Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid cascading updates for milestones server-side and move logic to client #3709

Open
maxceem opened this issue Feb 18, 2020 · 10 comments
Open

Comments

@maxceem
Copy link
Collaborator

maxceem commented Feb 18, 2020

Follow-up from #3299 (comment)

As a part of V5 Standards, we have to avoid making cascading updates server-side, implement endpoints for bulk operations and move cascading logic to the client-side.

This task would be done both server-side and client-side.

@maxceem
Copy link
Collaborator Author

maxceem commented Feb 18, 2020

@vikasrohit in this task we would get rid of cascading milestone updates while keeping milestones as "spans".

Our plan is to create a Bluk Update endpoint which would basically update all milestones of a timeline at once and the client would control everything by itself.

So we would remove the logic to make cascading updates from CREATE and UPDATE endpoint. I just would like to note, that after we remove this cascading update logic, we would be able to create "inconsistent data" (as per our current requirements) using these endpoints as sever would not enforce consistency anymore. For example:

  • we could create multiple milestones with the same "order, while now, we cannot do so, as milestones are automatically being reordered, see code.
  • we could update also create overlapping milestones as we wouldn't force cascading updates anymore, see code.

There are two ways to go:

  1. Accept it and let the client be responsible for all of these. So server-side wouldn't enforce it in any way.
  2. We can replace cascading update logic with server validation logic. So instead of automatically making cascading updates server could just validate data coming from the client and return errors if the client is trying to create inconsistent data like: same "order" or overlapping milestones. IMHO, as we are planning to move towards milestones as points we don't need these validations, as they would be removed when we migrate to the milestones as points.

@maxceem
Copy link
Collaborator Author

maxceem commented Feb 18, 2020

Milestones Bulk Update Endpoint Specification

We can add endpoint PATCH /v5/timelines/:timelineId(\\d+)/milestones which would update all the milestones of a timeline.

Imagine we have a timeline with 3 milestones:

milestones: [
  {
      "id": 1,
      "name": "Milestone 1",
      "startDate": "15 Dec 2020",
   },
  {
      "id": 2, 
      "name": "Milestone 2",
      "startDate": "20 Dec 2020",
   },
  {
      "id": 3,
      "name": "Milestone 3",
      "startDate": "25 Dec 2020",
   }
]

If we send a bulk request with the next payload:

[
  {
      "id": 2, 
      "name": "Milestone 2 Updated",
      "startDate": "21 Dec 2020",
   },
   {
      "id": 3,
      "name": "Milestone 3",
      "startDate": "25 Dec 2020",
   },
   {
      "name": "Milestone 4",
      "startDate": "30 Dec 2020",
   }
]

Then timeline would look like this:

milestones: [
  {
      "id": 2, 
      "name": "Milestone 2 Updated", // update name
      "startDate": "21 Dec 2020", // update date
   },
   { // stays as it is
      "id": 3,
      "name": "Milestone 3",
      "startDate": "25 Dec 2020",
   },
   { // new milestone
      "id": 4,
      "name": "Milestone 4",
      "startDate": "30 Dec 2020",
   }
]
  • milestone with id=1 has been deleted
  • milestone with id=2 has been updated
  • milestone with id=3 stay as it is
  • new millestone has been created and got id=4

Errors

This endpoint should work as a single transaction, so if at least on operation during this endpoint fails, then:

  • all other operations should be reverted back
  • the error should be returned for the whole endpoint

@vikasrohit what do you think about such logic?

@vikasrohit
Copy link

Agreed. We can live with the missing server enforcement because as you said we ultimately want to go towards the milestones as points implementation which would delegate all validations to the caller.

I am good with the bulk endpoint specs as well, go ahead.

@maxceem
Copy link
Collaborator Author

maxceem commented Feb 18, 2020

Thanks for confirmation @vikasrohit.

One more thing. The bulk endpoint should send Kafka events after individual milestones create/updated/delete opearations so the project-processor-es could reindex them in ES. There are 3 approaches:

  1. For each individual operation create/update/delete send a single Kafka event.
    • advantage we have to notify all the interested third-party services about any operations with milestones
    • disadvantage previously in V4, we had an issue when updating multiple milestones in ES at one time see comment that's why we had a special handler to index all the update milestones at once
  2. Instead of multiple Kafka events for each milestone, send one single updated Kafka event for timeline.
    • advantage it would work much faster and there for sure wouldn't be version conflict errors during indexing in ES
    • disadvantage we would break standards, and if some services are listening for milestone changes, they wouldn't get events about milestone changes in such case
  3. Send both events described in 1 and 2. But to the payloads of individual milestone events add some flag so project-processor-es would ignore them, and not reindex ES on them, and only would reindex ES on the common timeline update event. This way other services still could get events regarding single milestone updates, while we wouldn't have version conflict error during indexing.
    • advantage it would work well for our needs without any issues
    • disadvantage the logic would become quite custom, it's better to have unified event payloads for all events without such "hacky flags" IMHO

This is quite a nasty issue. As all of the approaches have strong disadvantages.
I would give a chance to the 1st approach, and see if we really would come across with version conflict error again or no. Alternatively, before choosing the approach we may run an F2F challenge to create a demo script that would send multiple create/update/delete requests to project-processor-es to see if version conflict errors would appear or no. If so, we would have a minimal code to reproduce the issue and would think about the solution.

@vikasrohit
Copy link

@maxceem I agreed. The only viable approach is 1. For avoiding version conflict errors, I guess, we can use approach suggested in https://www.elastic.co/guide/en/elasticsearch/reference/current/optimistic-concurrency-control.html. Notice the usage of if_seq_no and if_primary_term . I think that should solve the problem, if we get the document first to note down the seq_no and primary_term and then send them with next update call. We may have to include a logic of retries if the update fails because of if_sq_no or if_primary_term mismatch.
I think it makes sense to have a F2F to throw bulk events to project-processor-es and then fix the version conflict error using the approach I suggested above or any other better way participants can come up with.

@maxceem
Copy link
Collaborator Author

maxceem commented Feb 21, 2020

@vikasrohit the test has been implemented in branch https://github.com/topcoder-platform/project-processor-es/tree/feature/stress-test. And the results are quite interesting:

  1. if we send multiple events using Kafka as we do in reality the project-processor-es consumes them one by one and there are no version conflict error
  2. but if we call handlers to create/update/delete milestones directly at the same time, then version conflict errors appear, we even have one unit test disabled probably because it sometimes fails due to this error https://github.com/topcoder-platform/project-processor-es/blob/develop/test/e2e/processor.timeline.index.test.js#L623

So it looks like using project-processor-es with Kafka keeps us safe from this error.

@vikasrohit
Copy link

Good findings @maxceem
What are the cases in practical where we are calling create/update/delete milestones directly? I mean we always raise event for indexing even in CRUD operations by API. So, that makes it all safe?

@vikasrohit
Copy link

I don't remember the reason why those unit tests are disabled, can we get them enabled and fixed now?

@maxceem
Copy link
Collaborator Author

maxceem commented Feb 21, 2020

What are the cases in practical where we are calling create/update/delete milestones directly? I mean we always raise event for indexing even in CRUD operations by API. So, that makes it all safe?

We don't call them directly in real life, only in unit tests.

I don't remember the reason why those unit tests are disabled, can we get them enabled and fixed now?

I guess they were disabled due to version conflict error. I've enabled them locally but didn't get this issue, probably it's not stable. I guess I can enable it and we would watch if would cause random test fails or no.

@vikasrohit
Copy link

I guess I can enable it and we would watch if would cause random test fails or no.

Go ahead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants