Avoid cascading updates for milestones server-side and move logic to client #3709

maxceem · 2020-02-18T01:03:00Z

As a part of V5 Standards, we have to avoid making cascading updates server-side, implement endpoints for bulk operations and move cascading logic to the client-side.

This task would be done both server-side and client-side.

maxceem · 2020-02-18T04:32:29Z

@vikasrohit in this task we would get rid of cascading milestone updates while keeping milestones as "spans".

Our plan is to create a Bluk Update endpoint which would basically update all milestones of a timeline at once and the client would control everything by itself.

So we would remove the logic to make cascading updates from CREATE and UPDATE endpoint. I just would like to note, that after we remove this cascading update logic, we would be able to create "inconsistent data" (as per our current requirements) using these endpoints as sever would not enforce consistency anymore. For example:

we could create multiple milestones with the same "order, while now, we cannot do so, as milestones are automatically being reordered, see code.
we could update also create overlapping milestones as we wouldn't force cascading updates anymore, see code.

There are two ways to go:

Accept it and let the client be responsible for all of these. So server-side wouldn't enforce it in any way.
We can replace cascading update logic with server validation logic. So instead of automatically making cascading updates server could just validate data coming from the client and return errors if the client is trying to create inconsistent data like: same "order" or overlapping milestones. IMHO, as we are planning to move towards milestones as points we don't need these validations, as they would be removed when we migrate to the milestones as points.

maxceem · 2020-02-18T04:47:34Z

Milestones Bulk Update Endpoint Specification

We can add endpoint PATCH /v5/timelines/:timelineId(\\d+)/milestones which would update all the milestones of a timeline.

Imagine we have a timeline with 3 milestones:

milestones: [
  {
      "id": 1,
      "name": "Milestone 1",
      "startDate": "15 Dec 2020",
   },
  {
      "id": 2, 
      "name": "Milestone 2",
      "startDate": "20 Dec 2020",
   },
  {
      "id": 3,
      "name": "Milestone 3",
      "startDate": "25 Dec 2020",
   }
]

If we send a bulk request with the next payload:

[
  {
      "id": 2, 
      "name": "Milestone 2 Updated",
      "startDate": "21 Dec 2020",
   },
   {
      "id": 3,
      "name": "Milestone 3",
      "startDate": "25 Dec 2020",
   },
   {
      "name": "Milestone 4",
      "startDate": "30 Dec 2020",
   }
]

Then timeline would look like this:

milestones: [
  {
      "id": 2, 
      "name": "Milestone 2 Updated", // update name
      "startDate": "21 Dec 2020", // update date
   },
   { // stays as it is
      "id": 3,
      "name": "Milestone 3",
      "startDate": "25 Dec 2020",
   },
   { // new milestone
      "id": 4,
      "name": "Milestone 4",
      "startDate": "30 Dec 2020",
   }
]

milestone with id=1 has been deleted
milestone with id=2 has been updated
milestone with id=3 stay as it is
new millestone has been created and got id=4

Errors

This endpoint should work as a single transaction, so if at least on operation during this endpoint fails, then:

all other operations should be reverted back
the error should be returned for the whole endpoint

@vikasrohit what do you think about such logic?

vikasrohit · 2020-02-18T11:27:49Z

Agreed. We can live with the missing server enforcement because as you said we ultimately want to go towards the milestones as points implementation which would delegate all validations to the caller.

I am good with the bulk endpoint specs as well, go ahead.

maxceem · 2020-02-18T11:48:16Z

Thanks for confirmation @vikasrohit.

One more thing. The bulk endpoint should send Kafka events after individual milestones create/updated/delete opearations so the project-processor-es could reindex them in ES. There are 3 approaches:

For each individual operation create/update/delete send a single Kafka event.
- advantage we have to notify all the interested third-party services about any operations with milestones
- disadvantage previously in V4, we had an issue when updating multiple milestones in ES at one time see comment that's why we had a special handler to index all the update milestones at once
Instead of multiple Kafka events for each milestone, send one single updated Kafka event for timeline.
- advantage it would work much faster and there for sure wouldn't be version conflict errors during indexing in ES
- disadvantage we would break standards, and if some services are listening for milestone changes, they wouldn't get events about milestone changes in such case
Send both events described in 1 and 2. But to the payloads of individual milestone events add some flag so project-processor-es would ignore them, and not reindex ES on them, and only would reindex ES on the common timeline update event. This way other services still could get events regarding single milestone updates, while we wouldn't have version conflict error during indexing.
- advantage it would work well for our needs without any issues
- disadvantage the logic would become quite custom, it's better to have unified event payloads for all events without such "hacky flags" IMHO

This is quite a nasty issue. As all of the approaches have strong disadvantages.
I would give a chance to the 1st approach, and see if we really would come across with version conflict error again or no. Alternatively, before choosing the approach we may run an F2F challenge to create a demo script that would send multiple create/update/delete requests to project-processor-es to see if version conflict errors would appear or no. If so, we would have a minimal code to reproduce the issue and would think about the solution.

vikasrohit · 2020-02-19T05:33:19Z

@maxceem I agreed. The only viable approach is 1. For avoiding version conflict errors, I guess, we can use approach suggested in https://www.elastic.co/guide/en/elasticsearch/reference/current/optimistic-concurrency-control.html. Notice the usage of if_seq_no and if_primary_term . I think that should solve the problem, if we get the document first to note down the seq_no and primary_term and then send them with next update call. We may have to include a logic of retries if the update fails because of if_sq_no or if_primary_term mismatch.
I think it makes sense to have a F2F to throw bulk events to project-processor-es and then fix the version conflict error using the approach I suggested above or any other better way participants can come up with.

maxceem · 2020-02-21T02:18:15Z

@vikasrohit the test has been implemented in branch https://github.com/topcoder-platform/project-processor-es/tree/feature/stress-test. And the results are quite interesting:

if we send multiple events using Kafka as we do in reality the project-processor-es consumes them one by one and there are no version conflict error
but if we call handlers to create/update/delete milestones directly at the same time, then version conflict errors appear, we even have one unit test disabled probably because it sometimes fails due to this error https://github.com/topcoder-platform/project-processor-es/blob/develop/test/e2e/processor.timeline.index.test.js#L623

So it looks like using project-processor-es with Kafka keeps us safe from this error.

vikasrohit · 2020-02-21T07:17:41Z

Good findings @maxceem
What are the cases in practical where we are calling create/update/delete milestones directly? I mean we always raise event for indexing even in CRUD operations by API. So, that makes it all safe?

vikasrohit · 2020-02-21T07:18:46Z

I don't remember the reason why those unit tests are disabled, can we get them enabled and fixed now?

maxceem · 2020-02-21T07:23:30Z

What are the cases in practical where we are calling create/update/delete milestones directly? I mean we always raise event for indexing even in CRUD operations by API. So, that makes it all safe?

We don't call them directly in real life, only in unit tests.

I don't remember the reason why those unit tests are disabled, can we get them enabled and fixed now?

I guess they were disabled due to version conflict error. I've enabled them locally but didn't get this issue, probably it's not stable. I guess I can enable it and we would watch if would cause random test fails or no.

vikasrohit · 2020-02-21T07:25:09Z

I guess I can enable it and we would watch if would cause random test fails or no.

Go ahead.

maxceem mentioned this issue Feb 18, 2020

V5 tasks and other enhancements topcoder-platform/tc-project-service#435

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid cascading updates for milestones server-side and move logic to client #3709

Avoid cascading updates for milestones server-side and move logic to client #3709

maxceem commented Feb 18, 2020

maxceem commented Feb 18, 2020

maxceem commented Feb 18, 2020 •

edited

vikasrohit commented Feb 18, 2020

maxceem commented Feb 18, 2020 •

edited

vikasrohit commented Feb 19, 2020

maxceem commented Feb 21, 2020

vikasrohit commented Feb 21, 2020

vikasrohit commented Feb 21, 2020

maxceem commented Feb 21, 2020

vikasrohit commented Feb 21, 2020

Avoid cascading updates for milestones server-side and move logic to client #3709

Avoid cascading updates for milestones server-side and move logic to client #3709

Comments

maxceem commented Feb 18, 2020

maxceem commented Feb 18, 2020

maxceem commented Feb 18, 2020 • edited

Milestones Bulk Update Endpoint Specification

Errors

vikasrohit commented Feb 18, 2020

maxceem commented Feb 18, 2020 • edited

vikasrohit commented Feb 19, 2020

maxceem commented Feb 21, 2020

vikasrohit commented Feb 21, 2020

vikasrohit commented Feb 21, 2020

maxceem commented Feb 21, 2020

vikasrohit commented Feb 21, 2020

maxceem commented Feb 18, 2020 •

edited

maxceem commented Feb 18, 2020 •

edited