Skip to content

Add the ability to update event by item id#131

Merged
sergehuber merged 1 commit intoapache:masterfrom
YotpoLtd:DATA-2785-update-event-from-context-servlet
Apr 24, 2020
Merged

Add the ability to update event by item id#131
sergehuber merged 1 commit intoapache:masterfrom
YotpoLtd:DATA-2785-update-event-from-context-servlet

Conversation

@RonBarabash
Copy link
Copy Markdown
Contributor

@RonBarabash RonBarabash commented Feb 22, 2020

Hey,
As we understand from the Unomi docs, events are an immutable list of objects that can only be appended too.
As more and more event driven system are being based on tools such as Kafka, we want to have the ability to guard ourselves from receiving duplicate events as consumers usually operates in "at least once" semantics.
We thought that having the ability to pass in the itemId as a parameter to the context servlet would help us in achieving that goal without breaking the current behaviour.
We also want to add the ability to ignore duplicate events in some rules we define, hence we added the raiseEventOnlyOnce parameter to the rule creation.
I know that this is modifying Event, which is a Unomi core entity.
We were wondering if this is something u might want to consider adding to the core functionality.
If not, is there a way for us to achieve such behaviour only by extending or creating a plugin for Unomi?
Your help and thoughts would be kindly appreciated,
Cheers,
Ron

@sergehuber
Copy link
Copy Markdown
Contributor

sergehuber commented Feb 24, 2020

Hello Ron,

Thank you so much for your contribution, it is very appreciated. However, as a reviewer there are some points I would like to review. Thanks for your understanding.

Could you explain a little bit more why you need to update events ? Maybe an example could be great?

My biggest concern with this PR is that it introduces the potential for security vulnerabilities. As secured events such as logins or profile properties events exist in the system, re-using itemIds could make it possible to overwrite or retrieve profile data.

Maybe an secondary ID could be used on the events to do what you need ?

@sergehuber
Copy link
Copy Markdown
Contributor

Or another solution might be to allow this operation only for authorized third party servers by using something like the EventService.isEventAllowed mechanism ?

@sigdestad
Copy link
Copy Markdown

Imho this would break a fundamental concept in Unomi of immutable events. Ability to modify an event will break any possibility to playback a chain of events and any action resulting from the event. Also, It might lead to inconsistence in terms of other data. If you need to "change" an event, you should rather register an event that undoes the initial.

@sigdestad
Copy link
Copy Markdown

Also - are you saying repeated events is cause by a technichal problem in your event streaming pipeline, or something else? If you only want to avoid duplicate entries, this can be handled in other ways than modifying existing events.

@RonBarabash
Copy link
Copy Markdown
Contributor Author

Hey all,
Firstly, Thank you for your responses.
Our use case with Unomi is as follows:
We want to build a user profile based on Internal System events that are transported through Kafka, for example When a user submits a new purchase, an event is being fired to the "OrderCreated/Updated" topic with the purchase details (Email, OrderAmount etc.).
At some point in time let's say we want to add a new property to the purchase details and send it to all past events as well (for instance: OrderCurrency). We have the ability to stream all of the events from beginning time and the relevant consumers should be able to update past events with the new property. This scenario is something we already encountered several times and we build our systems in a way that entities can be updated at their persistent target.
Again, if there is a way we can achieve this by not changing the core Unomi entities i would be more than happy to hear.
In the way we architect our system, Unomi is not accessible from the outside (i.e the internet) of our systems, hence we do not have any security considerations.
@sergehuber What do u mean by secondary id? Should we add a new property to Event and that would act as our id for updating?
@sigdestad I agree with you, Can u elaborate on how would u implement this without breaking event immutability?
Looking forward for your responses,
10x
Ron

@lyogev
Copy link
Copy Markdown
Contributor

lyogev commented Feb 25, 2020

Regarding security we can maybe make sure we are at least in the same session, @sergehuber will that suffice?

@jbonofre jbonofre self-requested a review February 25, 2020 19:55
@jbonofre
Copy link
Copy Markdown
Member

@lyogev yeah, I think it's fair enough.

@RonBarabash
Copy link
Copy Markdown
Contributor Author

@jbonofre i added the check for session id as well

Copy link
Copy Markdown
Contributor

@sergehuber sergehuber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello, thanks for the context. The way I see it two solutions could be used:

  • The one you propose in this patch, but it would need to be better secured (see my comments in the code). It's not clear to me what you are doing with rules in this part, if executing them at all?
  • Another "more Unomi compliant" would be to, instead of replying the SAME events, you would send a new event of type "updateEvent" (or any name you choose), that would have an associated rule that would modify existing events. The disadvantage of this method is that events might grow quickly every type you update the events, but at least you would have full traceability of what happened overtime.
    I look forward to your answer, this is an interesting topic :)

@RonBarabash RonBarabash force-pushed the DATA-2785-update-event-from-context-servlet branch 2 times, most recently from e4bf912 to c8c6d8a Compare March 1, 2020 13:11
@RonBarabash
Copy link
Copy Markdown
Contributor Author

RonBarabash commented Mar 1, 2020

Hey @sergehuber,
I updated the code to use the isEventAllowed mechanism before creating an event with a given item id...
Let me know if there is anything else i need to in order to get this pr in,
10x
Ron

Copy link
Copy Markdown
Contributor

@sergehuber sergehuber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Ron, I suggested a small change that should secure the requests properly. Once that is added I think this PR will be ready.

Thanks for your understanding and patience.

logger.warn("Event is not allowed : {}", event.getEventType());
continue;
}
if (event.getItemId() != null) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be best to check if the thirdPartyId != null in that condition so it would become:

if (thirdPartyId != null && event.getItemId() != null) {

This way the itemId would only be accepted if a configured server Id is doing the request and it provides the Unomi Key.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey @sergehuber,
I understand, i added your suggestion to the pr,
10x
Ron

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if i missed something ion the conversation threads let me know :)
Thank you for your responsiveness,
Ron

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Ron, thanks for adding the suggestion, this is much better. Now all that is left is to check for migration issues but I have to test the branch to check for that.

Actually now that you mention it you didn't give me any feedback on this part of my suggestion: "Another "more Unomi compliant" would be to, instead of replying the SAME events, you would send a new event of type "updateEvent" (or any name you choose), that would have an associated rule that would modify existing events. The disadvantage of this method is that events might grow quickly every type you update the events, but at least you would have full traceability of what happened overtime."

Finally, I'd love to have more context on what you are using this for, because I wonder for example how often you see these model changes and if you could give me more examples of real cases? I'm just curious here because I haven't encountered this much for the moment.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @sergehuber,
So the solution u suggested might work but it doesn't seem very scalable to create a new event type for each entity, i think it would create a lot of mess within the data, and how would you treat these events on the query side, if you would want to create a rather complex aggregation which involves the update events + the created events? i would say that it would be over-complicating things for a rather simple & straight solution.
I think that if you would like to have full traceability of what happened overtime u might want to save this somewhere else outside of Unomi, maybe fire state changed events to Kafka and take it from there like (CDC)

About our use case:
So we are now at the beginning of building Yotpo's internal CDP based on apache Unomi.
As i mentioned earlier. we compose the user profiles based on internal system events which are themselves state changed events of core Yotpo entities. (i.e OrderCreated, ReviewCreated, PointsRedeemed etc)
All of these distributed systems are sending events to Kafka, from there we trigger Unomi Consumers which transform and validate the events properties and store them in Unomi.
We work in a matter that event publishers are consumer agnostic, meaning Unomi is just another downstream consumer of events.
We have scenarios of which entities are being created and after a while being updated. for example: a customer left a review which triggered ReviewCreate event. for each review we are performing sentiment analysis asynchronously (after the review was created, we send the text through an NLP pipeline) which eventually update the review entity with a sentiment score and triggers the ReviewUpdated event.
We need to be able to store the sentiment score within Unomi, and associate it with the relevant event that was already registered in Unomi.
We can dig deeper to some of the other use cases if u would like offline :)
nonetheless, the change proposed in this PR is supporting our use case without breaking the default use cases you guys had in mind while building Unomi.
10x
Ron

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if there is anything i can do to help u test the migration rules.
10x
Ron

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... I'm sorry to bring this up now but it seems like this is not a proper usage of events. Events are just what they are: objects that indicate something has happened at a certain time.

What you are describing in the use case seems more like you would benefit more from adding a new Item type called "Review" and then using events to update the Review items. This could be easily done just by building your own plugin. You could also build your own custom API to do any queries or aggregations you need on your custom item type.

I know this might come as a (big?) change, but with your use case it makes things a little different.

Copy link
Copy Markdown
Contributor Author

@RonBarabash RonBarabash Mar 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im sorry @sergehuber but i dont understand why creating a new item might solve this.
Its also very unscalable to create a new item per event type we want to add to the system.
Do u think maybe we should talk in over the phone for u to get a better understanding on why we need this feature?
And again, Why not add a generic behaviour and functionality that should work for some of the use cases with Unomi? this PR does not suggesting breaking changes or a change in concept. having the ability to update events to me seems trivial, some of the users of Unomi can enable it, and some dont, even more so when we guarded this using the 3rd party id check...
Creating Item Types for each Event Type for me is abusing the system to do something that it doesn't suppose to do.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following our meeting it was decided to:

@RonBarabash RonBarabash force-pushed the DATA-2785-update-event-from-context-servlet branch from c8c6d8a to a51a9b6 Compare March 3, 2020 07:00
@RonBarabash RonBarabash force-pushed the DATA-2785-update-event-from-context-servlet branch 8 times, most recently from 66483e0 to 1f6d187 Compare April 13, 2020 15:07
@RonBarabash
Copy link
Copy Markdown
Contributor Author

Hey @sergehuber & @jbonofre,
I added some itests, can u please take a look and see if this is what u had in mind? if there are more use cases u wish to test let me know and ill add them...
Testing this stuff is quite tiresome as most of the underlying processes are async which forces the tests to wait for them.
anyhow, if there is another place u wish to document this let me know,
Much appreciated,
Ron

@sergehuber
Copy link
Copy Markdown
Contributor

Hello Ron,

Thanks for all this, I'll look at it asap.

Regards,
Serge...

@sergehuber sergehuber self-requested a review April 13, 2020 17:29
@sergehuber
Copy link
Copy Markdown
Contributor

sergehuber commented Apr 13, 2020

I just had a quick look at the code and it looks good, I'll want to test it before merging it.

We also talk about documenting the feature and explaining it, will you contribute this separately (you could for example add content to the manual) ?

@RonBarabash
Copy link
Copy Markdown
Contributor Author

Hey @sergehuber,
No problems I'll add it to the manual in this pr,
10x
Ron

@RonBarabash RonBarabash force-pushed the DATA-2785-update-event-from-context-servlet branch from 1f6d187 to 7ca776e Compare April 16, 2020 07:26
@RonBarabash
Copy link
Copy Markdown
Contributor Author

Hey @sergehuber,
Added a new doc to the manual, let me know what u think,
10x
Ron

Copy link
Copy Markdown
Contributor

@sergehuber sergehuber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a minor change, please reference the new document from the index.adoc

@RonBarabash RonBarabash force-pushed the DATA-2785-update-event-from-context-servlet branch from 7ca776e to fd614d5 Compare April 17, 2020 11:25
Copy link
Copy Markdown
Contributor

@sergehuber sergehuber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your patience and your hard work. I think this PR is ready to be merged !

@RonBarabash
Copy link
Copy Markdown
Contributor Author

@sergehuber 🙏 🙏 🙏 🙏
Good luck!

@lyogev
Copy link
Copy Markdown
Contributor

lyogev commented Apr 21, 2020

@sergehuber can you merge this?

@sergehuber
Copy link
Copy Markdown
Contributor

@lyogev I'm just checking something quickly but I'm hoping to do asap.

@sergehuber
Copy link
Copy Markdown
Contributor

In case you're wondering I want to make this part of release 1.5.0 so don't worry about that :)

@lyogev
Copy link
Copy Markdown
Contributor

lyogev commented Apr 21, 2020

@sergehuber Thank you!! and I'm not worried :)
1.5.0 is coming out soon?

@sergehuber
Copy link
Copy Markdown
Contributor

1.5.0 is planned for next week if I can finish everything on time :)

@RonBarabash given the size of the contribution I think it would be best if you sign a CLA. You can find the information here: https://www.apache.org/licenses/contributor-agreements.html

This only needs to be done once, and if you did it for another project it's not needed. Just make sure you put "Unomi" in the project name and let me know when it's done.

Thanks a lot !

@RonBarabash
Copy link
Copy Markdown
Contributor Author

Hey @sergehuber ,
Done!

@sergehuber
Copy link
Copy Markdown
Contributor

Thanks. Did you submit an ICLA by email? As soon as it is processed by the secretary I'll merge the PR

@RonBarabash
Copy link
Copy Markdown
Contributor Author

@sergehuber yes sent an email to secretary@apache.org..

@sergehuber
Copy link
Copy Markdown
Contributor

Ok did you use a nickname or pseudo ? I'm still not seeing it in the system.

Copy link
Copy Markdown
Member

@jbonofre jbonofre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked and I confirm that Ron's ICLA is in the records. +1 to merge this PR.

@sergehuber sergehuber merged commit bacb7bc into apache:master Apr 24, 2020
@sergehuber
Copy link
Copy Markdown
Contributor

Thank you Ron for your patience and your contribution, it is now merged !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants