New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimistic Locking friendly changes #86

Merged
merged 35 commits into from Sep 27, 2017

Conversation

4 participants
@paneq
Member

paneq commented Aug 29, 2017

It started here: https://github.com/RailsEventStore/aggregate_root/issues/8

3 possible ways of writing

Number

  • :none - works like -1 assumes stream empty so far, starts adding new in stream
  • given number : -1..Infinity - assumes N, starts adding N+1, N+2

Should work well for

  • non-legacy
  • event sourcing scenario
  • transaction around not required

Auto

  • :auto - assumes lock in higher layer. Will query for last N and start doing N+1, N+2... etc [New mode]
  • good for legacy
  • requires transaction and lock around

Any

  • :any - NULLable position in stream, order unknown
  • For copying into highly-contagious technical streams where we don't really care about exact position
  • Best-guess order can be determined based on EventInStream auto-increment id.

This is just a spike that I wanted to share with you.

What do you think @mpraglowski @pawelpacana @mlomnicki ?

Checklist

  • Mysql
  • Postgres
  • Re-enable mutation testing
  • Verbose / non-verbose mode of running tests
  • Change migration to be DB-dependant and use string for mysql UUID
  • Discuss 3 tables vs 2 tables
  • Fix enrich_event_metadata to not edit metadata directly but rather a cloned/duped version.
  • GLOBAL_STREAM is all instead of __global__ and it should only write there once.
  • Consider global linearization with a very quick lock on write...
  • Make sure AggregateRoot can handle conflicts
  • Should event be unique in a stream?
Optimistic Locking friendly changes WIP
I collected the work from across all gems.

https://github.com/RailsEventStore/aggregate_root/issues/8

It is missing required Travis changes to run
rails_event_store_active_record on Mysql and Postgres
@paneq

This comment has been minimized.

Show comment
Hide comment
@paneq

paneq Aug 29, 2017

Member

right now it is missing required Travis changes to run rails_event_store_active_record on Mysql and Postgres

Member

paneq commented Aug 29, 2017

right now it is missing required Travis changes to run rails_event_store_active_record on Mysql and Postgres

paneq added some commits Aug 29, 2017

Make mutant happy but me unhappy
I considered using `prepend` to initialize
@unpublished_events = []
instead of using the lazy load pattern:

  def unpublished_events
    @unpublished_events ||= []
  end

but I decided the overhead is not worth it.
Although we coud use it to set version to -1 as well.

I will create a separate issue for it.

paneq added some commits Aug 30, 2017

@@ -1,26 +1,55 @@
if ENV['CODECLIMATE_REPO_TOKEN']

This comment has been minimized.

@mlomnicki

mlomnicki Aug 30, 2017

Member

not needed anymore. We got rid of CC

@mlomnicki

mlomnicki Aug 30, 2017

Member

not needed anymore. We got rid of CC

t.text :data, null: false
t.integer :position, null: true
if ENV['DATABASE_URL'].start_with?("postgres")
t.references :event, null: false, type: :uuid

This comment has been minimized.

@mlomnicki

mlomnicki Aug 30, 2017

Member

Do we actually need the :uuid? Maybe :string would suffice? With :string we don't have to enable pgcrypto

@mlomnicki

mlomnicki Aug 30, 2017

Member

Do we actually need the :uuid? Maybe :string would suffice? With :string we don't have to enable pgcrypto

This comment has been minimized.

@paneq

paneq Aug 30, 2017

Member

And what's wrong with pgcrypto?

@paneq

paneq Aug 30, 2017

Member

And what's wrong with pgcrypto?

This comment has been minimized.

@mlomnicki

mlomnicki Aug 30, 2017

Member

Nothing wrong. It's not enabled by default. Are we going to force users to enable it even though it's not really needed?

@mlomnicki

mlomnicki Aug 30, 2017

Member

Nothing wrong. It's not enabled by default. Are we going to force users to enable it even though it's not really needed?

This comment has been minimized.

@paneq

paneq Aug 30, 2017

Member

Yep :)

@paneq

paneq Aug 30, 2017

Member

Yep :)

This comment has been minimized.

@mlomnicki

mlomnicki Aug 30, 2017

Member

🤔

@mlomnicki

mlomnicki Aug 30, 2017

Member

🤔

@mlomnicki

This comment has been minimized.

Show comment
Hide comment
@mlomnicki

mlomnicki Aug 30, 2017

Member

Excellent job @paneq! This is quite a big PR. Could we maybe split it into smaller chunks?

The checklist already contains some bits that could be extracted to separate PRs (mysql/postrgres, fix enrich_event_metadata, etc)

I would also open separate PRs for the following:

  • ability to publish multiple events
  • Event#hash
  • Discuss - do we have to depend on activerecord-import?

Makes sense?

Member

mlomnicki commented Aug 30, 2017

Excellent job @paneq! This is quite a big PR. Could we maybe split it into smaller chunks?

The checklist already contains some bits that could be extracted to separate PRs (mysql/postrgres, fix enrich_event_metadata, etc)

I would also open separate PRs for the following:

  • ability to publish multiple events
  • Event#hash
  • Discuss - do we have to depend on activerecord-import?

Makes sense?

Show outdated Hide outdated ...re_active_record/lib/rails_event_store_active_record/event_repository.rb
when :none
-1
when :auto
eis = EventInStream.where(stream: stream_name).order("position DESC").first

This comment has been minimized.

@gottfrois

gottfrois Aug 30, 2017

Contributor

we should use a repository to access this

@gottfrois

gottfrois Aug 30, 2017

Contributor

we should use a repository to access this

This comment has been minimized.

@gottfrois

gottfrois Aug 30, 2017

Contributor

it should also be in its own private method

@gottfrois

gottfrois Aug 30, 2017

Contributor

it should also be in its own private method

This comment has been minimized.

@paneq

paneq Aug 30, 2017

Member

The code we are writing here is the repository :)

@paneq

paneq Aug 30, 2017

Member

The code we are writing here is the repository :)

This comment has been minimized.

@gottfrois

gottfrois Aug 30, 2017

Contributor

it's the event repository, not EventInStream repository. I would make a clear distinction

@gottfrois

gottfrois Aug 30, 2017

Contributor

it's the event repository, not EventInStream repository. I would make a clear distinction

Show outdated Hide outdated ...re_active_record/lib/rails_event_store_active_record/event_repository.rb
Show outdated Hide outdated ...re_active_record/lib/rails_event_store_active_record/event_repository.rb
Show outdated Hide outdated ...re_active_record/lib/rails_event_store_active_record/event_repository.rb
Show outdated Hide outdated ...re_active_record/lib/rails_event_store_active_record/event_repository.rb
Show outdated Hide outdated ...re_active_record/lib/rails_event_store_active_record/event_repository.rb
@@ -17,75 +54,75 @@ def delete_stream(stream_name)
end
def has_event?(event_id)
adapter.exists?(event_id: event_id)
Event.exists?(id: event_id)

This comment has been minimized.

@gottfrois

gottfrois Aug 30, 2017

Contributor

should use the adapter

@gottfrois

gottfrois Aug 30, 2017

Contributor

should use the adapter

Show outdated Hide outdated ...re_active_record/lib/rails_event_store_active_record/event_repository.rb
unless start_event_id.equal?(:head)
starting_event = adapter.find_by(event_id: start_event_id)
stream = stream.where('id > ?', starting_event)
def read_events_forward(stream_name, after_event_id, count)

This comment has been minimized.

@gottfrois

gottfrois Aug 30, 2017

Contributor

same comments about adapters

@gottfrois

gottfrois Aug 30, 2017

Contributor

same comments about adapters

@paneq

This comment has been minimized.

Show comment
Hide comment
@paneq

paneq Aug 30, 2017

Member

The checklist already contains some bits that could be extracted to separate PRs

I agree. It's more so that I don't forget about those things than to actually include in this PR.

Member

paneq commented Aug 30, 2017

The checklist already contains some bits that could be extracted to separate PRs

I agree. It's more so that I don't forget about those things than to actually include in this PR.

@paneq

This comment has been minimized.

Show comment
Hide comment
@paneq

paneq Aug 30, 2017

Member

ability to publish multiple events

well... We could add such functionality in separate PR without releasing a new version because publishing multiple events using the old code is still going to be completely broken from optimistic locking perspective but maybe a separate PR for extending API is a good idea? Dunno

Member

paneq commented Aug 30, 2017

ability to publish multiple events

well... We could add such functionality in separate PR without releasing a new version because publishing multiple events using the old code is still going to be completely broken from optimistic locking perspective but maybe a separate PR for extending API is a good idea? Dunno

@paneq

This comment has been minimized.

Show comment
Hide comment
@paneq

paneq Aug 30, 2017

Member

Event#hash

No brainer. Feel free to extract and merge. Very simple. Related mutant discussion https://twitter.com/pankowecki/status/902521997573455877

Member

paneq commented Aug 30, 2017

Event#hash

No brainer. Feel free to extract and merge. Very simple. Related mutant discussion https://twitter.com/pankowecki/status/902521997573455877

@paneq

This comment has been minimized.

Show comment
Hide comment
@paneq

paneq Aug 30, 2017

Member

Discuss - do we have to depend on activerecord-import?

Ideally not, but I didn't want to write myself all the code for doing a single INSERT with multiple records. It might not be relevant if we migrate to transactions and 3 tables approach.

Member

paneq commented Aug 30, 2017

Discuss - do we have to depend on activerecord-import?

Ideally not, but I didn't want to write myself all the code for doing a single INSERT with multiple records. It might not be relevant if we migrate to transactions and 3 tables approach.

@paneq

This comment has been minimized.

Show comment
Hide comment
@paneq

paneq Aug 30, 2017

Member

@gottfrois Thanks for the comments about extracting methods etc. This is right now however totally Work In Progress in which I am trying to find out if what we are trying to achieve is possible within the constraints that I described. I am still not exactly sure and there are bigger decisions to be made (3vs2 tables), handling concurrency etc. So once I am happy with the Big decisions, I am gonna focus on refactoring the smaller parts.

Member

paneq commented Aug 30, 2017

@gottfrois Thanks for the comments about extracting methods etc. This is right now however totally Work In Progress in which I am trying to find out if what we are trying to achieve is possible within the constraints that I described. I am still not exactly sure and there are bigger decisions to be made (3vs2 tables), handling concurrency etc. So once I am happy with the Big decisions, I am gonna focus on refactoring the smaller parts.

@paneq

This comment has been minimized.

Show comment
Hide comment
@paneq

paneq Aug 30, 2017

Member

@gottfrois Also, I am trying to reach a state in which this can be merged to master, unreleased. And then everyone can easily improve the codebase with proper tests until we are satisfied with the results. And the we could do an official release. So I am gonna dismiss the review right now. But does not mean I don't agree. I just want to handle those kind of things later :)

Member

paneq commented Aug 30, 2017

@gottfrois Also, I am trying to reach a state in which this can be merged to master, unreleased. And then everyone can easily improve the codebase with proper tests until we are satisfied with the results. And the we could do an official release. So I am gonna dismiss the review right now. But does not mean I don't agree. I just want to handle those kind of things later :)

@gottfrois

This comment has been minimized.

Show comment
Hide comment
@gottfrois

gottfrois Aug 30, 2017

Contributor

no hard feelings :)

Contributor

gottfrois commented Aug 30, 2017

no hard feelings :)

We will deal with those changes later :)

@mlomnicki

This comment has been minimized.

Show comment
Hide comment
@mlomnicki

mlomnicki Aug 30, 2017

Member

Event#hash

No brainer.

Apparently mere developers such as me don't understand it :) I.e. how is this change related to optimistic locking? Why the big number is written in binary notation? Why do we even need to overwrite Event#hash, etc

Member

mlomnicki commented Aug 30, 2017

Event#hash

No brainer.

Apparently mere developers such as me don't understand it :) I.e. how is this change related to optimistic locking? Why the big number is written in binary notation? Why do we even need to overwrite Event#hash, etc

@paneq

This comment has been minimized.

Show comment
Hide comment
@paneq

paneq Aug 30, 2017

Member

Apparently mere developers such as me don't understand it :) I.e. how is this change related to optimistic locking?

In one race condition test I wanted to guarantee that all events are returned but the order is not known so I used Array#to_set and check if two sets are equal. That failed. Which brings as to your next question:

Why do we even need to overwrite Event#hash, etc

Because our Event has == operator which makes it behave like a Value Object (ignoring metadata). But value objects with are equal in == sense should also return the same hash which is used when you put such object into Hash or Set.

       expect({
         klass.new(event_id: "doh") => :YAY
       }[ klass.new(event_id: "doh") ]).to eq(:YAY)
       expect(Set.new([
         klass.new(event_id: "doh")
       ])).to eq(Set.new([klass.new(event_id: "doh")]))

These two are failing without implementing hash

Why the big number is written in binary notation?

It's easier that way to see how many bits it has.

Member

paneq commented Aug 30, 2017

Apparently mere developers such as me don't understand it :) I.e. how is this change related to optimistic locking?

In one race condition test I wanted to guarantee that all events are returned but the order is not known so I used Array#to_set and check if two sets are equal. That failed. Which brings as to your next question:

Why do we even need to overwrite Event#hash, etc

Because our Event has == operator which makes it behave like a Value Object (ignoring metadata). But value objects with are equal in == sense should also return the same hash which is used when you put such object into Hash or Set.

       expect({
         klass.new(event_id: "doh") => :YAY
       }[ klass.new(event_id: "doh") ]).to eq(:YAY)
       expect(Set.new([
         klass.new(event_id: "doh")
       ])).to eq(Set.new([klass.new(event_id: "doh")]))

These two are failing without implementing hash

Why the big number is written in binary notation?

It's easier that way to see how many bits it has.

@mlomnicki

This comment has been minimized.

Show comment
Hide comment
@mlomnicki

mlomnicki Aug 30, 2017

Member

Wow that's a proper explanation. Much appreciated, thanks!!

Member

mlomnicki commented Aug 30, 2017

Wow that's a proper explanation. Much appreciated, thanks!!

paneq added some commits Aug 30, 2017

kill mutant
Simple test to check if #has_event? passes if different string with same
content still returns true.
Deleting a stream does not delete events in other streams
This changes the original logic but it makes sense imho
because we now have the concept of an event belonging to multiple
streams.
No point in this adapter
We now have 2 (it can be later 3) classes responsible for storing
events. I don't see much point in this adapter anymore.
@paneq

This comment has been minimized.

Show comment
Hide comment
@paneq

paneq Aug 30, 2017

Member

Regarding mutations in rails_event_store_active_record... I don't think it makes sense to run them on postgres or mysql. Probably I should bring back Sqlite into testing suite and mutate only on that.

Member

paneq commented Aug 30, 2017

Regarding mutations in rails_event_store_active_record... I don't think it makes sense to run them on postgres or mysql. Probably I should bring back Sqlite into testing suite and mutate only on that.

@mlomnicki mlomnicki referenced this pull request Aug 31, 2017

Open

Benchmarks for RES #100

@paneq

This comment has been minimized.

Show comment
Hide comment
@paneq

paneq Sep 9, 2017

Member

Today I realized that ideally I would want to come out with a DB schema that allows for Event Store as queue: ( #106 ) and since I don't want to keep the DB schema constantly I kind of coupled both problems together.

Member

paneq commented Sep 9, 2017

Today I realized that ideally I would want to come out with a DB schema that allows for Event Store as queue: ( #106 ) and since I don't want to keep the DB schema constantly I kind of coupled both problems together.

paneq added some commits Sep 11, 2017

Add lovely tests
Kill lovely mutations.
More strong check
I am not sure if `equal` will always work, but I think it will as two
identical symbols must be the same object, no matter how constructed.
Exclude `preload()`
1.

`preload()` only has observable perfromance effects so mutating it does
not benefit and I am not sure how would we kill it anyway?

I was thinking about adding in build_event_entity(record) a check like

raise "use preload()" unless record.association(:event).loaded?

but then how do we kill mutations around this ^^ check if does not
external API in any way?

2.

where('id < ?', before_event) automatically substitutes id so killing a
few mutations
Verify preloading effect with nr of queries
I was wrong in bc51e52

we can use nr of DB queries to verify preloading.
It its not part of linted spec but additional test of a particular
implementation.
Use constant
I does not matter if we shift all postions by 1,2,3 etc. We use 1 so
that first event in a stream is recorded with nr 0 :)
explicit sorting by position - working test
I had to remove_index :event_store_events_in_streams, [:stream,
:position] because it was used by default when fetching the records and
even without explicit order the order as defined by that used index.
Surprise.
@paneq

This comment has been minimized.

Show comment
Hide comment
@paneq

paneq Sep 25, 2017

Member

@pawelpacana @mlomnicki before we merge this I have a few questions. Let's start with most important one:

  • add_index :event_store_events_in_streams, [:stream, :event_uuid], unique: true - Do you think we need it? I think it makes sense. Given event should appear in a stream once and only once. read_events_forward and other similar methods basically rely on that fact because they find position based on event id and if there were two events with same id in a stream then we could have random results and failures.
Member

paneq commented Sep 25, 2017

@pawelpacana @mlomnicki before we merge this I have a few questions. Let's start with most important one:

  • add_index :event_store_events_in_streams, [:stream, :event_uuid], unique: true - Do you think we need it? I think it makes sense. Given event should appear in a stream once and only once. read_events_forward and other similar methods basically rely on that fact because they find position based on event id and if there were two events with same id in a stream then we could have random results and failures.
@mlomnicki

This comment has been minimized.

Show comment
Hide comment
@mlomnicki

mlomnicki Sep 26, 2017

Member

@paneq yep, defo makes sense 👍

Member

mlomnicki commented Sep 26, 2017

@paneq yep, defo makes sense 👍

paneq added some commits Sep 26, 2017

Merge remote-tracking branch 'origin/master' into locking_friendly
Conflicts:
	aggregate_root/spec/aggregate_root_spec.rb
	rails_event_store_active_record/Makefile
	rails_event_store_active_record/rails_event_store_active_record.gemspec
	ruby_event_store/Makefile
	ruby_event_store/lib/ruby_event_store/client.rb
	ruby_event_store/spec/subscription_spec.rb
Unique index on events in streams
Our codebase relies on non-duplicated events. Right now it is not
possible to publish same event twice. But if we add append_to_stream, it
will be possible to have same event in many streams.

Don't mutate #detect_pkey_index_violated because it has logic which work
on different databases and we can't run mutant across many DBs at the
same time easily (we could have many connections but not worth it imho).
@paneq

This comment has been minimized.

Show comment
Hide comment
@paneq

paneq Sep 27, 2017

Member

@mlomnicki @pawelpacana I want to merge today, please review holistically :)

The idea behind __global__ stream is that it contained all events but only once. It might not be needed right now but when we add the feature to copy event between streams it could be useful to have __global__ without duplicates.

So publishing in stream a would add event in stream a and __global__ and linking an event into stream b would only add it in stream b, without adding again to __global__. Does that make sense?

Member

paneq commented Sep 27, 2017

@mlomnicki @pawelpacana I want to merge today, please review holistically :)

The idea behind __global__ stream is that it contained all events but only once. It might not be needed right now but when we add the feature to copy event between streams it could be useful to have __global__ without duplicates.

So publishing in stream a would add event in stream a and __global__ and linking an event into stream b would only add it in stream b, without adding again to __global__. Does that make sense?

@paneq

This comment has been minimized.

Show comment
Hide comment
@paneq

paneq Sep 27, 2017

Member

So far I was not sure how we could introduce a 3rd table streams with name and position and unique index on name that could work in :any mode and not have a race condition on creating the record. This is an actual problem we had in another system where 2 background jobs tried to write first event to the same stream.

Member

paneq commented Sep 27, 2017

So far I was not sure how we could introduce a 3rd table streams with name and position and unique index on name that could work in :any mode and not have a race condition on creating the record. This is an actual problem we had in another system where 2 background jobs tried to write first event to the same stream.

@paneq paneq merged commit b81796d into master Sep 27, 2017

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

@paneq paneq deleted the locking_friendly branch Sep 27, 2017

@@ -87,7 +97,7 @@ def enrich_event_metadata(event)
metadata[:timestamp] ||= clock.()
metadata.merge!(metadata_proc.call || {}) if metadata_proc
event.class.new(event_id: event.event_id, metadata: metadata, data: event.data)
# event.class.new(event_id: event.event_id, metadata: metadata, data: event.data)

This comment has been minimized.

@pawelpacana

pawelpacana Oct 10, 2017

Member

what's up with this comment?

@pawelpacana

pawelpacana Oct 10, 2017

Member

what's up with this comment?

This comment has been minimized.

@paneq

paneq Oct 12, 2017

Member

Changing metatadata inside the instance vs new instance with new metadata. Keeping the old behavior was not possible during refactorings. A separate task would be to bring back the old behavior if still wanted, but test it properly. It was not.

@paneq

paneq Oct 12, 2017

Member

Changing metatadata inside the instance vs new instance with new metadata. Keeping the old behavior was not possible during refactorings. A separate task would be to bring back the old behavior if still wanted, but test it properly. It was not.

end
end
it 'reads events different uuid object but same content' do

This comment has been minimized.

@pawelpacana

pawelpacana Oct 10, 2017

Member

I don't quite get the description about different uuid object

@pawelpacana

pawelpacana Oct 10, 2017

Member

I don't quite get the description about different uuid object

This comment has been minimized.

@paneq

paneq Oct 12, 2017

Member

I wasn't clear at all here. we provide different string instance containing the same UUID. This makes it possible to establish whether we need == or eql? or equal? comparison logic. So instead of using the same variable, we use different string variable but with identical content.

@paneq

paneq Oct 12, 2017

Member

I wasn't clear at all here. we provide different string instance containing the same UUID. This makes it possible to establish whether we need == or eql? or equal? comparison logic. So instead of using the same variable, we use different string variable but with identical content.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment