Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimistic Locking friendly changes #86

Merged
merged 35 commits into from Sep 27, 2017
Merged

Optimistic Locking friendly changes #86

merged 35 commits into from Sep 27, 2017

Conversation

paneq
Copy link
Member

@paneq paneq commented Aug 29, 2017

It started here: https://github.com/RailsEventStore/aggregate_root/issues/8

3 possible ways of writing

Number

  • :none - works like -1 assumes stream empty so far, starts adding new in stream
  • given number : -1..Infinity - assumes N, starts adding N+1, N+2

Should work well for

  • non-legacy
  • event sourcing scenario
  • transaction around not required

Auto

  • :auto - assumes lock in higher layer. Will query for last N and start doing N+1, N+2... etc [New mode]
  • good for legacy
  • requires transaction and lock around

Any

  • :any - NULLable position in stream, order unknown
  • For copying into highly-contagious technical streams where we don't really care about exact position
  • Best-guess order can be determined based on EventInStream auto-increment id.

This is just a spike that I wanted to share with you.

What do you think @mpraglowski @pawelpacana @mlomnicki ?

Checklist

  • Mysql
  • Postgres
  • Re-enable mutation testing
  • Verbose / non-verbose mode of running tests
  • Change migration to be DB-dependant and use string for mysql UUID
  • Discuss 3 tables vs 2 tables
  • Fix enrich_event_metadata to not edit metadata directly but rather a cloned/duped version.
  • GLOBAL_STREAM is all instead of __global__ and it should only write there once.
  • Consider global linearization with a very quick lock on write...
  • Make sure AggregateRoot can handle conflicts
  • Should event be unique in a stream?

I collected the work from across all gems.

https://github.com/RailsEventStore/aggregate_root/issues/8

It is missing required Travis changes to run
rails_event_store_active_record on Mysql and Postgres
@paneq
Copy link
Member Author

paneq commented Aug 29, 2017

right now it is missing required Travis changes to run rails_event_store_active_record on Mysql and Postgres

I considered using `prepend` to initialize
@unpublished_events = []
instead of using the lazy load pattern:

  def unpublished_events
    @unpublished_events ||= []
  end

but I decided the overhead is not worth it.
Although we coud use it to set version to -1 as well.

I will create a separate issue for it.
@@ -1,26 +1,55 @@
if ENV['CODECLIMATE_REPO_TOKEN']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not needed anymore. We got rid of CC

t.text :data, null: false
t.integer :position, null: true
if ENV['DATABASE_URL'].start_with?("postgres")
t.references :event, null: false, type: :uuid
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually need the :uuid? Maybe :string would suffice? With :string we don't have to enable pgcrypto

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And what's wrong with pgcrypto?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing wrong. It's not enabled by default. Are we going to force users to enable it even though it's not really needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔

@mlomnicki
Copy link
Member

Excellent job @paneq! This is quite a big PR. Could we maybe split it into smaller chunks?

The checklist already contains some bits that could be extracted to separate PRs (mysql/postrgres, fix enrich_event_metadata, etc)

I would also open separate PRs for the following:

  • ability to publish multiple events
  • Event#hash
  • Discuss - do we have to depend on activerecord-import?

Makes sense?

event
def append_to_stream(events, stream_name, expected_version)
events = [*events]
expected_version = case expected_version
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be in its own private method. It makes the append_to_stream very hard to reason about otherwise, also violates single responsibility principle

when :none
-1
when :auto
eis = EventInStream.where(stream: stream_name).order("position DESC").first
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should use a repository to access this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should also be in its own private method

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code we are writing here is the repository :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's the event repository, not EventInStream repository. I would make a clear distinction

expected_version
end
in_stream = events.flat_map.with_index do |event, index|
position = if expected_version == :any
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extract in its own method

else
expected_version + index + 1
end
Event.create!(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should use the adapter

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the point of this adapter anymore now that the repository needs to work with 2-3 ActiveRecord classes instead of 1.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it makes it way harder to re-write this repository for another ORM if they are a mixed of multiple active record classes instead of a single responsibility repository which maps to one active record class

end
EventInStream.import(in_stream)
self
rescue ActiveRecord::RecordNotUnique
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this class should know nothing about active record

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would rails_event_store_active_record/event_repository.rb need to know nothing about active record when it actively uses active record to implement the needed features.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nvm, didn't see we are in active record namespace :)

stream: "__global__",
position: nil,
event_id: event.event_id
)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better indentation technique would be easier to read IMO

metadata: event.metadata,
event_type: event.class,
)
[EventInStream.new(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be extracted into its own method as well

@@ -17,75 +54,75 @@ def delete_stream(stream_name)
end

def has_event?(event_id)
adapter.exists?(event_id: event_id)
Event.exists?(id: event_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should use the adapter

end

def last_stream_event(stream_name)
build_event_entity(adapter.where(stream: stream_name).last)
build_event_entity(
EventInStream.preload(:event).where(stream: stream_name).order('position DESC, id DESC').first
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should use the event in stream repository like the other comment earlier

unless start_event_id.equal?(:head)
starting_event = adapter.find_by(event_id: start_event_id)
stream = stream.where('id > ?', starting_event)
def read_events_forward(stream_name, after_event_id, count)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comments about adapters

@paneq
Copy link
Member Author

paneq commented Aug 30, 2017

The checklist already contains some bits that could be extracted to separate PRs

I agree. It's more so that I don't forget about those things than to actually include in this PR.

@paneq
Copy link
Member Author

paneq commented Aug 30, 2017

ability to publish multiple events

well... We could add such functionality in separate PR without releasing a new version because publishing multiple events using the old code is still going to be completely broken from optimistic locking perspective but maybe a separate PR for extending API is a good idea? Dunno

@paneq
Copy link
Member Author

paneq commented Aug 30, 2017

Event#hash

No brainer. Feel free to extract and merge. Very simple. Related mutant discussion https://twitter.com/pankowecki/status/902521997573455877

@paneq
Copy link
Member Author

paneq commented Aug 30, 2017

Discuss - do we have to depend on activerecord-import?

Ideally not, but I didn't want to write myself all the code for doing a single INSERT with multiple records. It might not be relevant if we migrate to transactions and 3 tables approach.

@paneq
Copy link
Member Author

paneq commented Aug 30, 2017

@gottfrois Thanks for the comments about extracting methods etc. This is right now however totally Work In Progress in which I am trying to find out if what we are trying to achieve is possible within the constraints that I described. I am still not exactly sure and there are bigger decisions to be made (3vs2 tables), handling concurrency etc. So once I am happy with the Big decisions, I am gonna focus on refactoring the smaller parts.

@paneq
Copy link
Member Author

paneq commented Aug 30, 2017

@gottfrois Also, I am trying to reach a state in which this can be merged to master, unreleased. And then everyone can easily improve the codebase with proper tests until we are satisfied with the results. And the we could do an official release. So I am gonna dismiss the review right now. But does not mean I don't agree. I just want to handle those kind of things later :)

@gottfrois
Copy link
Contributor

no hard feelings :)

@paneq paneq dismissed gottfrois’s stale review August 30, 2017 08:24

We will deal with those changes later :)

@mlomnicki
Copy link
Member

Event#hash

No brainer.

Apparently mere developers such as me don't understand it :) I.e. how is this change related to optimistic locking? Why the big number is written in binary notation? Why do we even need to overwrite Event#hash, etc

@paneq
Copy link
Member Author

paneq commented Aug 30, 2017

Apparently mere developers such as me don't understand it :) I.e. how is this change related to optimistic locking?

In one race condition test I wanted to guarantee that all events are returned but the order is not known so I used Array#to_set and check if two sets are equal. That failed. Which brings as to your next question:

Why do we even need to overwrite Event#hash, etc

Because our Event has == operator which makes it behave like a Value Object (ignoring metadata). But value objects with are equal in == sense should also return the same hash which is used when you put such object into Hash or Set.

       expect({
         klass.new(event_id: "doh") => :YAY
       }[ klass.new(event_id: "doh") ]).to eq(:YAY)
       expect(Set.new([
         klass.new(event_id: "doh")
       ])).to eq(Set.new([klass.new(event_id: "doh")]))

These two are failing without implementing hash

Why the big number is written in binary notation?

It's easier that way to see how many bits it has.

@mlomnicki
Copy link
Member

Wow that's a proper explanation. Much appreciated, thanks!!

Simple test to check if #has_event? passes if different string with same
content still returns true.
This changes the original logic but it makes sense imho
because we now have the concept of an event belonging to multiple
streams.
We now have 2 (it can be later 3) classes responsible for storing
events. I don't see much point in this adapter anymore.
@paneq
Copy link
Member Author

paneq commented Aug 30, 2017

Regarding mutations in rails_event_store_active_record... I don't think it makes sense to run them on postgres or mysql. Probably I should bring back Sqlite into testing suite and mutate only on that.

@mlomnicki mlomnicki mentioned this pull request Aug 31, 2017
@paneq
Copy link
Member Author

paneq commented Sep 9, 2017

Today I realized that ideally I would want to come out with a DB schema that allows for Event Store as queue: ( #106 ) and since I don't want to keep the DB schema constantly I kind of coupled both problems together.

Kill lovely mutations.
I am not sure if `equal` will always work, but I think it will as two
identical symbols must be the same object, no matter how constructed.
1.

`preload()` only has observable perfromance effects so mutating it does
not benefit and I am not sure how would we kill it anyway?

I was thinking about adding in build_event_entity(record) a check like

raise "use preload()" unless record.association(:event).loaded?

but then how do we kill mutations around this ^^ check if does not
external API in any way?

2.

where('id < ?', before_event) automatically substitutes id so killing a
few mutations
I was wrong in bc51e52

we can use nr of DB queries to verify preloading.
It its not part of linted spec but additional test of a particular
implementation.
I does not matter if we shift all postions by 1,2,3 etc. We use 1 so
that first event in a stream is recorded with nr 0 :)
I had to remove_index :event_store_events_in_streams, [:stream,
:position] because it was used by default when fetching the records and
even without explicit order the order as defined by that used index.
Surprise.
@paneq
Copy link
Member Author

paneq commented Sep 25, 2017

@pawelpacana @mlomnicki before we merge this I have a few questions. Let's start with most important one:

  • add_index :event_store_events_in_streams, [:stream, :event_uuid], unique: true - Do you think we need it? I think it makes sense. Given event should appear in a stream once and only once. read_events_forward and other similar methods basically rely on that fact because they find position based on event id and if there were two events with same id in a stream then we could have random results and failures.

@mlomnicki
Copy link
Member

@paneq yep, defo makes sense 👍

Conflicts:
	aggregate_root/spec/aggregate_root_spec.rb
	rails_event_store_active_record/Makefile
	rails_event_store_active_record/rails_event_store_active_record.gemspec
	ruby_event_store/Makefile
	ruby_event_store/lib/ruby_event_store/client.rb
	ruby_event_store/spec/subscription_spec.rb
Our codebase relies on non-duplicated events. Right now it is not
possible to publish same event twice. But if we add append_to_stream, it
will be possible to have same event in many streams.

Don't mutate #detect_pkey_index_violated because it has logic which work
on different databases and we can't run mutant across many DBs at the
same time easily (we could have many connections but not worth it imho).
@paneq
Copy link
Member Author

paneq commented Sep 27, 2017

@mlomnicki @pawelpacana I want to merge today, please review holistically :)

The idea behind __global__ stream is that it contained all events but only once. It might not be needed right now but when we add the feature to copy event between streams it could be useful to have __global__ without duplicates.

So publishing in stream a would add event in stream a and __global__ and linking an event into stream b would only add it in stream b, without adding again to __global__. Does that make sense?

@paneq
Copy link
Member Author

paneq commented Sep 27, 2017

So far I was not sure how we could introduce a 3rd table streams with name and position and unique index on name that could work in :any mode and not have a race condition on creating the record. This is an actual problem we had in another system where 2 background jobs tried to write first event to the same stream.

@paneq paneq merged commit b81796d into master Sep 27, 2017
@paneq paneq deleted the locking_friendly branch September 27, 2017 14:49
@@ -87,7 +97,7 @@ def enrich_event_metadata(event)
metadata[:timestamp] ||= clock.()
metadata.merge!(metadata_proc.call || {}) if metadata_proc

event.class.new(event_id: event.event_id, metadata: metadata, data: event.data)
# event.class.new(event_id: event.event_id, metadata: metadata, data: event.data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's up with this comment?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing metatadata inside the instance vs new instance with new metadata. Keeping the old behavior was not possible during refactorings. A separate task would be to bring back the old behavior if still wanted, but test it properly. It was not.

end
end

it 'reads events different uuid object but same content' do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite get the description about different uuid object

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't clear at all here. we provide different string instance containing the same UUID. This makes it possible to establish whether we need == or eql? or equal? comparison logic. So instead of using the same variable, we use different string variable but with identical content.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants