Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Populate channels map in batch #3470

Merged
merged 2 commits into from Jul 10, 2021
Merged

Populate channels map in batch #3470

merged 2 commits into from Jul 10, 2021

Conversation

nzpr
Copy link
Collaborator

@nzpr nzpr commented Jul 9, 2021

Overview

Channels map population is taking long time because each hash is committed separately.
This PR introduces batch write. Preliminary results show that it significantly reduces replay time. (15sec -> 9 sec)

Notes

This is needed just for transitional period before channels map is removed completely.

Please make sure that this PR:

Bors cheat-sheet:

  • bors r+ runs integration tests and merges the PR (if it's approved),
  • bors try runs integration tests for the PR,
  • bors delegate+ enables non-maintainer PR authors to run the above.

@nzpr nzpr requested review from zsluedem and tgrospic July 9, 2021 11:22
@nzpr nzpr added this to In progress in Core team via automation Jul 9, 2021
@nzpr nzpr added this to the Block merge milestone Jul 9, 2021
Copy link
Collaborator

@tgrospic tgrospic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

val storeChannels = Stream
.emits(actions.map(a => Stream.eval(storeChannelHash(a).map(_.asLeft[History[F]]))))
.parJoinProcBounded
val storeChannels = storeChannelHash(actions).map(_.asLeft[History[F]])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this leftover .map(_.asLeft[History[F]])?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is, will remove.

continuationHash = hashContinuationsChannels(channels, sc)
} yield (eventKey, ContinuationHash(continuationHash))

conts.toList.traverse(convert).flatMap(store.put)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorted keys can also speed up insertion.
https://stackoverflow.com/a/62980373

In LMDB by default the keys are compared lexically. ByteString has also lexical ordering defined. Maybe it's worth to test.
https://stackoverflow.com/questions/39387702/sorting-an-lmdb-file-for-sequential-access-according-to-key-order
http://104.237.133.194/doc/group__mdb.html#ga68e47ffcf72eceec553c72b1784ee0fe

@tgrospic tgrospic removed their assignment Jul 9, 2021
@tgrospic tgrospic moved this from In progress to Review in progress in Core team Jul 9, 2021
Copy link
Collaborator

@zsluedem zsluedem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -65,7 +65,7 @@ object ChannelStoreImpl {
joinHash = hashJoinsChannel(channel, sc)
} yield (eventKey, DataJoinHash(dataHash, joinHash))

channels.toList.traverse(convert).flatMap(store.put)
channels.toList.traverse(convert).flatMap(kvs => store.put(kvs.sortBy({ case (k, _) => k })))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tgrospic is this what you meant by sorting?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder, should we make sorting default on put API method of KVStore?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will not be the same sorting as LMDB does internally because it uses lexical comparison.

// Get lexical Ordering for ByteString
implicit val bsComparator: Comparator[ByteString] = ByteString.unsignedLexicographicalComparator()
// ... then use key as ByteString for sorting
...flatMap(kvs => store.put(kvs.sortBy({ case (k, _) => k.toByteString })))

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I'm assuming that ByteString lexical comparison is the same as in LMDB.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This trick did not make any observable difference, might be it require more investigation.

@nzpr
Copy link
Collaborator Author

nzpr commented Jul 10, 2021

bors merge

bors bot added a commit that referenced this pull request Jul 10, 2021
3470: Populate channels map in batch r=nzpr a=nzpr



Co-authored-by: nutzipper <1746367+nzpr@users.noreply.github.com>
@bors
Copy link
Contributor

bors bot commented Jul 10, 2021

Build failed:

@nzpr
Copy link
Collaborator Author

nzpr commented Jul 10, 2021

bors merge

@bors
Copy link
Contributor

bors bot commented Jul 10, 2021

Build succeeded:

@bors bors bot merged commit 9cfb895 into rchain:dev Jul 10, 2021
Core team automation moved this from Review in progress to Done Jul 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging this pull request may close these issues.

None yet

3 participants