Skip to content

Conversation

@peterbroadhurst
Copy link
Contributor

@peterbroadhurst peterbroadhurst commented Mar 13, 2022

Set of incremental test+fix changes, based on analysis of flame graphs against runs.

  • fce2ea6
    • Use multiple value insert for pins.
  • 041005c
    • Increase the message writer timeout, accepting 50ms latency vs. benefits of more writes per commit
  • 4e0a86d
    • Delay individual events for up to 250ms when read-ahead is enabled, to reduce polling overhead
  • 5c4d55b
    • Better debug logging for cache misses
  • 364e8ae
    • Only query id+sequence when reading pages in batch manager. The full message is almost always in the cache when we're assembling a batch, so there's no need to query the full message+datarefs from the database.
  • 33c244c
    • Prevent new msg cache overwritting batch cache of message. We have to write the cache first before the DB update, as we have no idea how quickly the batch assembly will happen - and we can overwrite its updates if we do it after
  • a0232b7
    • Better debug for cases when we have blocking due to missed pins
  • 8d6be55
    • Update message sequence from query of just the IDs+sequences
  • 8f63fc9
    • Add a minimum poll time to the batch manager, so it doesn't thrash rewinding as commits come in
  • 21789a0
    • The log formatter we have (which is great for other reasons) does a regex on every message, if you don't have a prefix key - so set one to the node name
  • ff877bd
    • Cache first nodes for each parent org to avoid DB queries on every message
  • d1dea98
    • Significant changes to how batch persistence works, to optimize the DB queries. Means that under normal circumstances, the sender only writes once (in batch assembly), and the receiver only writes once (in batch persist).
  • edfaede
    • One DB operation to update all the pins within a batch, using a list of Batch+ID combinations
  • 7f1fb77
    • Enhancement to include the nonce in the pins array on a message (reducing the max topics for each message to 10), to aid debugging in the case of stuck contexts
  • 0d5b479
    • Commit offset updates to the DB asyncronously from the polling loop
    • Aggregator batch size to match input batch size
    • Fix descending sorting by default on created time to match confirmed descending sorting (which takes precedenced)
  • 80ed638
    • Write back to the DB when the broadcast dispatch adds public references to blobs

Instead we now just query the ID+Sequence, and then read the rest from the cache

Signed-off-by: Peter Broadhurst <peter.broadhurst@kaleido.io>
Signed-off-by: Peter Broadhurst <peter.broadhurst@kaleido.io>
Signed-off-by: Peter Broadhurst <peter.broadhurst@kaleido.io>
Signed-off-by: Peter Broadhurst <peter.broadhurst@kaleido.io>
Signed-off-by: Peter Broadhurst <peter.broadhurst@kaleido.io>
Signed-off-by: Peter Broadhurst <peter.broadhurst@kaleido.io>
Signed-off-by: Peter Broadhurst <peter.broadhurst@kaleido.io>
Signed-off-by: Peter Broadhurst <peter.broadhurst@kaleido.io>
Signed-off-by: Peter Broadhurst <peter.broadhurst@kaleido.io>
Signed-off-by: Peter Broadhurst <peter.broadhurst@kaleido.io>
Signed-off-by: Peter Broadhurst <peter.broadhurst@kaleido.io>
Signed-off-by: Peter Broadhurst <peter.broadhurst@kaleido.io>
Signed-off-by: Peter Broadhurst <peter.broadhurst@kaleido.io>
Signed-off-by: Peter Broadhurst <peter.broadhurst@kaleido.io>
Signed-off-by: Peter Broadhurst <peter.broadhurst@kaleido.io>
@peterbroadhurst peterbroadhurst changed the title Only query id+sequence when reading pages in batch manager Performance enhancements from flame graph analysis Mar 14, 2022
Signed-off-by: Peter Broadhurst <peter.broadhurst@kaleido.io>
@codecov-commenter
Copy link

Codecov Report

Merging #599 (80ed638) into main (fa21655) will decrease coverage by 0.01%.
The diff coverage is 100.00%.

Impacted file tree graph

@@             Coverage Diff             @@
##              main     #599      +/-   ##
===========================================
- Coverage   100.00%   99.98%   -0.02%     
===========================================
  Files          306      306              
  Lines        18018    18185     +167     
===========================================
+ Hits         18018    18182     +164     
- Misses           0        2       +2     
- Partials         0        1       +1     
Impacted Files Coverage Δ
internal/data/message_writer.go 100.00% <ø> (ø)
internal/database/sqlcommon/data_sql.go 100.00% <ø> (ø)
pkg/fftypes/batch.go 100.00% <ø> (ø)
pkg/fftypes/pin.go 100.00% <ø> (ø)
cmd/firefly.go 100.00% <100.00%> (ø)
internal/batch/batch_manager.go 100.00% <100.00%> (ø)
internal/batch/batch_processor.go 100.00% <100.00%> (ø)
internal/broadcast/manager.go 100.00% <100.00%> (ø)
internal/config/config.go 100.00% <100.00%> (ø)
internal/data/data_manager.go 100.00% <100.00%> (ø)
... and 16 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fa21655...80ed638. Read the comment docs.

@peterbroadhurst
Copy link
Contributor Author

I do not believe the WS Heartbeat code coverage drop is related to my changes:
image

bp.conf.BatchMaxSize = 1
bp.conf.txType = fftypes.TransactionTypeUnpinned

mockRunAsGroupPassthrough(mdi)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed whenever this helper came into existence, but we probably need it in every package 😃

}

type iPinCollection interface {
// InsertPins - Inserts a list of pins - fails if they already exist, so caller can fall back to UpsertPin individually
Copy link
Contributor

@awrichar awrichar Mar 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fails if they already exist, so caller can fall back to UpsertPin individually

This is a useful bit of information - is it relevant on some of the other multi-insert calls added recently?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't before, but it is now after this PR 👍

Copy link
Contributor

@awrichar awrichar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really good - couldn't see any major problems here. Great to see things getting to this level of optimization.

Signed-off-by: Peter Broadhurst <peter.broadhurst@kaleido.io>
@peterbroadhurst peterbroadhurst merged commit 3d0ab63 into hyperledger:main Mar 15, 2022
@peterbroadhurst peterbroadhurst deleted the msg-state-pages branch March 15, 2022 16:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants