Skip to content
This repository has been archived by the owner on Jul 16, 2021. It is now read-only.

Read mutations via low and high watermarks #1045

Merged
merged 20 commits into from
Oct 4, 2018

Conversation

gdbelvin
Copy link
Contributor

@gdbelvin gdbelvin commented Sep 20, 2018

  • Define an API for reading mutations from the queue by timestamp range.
  • Store high watermark in the MapRoot.
  • Retrieve next set of mutations via the range (CurrentMapRoot.HighWatermark, CurrentHighWatermark]

@gdbelvin gdbelvin added the blocked PR cannot be merged until another PR is merged. label Sep 20, 2018
@gdbelvin gdbelvin changed the title F/retry/queue Read mutations via low and high watermarks Sep 20, 2018
@codecov
Copy link

codecov bot commented Sep 20, 2018

Codecov Report

Merging #1045 into master will decrease coverage by 1.08%.
The diff coverage is 64.34%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1045      +/-   ##
==========================================
- Coverage   66.25%   65.16%   -1.09%     
==========================================
  Files          39       39              
  Lines        2670     2733      +63     
==========================================
+ Hits         1769     1781      +12     
- Misses        587      632      +45     
- Partials      314      320       +6
Impacted Files Coverage Δ
impl/sql/mutationstorage/mutations.go 60.78% <ø> (ø) ⬆️
impl/integration/env.go 73.21% <ø> (-0.56%) ⬇️
core/integration/monitor_tests.go 76.05% <50%> (-0.66%) ⬇️
core/sequencer/server.go 63.54% <58.69%> (-0.66%) ⬇️
impl/sql/mutationstorage/queue.go 54.87% <66.66%> (-9.13%) ⬇️
core/sequencer/sequencer.go 42.85% <69.23%> (-37.54%) ⬇️
core/integration/client_tests.go 84.37% <72.72%> (+0.67%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f2f212b...e38d0a0. Read the comment docs.

@@ -25,6 +25,12 @@ package google.keytransparency.sequencer;
import "google/protobuf/empty.proto";
import "v1/keytransparency.proto";

message MapMetadata {
// high_watermark is the highest timestamp (inclusive) of the source log that

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What units are the timestamp in? (Hopefully something strictly monotonic...)

Can there be multiple entries with identical timestamps? If so, presumably the high watermark encompasses all of them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The timestamps are in nanoseconds from the epoch (January 1, 1970 UTC).

Added tests in sql/queue_test.go for

  • Timestamp uniqueness - enforced by Time being part of the primary key
  • Monotonicity - enforced by querying for MAX(Time) and inserting in a single transaction.

core/sequencer/sequencer_api.proto Outdated Show resolved Hide resolved
core/sequencer/server.go Outdated Show resolved Hide resolved
core/sequencer/server.go Show resolved Hide resolved
core/sequencer/server.go Show resolved Hide resolved
core/sequencer/server.go Show resolved Hide resolved
impl/sql/mutationstorage/mutations.go Outdated Show resolved Hide resolved
}

// Read mutations
batch, err := s.queue.ReadQueue(ctx, in.DomainId, low.HighWatermark, high)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CT sequencers (both old and new) have a guard window to avoid sequencing things that are super-recent; I'm not sure exactly what problem that was intended to solve, but probably worth investigating to seem if the same concern applies here.

impl/sql/mutationstorage/queue_test.go Show resolved Hide resolved
core/integration/client_tests.go Outdated Show resolved Hide resolved
core/sequencer/server.go Outdated Show resolved Hide resolved
Ensure that queue.Send fails with a retryable error code.
Improve MapMetadata by defining both low and high timestamps.

Also reuse MapMetadata in the CreateEpoch request.
* Create a metadata proto to describe the low and high watermarks for a
map revision.
* Save this metadata proto in the SetLeaves API
* Use this metadata proto to perform reads during CreateEpoch.
Rework PeriodicallyRunBatchForAllDomains into PeriodicallyRun.
This makes the function useful for all kinds of uses.

Move calling PeriodicallyRun from Env to the test cases themselves.
Some tests want tight control over the sequencer, while other tests
want it to be automatically running in the background.
Now that we've from accepting mutation directly in the CommitBatch API
to using Metadata watermark references into the mutation queue, we need
to actualy use the KT front end for queueing mutations in all tests.
@gdbelvin
Copy link
Contributor Author

gdbelvin commented Oct 1, 2018

@daviddrysdale, many thanks for your review. As a result, I've added a bunch of unit-tests and reworked the CreateEpoch function to be metadata / watermark driven.

This had some follow on effects as a few integration tests were "cheating". Tests that wanted tight control over what went in a particular revision were sending their mutations directly to CreateEpoch, skipping the queue entirely. These tests now use the queue like everyone else, and responsibility for running RunBatch has been moved inside the individual test functions which may call PeriodicallyRun or RunBatch as they need.

We are not reusing the prepared statements, and thus are not deriving
any benefit from them.
Copy link

@daviddrysdale daviddrysdale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, I've skipped the _test.go files on this pass in order to get feedback out sooner.

cmd/keytransparency-sequencer/main.go Show resolved Hide resolved
core/sequencer/sequencer.go Outdated Show resolved Hide resolved
core/sequencer/sequencer.go Show resolved Hide resolved
core/sequencer/sequencer.go Outdated Show resolved Hide resolved
core/sequencer/sequencer.go Show resolved Hide resolved
impl/sql/mutationstorage/queue.go Outdated Show resolved Hide resolved
impl/sql/mutationstorage/queue.go Outdated Show resolved Hide resolved
impl/sql/mutationstorage/queue.go Outdated Show resolved Hide resolved
impl/sql/mutationstorage/queue.go Show resolved Hide resolved
impl/sql/mutationstorage/queue.go Outdated Show resolved Hide resolved
* master:
  Return error code for old requests. (google#1052)
@gdbelvin
Copy link
Contributor Author

gdbelvin commented Oct 3, 2018

Addressed feedback so far (I think). PTAL

Copy link

@daviddrysdale daviddrysdale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(GitHub UI is getting confused for me, so I'm going to 'submit' what I've got so far then look some more...)

core/sequencer/sequencer.go Show resolved Hide resolved
core/sequencer/server.go Outdated Show resolved Hide resolved
core/sequencer/server.go Show resolved Hide resolved
impl/sql/mutationstorage/queue.go Show resolved Hide resolved
core/sequencer/server.go Outdated Show resolved Hide resolved
impl/sql/mutationstorage/queue.go Show resolved Hide resolved
impl/sql/mutationstorage/queue_test.go Show resolved Hide resolved
impl/sql/mutationstorage/queue_test.go Outdated Show resolved Hide resolved
impl/sql/mutationstorage/queue_test.go Show resolved Hide resolved
* master:
  Cleanup tink errors (google#1055)
  Generate consistency proofs in test vectors (google#1053)
  Merge Tink API changes (google#1054)
@gdbelvin gdbelvin merged commit fb5cf0e into google:master Oct 4, 2018
@gdbelvin gdbelvin deleted the f/retry/queue branch October 4, 2018 17:15
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
for review PR is ready for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants