Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert Java client to asynchronous RPCs. #1511

Merged
merged 4 commits into from
Feb 20, 2016
Merged

Conversation

enisoc
Copy link
Member

@enisoc enisoc commented Feb 17, 2016

@erzel

To get the old synchronous behavior, append .checkedGet() to a call
that returns SQLFuture. For example:

Cursor cursor = vtgateConn.execute(...).checkedGet();

Review on Reviewable

@erzel
Copy link
Contributor

erzel commented Feb 19, 2016

Reviewed 38 of 38 files at r1.
Review status: all files reviewed at latest revision, 5 unresolved discussions.


java/client/src/main/java/com/youtube/vitess/client/SQLFuture.java, line 27 [r1] (raw file):
It's a little confusing here to say that you don't implement that interface because you do, you just don't declare that you do.
So perhaps rephrase to say: "this class does not declare that it implements the interface"


java/client/src/main/java/com/youtube/vitess/client/SQLFuture.java, line 97 [r1] (raw file):
So you're wrapping the ExecutionException in the SQLException not the underlying SQLException?


java/client/src/main/java/com/youtube/vitess/client/VTGateTx.java, line 42 [r1] (raw file):
Agreed. I'm wondering what are the use-cases for an asynchronous Transaction API. It would simplify this code if we can get away with only allowing a synchronous API.
If we do allow asynchronous calls here I think you should add enforcement for this (e.g. throw an exception if there's already an op in progress). Currently, if the caller issues two queries after a begin() they could end up executed in different MySQL connections.


java/client/src/main/java/com/youtube/vitess/client/VTGateTx.java, line 43 [r1] (raw file):
I'm unclear as to why you need synchronized in the Vitess API methods.


java/grpc-client/src/main/java/com/youtube/vitess/client/grpc/GrpcStreamAdapter.java, line 48 [r1] (raw file):
Did GRPC change their API?


Comments from the review on Reviewable.io

@erzel
Copy link
Contributor

erzel commented Feb 19, 2016

Looks nice. Sorry for the delay in reviewing.


Review status: all files reviewed at latest revision, 5 unresolved discussions.


Comments from the review on Reviewable.io

@enisoc
Copy link
Member Author

enisoc commented Feb 19, 2016

PTAL


Review status: 36 of 38 files reviewed at latest revision, 5 unresolved discussions.


java/client/src/main/java/com/youtube/vitess/client/SQLFuture.java, line 27 [r1] (raw file):
Done.


java/client/src/main/java/com/youtube/vitess/client/SQLFuture.java, line 97 [r1] (raw file):
Yes, although I was torn about whether that's the right thing to do. In the end, I thought more information is better?


java/client/src/main/java/com/youtube/vitess/client/VTGateTx.java, line 42 [r1] (raw file):
Certainly for commit() I could imagine wanting to do other work while waiting for the server to get back to you. You can also do reads as part of a transaction, and you may want those async too for the same reasons our customers want async queries in general. Since the underlying RpcClient interface is async anyway, the alternative is to have VTGateTx call checkedGet() for you. I'd rather leave the option in the user's hands.

I agree it would be nice to enforce the in-flight operation limit. Originally I was thinking the second attempted operation would block until the previous one finished, but that seemed too complicated and prone to deadlock. I like your suggestion to just throw an exception to say that's not allowed.


java/client/src/main/java/com/youtube/vitess/client/VTGateTx.java, line 43 [r1] (raw file):
The session field gets updated by the async callback, which could be in a different thread. In order for the next call to execute (for example) to be guaranteed to see the new session value, it must be synchronized. Otherwise the Java memory model allows it to see the old value from a thread-local cache.


java/grpc-client/src/main/java/com/youtube/vitess/client/grpc/GrpcStreamAdapter.java, line 48 [r1] (raw file):
Yes I updated the Maven dependency to grpc-java 0.12 to match the rest of our project being updated to grpc 0.12.


Comments from the review on Reviewable.io

@erzel
Copy link
Contributor

erzel commented Feb 19, 2016

Reviewed 2 of 2 files at r2.
Review status: all files reviewed at latest revision, 5 unresolved discussions.


java/client/src/main/java/com/youtube/vitess/client/SQLFuture.java, line 97 [r1] (raw file):
SGTM


java/client/src/main/java/com/youtube/vitess/client/VTGateTx.java, line 42 [r1] (raw file):
Fair enough. I think you've convinced me about the need for asynchronicity in transactions.


java/client/src/main/java/com/youtube/vitess/client/VTGateTx.java, line 43 [r1] (raw file):
You're right. I was confused about the meaning of synchronized in Java.
Can you just move the word "only" to after the word "synchronized" in the comment to make this slightly clearer.


java/grpc-client/src/main/java/com/youtube/vitess/client/grpc/GrpcStreamAdapter.java, line 48 [r1] (raw file):
LGTM, but please see one last comment.


Comments from the review on Reviewable.io

Approved with PullApprove

To get the old synchronous behavior, append `.checkedGet()` to a call
that returns SQLFuture. For example:

```
Cursor cursor = vtgateConn.execute(...).checkedGet();
```
Because of the session cookie, only one asynchronous call can be
in-flight at a time. Violating that constraint is a bug in the client
app, so we throw an IllegalStateException to prevent race conditions.
@enisoc
Copy link
Member Author

enisoc commented Feb 19, 2016

Review status: all files reviewed at latest revision, 2 unresolved discussions.


java/client/src/main/java/com/youtube/vitess/client/VTGateTx.java, line 43 [r1] (raw file):
Done.


Comments from the review on Reviewable.io

enisoc added a commit that referenced this pull request Feb 20, 2016
Convert Java client to asynchronous RPCs.
@enisoc enisoc merged commit 014aa04 into vitessio:master Feb 20, 2016
@enisoc enisoc deleted the java-async branch February 20, 2016 00:15
dbussink added a commit that referenced this pull request Jan 30, 2023
* boost: midflow upqueries

Signed-off-by: Vicent Marti <vmg@strn.cat>

* WIP

Signed-off-by: Vicent Marti <vmg@strn.cat>

* boost: better midflow upqueries

Signed-off-by: Vicent Marti <vmg@strn.cat>

* boosttest: enable go-mysql-server compat flag

Signed-off-by: Vicent Marti <vmg@strn.cat>

* Restore test that is now passing

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* domain: fix RowSet behavior

RowSet should always be a pointer so we can tell when we're dealing with
a missing rowset

Signed-off-by: Vicent Marti <vmg@strn.cat>

* WIP: Make partial extremums work with eviction

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* Revert "Disable MIN() / MAX() support"

This reverts commit ef888fecfe8c2d3123f043c688b63b8bd365cf17.

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* Handle cases when no upquery is available

This ensures we keep the legacy full replay path working when no
upquery is available (such as for UNION today).

Fixes a whole bunch of tests as well.

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* boostplan: cleanup upquery generation

Signed-off-by: Vicent Marti <vmg@strn.cat>

* Fix processing evictions

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* Explicitly mark flushes in misses

Not every miss needs to explicitly flush, but some do. For example from
a join but now also from a group by if we have an extremum.

The logic treats it independently from node type so we can also fix this
for cases like TopK in the future.

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* Fix basic test

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* add upquery support for UNION

Signed-off-by: Andres Taylor <andres@planetscale.com>

* fix UNION planning

Signed-off-by: Andres Taylor <andres@planetscale.com>

* update test expect

Signed-off-by: Andres Taylor <andres@planetscale.com>

* Add GTID executed to each side of the union

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* Mark all packets with the correct intermediate upquery flag

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* Handle empty results for full materialization on group by

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* flownode: cleanup empty state initialization

Signed-off-by: Vicent Marti <vmg@strn.cat>

* Add GTID tracker for mid flow upqueries

This adds the GTID tracker and cleans up the ignore hack which should no
longer be needed.

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* common: extract a generic Map

Signed-off-by: Vicent Marti <vmg@strn.cat>

* integration: use row strings for tests

Signed-off-by: Vicent Marti <vmg@strn.cat>

* integration: add Seed helper

Signed-off-by: Vicent Marti <vmg@strn.cat>

* flownode: proper SUM handling for integrals

Signed-off-by: Vicent Marti <vmg@strn.cat>

* boost: split Protos into individual packages

Signed-off-by: Vicent Marti <vmg@strn.cat>

* packet: Remove VStream specific packet

Signed-off-by: Vicent Marti <vmg@strn.cat>

* topk: handle large evictions in TopK

Signed-off-by: Vicent Marti <vmg@strn.cat>

* proto: extract dataflow package

Signed-off-by: Vicent Marti <vmg@strn.cat>

* sql: rename RowSet

Signed-off-by: Vicent Marti <vmg@strn.cat>

* proto: reduce interface usage

Signed-off-by: Vicent Marti <vmg@strn.cat>

* sizegen: update

Signed-off-by: Vicent Marti <vmg@strn.cat>

* handle grouping of literals better (#1439)

* handle grouping of literals better

Signed-off-by: Andres Taylor <andres@planetscale.com>

* sizegen

Signed-off-by: Andres Taylor <andres@planetscale.com>

* another alternative for literal grouping

Signed-off-by: Andres Taylor <andres@planetscale.com>

Signed-off-by: Andres Taylor <andres@planetscale.com>

* integration: test for full materialization race

Signed-off-by: Vicent Marti <vmg@strn.cat>

* integration: more race tests

Signed-off-by: Vicent Marti <vmg@strn.cat>

* flownode: cleanup data packets in stream processing

Signed-off-by: Vicent Marti <vmg@strn.cat>

* integration: support TRACK_GTID in fake executor

Signed-off-by: Vicent Marti <vmg@strn.cat>

* vstream: wait for startup as to not miss packets

Signed-off-by: Vicent Marti <vmg@strn.cat>

* boost: handle duplicated packets during stream

Signed-off-by: Vicent Marti <vmg@strn.cat>

* Add support for DECIMAL extremums

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* boost: New Domain API (#1442)

* wip: domain API

Signed-off-by: Vicent Marti <vmg@strn.cat>

* boostrpc: GRPC design

Signed-off-by: Vicent Marti <vmg@strn.cat>

* domain: remove unused map

Signed-off-by: Vicent Marti <vmg@strn.cat>

* domain: cleaner API for RPCs during planning

Signed-off-by: Vicent Marti <vmg@strn.cat>

Signed-off-by: Vicent Marti <vmg@strn.cat>

* Allow using primary key column annotation

Small improvement mostly for testing to allow primary key annotation in
the table directly.

We'd never see this style input from a `SHOW CREATE` from MySQL, but for
manual schemas in tests this allows us to pick up the primary key for a
table.

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* Add support for MIN / MAX on date, datetime and timestamp columns

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* go-mysql-server: Handle YEAR as integer for aggregation

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* Implement additional extremums and refactor graph type checking

We add support here for quoted type extremums for min / max and fix our
graph type checking.

With the fixed type checks we now both can finalize in the dummy
migration but also ensure we don't panic but pass through errors
properly.

We don't support SUM() for non numerical types at this point. They do
have defined behavior in MySQL but that is something to add another day.

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* mod: upgrade exp/slices

Signed-off-by: Vicent Marti <vmg@strn.cat>

* domain: split domain logic into individual files

Signed-off-by: Vicent Marti <vmg@strn.cat>

* internal/grouped: refactor all the extremum aggregations

Signed-off-by: Vicent Marti <vmg@strn.cat>

* integration: split test files

Signed-off-by: Vicent Marti <vmg@strn.cat>

* dispatch: add TODO for fully materialized nodes

Signed-off-by: Vicent Marti <vmg@strn.cat>

* gtid: add an extra check in the GTID tracker

Signed-off-by: Vicent Marti <vmg@strn.cat>

* controller: Introduce migration target interface

This interface is added so that we can mock more of the controller logic
and run things like domain placements also for the dummy incorporator.
This is needed since domain placements can impact things like replay
paths etc. which result in different behavior that can introduce bugs.

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* Add check from broken handling of multiple states on single node

Right now things are broken if you have multiple states keyed by
different columns on the same internal node. This causes data writes
after partial materialization to not show and be dropped leading to
inconsistent results.

This is already a problem with Noria as well, but never uncovered there
apparently.

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* migration: cleanup dummy migration

Signed-off-by: Vicent Marti <vmg@strn.cat>

* flow: support replacements in fully materialized nodes

Signed-off-by: Vicent Marti <vmg@strn.cat>

* Add additional tests for multiple level upqueries

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* wip

Signed-off-by: Vicent Marti <vmg@strn.cat>

* unify the external base and gtid tracker

Signed-off-by: Vicent Marti <vmg@strn.cat>

* Implement correct packet filtering with tags

The filtering needs to happen where the replay path ends. This is
because there can be branches mid replay path to other nodes that have
not seen the packet yet.

By dropping it at the end of the replay path we ensure that only the
node filters the packet where it surely has been processed.

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* track with row offsets

Signed-off-by: Vicent Marti <vmg@strn.cat>

* [wip] with bitsets

Signed-off-by: Vicent Marti <vmg@strn.cat>

* use Seen structs for this stuff

Signed-off-by: Vicent Marti <vmg@strn.cat>

* materialization: better semantics for midflow upqueries

Signed-off-by: Vicent Marti <vmg@strn.cat>

* graphviz: use a sane default font name

Signed-off-by: Vicent Marti <vmg@strn.cat>

* Fix derived table rewriting - copy dependencies (#1497)

* Fix TestMultiLevelUpqueriesPartialMaterialized test

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>

* Simplify code change

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>

* better external replay piece tracking

Signed-off-by: Vicent Marti <vmg@strn.cat>

* wip

Signed-off-by: Vicent Marti <vmg@strn.cat>

* unify the upquery paths

Signed-off-by: Vicent Marti <vmg@strn.cat>

* move intermediate as a property of upquery

Signed-off-by: Vicent Marti <vmg@strn.cat>

* fix tests

Signed-off-by: Vicent Marti <vmg@strn.cat>

* replay: do not block forever waiting for slots

Signed-off-by: Vicent Marti <vmg@strn.cat>

* gtid_tracker: clear records, not the packet

Signed-off-by: Vicent Marti <vmg@strn.cat>

* Maintain upqueries when nodes are reused

Before the migration would not have access to the upquery for nodes that
already exist, but it needs those.

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* add support for derived tables in upqueries (#1504)

* add support for derived tables in upqueries

Signed-off-by: Andres Taylor <andres@planetscale.com>

* use a marker struct instead of nil when filling out the tables slice

Signed-off-by: Andres Taylor <andres@planetscale.com>

* better default implementations

Signed-off-by: Andres Taylor <andres@planetscale.com>

Signed-off-by: Andres Taylor <andres@planetscale.com>

* boost: add support for double-shuffle (#1505)

* boost: add support for double-shuffle misses

Signed-off-by: Vicent Marti <vmg@strn.cat>

* dbg: simplify the pretty printing

Signed-off-by: Vicent Marti <vmg@strn.cat>

* dataflow: remove trace package

Signed-off-by: Vicent Marti <vmg@strn.cat>

* common: improve printing of memory tables

Signed-off-by: Vicent Marti <vmg@strn.cat>

* integration: this query is now supported!

Signed-off-by: Vicent Marti <vmg@strn.cat>

* go mod tidy

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

Signed-off-by: Vicent Marti <vmg@strn.cat>
Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>
Co-authored-by: Dirkjan Bussink <d.bussink@gmail.com>

* Add basic time expiry for recent GTID sets (#1506)

* Add basic time expiry for recent GTID sets

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* gtid_tracker: use monotonic time

Signed-off-by: Vicent Marti <vmg@strn.cat>

* hack: commit missing file

Signed-off-by: Vicent Marti <vmg@strn.cat>

* dataflow: add a heap for GTID expiry

Signed-off-by: Vicent Marti <vmg@strn.cat>

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>
Signed-off-by: Vicent Marti <vmg@strn.cat>
Co-authored-by: Vicent Marti <vmg@strn.cat>

* boost: Remove Bases (#1508)

* wip

Signed-off-by: Vicent Marti <vmg@strn.cat>

* wip

Signed-off-by: Vicent Marti <vmg@strn.cat>

* removal

Signed-off-by: Vicent Marti <vmg@strn.cat>

* remove proto

Signed-off-by: Vicent Marti <vmg@strn.cat>

* boost: rename Source to Root

Signed-off-by: Vicent Marti <vmg@strn.cat>

* flownode: rename 'ExternalBase' to 'Table'

Signed-off-by: Vicent Marti <vmg@strn.cat>

Signed-off-by: Vicent Marti <vmg@strn.cat>

* Remove some panics and add more robust migration (#1511)

* Remove undeeded RowSet checks

This is only created through `NewRowSet` so we don't need to check for
nil here since it can never be nil.

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* Avoid panics in topo

This logic runs inside the vtgate so it's better to avoid panicing there
if something is up. Instead we propagate the error where needed.

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* Remove unused code

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* Cleanup panics in tests and use t.Fatal instead

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* Avoid panic in worker

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* Avoid panics in migration

Migration should use errors where possible since we also run these for
linting as well and even in the weirdest edge case we want to use proper
errors.

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* Don't let requesting JSON marshalling panic

We don't use JSON but if some external request uses it, we should not
panic but error.

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* Remove panics from controller

This avoids the controller from crashing also in edge cases which is
useful as it runs as part of Singularity as well for query linting.

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* Revert "Remove panics from controller"

This reverts commit ea7c10f3863a6670a10f600c4396cdf55ff4e448.

* Revert "Avoid panics in migration"

This reverts commit 4150e2e17fbcdaba969ef7f13de9151b6c9c5b5a.

* materialization: apply safely

Signed-off-by: Vicent Marti <vmg@strn.cat>

* Add back checks that are valid

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* Add explicit error type for recovered panic

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* Increase sleeps for test resilience, yay

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* stop walking down the tree when we have found an offset

Signed-off-by: Andres Taylor <andres@planetscale.com>

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>
Signed-off-by: Vicent Marti <vmg@strn.cat>
Signed-off-by: Andres Taylor <andres@planetscale.com>
Co-authored-by: Vicent Marti <vmg@strn.cat>
Co-authored-by: Andres Taylor <andres@planetscale.com>

* Support partially overlapping (#1514)

* plan: support overlapping partial materializations

Signed-off-by: Vicent Marti <vmg@strn.cat>

* boost: better handling of multi-key insertions

Signed-off-by: Vicent Marti <vmg@strn.cat>

* boost: fix removal cases in aggregations

Signed-off-by: Vicent Marti <vmg@strn.cat>

* Implement dependency processing for regular messages

Signed-off-by: Vicent Marti <vmg@strn.cat>
Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* memory: fix dependencies

Signed-off-by: Vicent Marti <vmg@strn.cat>

* domain: do not notify midflow upqueries

Signed-off-by: Vicent Marti <vmg@strn.cat>

* Add upquery mode enum for additional mode with no upqueries

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* Add back partial overlapping materialization check

When we don't have a generated upquery, we still want to verify and do
this check.

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* backlog: fix data race when replacing heads

Signed-off-by: Vicent Marti <vmg@strn.cat>

* state: add primary key optimizations

Signed-off-by: Vicent Marti <vmg@strn.cat>

Signed-off-by: Vicent Marti <vmg@strn.cat>
Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>
Co-authored-by: Vicent Marti <vmg@strn.cat>

* Add coalescing of upqueries with multiple keys (#1520)

* Add coalescing of upqueries with multiple keys

When we generate misses, coalesce the cases where we have different keys
for otherwise the same miss. This ensures that we only have to execute a
single upquery.

This also exposed a bug in how we generate parameters for a multi
column, multi key upquery since we were `OR`ing the columns instead of
using `AND`.

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* Add waits between attempts since deletion can take some time

CI can be really slow on these, so we need to have some time buffer
here.

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

* boost: left/right concurrency (#1521)

* backlog: proper left-right concurrency

Signed-off-by: Vicent Marti <vmg@strn.cat>

* backlog: rename package

Signed-off-by: Vicent Marti <vmg@strn.cat>

* go atomics??

Signed-off-by: Vicent Marti <vmg@strn.cat>

* leftright: work around sequential consistency

Signed-off-by: Vicent Marti <vmg@strn.cat>

Signed-off-by: Vicent Marti <vmg@strn.cat>

---------

Signed-off-by: Vicent Marti <vmg@strn.cat>
Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>
Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>
Co-authored-by: Vicent Marti <vmg@strn.cat>
Co-authored-by: Andres Taylor <andres@planetscale.com>
Co-authored-by: Vicent Martí <42793+vmg@users.noreply.github.com>
Co-authored-by: Florent Poinsard <35779988+frouioui@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants