Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create google-common module #2613

Merged
merged 58 commits into from
Mar 25, 2021
Merged

Create google-common module #2613

merged 58 commits into from
Mar 25, 2021

Conversation

armanbilge
Copy link
Contributor

@armanbilge armanbilge commented Mar 12, 2021

References #773 #2548; cc @tg44

This is an attempt at the beginnings of a long-awaited google-common module.

The implementation is mostly taken from the new BigQuery connector, which includes:

  • OAuth2 Credentials providers w/ Service Account and Compute Engine configurations
    • A lightweight wrapper as a Google Credentials class, for interop with gRPC
  • A GoogleHttp class with auth, retry, and proxy support for requests
  • A PaginatedRequest source for paginated resources
  • A ResumableUpload flow (will be shared between GCS and BigQuery, at least)
  • GoogleSettings and stream attributes classes for managing config

The idea is that this config is generally done centrally (although of course can be overridden) so that multiple connectors can share access tokens from the same Credentials instance.

If it's not too ambitious, I'd like to try and migrate all the Google connectors to this module in this PR. I think this is the best path towards determining a relatively stable (internal) API for this module that meets the needs of all of the connectors.

Unfortunately, it seems that the various Google connectors don't share a top level package: they are split between a.s.a.google and a.s.a.googlecloud, probably too late to change?

Looking forward to your feedback!

@tg44
Copy link
Contributor

tg44 commented Mar 15, 2021

Nice! I think we don't need to keep it internal only, simply just document that everything can change until we freeze the API. So I don't think the namespaces are problematic.

I'd like to try and migrate all the Google connectors to this module in this PR.

This will be the interesting part. If you want to I can try to migrate FCM.

@armanbilge
Copy link
Contributor Author

Thanks @tg44! I'm open to making it public API with those caveats.

This will be the interesting part. If you want to I can try to migrate FCM.

Thanks, that would be a big help!! Before we start migrating code, my big question is if (and how much) we can break the public APIs of the existing connectors? Of course this module is still useful even if only used to replace the internal implementations of the connectors, without changing their public APIs.

But, to achieve truly central configuration (and shared credentials/access tokens) will probably require breaking changes to config and settings classes for each of the connectors. If we are going to do it, Alpakka 3 would probably be our best chance. Hopefully @seglo or @ennru can comment on this?

@tg44
Copy link
Contributor

tg44 commented Mar 16, 2021

I created a commit; tg44@601a464
The java API and the docs are missing (also I haven't tested against Google). But the "flow" of the migration seemed ok to me. Also, I don't think I broke any existing API.

(I can't PR it to your fork, I think you need to enable something.)

I removed the extends AnyVal and final from GoogleHttp so we can mock it from now on. I eased up some visibility rules too to write backward compatible conversions.

@armanbilge
Copy link
Contributor Author

armanbilge commented Mar 16, 2021

Wow, this is awesome @tg44, thanks for this! Love that your commit nearly halved the size of the code. Not sure why you couldn't PR so I started it for you at armanbilge#1.

But the "flow" of the migration seemed ok to me.

I'm glad you had a positive experience, any feedback? Actually, I'm still noodling a bit with the GoogleHttp class so that it directly takes unmarshallers and has customizable retrying.

Also, I don't think I broke any existing API.

I added some comments, but you're right, it looks like you pulled this off without breaking much/any public API! I hope it can be this easy for the other connectors.

@seglo
Copy link
Member

seglo commented Mar 16, 2021

If it's not too ambitious, I'd like to try and migrate all the Google connectors to this module in this PR. I think this is the best path towards determining a relatively stable (internal) API for this module that meets the needs of all of the connectors.

To make sure IUC, you're asking if you can update the existing connectors to use this updated module in this PR? That sounds reasonable to me, and will help to make sure we don't miss any functionality required by one or more of google connectors.

Unfortunately, it seems that the various Google connectors don't share a top level package: they are split between a.s.a.google and a.s.a.googlecloud, probably too late to change?

If we did this I would choose a package that most of the connectors use and deprecate all the APIs found in the other packages. I don't think we should make a clean break. I'm not sure this is really worth the effort though.

Thanks, that would be a big help!! Before we start migrating code, my big question is if (and how much) we can break the public APIs of the existing connectors? Of course this module is still useful even if only used to replace the internal implementations of the connectors, without changing their public APIs.

But, to achieve truly central configuration (and shared credentials/access tokens) will probably require breaking changes to config and settings classes for each of the connectors. If we are going to do it, Alpakka 3 would probably be our best chance. Hopefully @seglo or @ennru can comment on this?

We can deprecate existing configuration, but it would be unfortunate to have to make breaking changes across the board for Alpakka 3.0.0. Alpakka major releases don't have to be as rare as they are right now. If there's a good reason (and maybe this is one of them) then we could release an Alpakka 4.0.0 in a more reasonable interval, maybe after 3 months?

@armanbilge
Copy link
Contributor Author

make sure we don't miss any functionality required by one or more of google connectors.

Exactly my thinking!

We can deprecate existing configuration, but it would be unfortunate to have to make breaking changes across the board for Alpakka 3.0.0.

Thanks for your thoughts! This makes sense; I agree. Ok, let's focus on not breaking stuff for now, and once the dust settles see where that leaves us regarding need for 4.0.

@armanbilge
Copy link
Contributor Author

Just pushed my first migration effort for the Google Cloud Storage module. All tests pass, but I did have to break a couple things.

  1. In the original config you could specify non-standard endpoints for the Google APIs. I assume that this existed purely for testing purposes (but please correct me if I'm mistaken) and dropped support for non-default values. Users who needed this functionality can use the new forward-proxy config.
  2. There was a single test for a failing request that broke: previously, it failed silently (returned empty Source) but now it fails with an exception. This behavior seemed inconsistent with the rest of the connector's API (only 404s fail silently like this) so I didn't fix it.

@seglo Are these breaking changes acceptable?

client-email = deprecated
private-key = deprecated

# No longer supported, use config path alpakka.google.forward-proxy
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the configurable endpoints I dropped support for.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know. Maybe some people with some recent experience using GCS can comment. @francisdb, @DanieleSassoli, @gabrielreid ?

See Arman's note about introducing configuration breaking changes here: #2613 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just had a look, seems this was added by @jkobejs when he worked further on my pr. We are not using these settings so for my part ok to make unsupported.

@armanbilge
Copy link
Contributor Author

@tg44 I just pushed the FCM migration based on your code, thanks for the help! Would be great if you could have a look at it.

@armanbilge
Copy link
Contributor Author

Thanks for the review @seglo, working my way through it.

These are probably corner cases anyway.

As far as I can tell the compute-engine strategy also works on cloud shell since that runs in a compute engine instance anyway. It would be a happy surprise if App Engine could also use this but not sure.

@armanbilge
Copy link
Contributor Author

armanbilge commented Mar 24, 2021

@seglo did I miss any comments or anything else left to do? Otherwise I think should be in good shape for merge pending CI. EDIT: still a couple things to fix.

Ok, good to go for real! Pending CI :)

Copy link
Member

@seglo seglo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I'll wait for the build to go thru and merge this evening or tomorrow.

@seglo seglo merged commit 3d3d267 into akka:master Mar 25, 2021
@seglo seglo added this to the 3.0.0-M1 milestone Mar 25, 2021
@armanbilge
Copy link
Contributor Author

Thanks for merging! Besides #773 there's a number of issues this potentially closes: #523, #825, #923, #1385, #1705, #2547.

@seglo
Copy link
Member

seglo commented Mar 26, 2021

Got it 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants