Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

java CDK: build no longer downloads files from connector registry #34441

Merged
merged 7 commits into from
Jan 25, 2024

Conversation

postamar
Copy link
Contributor

@postamar postamar commented Jan 23, 2024

These files were part of the CDK build and formed a circular dependency: their content is updated based on the connector metadata when a connector is published. Incidentally, removing these also allows removing the dependencies on micronaut.

This PR is best reviewed commit-by-commit:

  • 95f6e26 removes the whole init-oss module and changes DestinationAcceptanceTest, which is the only downstream dependency, to rely on the connector's metadata.yaml file instead of the connector metadata json blob.
  • 40bf905 removes all references to micronaut in the repo; this is made possible by the previous commit.
  • b25fe6f removes the dependency on the specs_secrets_mask.yaml file which was used by a custom log4j plugin to scrub secrets from the logs. The log config in this repo is now only used for testing and since tests are run via airbyte-ci the secrets will already be scrubbed by dagger.
  • the final two commits are version accounting boilerplate and also ensure the changes are validated on a couple of connectors.

Copy link

vercel bot commented Jan 23, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
airbyte-docs ⬜️ Ignored (Inspect) Visit Preview Jan 25, 2024 7:01pm

Copy link
Contributor

github-actions bot commented Jan 23, 2024

Coverage report for source-postgres

There is no coverage information present for the Files changed

Total Project Coverage 71.63% 🍏

@postamar postamar force-pushed the postamar/remove-file-downloads-from-build branch 2 times, most recently from cad9d5b to 8c129b9 Compare January 23, 2024 17:24
@postamar postamar marked this pull request as ready for review January 23, 2024 17:33
@postamar postamar requested review from a team as code owners January 23, 2024 17:33
@postamar postamar force-pushed the postamar/remove-file-downloads-from-build branch from 80ad63c to 92e6586 Compare January 23, 2024 20:23
@postamar
Copy link
Contributor Author

Confirmed with @alafanechere that the test logs will be scrubbed in github, but it's possible that the test logs end up in some other report, like a gradle scan, and remain unscrubbed. I'll investigate further.

@postamar postamar removed the request for review from bnchrch January 24, 2024 18:09
@postamar postamar force-pushed the postamar/remove-file-downloads-from-build branch from 92e6586 to a0bac97 Compare January 24, 2024 21:57
@postamar postamar requested a review from a team January 24, 2024 21:57
@postamar postamar force-pushed the postamar/remove-file-downloads-from-build branch from a0bac97 to 4e1a328 Compare January 24, 2024 22:02
@postamar
Copy link
Contributor Author

@alafanechere could you review my last commit 4e1a328 please?
This should ensure that the java logs never leak any secrets. I've tested this manually.

Copy link
Contributor

@alafanechere alafanechere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To what extent would it be harder / simpler to create this pattern within the containers under test instead of building the pattern in the Python logic?

I'm also wondering if we could declare this logic within the GradleTask step.

Slightly out of scope:
Would it also be worth configuring the log scrubbing at the connector image level at runtime: when the config is parsed

@@ -326,3 +327,28 @@ def fail_if_missing_docker_hub_creds(ctx: click.Context) -> None:
raise click.UsageError(
"You need to be logged to DockerHub registry to run this command. Please set DOCKER_HUB_USERNAME and DOCKER_HUB_PASSWORD environment variables."
)


def java_log_scrub_pattern(secrets_to_mask: List[str]) -> str:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add a unit test for this? 🙏

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! sorry it slipped my mind

Copy link
Contributor Author

Thanks for taking a look! I'll add the unit test.

To what extent would it be harder / simpler to create this pattern within the containers under test instead of building the pattern in the Python logic?

Unfortunately, the scrubbing needs to be applied to all logs, including the test framework, and not just the logs emitted by the connector container. Also, not all secrets are equal: the scrubbing should only apply to those secrets downloaded by ci_credentials, we don't want to scrub passwords for testcontainer instances (which furthermore are often dumb strings like "password" or "test").

I'm also wondering if we could declare this logic within the GradleTask step.

The integration test task class would be its proper place but for some reason, dagger secrets are defined by the PipelineContext, which isn't something I'm a fan of. This is why I moved the pattern generation logic to util instead of inlining it.

Slightly out of scope:
Would it also be worth configuring the log scrubbing at the connector image level at runtime: when the config is parsed

Indeed. @stephane-airbyte and I are also thinking about how to properly scrub secrets in prod. It's a bit more involved in that case.

@postamar postamar force-pushed the postamar/remove-file-downloads-from-build branch from 4e1a328 to b96fa1b Compare January 25, 2024 17:35
@postamar
Copy link
Contributor Author

I added a unit test and checked again that the scrubbing works (it does).

@postamar postamar force-pushed the postamar/remove-file-downloads-from-build branch from b96fa1b to d6c9f7a Compare January 25, 2024 17:58
@postamar
Copy link
Contributor Author

postamar commented Jan 25, 2024

/publish-java-cdk

🕑 https://github.com/airbytehq/airbyte/actions/runs/7658937523
✅ Successfully published Java CDK version=0.15.0!

@postamar postamar merged commit d01bb65 into master Jan 25, 2024
28 checks passed
@postamar postamar deleted the postamar/remove-file-downloads-from-build branch January 25, 2024 19:44
jatinyadav-cc pushed a commit to ollionorg/datapipes-airbyte that referenced this pull request Feb 21, 2024
jatinyadav-cc pushed a commit to ollionorg/datapipes-airbyte that referenced this pull request Feb 26, 2024
jatinyadav-cc pushed a commit to ollionorg/datapipes-airbyte that referenced this pull request Feb 26, 2024
jatinyadav-cc pushed a commit to ollionorg/datapipes-airbyte that referenced this pull request Feb 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants