-
Notifications
You must be signed in to change notification settings - Fork 297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: remove google cloud storage dependency for gcs datalake test using fake-gcs-server #3576
Conversation
c9e50a5
to
3582644
Compare
Codecov ReportPatch coverage has no change and project coverage change:
Additional details and impacted files@@ Coverage Diff @@
## master #3576 +/- ##
==========================================
+ Coverage 67.94% 68.02% +0.07%
==========================================
Files 318 318
Lines 50262 50262
==========================================
+ Hits 34153 34193 +40
+ Misses 13877 13845 -32
+ Partials 2232 2224 -8 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@achettyiitr This is nice, it seems like the project is well maintained too. However I'm concerned about not being able to detect regressions as timely as we would be able to if we were to use the real thing.
Considering that we are already using the real GCS, what do you think we would gain by using this emulator in its place (beside faster tests)?
|
So you're saying that when handling data in BQ the underlying API does the same operations we do for Datalakes, meaning creating buckets and uploading files on GCS?
Before doing that we would need to sync with @pChondros so that we stay covered until we're ready there. |
Yeah. That's the reason, for S3 datalake I am using minio container, and for Azure Datalake I am using azurite container. |
I checked the bigquery client and even though it is calling the Google API, it is calling different endpoints, it's not really creating buckets. What am I missing? Anyway, considering that on Google Cloud APIs they do proper versioning I am OK with the change. As long as we don't upgrade the client we should be fine. However, we will have to keep in mind that we wouldn't be able anymore to just upgrade the client, run a test and release. We would need to check manually and manual checks means that people need to remember to do them. So I'm keen to have some regression test on the e2e suite that would run less often as these. Thoughts? (cc @pChondros) |
Since we already have the |
|
It makes sense now. So I think we should have an integration test for GCS, one for S3 and one for Azure in the layer that is using them (e.g. This way we can mock the storage layer in the other tests like BQ, Datalakes, Redshift, etc... |
Description
Notion Ticket
https://www.notion.so/rudderstacks/Remove-GCS-dependency-using-fake-gcs-server-7321148ad63a44daa2d967bfa02dfaf8?pvs=4
Security