Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug-1881575: support storing crashes in gcs #977

Merged
merged 9 commits into from Mar 7, 2024
Merged

bug-1881575: support storing crashes in gcs #977

merged 9 commits into from Mar 7, 2024

Conversation

relud
Copy link
Member

@relud relud commented Feb 13, 2024

No description provided.

@relud relud force-pushed the gcs-support branch 7 times, most recently from f42a75f to 3670126 Compare February 20, 2024 22:05
@relud relud marked this pull request as ready for review February 20, 2024 22:11
@relud relud requested a review from a team as a code owner February 20, 2024 22:11
@willkg willkg self-assigned this Feb 21, 2024
@willkg
Copy link
Collaborator

willkg commented Feb 22, 2024

This is tied to the wrong bug. This PR is one step in updating crash ingestion to use GCS instead of S3 for storage. The Socorro processor also has a crash storage for exporting crash data to Telemetry which is ultimately stored in telemetry.socorro_crash. That export involves storing crash data in a different S3 bucket which a DAG looks at for importing crash data and storing it in telemetry.socorro_crash.

I don't think we have a general bug for switching from S3 to GCS as part of the GCP migration. We need to create a new bug.

@relud relud changed the title bug-1579266: support storing crashes in gcs bug-1881575: support storing crashes in gcs Feb 22, 2024
@relud
Copy link
Member Author

relud commented Feb 22, 2024

We need to create a new bug.

filed https://bugzilla.mozilla.org/show_bug.cgi?id=1881575 and updated this PR to point there instead

@willkg
Copy link
Collaborator

willkg commented Feb 22, 2024

Thank you! Sorry it took me ages to notice.

@relud relud force-pushed the gcs-support branch 4 times, most recently from dc5b852 to 5d3f78b Compare February 27, 2024 22:30
project="test",
)
else:
self.client = storage.Client()
Copy link
Member Author

@relud relud Feb 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unlike for pubsub in #974 (comment) the default retry and timeout behavior for google-cloud-storage is better defined. we use two network methods in this class: Client.get_bucket and Blob.upload_from_string, both of which have a default timeout of 60 seconds. Blob.upload_from_string is expected not to retry because the client assumes it is not an idempotent action. Client.get_bucket may retry and I think that's fine given there is a default timeout set, and a default retry timeout of 120 seconds. Overall a single file upload is bounded to (bucket retry timeout)+(bucket rpc timeout)+(blob upload rpc timeout) => 120+60+60 => 240 seconds.

tl;dr I think this client sets sane defaults for retry and timeout, and we shouldn't mess with them unless/until we see an issue.

systemtest/conftest.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@willkg willkg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's one thing that should get fixed, one thing we should figure out, and I think we should redo the tests (other than the permissions one) to use the emulator rather than mocking everything out.

Other than that, this looks good!

antenna/ext/gcs/crashstorage.py Outdated Show resolved Hide resolved
docker/config/local_dev.env Outdated Show resolved Hide resolved
tests/unittest/test_gcs_crashstorage.py Show resolved Hide resolved
antenna/ext/gcs/crashstorage.py Outdated Show resolved Hide resolved
@relud

This comment was marked as resolved.

@relud

This comment was marked as resolved.

tests/conftest.py Outdated Show resolved Hide resolved
@relud relud requested a review from willkg March 6, 2024 16:34
Copy link
Collaborator

@willkg willkg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the changes you made in the tests and I'm puzzled about the new problem with markus not closing the statsd socket. Can you explain more about what you're thinking?

@@ -36,6 +33,7 @@ CRASHMOVER_CRASHSTORAGE_ENDPOINT_URL=http://localstack:4566
CRASHMOVER_CRASHSTORAGE_REGION=us-east-1
CRASHMOVER_CRASHSTORAGE_ACCESS_KEY=foo
CRASHMOVER_CRASHSTORAGE_SECRET_ACCESS_KEY=foo
# Used for S3 and GCS
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a much better idea than what I was thinking. 👍

setup.cfg Outdated Show resolved Hide resolved
systemtest/conftest.py Outdated Show resolved Hide resolved
tests/conftest.py Outdated Show resolved Hide resolved
tests/external/test_gcs_crashstorage_with_emulator.py Outdated Show resolved Hide resolved
tests/unittest/conftest.py Outdated Show resolved Hide resolved
@relud relud requested a review from willkg March 7, 2024 18:54
Copy link
Collaborator

@willkg willkg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@relud relud merged commit a61d0cb into main Mar 7, 2024
1 check passed
@relud relud deleted the gcs-support branch March 7, 2024 20:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants