DM-36999: Complete integration test at USDF #38

kfindeisen · 2022-11-15T17:45:58Z

This PR contains a number of miscellaneous fixes that prevented an end-to-end run on USDF. Many of these are preexisting bugs that were never exercised on GCP.

This appears to be a limit of the boto3 API; see #35 for discussion.

The Contents field does not exist if no files are found, so the old code would crash with a KeyError.

The topic is still hard-coded into the file, until we can decide how we want topics to be organized.

python/activator/middleware_interface.py

tests/test_middleware_interface.py

hsinfang · 2022-11-17T23:51:12Z

python/tester/upload.py

+            _log.info(f"No raw files found for {instrument}, generating dummy files instead.")
+            upload_from_random(producer, instrument, dest_bucket, n_groups, new_group_base)
+    finally:
+        producer.flush(30.0)


I see the warning but I'm not sure if it matters to flush and wait here. Don't they get sent out still?

Yes, the messages get sent, but such warnings usually mean that a class finalizer is doing cleanup that it can't be trusted to do. Also, the correctness of not flushing depends on the lower-level code not using callbacks, and there's no need for main to assume that.

As for waiting, the delay when I tested it was too small to measure.

The test assertion for raw export was originally written in terms of avoiding the standard behavior, but there's no reason not to test for the behavior we want instead, and it's easier to read.

This bug is currently caused by repetition of input exposure IDs, but it may also happen in the future as a result of retries, especially late-stage ones. Solution is to have the export code check for and only export the latest run, though this comes at the cost of not being able to retry previously failed exports.

We're not currently using callbacks, but Producer emits a warning if we don't flush anyway.

kfindeisen added 7 commits November 14, 2022 15:02

Use explicit S3 endpoint in activator.

d81072f

This appears to be a limit of the boto3 API; see #35 for discussion.

Update instructions for upload.py to reflect USDF.

be1a742

Document log streaming in playbook.

d87df99

Fix fatal bug in check_for_snap.

8a58099

The Contents field does not exist if no files are found, so the old code would crash with a KeyError.

Remove Google dependencies from MiddlewareInterface.

52981ab

Fix preliminary bucket notification topic.

de3458f

The topic is still hard-coded into the file, until we can decide how we want topics to be organized.

Fix use of old ingestion API in raw-notification branch.

f52b86b

kfindeisen mentioned this pull request Nov 15, 2022

DM-36999: Complete integration test at USDF #37

Closed

kfindeisen force-pushed the tickets/DM-36999 branch from ad32cb8 to 9b4a6e7 Compare November 16, 2022 01:44

kfindeisen marked this pull request as ready for review November 16, 2022 02:23

kfindeisen requested a review from hsinfang November 16, 2022 18:26

hsinfang reviewed Nov 17, 2022

View reviewed changes

python/activator/middleware_interface.py Show resolved Hide resolved

hsinfang reviewed Nov 17, 2022

View reviewed changes

tests/test_middleware_interface.py Outdated Show resolved Hide resolved

hsinfang reviewed Nov 17, 2022

View reviewed changes

hsinfang approved these changes Nov 17, 2022

View reviewed changes

kfindeisen added 3 commits November 18, 2022 10:41

Remove unnecessarily negative test assertion.

81ff890

The test assertion for raw export was originally written in terms of avoiding the standard behavior, but there's no reason not to test for the behavior we want instead, and it's easier to read.

Clean up Producer use in upload.py.

4bbce1b

We're not currently using callbacks, but Producer emits a warning if we don't flush anyway.

kfindeisen force-pushed the tickets/DM-36999 branch from 9b4a6e7 to 4bbce1b Compare November 18, 2022 18:44

kfindeisen merged commit fbaf4e3 into main Nov 18, 2022

kfindeisen deleted the tickets/DM-36999 branch November 18, 2022 19:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DM-36999: Complete integration test at USDF #38

DM-36999: Complete integration test at USDF #38

Uh oh!

kfindeisen commented Nov 15, 2022 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

hsinfang Nov 17, 2022

Uh oh!

kfindeisen Nov 18, 2022 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

DM-36999: Complete integration test at USDF #38

DM-36999: Complete integration test at USDF #38

Uh oh!

Conversation

kfindeisen commented Nov 15, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hsinfang Nov 17, 2022

Choose a reason for hiding this comment

Uh oh!

kfindeisen Nov 18, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kfindeisen commented Nov 15, 2022 •

edited

Loading

kfindeisen Nov 18, 2022 •

edited

Loading