Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove GenBank metadata diff #266

Merged
merged 2 commits into from Jan 4, 2022
Merged

Remove GenBank metadata diff #266

merged 2 commits into from Jan 4, 2022

Conversation

joverlee521
Copy link
Contributor

The csv-diff within notify-on-metadata-change began running into
a MemoryError on December 26, 2021. This was patched during the holidays
by increasing the memory for Genbank ingest runs (bd2c2ca).

Further investigation showed that we have not been receiving the
metadata diff output files since we switched to Snakemake on
December 1, 2021 (74de8b1). The upload rule has been running before
notify_genbank resulting in diffing of the same file that has been
uploaded to S3. This is caused by upload not having any dependencies
on notify_genbank and notify_genbank having an additional dependency
on check_locations. This prompted an internal discussion that showed
no one has been using the outputs of this metadata diff.

The `csv-diff` within `notify-on-metadata-change` began running into
a MemoryError on December 26, 2021. This was patched during the holidays
by increasing the memory for Genbank ingest runs (bd2c2ca).

Further investigation showed that we have not been receiving the
metadata diff output files since we switched to Snakemake on
December 1, 2021 (74de8b1). The `upload` rule has been running before
`notify_genbank` resulting in diffing of the same file that has been
uploaded to S3. This is caused by `upload` not having any dependencies
on `notify_genbank` and `notify_genbank` having an additional dependency
on `check_locations`. This prompted an internal discussion that showed
no one has been using the outputs of this metadata diff.
This reverts commit bd2c2ca.

The memory issue has since been resolved by
commit 64885d1.
@joverlee521
Copy link
Contributor Author

Tested locally that notify_genbank works as expected after the changes.
Cancelled the branch runs since they are not affected by the change.

@joverlee521 joverlee521 merged commit 02a947b into master Jan 4, 2022
@joverlee521 joverlee521 deleted the genbank-metadata-diff branch January 4, 2022 21:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant