Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add OGM harvest to CLI commands #134

Merged
merged 3 commits into from
Feb 6, 2024
Merged

Conversation

ghukill
Copy link
Collaborator

@ghukill ghukill commented Feb 5, 2024

Purpose and background context

This PR fills out the stubbed OGM harvest command in click CLI commands, making the CLI ready when records normalization is complete for OGM harvests.

How can a reviewer manually see the effects of these changes?

Set env vars:

WORKSPACE=dev
SENTRY_DSN=None
S3_RESTRICTED_CDN_ROOT=s3://cdn-origin-dev-222053980223/cdn/geo/restricted/
S3_PUBLIC_CDN_ROOT=s3://cdn-origin-dev-222053980223/cdn/geo/public/
GEOHARVESTER_SQS_TOPIC_NAME=geo-harvester-input-dev

It is now possible to run an OGM harvest via the CLI, that should produce some output indicating it has started:

pipenv run harvester --verbose harvest \
--harvest-type=incremental \
--from-date=2024-01-10 \
ogm

This will throw some errors, as the normalization logic is not complete, but it can be seen the harvest has started, with arguments like --from-date getting used as expected.

Includes new or updated dependencies?

NO

Changes expectations for external applications?

NO

What are the relevant tickets?

Developer

  • All new ENV is documented in README
  • All new ENV has been added to staging and production environments
  • All related Jira tickets are linked in commit message(s)
  • Stakeholder approval has been confirmed (or is not needed)

Code Reviewer(s)

  • The commit message is clear and follows our guidelines (not just this PR message)
  • There are appropriate tests covering any new functionality
  • The provided documentation is sufficient for understanding any new functionality introduced
  • Any manual tests have been performed and verified
  • New dependencies are appropriate or there were no changes

Why these changes are being introduced:
In preparation for OGM record normalization, which is the final major component before it is functional,
it made sense to finally expose OGM harvests at the CLI level for easy testing and use later.

How this addresses that need:
* Completes the OGM CLI harvest command and arguments

Side effects of this change:
* None

Relevant ticket(s):
* https://mitlibraries.atlassian.net/browse/GDT-109
Copy link
Contributor

@jonavellecuerdo jonavellecuerdo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good but needs to be rebased with main! :)

@ghukill ghukill force-pushed the GDT-109-normalize-ogm-records branch from fe419b4 to aff7ae9 Compare February 5, 2024 21:24
@ghukill
Copy link
Collaborator Author

ghukill commented Feb 5, 2024

Looks good but needs to be rebased with main! :)

@jonavellecuerdo - Rebased! This was an interesting one... I'm pretty confident the code would have been okay without that step, but you're right, that doing so might result in a cleaner shared git history.

Copy link
Contributor

@ehanson8 ehanson8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few suggestions but code looks solid

tests/test_cli.py Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
@ghukill ghukill merged commit d9aa2dd into main Feb 6, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants