Skip to content
This repository has been archived by the owner on Aug 4, 2023. It is now read-only.

Add reingestion DAG for Phylopic #830

Merged
merged 2 commits into from
Oct 28, 2022
Merged

Conversation

stacimc
Copy link
Contributor

@stacimc stacimc commented Oct 25, 2022

Fixes

Fixes WordPress/openverse#1500 by @stacimc

Description

Now that all blockers are resolved, we can add a reingestion workflow for Phylopic! With the most recent changes to Phylopic in its refactor (#747), it is now a straightforward dated DAG and we can wire up a reingestion workflow with the same minimal configuration as Metropolitan.

Screen Shot 2022-10-25 at 8 57 49 AM

I've chosen the same conf options as described in #819.

Testing Instructions

just test

Set an INGESTION_LIMIT to something small and run phylopic_reingestion_workflow locally 😄

Checklist

  • My pull request has a descriptive title (not a vague title like Update index.md).
  • My pull request targets the default branch of the repository (main) or a parent feature branch.
  • My commit messages follow best practices.
  • My code follows the established code style of the repository.
  • I added or updated tests for the changes I made (if applicable).
  • I added or updated documentation (if applicable).
  • I tried running the project locally and verified that there are no visible errors.

Developer Certificate of Origin

Developer Certificate of Origin
Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.


Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

@stacimc stacimc added 🟨 priority: medium Not blocking but should be addressed soon 🌟 goal: addition Addition of new feature 💻 aspect: code Concerns the software code in the repository labels Oct 25, 2022
@stacimc stacimc requested a review from a team as a code owner October 25, 2022 16:04
@stacimc stacimc added this to In progress in Openverse PRs via automation Oct 25, 2022
@stacimc stacimc self-assigned this Oct 25, 2022
@openverse-bot openverse-bot moved this from In progress to Needs review in Openverse PRs Oct 25, 2022
@stacimc stacimc force-pushed the add/reingestion-flow-for-phylopic branch from 93d9f90 to 70cb19e Compare October 27, 2022 19:28
Copy link
Contributor

@AetherUnbound AetherUnbound left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran this locally and it worked great! I noticed that skips aren't bubbled up so I made #844 to tackle that 🚀

Copy link
Member

@krysal krysal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So nice it's so straightforward now! ✨

However, a thing I noticed while reviewing is that the Phylopic script isn't getting any data, see #847. I'm not sure if this can be fully tested with that issue, but I haven't wrapped the ingestion process in my head entirely (yet) anyway so approving to unblock.

Openverse PRs automation moved this from Needs review to Reviewer approved Oct 28, 2022
@stacimc stacimc merged commit 96d9c85 into main Oct 28, 2022
Openverse PRs automation moved this from Reviewer approved to Merged! Oct 28, 2022
@stacimc stacimc deleted the add/reingestion-flow-for-phylopic branch October 28, 2022 17:32
@stacimc
Copy link
Contributor Author

stacimc commented Oct 28, 2022

There's a response on WordPress/openverse#1374 but commenting here as well for posterity -- Phylopic is just very sparse, so we often see no data. Testing locally I'm still getting data for some days, even if it's only a single image.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
💻 aspect: code Concerns the software code in the repository 🌟 goal: addition Addition of new feature 🟨 priority: medium Not blocking but should be addressed soon
Projects
No open projects
Openverse PRs
  
Merged!
Development

Successfully merging this pull request may close these issues.

Create ingestion workflow for Phylopic
3 participants