Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPL-733 As a developer I would like to run a Manual MLWH and DART Migration #144

Closed
9 tasks done
Chris-Friend opened this issue Nov 4, 2020 · 2 comments · Fixed by #160
Closed
9 tasks done

GPL-733 As a developer I would like to run a Manual MLWH and DART Migration #144

Chris-Friend opened this issue Nov 4, 2020 · 2 comments · Fixed by #160
Assignees
Labels
Beckman integration Beckman integration

Comments

@Chris-Friend
Copy link
Contributor

Chris-Friend commented Nov 4, 2020

User story
As developers we want to ensure that if a sample upload to the MLWH or DART databases fails with a critical exception, but the upload to mongo succeeds, we are able to run a manual process to pull samples from mongodb added between timestamps and attempt to re-add them to MLWH and DART (see existing manual MLWH migration that already exists in crawler - we want a combined version that keeps both in sync). The process needs to be idempotent, such that data that already exists is not overwritten. It will help to ensure the mongo, MLWH and DART databases remain in-sync through MLWH and DART upload failures. NB. DART inserts should only be done if samples for a particular plate have NOT already been cherrypicked. This does not handle changing filtered positive rules. See GPL-709 for this

Acceptance criteria
To be considered successful the solution must:

  • Add a new combined MLWH/DART migration or update the existing MLWH migration that:
    • requires start and end timestamp input parameters
    • performs the existing MLWH migration functionality
    • for DART fetches all RESULT=Positive samples from mongo between the timestamps. Also check for filtered_positive_version, lh_sample_uuid and lh_source_plate_uuid properties. If not we shouldn't add them as DART would become inconsistent/Beckman workflow expects these properties
    • for DART, further filters these above samples on which aren't cherrypicked (if any samples in a plate are cherrypicked, don't add the plate) E.g. by checking for the sample cherrypicked event in MLWH (see get_cherrypicked_samples method in lighthouse); or be checking the MLWH lighthouse_sample table for a COG UK Id (as currently samples only get a COG UK Id when selected for cherrypicking - may be susceptible to future changes)
    • Needs discussing: Re-determine if filtered positive? Is this required? Filtered positive fields should already be recorded in mongo, and there is a separate migration to update databases following filtered positive rule changes (GPL-709)
    • Needs discussing: Re-add to Mongo & MLWH (only necessary if re-determining filtered positive fields)
    • Add samples to DART (plates and well properties - see GPL-709's implementation and GPL-745)
  • include README.md documentation of this new process
  • in the normal file processing, don't attempt to add to DART if the MLWH insert fails Split to GPL-764 Do not update DART if the MLWH insert fails #162

Dependencies
This story is blocked by the following dependencies:

Additional context
NB. There are specific error types that will show for a file in the Lighthouse-UI imports screen. This will need to be monitored to indicate to us when we need to run the migration.
The migration should also be able to insert legacy data for us, e.g. to get the initial plates into DART for cherrypicking.

@Chris-Friend Chris-Friend changed the title GPL-nnn Manual DART Migration GPL-nnn As a developer I would like to run a Manual DART Migration Nov 4, 2020
@Chris-Friend Chris-Friend transferred this issue from sanger/General-Backlog-Items Nov 4, 2020
@rl15 rl15 changed the title GPL-nnn As a developer I would like to run a Manual DART Migration GPL-733 As a developer I would like to run a Manual DART Migration Nov 4, 2020
@rl15 rl15 added the Beckman integration Beckman integration label Nov 4, 2020
@andrewsparkes andrewsparkes changed the title GPL-733 As a developer I would like to run a Manual DART Migration GPL-733 As a developer I would like to run a Manual MLWH and DART Migration Nov 16, 2020
@pjvv pjvv self-assigned this Nov 23, 2020
@Chris-Friend
Copy link
Contributor Author

With regards to the issue of uploading legacy plates that have already been destroyed/used and no longer available, the current approach of just filtering out cherrypicked samples is sufficient. The risk of uploading too much data will be mitigated by only adding legacy data within a sensible timeframe, and reporting will most likely be done not from the DART database, instead using existing reporting channels. Better to have too much data in DART than not enough

@rl15
Copy link

rl15 commented Dec 2, 2020

Discussion in weekly Heron meeting tentative agreement to load samples with tested date from 1st December.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Beckman integration Beckman integration
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants