Skip to content
This repository has been archived by the owner on Jan 23, 2021. It is now read-only.

Use case: Create a build manifest given a set of sequencescape IDs, oxford code/ROMA IDs or Alfresco study codes #26

Open
magnusmanske opened this issue Nov 26, 2018 · 3 comments
Assignees
Labels
Milestone

Comments

@magnusmanske
Copy link
Contributor

In the following, a build manifest is considered to be a file with one line per file, and to contain at the minimum:

  • iRODs path
  • Sample ID (Oxford code or ROMA ID)
  • Alfresco study code
  • NCBI taxon ID
  • Manual QC status
  • Date manual QC complete
  • ENA run accession
  • ENA sample accession

Create a build manifest given a set of sequencescape IDs

Possible with current version:

SELECT vw_sample_tag.value,full_path FROM vw_sample_tag,vw_sample_file WHERE tag_id=3585 AND vw_sample_tag.sample_id=vw_sample_file.sample_id AND `value` IN (list_of_sequenscape_IDs);

See (how to build a manifest)[https://github.com/wtsi-team112/fits/blob/master/documentation/How_to_build_a_manifest.md].

Create a build manifest given a set of Oxford codes and/or ROMA IDs

Possible with current version, similar to above.

Create a build manifest given a set of Alfresco study codes

The current version does not track Alfresco study codes. These can be imported, though a sample tracking system might be a more appropriate place for this information. Alfresco study names (number&text) are present in FITS for many samples, imported from Solaris and study names from sequenscape, in various stages of completion/correctness.

@magnusmanske magnusmanske added this to the MVP V1 milestone Nov 26, 2018
@magnusmanske magnusmanske self-assigned this Nov 26, 2018
@podpearson
Copy link
Member

@magnusmanske you say "The current version does not track Alfresco study codes", but issue #25 says it is possible to build a manifest with Alfresco study code with the current version. These can't both be correct. Does the current version contain Alfresco study codes or not?

@magnusmanske
Copy link
Contributor Author

Trying to be specific here:

  • FITS current does track 44,582 samples (by FITS definition) with an Alfresco study name, such as "1087-AN-HAPMAP-DONNELLY".
  • FITS current does NOT track dedicated Alfresco study IDs, such as "1087". These can be extracted from the Alfresco study name on-the-fly.
    There is no mechanism at the moment to automatically add new associations of samples to Alfresco study names; sequenscape study names are imported from MLWH automatically, but I hesitate to implement a "guessing game" to set them as Alfresco studies based on some pattern, especially since some sequenscape study names supposed to be Alfresco study names are mangled/wrong.

@podpearson
Copy link
Member

@magnusmanske yes, I am interested in the full Alfresco study name here (not just the 4-digit code). As with #28 I guess what I'm still unclear about what are the data sources for "Alfresco study", and whether this can be complete without first addressing #44. I think I can live with this not being in MVP v1, in which case we should perhaps close this issue and create a new one, but let's get feedback from production team before we close this issue. Also, if not in MVP v1, I think this should be high priority of v2.

@podpearson podpearson reopened this Dec 18, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants