Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a data manager to retrieve liftOver files from UCSC #1904

Open
4 tasks
jennaj opened this issue Mar 10, 2016 · 0 comments
Open
4 tasks

Create a data manager to retrieve liftOver files from UCSC #1904

jennaj opened this issue Mar 10, 2016 · 0 comments

Comments

@jennaj
Copy link
Member

jennaj commented Mar 10, 2016

Suggested options:

  • List out all loaded UCSC reference genomes by dbkey and name (only UCSC genomes!) that are already on the instance (not just in the builds list, but dbkeys that actually have fasta data loaded) so that multiple can be checked off to queue as one launched job. Spawning individual output datasets per genome is probably a good idea to track what was pulled or not. A top header above the checkboxes for "Select all, none" would be helpful.
  • Make the DM "smart" so that it will get new liftOver data only and ignore data already loaded (no duplicates). Ideally, the tool would check for pre-existing dbkeys that have liftOver content loaded (loc), compare existing files against all files available at UCSC, then load whatever is new. This is somewhat important since genomes often have liftOver data added as time passes. A way to update this data with a DM, without creating duplicates, along with retrieving all data for new genomes, in one tool, in batch, will make data admins happy.
  • Limit the tool so that it the retrieval of data is for 1-3 genomes at a time, allow to complete, then start next 1-3, repeat. UCSC will time out if too many connections are made at the same time, resulting in failed jobs._ The jobs must be paced temporally in order to not trigger a block_.
  • Other ideas to make this type of tool useful?

Move this request to tools? galaxy or iuc?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants