Nextflow to download omics data from DRS (Data Repository Service) based platforms.
This Nextflow pipeline contains a single process based on fasp-scripts
Read more resources related to DRS -
--reads A CSV file with particular set of columns
--key_file Credentials file
--genome_fasta A genome fasta file (required for CRAM files)
--gcp_id Google cloud project ID
You will be needing two things from - https://gen3.theanvil.io/
- manifest file
- credentials file
Original downloaded manifest.json
(example) file need to be converted into manifest.csv
(example) in order to be accepted in --reads
, for doing that you can do this -
pip install csvkit
in2csv manifest.json > manifest.csv
NOTE: Make sure the manifest.csv
file have five columns, Check from examples
Downloaded credentials.json
file can be provided in --key_file
param.
NOTE: Make sure credentials.json
is a latest one. They have expiry dates when you download.
If you running with AnviL Gen3-DRS files you also need to provide a Genome fasta file with --genome_fasta
, which will be used to convert CRAM files to BAM format.