You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I might have missed the information in the documentation, but I don't think there is a way to input a custom-made reference genome for the alignment step.
I was able to get around this by placing my genome fasta in the resources folder and renaming it to genome.fasta. This mirrors the output of the ENSEMBL-SEQUENCE wrapper so snakemake recognizes that step as already having been completed. I am still having issues with downstream rules so I'm unsure of how reliable this solution is.
Yes, the workflow downloads the reference from Ensembl based on the config file, because sadly things like the exact naming of chromosomes matters a lot for a bunch of downstream tools, especially tools like picard, GATK and the annotation tools.
So what @ArsenaultResearch describes is a possibilty to get around this automatic download, but you will run into a bunch of downstream parsing issues that you will then probably have to debug by introducing intermediate steps that fix things like the chromosome naming to what these tools expect. Classic bioinformatics, sorry... 🤷
Also, please ensure that you properly document what you are doing, optimally by writing a download rule for the reference you want to use, and optimally with the link that you download the reference from configured in the config.yaml. Otherwise, this reduces the reproducibility of your workflow, because it will be hard for others (or your future self;) to find out where you got your reference from and would require a manual download and deposition in the correct folder with the correct name. Automation is our friend... 🤖
I might have missed the information in the documentation, but I don't think there is a way to input a custom-made reference genome for the alignment step.
If I have well understood, the pipeline use the ENSEMBL-SEQUENCE wrapper to output a .fasta file based on species, build, and release parameters.
https://snakemake-wrappers.readthedocs.io/en/stable/wrappers/reference/ensembl-sequence.html
Is there any option to input a local .fasta file as reference genome?
The text was updated successfully, but these errors were encountered: