You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for the great package.
I installed nucleoatac and tried the provided example.
The example worked and finished in around 5 minutes.
Then I ran my own bam file (human data) and it was very slow. The bed file is human genome modified with slopBed (-b 600). Chromosomes are labelled with numbers only (thus 3, instead of chr3 or chrIII), but it is consistent among the bed, bam and fa files
Here is what I ran:
/package/NucleoATAC-0.3.1/bin/nucleoatac run --bed /data/hs38_genes_b600.bed --bam /data/Sample_Lib_A376_2/Bowtie2/50K.bam --fasta /data/repository/organisms/GRCh38_ensembl/genome_fasta/genome.fa --out Sample_Lib_A376_2
I can finish bowtie to generate these bam files very quickly, but with the same computational resources, the nucleoatac ran for 4 days and was still running. It generated the output file and was modifying it. There are no error messages.
The bam file contain 40 mil read. Is it normal that it runs for a week? I am new to data analysis, it would be nice you can point me to relevant settings that I should check.
You can use the --cores flag to use multiple cores to speed things up. You set --cores 10 to use 10 cores for example.
Even using multiple cores, the program is not intended to be run on a whole genome or anything close to that. As most of the genome will have very little coverage (and the little coverage there is will be background noise), running it on anything but open chromatin peaks is a waste of time.
Hi Alicia,
Thanks for the great package.
I installed nucleoatac and tried the provided example.
The example worked and finished in around 5 minutes.
Then I ran my own bam file (human data) and it was very slow. The bed file is human genome modified with slopBed (-b 600). Chromosomes are labelled with numbers only (thus 3, instead of chr3 or chrIII), but it is consistent among the bed, bam and fa files
Here is what I ran:
/package/NucleoATAC-0.3.1/bin/nucleoatac run --bed /data/hs38_genes_b600.bed --bam /data/Sample_Lib_A376_2/Bowtie2/50K.bam --fasta /data/repository/organisms/GRCh38_ensembl/genome_fasta/genome.fa --out Sample_Lib_A376_2
I can finish bowtie to generate these bam files very quickly, but with the same computational resources, the nucleoatac ran for 4 days and was still running. It generated the output file and was modifying it. There are no error messages.
The bam file contain 40 mil read. Is it normal that it runs for a week? I am new to data analysis, it would be nice you can point me to relevant settings that I should check.
Many Thanks!!
Ken
Python 2.7.5
NucleoATAC (0.3.1)
$ pip list
cycler (0.9.0)
Cython (0.23.4)
matplotlib (1.5.0)
numpy (1.10.1)
pip (1.4.1)
pyparsing (2.0.5)
pysam (0.8.3)
python-dateutil (2.4.2)
pytz (2015.7)
scipy (0.16.1)
setuptools (0.9.8)
six (1.10.0)
wsgiref (0.1.2)
The text was updated successfully, but these errors were encountered: