very slow nucleoatac run #34

kenkclam · 2016-01-25T14:20:30Z

Hi Alicia,

Thanks for the great package.
I installed nucleoatac and tried the provided example.
The example worked and finished in around 5 minutes.

Then I ran my own bam file (human data) and it was very slow. The bed file is human genome modified with slopBed (-b 600). Chromosomes are labelled with numbers only (thus 3, instead of chr3 or chrIII), but it is consistent among the bed, bam and fa files
Here is what I ran:
/package/NucleoATAC-0.3.1/bin/nucleoatac run --bed /data/hs38_genes_b600.bed --bam /data/Sample_Lib_A376_2/Bowtie2/50K.bam --fasta /data/repository/organisms/GRCh38_ensembl/genome_fasta/genome.fa --out Sample_Lib_A376_2

I can finish bowtie to generate these bam files very quickly, but with the same computational resources, the nucleoatac ran for 4 days and was still running. It generated the output file and was modifying it. There are no error messages.

The bam file contain 40 mil read. Is it normal that it runs for a week? I am new to data analysis, it would be nice you can point me to relevant settings that I should check.

Many Thanks!!
Ken

Python 2.7.5
NucleoATAC (0.3.1)
$ pip list
cycler (0.9.0)
Cython (0.23.4)
matplotlib (1.5.0)
numpy (1.10.1)
pip (1.4.1)
pyparsing (2.0.5)
pysam (0.8.3)
python-dateutil (2.4.2)
pytz (2015.7)
scipy (0.16.1)
setuptools (0.9.8)
six (1.10.0)
wsgiref (0.1.2)

kenkclam · 2016-01-27T23:23:34Z

Actually I cannot run nucleoatac with all the active genes.
When I reduced the bed file a lot, then it was fine.

AliciaSchep · 2016-01-28T03:00:46Z

You can use the --cores flag to use multiple cores to speed things up. You set --cores 10 to use 10 cores for example.

Even using multiple cores, the program is not intended to be run on a whole genome or anything close to that. As most of the genome will have very little coverage (and the little coverage there is will be background noise), running it on anything but open chromatin peaks is a waste of time.

kenkclam closed this as completed Jan 27, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

very slow nucleoatac run #34

very slow nucleoatac run #34

kenkclam commented Jan 25, 2016

kenkclam commented Jan 27, 2016

AliciaSchep commented Jan 28, 2016

very slow nucleoatac run #34

very slow nucleoatac run #34

Comments

kenkclam commented Jan 25, 2016

kenkclam commented Jan 27, 2016

AliciaSchep commented Jan 28, 2016