Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

run_FastQTL_threaded.py script doen't work #4

Closed
Veliborka opened this issue Jul 10, 2018 · 3 comments
Closed

run_FastQTL_threaded.py script doen't work #4

Veliborka opened this issue Jul 10, 2018 · 3 comments

Comments

@Veliborka
Copy link

Hello,

We are trying to run the eQTL analysis using this modified version of the FastQTL tool which supports multi-threaded execution. The BED and VCF files we use contain all the chromosomes. We did all the necessary preprocessing steps (replaced all the genotypes different from 0|0, 0|1, 1|0, 1|1 with .|. and the samples are the same in both VCF and BED files). We managed to run successfully the original version of the fastQTL tool on the same files. But we didn't succeed in running this modified script. This is the command we tried to run:

/opt/fastqtl/python/run_FastQTL_threaded.py all_chr.vcf.gz all_genes_expression.bed.gz all_genes --window 1e6 --chunks 100 --threads 16

This is the error we get:
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/lib/python3.5/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/usr/lib/python3.5/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "/opt/fastqtl/python/run_FastQTL_threaded.py", line 48, in perm_worker
s = subprocess.check_call(cmd, shell=True, executable='/bin/bash', stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
File "/usr/lib/python3.5/subprocess.py", line 581, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '/opt/fastqtl/bin/fastQTL --vcf /sbgenomics/Projects/cb8a783b-a338-4a0b-a63b-8d154c5f7ebd/all_test_vcf_tabix.vcf.gz --bed /sbgenomics/Projects/cb8a783b-a338-4a0b-a63b-8d154c5f7ebd/all_genes_expression_tabix.bed.gz --window 1e6 --maf-threshold 0.0 --ma-sample-threshold 0 --chunk 5 100 --out all_genes_expression_tabix_chunk005.txt.gz --log all_genes_expression_tabix_chunk005.log' returned non-zero exit status -9
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/opt/fastqtl/python/run_FastQTL_threaded.py", line 86, in
assert res.get()[0]==0
File "/usr/lib/python3.5/multiprocessing/pool.py", line 608, in get
raise self._value
subprocess.CalledProcessError: Command '/opt/fastqtl/bin/fastQTL --vcf /sbgenomics/Projects/cb8a783b-a338-4a0b-a63b-8d154c5f7ebd/all_test_vcf_tabix.vcf.gz --bed /sbgenomics/Projects/cb8a783b-a338-4a0b-a63b-8d154c5f7ebd/all_genes_expression_tabix.bed.gz --window 1e6 --maf-threshold 0.0 --ma-sample-threshold 0 --chunk 5 100 --out all_genes_expression_tabix_chunk005.txt.gz --log all_genes_expression_tabix_chunk005.log' returned non-zero exit status -9

We also tried to make our own script which runs several tasks in parallel but each time we run this script only one of the tasks finishes successfully, all the others are killed.
We tried everything that came to our mind, but nothing helped. Each time we get the same error. Could you help us to solve this problem?

Thank you in advance,
Best regards,
Veliborka

@francois-a
Copy link
Owner

The error states that a specific chunk failed. Can you please try to run that separately and post the output?

/opt/fastqtl/bin/fastQTL --vcf /sbgenomics/Projects/cb8a783b-a338-4a0b-a63b-8d154c5f7ebd/all_test_vcf_tabix.vcf.gz --bed /sbgenomics/Projects/cb8a783b-a338-4a0b-a63b-8d154c5f7ebd/all_genes_expression_tabix.bed.gz --window 1e6 --maf-threshold 0.0 --ma-sample-threshold 0 --chunk 5 100 --out all_genes_expression_tabix_chunk005.txt.gz --log all_genes_expression_tabix_chunk005.log

@Veliborka
Copy link
Author

Yes, and this is not the only chunk that failed. Many of them failed. I tried to run the command for some of the failed chunks (5, 9, 16 etc) and each time the error is the same. Follow the command and the logs for the chunk 9:

/opt/fastqtl/bin/fastQTL --vcf /data/all_mutations_vcf.tab.vcf.gz --bed /data/all_genes_new.tab.bed.gz --window 1e6 --maf-threshold 0.0 --ma-sample-threshold 0 --chunk 9 100 --out all_genes_expression_tabix_chunk009.txt.gz --log all_genes_expression_tabix_chunk009.log

Fast QTL

Perform nominal analysis (used to get raw p-values of association)

  • Using p-value threshold = 1.0000000000
  • Random number generator is seeded with 1531241022
  • Considering variants within 1e+06 bp of the MPs
  • Using minor allele frequency threshold = 0.0000
  • Using minor allele sample threshold = 0
  • Using INFO field AF threshold = 0.0000
  • Chunk processed 9 / 100

Scanning phenotype data in [/data/all_genes_new.tab.bed.gz]

  • 14200 phenotypes

Reading phenotype data in [/data/all_genes_new.tab.bed.gz]

  • region = 1:183387737-214454445
  • 445 samples included
  • 150 phenotypes included

Reading genotype data in [/data/all_mutations_vcf.tab.vcf.gz] in VCF format

  • region = 1:182387737-215454445
  • 100000 lines parsed
  • 200000 lines parsed
  • 300000 lines parsed
  • 400000 lines parsed
  • 500000 lines parsed
  • 600000 lines parsed
  • 700000 lines parsed
    Killed

@francois-a
Copy link
Owner

It's not easy to guess the error from this output. Could this be a memory issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants