Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make_snpbd : invalid GATK command #21

Closed
ScaonE opened this issue Nov 12, 2018 · 2 comments
Closed

make_snpbd : invalid GATK command #21

ScaonE opened this issue Nov 12, 2018 · 2 comments

Comments

@ScaonE
Copy link

ScaonE commented Nov 12, 2018

Dear all,

When lauching the following command :
python2.7 ./run_snapperdb.py make_snpdb -c custom_salmo.txt;

It output the stderr below :

/home/scaonp01/.local/lib/python2.7/site-packages/psycopg2/init.py:144: UserWarning: The psycopg2 wheel package will be renamed from release 2.8; in order to keep installing from binary please use "pip install psycopg2-binary" instead. For details see: http://initd.org/psycopg/docs/install.html#binary-install-from-pypi.
""")
Namespace(command='make_snpdb', config_file='custom_salmo.txt', fastqs=[], log_dir='/home/scaonp01/Software/SnapperDB')
db_snapperdb already exists
FASTQs found for AM933172
[2018-11-12 15:12:59,831] INFO: Version: N/A
[2018-11-12 15:12:59,832] INFO: Initialising data matrix.
[2018-11-12 15:12:59,837] INFO: Mapping data file with bwa.
[2018-11-12 15:14:29,448] INFO: Creating digitised variants with gatk.
[2018-11-12 15:14:30,924] WARNING: Calling variants returned non-zero exit status.
[2018-11-12 15:14:30,925] WARNING: USAGE: [-h]

Available Programs:
...
.... (GATK available programs are listed) (-h)
....
A USER ERROR has occurred: '-T' is not a valid command.


Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace.

[2018-11-12 15:14:30,925] ERROR: VCF was not created.
/home/scaonp01/Software/SnapperDB/reference_genomes/snpdb/AM933172.filtered.vcf not found ...

I am using snapperdb.py 1.0.5, GATK-4.0.11.0.

printenv is showing :
PICARD_JAR=/home/scaonp01/Software/picard.jar
GATK_JAR=/home/scaonp01/Software/GATK-4.0.11.0/gatk-package-4.0.11.0-local.jar
GASTROSNAPPER_CONFPATH=/home/scaonp01/Software/SnapperDB/user_configs
GASTROSNAPPER_REFPATH=/home/scaonp01/Software/SnapperDB/reference_genomes

postgresql user "user_snapperdb" is a superuser and can access postgresql database "db_snapperdb"

Any tips ? Should i work with a 3.X version of GATK ?

Edit (config file "custom_salmo.txt") :

snpdb_name db_snapperdb
reference_genome AM933172
pg_uname user_snapperdb
pg_pword somepassword
pg_host localhost
depth_cutoff 10
mq_cutoff 30
ad_cutoff 0.9
average_depth_cutoff 30
mapper bwa
mapper_threads 8
variant_caller gatk
variant_caller_threads 8

@ScaonE
Copy link
Author

ScaonE commented Nov 12, 2018

Ok, I picked a random GATK 3.X version (3.7.0) and ran the same command again : it went further but did not complete. Now it seems that I have a postgresql related issue (see below) :

Ps : Required GATK version should be specified in the README.

/home/scaonp01/.local/lib/python2.7/site-packages/psycopg2/init.py:144: UserWarning: The psycopg2 wheel package will be renamed from release 2.8; in order to keep installing from binary please use "pip install psycopg2-binary" instead. For details see: http://initd.org/psycopg/docs/install.html#binary-install-from-pypi.
""")
Namespace(command='make_snpdb', config_file='custom_salmo.txt', fastqs=[], log_dir='/home/scaonp01/Software/SnapperDB')
db_snapperdb already exists
FASTQs found for AM933172
[2018-11-12 15:34:37,157] INFO: Version: N/A
[2018-11-12 15:34:37,157] INFO: Initialising data matrix.
[2018-11-12 15:34:37,162] INFO: Mapping data file with bwa.
[2018-11-12 15:36:05,364] INFO: Creating digitised variants with gatk.
[2018-11-12 15:37:49,841] INFO: Annotating
[2018-11-12 15:39:43,494] INFO: Applying filters: ['mq_score:30', 'min_depth:10', 'ad_ratio:0.9']
Traceback (most recent call last):
File "./run_snapperdb.py", line 312, in
main()
File "./run_snapperdb.py", line 307, in main
run_command(args)
File "./run_snapperdb.py", line 102, in run_command
vcf_to_db(args, config_dict, vcf)
File "/home/scaonp01/Software/SnapperDB/snapperdb/snpdb/init.py", line 52, in vcf_to_db
snpdb.snpdb_upload(vcf,args)
File "/home/scaonp01/Software/SnapperDB/snapperdb/snpdb/snpdb.py", line 463, in snpdb_upload
if not self.check_duplicate(vcf, 'strains_snps'):
File "/home/scaonp01/Software/SnapperDB/snapperdb/snpdb/snpdb.py", line 190, in check_duplicate
dict_cursor.execute("select distinct(name) FROM %s where name = '%s'" % (database, vcf.sample_name))
File "/home/scaonp01/.local/lib/python2.7/site-packages/psycopg2/extras.py", line 141, in execute
return super(DictCursor, self).execute(query, vars)
psycopg2.ProgrammingError: relation "strains_snps" does not exist
LINE 1: select distinct(name) FROM strains_snps where name = 'AM9331...

Edit (associated log_dir file) :

2018-11-12 15:34:36,892 snapperdb.make_snpdb INFO PARAMS: config = custom_salmo.txt
2018-11-12 15:34:36,914 snapperdb.fastq_to_vcf INFO Running fastq_to_vcf
2018-11-12 15:34:36,914 snapperdb.fastq_to_vcf INFO Parsing config_dict
2018-11-12 15:34:36,914 snapperdb.fastq_to_vcf INFO Defining class variables and making output files
2018-11-12 15:34:36,915 snapperdb.fastq_to_vcf INFO Making FASTQs
2018-11-12 15:34:36,915 snapperdb.fastq_to_vcf INFO Running Pheonix
2018-11-12 15:42:03,196 snapperdb.snpdb.vcf_to_db INFO Initialising SNPdb class
2018-11-12 15:42:03,196 snapperdb.snpdb.vcf_to_db INFO Parsing config dict
2018-11-12 15:42:03,220 snapperdb.snpdb.vcf_to_db INFO You are running vcf_to_db. Initialising Vcf class.
2018-11-12 15:42:03,221 snapperdb.snpdb.vcf_to_db INFO Making SNPdb variables and output files
2018-11-12 15:42:03,438 snapperdb.snpdb.vcf_to_db INFO Uploading to SNPdb

Edit 2 : I did read this ISSUE as it was pretty similar :
@timdallman commented :

it looks like the database did not form correctly when you made it manually.
if you can delete it and try make_snpdb function it should work

I thus tried to follow what's listed within "Deleting or purging your database" in the README :
dropdb -U user_snapperdb db_snapperdb;

@jb2cool did not have postgresql-contrib package installed (I have)

I launched the command again after this, here it the stderr :

/home/scaonp01/.local/lib/python2.7/site-packages/psycopg2/init.py:144: UserWarning: The psycopg2 wheel package will be renamed from release 2.8; in order to keep installing from binary please use "pip install psycopg2-binary" instead. For details see: http://initd.org/psycopg/docs/install.html#binary-install-from-pypi.
""")
Namespace(command='make_snpdb', config_file='custom_salmo.txt', fastqs=[], log_dir='/home/scaonp01/Software/SnapperDB')
Cant connect to SnapperDB db_snapperdb
The SNPdb db_snapperdb does not exist - running sql to create database
FASTQs found for AM933172
[2018-11-12 16:06:38,596] INFO: Version: N/A
[2018-11-12 16:06:38,596] INFO: Initialising data matrix.
[2018-11-12 16:06:38,601] INFO: Mapping data file with bwa.
[2018-11-12 16:08:06,988] INFO: Creating digitised variants with gatk.
[2018-11-12 16:09:48,255] INFO: Annotating
[2018-11-12 16:11:42,808] INFO: Applying filters: ['mq_score:30', 'min_depth:10', 'ad_ratio:0.9']
Calulated depth is 128.05 - cuttoff is 30
Completed 2018-11-12 16:14:15.390364

It's all good now, right ?
What mislead me was the "snpdb_name" line in the config file. I thought it was required to create a postgresql DB before lauching anything. Seems that I was wrong about this.

@timdallman
Copy link
Contributor

Thanks for pointing this out will update the README

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants