Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

teflon error #76

Closed
tomaszjacek opened this issue Jan 14, 2021 · 17 comments
Closed

teflon error #76

tomaszjacek opened this issue Jan 14, 2021 · 17 comments

Comments

@tomaszjacek
Copy link

Hi,

When I run the teflon analysis with command

python3 mcclintock.py
-r /work/mcclintock/test/sacCer2.fasta
-c /work/mcclintock/test/sac_cer_TE_seqs.fasta
-g /work/mcclintock/test/reference_TE_locations.gff
-t /work/mcclintock/test/sac_cer_te_families.tsv
-1 /data/mcclintock/test/SRR800842_1.fastq.gz
-2 /data/mcclintock/test/SRR800842_2.fastq.gz
-p 10
-m teflon
-o /data/mcclintock/test/output/

I got the error

Job counts:
count jobs
1 make_consensus_fasta
1 make_reference_fasta
1 make_te_annotations
1 setup_reads
1 summary_report
1 teflon_post
1 teflon_preprocessing
1 teflon_run
8
Environment defines Python version < 3.5. Using Python of the master process to execute script. Note that this cannot be avoided, because the script uses data structures from Snakemake which are Python >=3.5 only.
Environment defines Python version < 3.5. Using Python of the master process to execute script. Note that this cannot be avoided, because the script uses data structures from Snakemake which are Python >=3.5 only.
python /work/mcclintock/install/tools/teflon/teflon_collapse.py -wd /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/ -d /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.prep_TF/ -s /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/samples.tsv -t 10 -n1 1 -n2 1 -q 20
[Thu Jan 14 18:10:01 2021]
Error in rule teflon_run:
jobid: 2
output: /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/genotypes/sample.genotypes.txt
conda-env: /work/mcclintock/install/envs/conda/c707b3e8

RuleException:
CalledProcessError in line 49 of /work/mcclintock/snakefiles/teflon.snakefile:
Command 'source /opt/conda/envs/mcclintock/bin/activate '/work/mcclintock/install/envs/conda/c707b3e8'; set -euo pipefail; /opt/conda/envs/mcclintock/bin/python3.7 /data/mcclintock/test/output/snakemake/3802957/.snakemake/scripts/tmpifvv9aex.teflon_run.py' returned non-zero exit status 1.
File "/opt/conda/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/init.py", line 2189, in run_wrapper
File "/work/mcclintock/snakefiles/teflon.snakefile", line 49, in __rule_teflon_run
File "/opt/conda/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/init.py", line 529, in _callback
File "/opt/conda/envs/mcclintock/lib/python3.7/concurrent/futures/thread.py", line 57, in run
File "/opt/conda/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/init.py", line 515, in cached_or_run
File "/opt/conda/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/init.py", line 2201, in run_wrapper
Exiting because a job execution failed. Look above for error message
snakemake --use-conda --conda-prefix /work/mcclintock/install/envs/conda --quiet --configfile /data/mcclintock/test/output/snakemake/config/config_3802957.json --cores 10 /data/mcclintock/test/output/SRR800842_1/results/teflon/SRR800842_1_teflon_nonredundant.bed /data/mcclintock/test/output/SRR800842_1/results/summary/data/run/summary_report.txt

Is it bug of teflon software? or I should use some extraa option in command?

Thank you,
tj

@pbasting
Copy link
Collaborator

Hi @tomaszjacek,

can you post the contents of the TEFLoN specific log? That should make it easier for me to determine what is going wrong. Based on the paths in the error you posted, the TEFLoN log should be at: /data/mcclintock/test/output/log/*/teflon.log

Thanks,
Preston

@tomaszjacek
Copy link
Author

tomaszjacek commented Jan 15, 2021

Im sorry i dont know how to attach the file. is it possible here?
So, I have to pste it.
teflon.log file is 1135 lines long with many times "Processed 990100 reads..."
but ends with error

Thank you,
tj

[M::mem_process_seqs] Processed 990100 reads in 83.325 CPU sec, 8.546 real sec
[M::process] read 990100 sequences (100000100 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (217, 401435, 74, 95)
[M::mem_pestat] analyzing insert size distribution for orientation FF...
[M::mem_pestat] (25, 50, 75) percentile: (61, 132, 672)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 1894)
[M::mem_pestat] mean and std.dev: (313.10, 375.83)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 2505)
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (276, 301, 320)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (188, 408)
[M::mem_pestat] mean and std.dev: (298.18, 33.74)
[M::mem_pestat] low and high boundaries for proper pairs: (144, 452)
[M::mem_pestat] analyzing insert size distribution for orientation RF...
[M::mem_pestat] (25, 50, 75) percentile: (257, 3703, 9499)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 27983)
[M::mem_pestat] mean and std.dev: (4134.85, 3903.56)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 37225)
[M::mem_pestat] analyzing insert size distribution for orientation RR...
[M::mem_pestat] (25, 50, 75) percentile: (495, 753, 1247)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 2751)
[M::mem_pestat] mean and std.dev: (747.34, 386.19)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 3503)
[M::mem_pestat] skip orientation FF
[M::mem_pestat] skip orientation RF
[M::mem_pestat] skip orientation RR
[M::mem_process_seqs] Processed 990100 reads in 86.595 CPU sec, 8.862 real sec
[M::process] read 990100 sequences (100000100 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (205, 396794, 77, 95)
[M::mem_pestat] analyzing insert size distribution for orientation FF...
[M::mem_pestat] (25, 50, 75) percentile: (62, 140, 510)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 1406)
[M::mem_pestat] mean and std.dev: (314.52, 367.90)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 1854)
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (275, 301, 319)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (187, 407)
[M::mem_pestat] mean and std.dev: (297.60, 34.11)
[M::mem_pestat] low and high boundaries for proper pairs: (143, 451)
[M::mem_pestat] analyzing insert size distribution for orientation RF...
[M::mem_pestat] (25, 50, 75) percentile: (271, 4322, 8277)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 24289)
[M::mem_pestat] mean and std.dev: (3993.38, 3576.53)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 32295)
[M::mem_pestat] analyzing insert size distribution for orientation RR...
[M::mem_pestat] (25, 50, 75) percentile: (449, 703, 1217)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 2753)
[M::mem_pestat] mean and std.dev: (687.53, 371.81)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 3521)
[M::mem_pestat] skip orientation FF
[M::mem_pestat] skip orientation RF
[M::mem_pestat] skip orientation RR
[M::mem_process_seqs] Processed 990100 reads in 92.206 CPU sec, 9.404 real sec
[M::process] read 918116 sequences (92729716 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (211, 394908, 65, 89)
[M::mem_pestat] analyzing insert size distribution for orientation FF...
[M::mem_pestat] (25, 50, 75) percentile: (71, 135, 446)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 1196)
[M::mem_pestat] mean and std.dev: (211.70, 216.78)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 1571)
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (274, 300, 319)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (184, 409)
[M::mem_pestat] mean and std.dev: (296.91, 34.69)
[M::mem_pestat] low and high boundaries for proper pairs: (139, 454)
[M::mem_pestat] analyzing insert size distribution for orientation RF...
[M::mem_pestat] (25, 50, 75) percentile: (285, 2584, 9521)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 27993)
[M::mem_pestat] mean and std.dev: (3933.18, 3790.37)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 37229)
[M::mem_pestat] analyzing insert size distribution for orientation RR...
[M::mem_pestat] (25, 50, 75) percentile: (404, 643, 1227)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 2873)
[M::mem_pestat] mean and std.dev: (683.83, 464.21)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 3696)
[M::mem_pestat] skip orientation FF
[M::mem_pestat] skip orientation RF
[M::mem_pestat] skip orientation RR
[M::mem_process_seqs] Processed 990100 reads in 92.694 CPU sec, 9.479 real sec
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (174, 337492, 61, 93)
[M::mem_pestat] analyzing insert size distribution for orientation FF...
[M::mem_pestat] (25, 50, 75) percentile: (69, 131, 548)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 1506)
[M::mem_pestat] mean and std.dev: (310.03, 353.80)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 1985)
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (271, 298, 317)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (179, 409)
[M::mem_pestat] mean and std.dev: (294.40, 35.79)
[M::mem_pestat] low and high boundaries for proper pairs: (133, 455)
[M::mem_pestat] analyzing insert size distribution for orientation RF...
[M::mem_pestat] (25, 50, 75) percentile: (308, 2984, 9472)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 27800)
[M::mem_pestat] mean and std.dev: (4027.59, 3658.31)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 36964)
[M::mem_pestat] analyzing insert size distribution for orientation RR...
[M::mem_pestat] (25, 50, 75) percentile: (513, 721, 809)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 1401)
[M::mem_pestat] mean and std.dev: (719.77, 315.99)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 1984)
[M::mem_pestat] skip orientation FF
[M::mem_pestat] skip orientation RF
[M::mem_pestat] skip orientation RR
[M::mem_process_seqs] Processed 918116 reads in 97.453 CPU sec, 9.857 real sec
[main] Version: 0.7.17-r1188
[main] CMD: bwa mem -t 10 -Y /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered//teflon.prep_MP/teflon.mappingRef.fa /data/mcclintock/test/output/SRR800842_1/intermediate/fastq/SRR800842_1_1.fq /data/mcclintock/test/output/SRR800842_1/intermediate/fastq/SRR800842_1_2.fq
[main] Real time: 389.589 sec; CPU: 3788.028 sec
bwa mem -t 10 -Y /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered//teflon.prep_MP/teflon.mappingRef.fa /data/mcclintock/test/output/SRR800842_1/intermediate/fastq/SRR800842_1_1.fq /data/mcclintock/test/output/SRR800842_1/intermediate/fastq/SRR800842_1_2.fq > /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.sam
samtools view -Sb /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.sam > /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.bam
[bam_sort_core] merging from 20 files...
samtools sort -@ 10 -o /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.sorted.bam /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.bam
samtools index /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.sorted.bam
awk: line 1: syntax error at or near *
Calculating alignment statistics
cmd: samtools stats -t /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.prep_TF/teflon.genomeSize.txt /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.sorted.bam
cmd: samtools depth -Q 20 /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.sorted.bam | awk '{sum+=$3; sumsq+=$3*$3} END {print "Average = ",sum/NR; print "Stdev = ",sqrt(sumsq/NR - (sum/NR)**2)}' > /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.sorted.cov.txt
Insert size standard deviation estimated as 45. Use the override option if you suspect this is incorrect!
Warning: coverage could not be estimated, enter coverage manually
python /work/mcclintock/install/tools/teflon/teflon.v0.4.py -wd /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/ -d /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.prep_TF/ -s /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/samples.tsv -i sample -l1 family -l2 family -t 10 -q 20
Traceback (most recent call last):
  File "/work/mcclintock/install/tools/teflon/teflon_collapse.py", line 165, in <module>
    main()
  File "/work/mcclintock/install/tools/teflon/teflon_collapse.py", line 103, in main
    samples.append([line.split()[0], line.split()[1], [readLen, insz, sd, total_n,cov,cov_sd]])
UnboundLocalError: local variable 'cov' referenced before assignment
python /work/mcclintock/install/tools/teflon/teflon_collapse.py -wd /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/ -d /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.prep_TF/ -s /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/samples.tsv -t 10 -n1 1 -n2 1 -q 20
python /work/mcclintock/install/tools/teflon/teflon_collapse.py -wd /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/ -d /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/teflon.prep_TF/ -s /data/mcclintock/test/output/SRR800842_1/results/teflon/unfiltered/samples.tsv -t 10 -n1 1 -n2 1 -q 20

-bash-4.2$ wc -l teflon.log

@cbergman
Copy link
Member

@tomaszjacek: thanks for your feedback on running McClintock. You can attach files by clicking on the bottom bar of the comment box and navigating in your finder/explorer and uploading. Alternatively, you can drag and drop files of select types into the comment box and it will upload automatically. See more here: https://docs.github.com/en/free-pro-team@latest/github/managing-your-work-on-github/file-attachments-on-issues-and-pull-requests

@zhjpeng
Copy link

zhjpeng commented Jan 15, 2021

Hi, when I run McClintock as following:

python3 ${MCK}/mcclintock.py --reference ../10-reference/HaSCD2.fa \
                                 --consensus ../10-reference/Hadb-families_rename.fa \
                                 --first ../20-NGS/${K}/${K}_1.fastq \
                                 --second ../20-NGS/${K}/${K}_2.fastq \
                                 --proc 48 \
                                 --out ${K} \
                                 --locations ./TE_annotations/HaSCD2/reference_te_locations/unaugmented_inrefTEs.gff \
                                 --taxonomy ./TE_annotations/HaSCD2/te_taxonomy/unaugmented_taxonomy.tsv

I got some errors related to teflon as following:

Error in rule teflon_run:
    jobid: 20
    output: /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/genotypes/sample.genotypes.txt
    conda-env: /home/dell/biosoft/mcclintock/install/envs/conda/54b8d4d7

RuleException:
CalledProcessError in line 49 of /home/dell/biosoft/mcclintock/snakefiles/teflon.snakefile:
Command 'source /home/dell/miniconda3/envs/mcclintock/bin/activate '/home/dell/biosoft/mcclintock/install/envs/conda/54b8d4d7'; set -euo pipefail;  /home/dell/miniconda3/envs/mcclintock/bin/python3.7 /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/snakemake/1571076/.snakemake/scripts/tmpc34m4ip0.teflon_run.py' returned non-zero exit status 1.
  File "/home/dell/miniconda3/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2189, in run_wrapper
  File "/home/dell/biosoft/mcclintock/snakefiles/teflon.snakefile", line 49, in __rule_teflon_run
  File "/home/dell/miniconda3/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 529, in _callback
  File "/home/dell/miniconda3/envs/mcclintock/lib/python3.7/concurrent/futures/thread.py", line 57, in run
  File "/home/dell/miniconda3/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 515, in cached_or_run
  File "/home/dell/miniconda3/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2201, in run_wrapper

teflon.log as following

writing TE bed files...
writing TE bed files completed!
reducing search space...
cmd: samtools view -@ 4 -L /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/sample.bed_files/mega_complete.bed /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/teflon.sorted.bam -b
search space succesfully reduced...
new reduced bam file: /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/sample.sam_files/mega_complete.bam
clustering TE positions...
[ ================================================== ] 100.00%
clustering TE positions completed!
final reduction of search space...
cmd: samtools view -@ 4 -q 20 -L /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/sample.bed_files/mega_clustered.bed /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/teflon.sorted.bam -b
Error running samtools: p.returncode = 1
python /home/dell/biosoft/mcclintock/install/tools/teflon/teflon.v0.4.py -wd /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/ -d /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/teflon.prep_TF/ -s /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/samples.tsv -i sample -l1 family -l2 family -t 4 -q 20
python /home/dell/biosoft/mcclintock/install/tools/teflon/teflon.v0.4.py -wd /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/ -d /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/teflon.prep_TF/ -s /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/samples.tsv -i sample -l1 family -l2 family -t 4 -q 20

when I run the samtools view manually as

samtools view -@ 4 -q 20 -L /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/sample.bed_files/mega_clustered.bed /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/teflon.sorted.bam -b

I got error as following:

[bed_read] Parse error reading "/home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/sample.bed_files/mega_clustered.bed" at line 63797
samtools view: Could not read file "/home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/sample.bed_files/mega_clustered.bed"

therefore, I get the line 63797 of /home/newsdc/zhang_20201215/insertTE/30-mcclintock/Ac12/Ac12_1/results/teflon/unfiltered/sample.bed_files/mega_clustered.bed as following
4007749
it just included one site, may be start or end?
Meanwhile, I found another potential error in as following
chr19 4007485 Unchr32 651720 651859
it seems to be chimeric records.

So, the error above may occur during clustering TE positions?

@pbasting
Copy link
Collaborator

@tomaszjacek

  • based on the information posted, it looks like you are having issues with the test data on TEFLoN. I've run with this dataset many times without issue which suggests that the problem is likely unrelated to the data, but rather an issue with the environment
  • When I look through the log you posted, I see that the first error is: awk: line 1: syntax error at or near *
  • I do not have an awk interpreter included in the TEFLoN conda environment, so any awk commands internal to TEFLoN would be using the awk interpreter installed on your system.
  • I've been running with GNU Awk 4.2.1 and haven't had issues
$ awk --version
GNU Awk 4.2.1, API: 2.0 (GNU MPFR 3.1.6-p2, GNU MP 6.1.2)
Copyright (C) 1989, 1991-2018 Free Software Foundation.
  • TEFLoN has two lines where awk is used.
$ grep "awk" *.py
teflon.v0.4.py:    cmd="""%s depth -Q %s %s | awk '{sum+=$3; sumsq+=$3*$3} END {print "Average = ",sum/NR; print "Stdev = ",sqrt(sumsq/NR - (sum/NR)**2)}' > %s""" %(exeSAM, str(qual), bam, covFILE)
$ grep "awk" teflon_scripts/*.py
teflon_scripts/subsample_alignments.py:                cmd="""%s depth -Q %s %s | awk '{sum+=$3; sumsq+=$3*$3} END {print "Average = ",sum/NR; print "Stdev = ",sqrt(sumsq/NR - (sum/NR)**2)}' > %s""" %(exePATH, str(qual), bamFILE, covFILE)

  • I think the easiest solution is to include gawk in the TEFLoN conda environment to ensure that users are using the same awk interpreter.
  • @tomaszjacek I'll update the TEFLoN environment yaml and test that it is working properly. Then I'll let you know when it's ready for you to try out. Hopefully this will resolve your issue.

@pbasting
Copy link
Collaborator

  • @zhjpeng (teflon error #76 (comment)) I have seen this issue before as well. It seems to be sample dependent. Most of my McClintock runs with TEFLoN do not have this issue but some specific samples will have this occur where the mega_clustered.bed is malformed.
  • I am fairly certain this is a bug in TEFLoN and not related to mcclintock, so I am going to work on replicating this bug outside of McClintock with just TEFLoN. Then I'll open an issue on the actual TEFLoN repository (https://github.com/jradrion/TEFLoN) to see if their developers know what is going on.
  • I'll let you know when I've posted the issue

@zhjpeng
Copy link

zhjpeng commented Jan 16, 2021

  • @zhjpeng (#76 (comment)) I have seen this issue before as well. It seems to be sample dependent. Most of my McClintock runs with TEFLoN do not have this issue but some specific samples will have this occur where the mega_clustered.bed is malformed.
  • I am fairly certain this is a bug in TEFLoN and not related to mcclintock, so I am going to work on replicating this bug outside of McClintock with just TEFLoN. Then I'll open an issue on the actual TEFLoN repository (https://github.com/jradrion/TEFLoN) to see if their developers know what is going on.
  • I'll let you know when I've posted the issue

Thanks for your reply, I am running mcclintock in more samples and check whether other samples have similar errors.

@tomaszjacek
Copy link
Author

@tomaszjacek: thanks for your feedback on running McClintock. You can attach files by clicking on the bottom bar of the comment box and navigating in your finder/explorer and uploading. Alternatively, you can drag and drop files of select types into the comment box and it will upload automatically. See more here: https://docs.github.com/en/free-pro-team@latest/github/managing-your-work-on-github/file-attachments-on-issues-and-pull-requests

Thank you,
tj

@pbasting
Copy link
Collaborator

  • @tomaszjacek I've updated the mcclintock master branch b61563e with the change to the TEFLoN environment that now includes gawk. You should be able to update your mcclintock repository with a git pull. Then you should do a clean install with mcclintock.py --install which will install TEFLoN with the updated conda environment.
  • Let me know if this resolves the bug you were experiencing earlier.

@cbergman
Copy link
Member

@tomaszjacek
Copy link
Author

  • @tomaszjacek I've updated the mcclintock master branch b61563e with the change to the TEFLoN environment that now includes gawk. You should be able to update your mcclintock repository with a git pull. Then you should do a clean install with mcclintock.py --install which will install TEFLoN with the updated conda environment.
  • Let me know if this resolves the bug you were experiencing earlier.

@pbasting

It works,
Thank you,
tj

@yuryfunikov
Copy link

yuryfunikov commented Mar 14, 2021

unfortunately git pull && mcclintock.py --install didn't help me
is there any way to verify teflon was updated and/or a way to get a component version being used?

@pbasting
Copy link
Collaborator

Hi @yuryfunikov ,

  • Can you post the version of mcclintock you are using?
cd /path/to/mcclintock
git rev-parse HEAD
  • Also can you describe the issues you are having in detail and provide examples of the error messages you are receiving?
  • If they are different then what is described in: teflon error #76 (comment) and teflon error #76 (comment) then please post this information in a new issue.
  • And to answer your question from: teflon error #76 (comment), the best way to be sure you are using the correct versions of McClintock and the component methods is to do a clean installation of the newest commit: 5849097 following the instructions in the README

Thanks!

Preston

@yuryfunikov
Copy link

Hi and thanks for the answer,

this is what i got:

  1. i ran git pull && mcclintock.py --install
  2. git rev-parse HEAD
mcclintock$ git rev-parse HEAD
5849097de4f74b0b8b149cad138e31024082924c
  1. then i ran:
python3 ./../mcclintock/mcclintock.py -r dvir-all-chromosome-r.1.06.fasta -c asymmetric_TEs_v1.fasta -1 160JB_dna_seq_1_trimmed.fastq.gz -2 160JB_dna_seq_2_trimmed.fastq.gz -p 1 -m teflon -o mcclintock_out_assTEv1_160_refgen/ --resume --debug

that resulted in following error:

RuleException:
CalledProcessError in line 49 of /path/to/file/mcclintock/snakefiles/teflon.snakefile:
Command 'source /opt/miniconda/envs/mcclintock/bin/activate '/path/to/file/mcclintock/install/envs/conda/cc1216b5'; set -euo pipefail;  /opt/miniconda/envs/mcclintock/bin/python3.7 /path/to/file/mcclintock_out_assTEv1_160_refgen/snakemake/3370691/.snakemake/scripts/tmp6dm6acdf.teflon_run.py' returned non-zero exit status 1.
  File "/opt/miniconda/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2189, in run_wrapper
  File "/path/to/filemcclintock/snakefiles/teflon.snakefile", line 49, in __rule_teflon_run
  File "/opt/miniconda/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 529, in _callback
  File "/opt/miniconda/envs/mcclintock/lib/python3.7/concurrent/futures/thread.py", line 57, in run
  File "/opt/miniconda/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 515, in cached_or_run
  File "/opt/miniconda/envs/mcclintock/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2201, in run_wrapper
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: //path/to/file/mcclintock_out_assTEv1_160_refgen/snakemake/3370691/.snakemake/log/2021-03-15T001010.010823.snakemake.log
  1. then i checked teflon log
-rw-rw-r-- 1 sergey sergey   2425 Mar 15 00:16 ./mcclintock_out_assTEv1_160_refgen/logs/20210315.001008.3370691/teflon.log

./mcclintock_out_assTEv1_160_refgen/logs/20210315.001008.3370691/teflon.log:

writing TE bed files...
writing TE bed files completed!
reducing search space...
cmd: samtools view -@ 1 -L /path/to/file/mcclintock_out_assTEv1_160_refgen/160JB_dna_seq_1_trimmed/results/teflon/unfiltered/sample.bed_files/mega_complete.bed /path/to/file/mcclintock_out_assTEv1_160_refgen/160JB_dna_seq_1_trimmed/results/teflon/unfiltered/teflon.sorted.bam -b
Error running samtools: p.returncode = 1
py

and i must say that it looks like mega_complete.bed wasn't created at all:

/path/to/file/mcclintock_out_assTEv1_160_refgen/160JB_dna_seq_1_trimmed/results/teflon/unfiltered/sample.bed_files/mega_complete.bed: No such file or directory

also i should say that the pipeline used to be working without problems but then it stated failing with this error from time to time and now it fails every time we run the script

pls let me know if you think i should file a new ticket regarding this

@pbasting
Copy link
Collaborator

Thanks @yuryfunikov this looks like a similar problem as described in: #76 (comment). We have contacted the TEFLoN developer and I think that the bug has been fixed (see: jradrion/TEFLoN#8) but I am currently testing it and integrating the changes in mcclintock. I'll let you know when these changes have been integrated.

@yuryfunikov
Copy link

hi

sorry for bothering but have you had a chance to look into this?

@pbasting
Copy link
Collaborator

pbasting commented Apr 8, 2021

@yuryfunikov Sorry for not replying earlier, but I have integrated the most recent update to TEFLoN into mcclintock. So I'd suggest re-installing the newest version of mcclintock: 40863ac and trying TEFLoN again on your sample to see if the issue is resolved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants