Running Tabsat test ################################# ## Welcome to TABSAT! ## ## github.com/tadKeys/tabsat ## ################################# - Library is PE - Using aligner: bowtie2 - MIN_READ_QUAL: 21 - MIN_READ_LEN: 9 - Maximum read length not specified. Setting it to 100000 - MAX_READ_LEN: 100000 - PERCENT_TARGET: 0.7 - READ_CUTOFF: 4 - Sort list not specified. Setting it to '' - SORT_LIST: - checking the target list - Target list is ok - Creating and setting coverage directory: /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/COVERAGE_NONDIR_bowtie2 - Creating and setting CpG directory: /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/CPG_NONDIR_bowtie2 -- Creating quality directory - Basename: SRR3296596_1 - current_dir: /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/SRR3296596_1_NONDIR_bowtie2 - Running Bismark... - First file of PE SRR3296596_1 - Found 2nd PE file /media/MyBook/progs/tabsat-master/tools/zz_test/SRR3296596_2.fastq - Starting 02_meth_pipe.sh - /media/MyBook/progs/tabsat-master/tools/zz_test/SRR3296596_1.fastq /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/SRR3296596_1_NONDIR_bowtie2 /media/MyBook/progs/tabsat-master/tools/zz_test/target_list_miseq.csv NONDIR bowtie2 9 100000 21 hg19 PE /media/MyBook/progs/tabsat-master/tools/zz_test/SRR3296596_2.fastq - SeqLibrary: NONDIR - Aligner: bowtie2 - param_min_length: 9 - param_max_length: 100000 - param_min_qual: 21 - param_ref_genome: hg19 - param_seq_library: PE - PE library - looking for second file - Found 2nd PE file: /media/MyBook/progs/tabsat-master/tools/zz_test/SRR3296596_2.fastq -- seq_library: NONDIR -- Performing QC with /media/MyBook/progs/tabsat-master/tools/zz_test/SRR3296596_1.fastq - folder: /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/SRR3296596_1_NONDIR_bowtie2 - filename: SRR3296596_1.fastq -- ... done performing QC -- Performing QC with /media/MyBook/progs/tabsat-master/tools/zz_test/SRR3296596_2.fastq - folder: /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/SRR3296596_1_NONDIR_bowtie2 - filename: SRR3296596_2.fastq -- ... done performing QC -- Performing Fastq filtering/trimming (filter min-length: 9bp, max-length: 100000bp, trim 3' end quality <21) -- Filter for PE - FILENAME in PE setting: SRR3296596 -- ... done performing filtering/trimming -- Performing QC with /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/SRR3296596_1_NONDIR_bowtie2/SRR3296596_trimmed_2.fastq - folder: /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/SRR3296596_1_NONDIR_bowtie2 - filename: SRR3296596_trimmed_2.fastq -- Performing QC with /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/SRR3296596_1_NONDIR_bowtie2/SRR3296596_trimmed_1.fastq - folder: /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/SRR3296596_1_NONDIR_bowtie2 - filename: SRR3296596_trimmed_1.fastq -- ... done performing QC -- ... done performing QC -- Calling Bismark PE: /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/SRR3296596_1_NONDIR_bowtie2/SRR3296596_trimmed_1.fastq and /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/SRR3296596_1_NONDIR_bowtie2/SRR3296596_trimmed_2.fastq -- ... done with bismark. -- Performing methyl extraction on SAM_FILE: /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/SRR3296596_1_NONDIR_bowtie2/SRR3296596*.sam ... -- ... done with methyl extraction. Copying coverage files cp /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/SRR3296596_1_NONDIR_bowtie2/*.cov /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/COVERAGE_NONDIR_bowtie2 Copying CpG files with strand information cp /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/SRR3296596_1_NONDIR_bowtie2/CpG_*txt /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/CPG_NONDIR_bowtie2 Copying SAM files Copying qc files Removing all reads with coverage 1 convert target list to bed /media/MyBook/progs/tabsat-master/tools/samtools/samtools-1.4/samtools view -S -b -h /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/SRR3296596_1_NONDIR_bowtie2/SRR3296596_trimmed_1.fastq_bismark_bt2_pe.sam > /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/SRR3296596_1_NONDIR_bowtie2/SRR3296596_trimmed_1.fastq_bismark_bt2_pe.sam_removed_cov_one.bam /media/MyBook/progs/tabsat-master/tools/samtools/samtools-1.4/samtools sort /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/SRR3296596_1_NONDIR_bowtie2/SRR3296596_trimmed_1.fastq_bismark_bt2_pe.sam_removed_cov_one.bam -o /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/SRR3296596_1_NONDIR_bowtie2/SRR3296596_trimmed_1.fastq_bismark_bt2_pe.sam_removed_cov_one_sorted.bam /media/MyBook/progs/tabsat-master/tools/bedtools/bedtools2/bin/intersectBed -v -a /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/SRR3296596_1_NONDIR_bowtie2/SRR3296596_trimmed_1.fastq_bismark_bt2_pe.sam_removed_cov_one_sorted.bam -b /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/SRR3296596_1_NONDIR_bowtie2/target_list_for_cov_one.bed > /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/SRR3296596_1_NONDIR_bowtie2/SRR3296596_trimmed_1.fastq_bismark_bt2_pe.sam_removed_cov_one_sorted_off_target.bam ***** WARNING: File /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/SRR3296596_1_NONDIR_bowtie2/SRR3296596_trimmed_1.fastq_bismark_bt2_pe.sam_removed_cov_one_sorted.bam has inconsistent naming convention for record: 1 146549825 146549977 SRR3296596.106107_106107_length=150/1 0 + ***** WARNING: File /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/SRR3296596_1_NONDIR_bowtie2/SRR3296596_trimmed_1.fastq_bismark_bt2_pe.sam_removed_cov_one_sorted.bam has inconsistent naming convention for record: 1 146549825 146549977 SRR3296596.106107_106107_length=150/1 0 + /media/MyBook/progs/tabsat-master/tools/bedtools/bedtools2/bin/genomeCoverageBed -ibam /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/SRR3296596_1_NONDIR_bowtie2/SRR3296596_trimmed_1.fastq_bismark_bt2_pe.sam_removed_cov_one_sorted_off_target.bam -bg | awk '$4 < 2' | /media/MyBook/progs/tabsat-master/tools/bedtools/bedtools2/bin/intersectBed -a /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/SRR3296596_1_NONDIR_bowtie2/SRR3296596_trimmed_1.fastq_bismark_bt2_pe.sam_removed_cov_one_sorted.bam -b - -v > /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/SRR3296596_1_NONDIR_bowtie2/SRR3296596_trimmed_1.fastq_bismark_bt2_pe.sam_removed_one_final.bam /media/MyBook/progs/tabsat-master/tools/samtools/samtools-1.4/samtools view -h /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/SRR3296596_1_NONDIR_bowtie2/SRR3296596_trimmed_1.fastq_bismark_bt2_pe.sam_removed_one_final.bam > /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/SRR3296596_1_NONDIR_bowtie2/SRR3296596_trimmed_1.fastq_bismark_bt2_pe.sam_removed_one_final.sam Copy to current sam Create samtools stats /media/MyBook/progs/tabsat-master/tools/samtools/samtools-1.4/samtools view -S -b /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/SRR3296596_1_NONDIR_bowtie2/SRR3296596_trimmed_1.fastq_bismark_bt2_pe.sam > /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/SRR3296596_1_NONDIR_bowtie2/SRR3296596_1.bam /media/MyBook/progs/tabsat-master/tools/samtools/samtools-1.4/samtools sort /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/SRR3296596_1_NONDIR_bowtie2/SRR3296596_1.bam /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/SRR3296596_1_NONDIR_bowtie2/SRR3296596_1.bam.sorted.bam /media/MyBook/progs/tabsat-master/tools/samtools/samtools-1.4/samtools index /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/SRR3296596_1_NONDIR_bowtie2/SRR3296596_1.bam.sorted.bam -- Copying the CpH files -- Creating quality stats /media/MyBook/progs/tabsat-master/tools/ait/check_quality.sh /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/quality /media/MyBook/progs/tabsat-master/tools/zz_test/target_list_miseq.csv /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/SRR3296596_1_NONDIR_bowtie2/SRR3296596_1.bam.sorted.bam ***** WARNING: File - has inconsistent naming convention for record: 1 146549826 146549975 SRR3296596.24685_24685_length=150/1 8 + ***** WARNING: File - has inconsistent naming convention for record: 1 146549826 146549975 SRR3296596.24685_24685_length=150/1 8 + ***** WARNING: File - has inconsistent naming convention for record: 1 146549826 146549975 SRR3296596.24685_24685_length=150/1 8 + ***** WARNING: File - has inconsistent naming convention for record: 1 146549826 146549975 SRR3296596.24685_24685_length=150/1 8 + /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/quality/SRR3296596_1.bam.sorted.non_intersect_cov.bed - Lines in no-intersect cov_bed: 21 /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/quality/SRR3296596_1.bam.sorted.non_intersect_cov_merged.bed - Empty intersect cov_bed - Basename: SRR3296596_2 - current_dir: /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/SRR3296596_2_NONDIR_bowtie2 - Running Bismark... - In list of files to process a 2nd PE file was found. Don't do anything; continue with loop -- Combinind idxstats -- .. done with combining idxstats -- Creating final table.. /media/MyBook/progs/tabsat-master/tools/ait/create_final_table.py /media/MyBook/progs/tabsat-master/tools/zz_test/target_list_miseq.csv /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/COVERAGE_NONDIR_bowtie2 bowtie2 /media/MyBook/progs/tabsat-master/tools/ait/all_cpgs_only_pos_hg19.txt 4 - target_list: /media/MyBook/progs/tabsat-master/tools/zz_test/target_list_miseq.csv - cov_dir: /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/COVERAGE_NONDIR_bowtie2 - mapper: bowtie2 - cpg_file_path: /media/MyBook/progs/tabsat-master/tools/ait/all_cpgs_only_pos_hg19.txt - checking the target list - Target list is ok - using the read cutoff specified by the user: 4 ***** Generating final output table (Python) ***** - reading cutoff config -- cov_sum_cutoff: 4 -- cov_pos_cutoff: 2 -- read_cutoff: 4 - preparing list 2018-09-21 11:00:39 Prefilling strand information buffer with 190 items 2018-09-21 11:01:03 - creating entries cur_cov_file_basename: SRR3296596_trimmed_1.fastq_bismark_bt2_pe.bismark.cov cov_file name: SRR3296596 - cleaning result file - filtering low cov calls - cleaning filtered result file - keeping only CpGs from reference - Remove temp files - Final output file: /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/COVERAGE_NONDIR_bowtie2/ResultMethylListOnlyReferenceCpGs.csv Done. -- ... done with table creation. -- Creating final bed ... Converting table: ResultMethylListOnlyReferenceCpGs.csv -- .. done with bed creation. -- Create lollipop plots for every target Rscript --vanilla /media/MyBook/progs/tabsat-master/tools/ait/lollipop.R /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/COVERAGE_NONDIR_bowtie2 NONDIR bowtie2 named list() [1] "File does not exist: /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/COVERAGE_NONDIR_bowtie2/.csv" [1] "Done printing all targets." -- .. done with plotting. - CMD subpopulations: /media/MyBook/progs/tabsat-master/tools/MethylSubpop/subpopulations.sh -i /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/COVERAGE_NONDIR_bowtie2/MethylSubpopulations -p 0.7 -t /media/MyBook/progs/tabsat-master/tools/zz_test/target_list_miseq.csv - Starting with methylation pattern analysis - Output will be saved in /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/COVERAGE_NONDIR_bowtie2/MethylSubpopulations/Output - Whole Target for /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/COVERAGE_NONDIR_bowtie2/MethylSubpopulations/SRR3296596_trimmed_1.fastq_bismark_bt2_pe.sam - Intermediate Positions for /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/COVERAGE_NONDIR_bowtie2/MethylSubpopulations/SRR3296596_trimmed_1.fastq_bismark_bt2_pe.sam - Paste intermediate Positions for /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/COVERAGE_NONDIR_bowtie2/MethylSubpopulations/SRR3296596_trimmed_1.fastq_bismark_bt2_pe.sam - Intermediate Subpops for /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/COVERAGE_NONDIR_bowtie2/MethylSubpopulations/SRR3296596_trimmed_1.fastq_bismark_bt2_pe.sam - Final Positions for /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/COVERAGE_NONDIR_bowtie2/MethylSubpopulations/SRR3296596_trimmed_1.fastq_bismark_bt2_pe.sam - Paste final Positions for /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/COVERAGE_NONDIR_bowtie2/MethylSubpopulations/SRR3296596_trimmed_1.fastq_bismark_bt2_pe.sam - Final Subpops for /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/COVERAGE_NONDIR_bowtie2/MethylSubpopulations/SRR3296596_trimmed_1.fastq_bismark_bt2_pe.sam - Comparison of first and last methylation positions in all samples - Finding methylation subpopulations - Done with workflow -- Creating patternmap copy command: cp /media/MyBook/progs/tabsat-master/tools/zz_test/SRR3296596_1.fastq /media/MyBook/progs/tabsat-master/tools/zz_test/SRR3296596_2.fastq "/media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/copied_inputs" - CMD patternmap: /media/MyBook/progs/tabsat-master/tools/Patternmap/patternmap.sh -i /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq -s /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/copied_inputs -t /media/MyBook/progs/tabsat-master/tools/zz_test/target_list_miseq.csv -a bowtie2 Preparing SampleComparison for Patternmap INDIR: /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq OUTDIR: /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/Patternmap cp /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/COVERAGE_NONDIR_bowtie2/MethylSubpopulations/Output/SampleComparison.txt /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/Patternmap/All_targets.txt ---- Crete target lists in patternmap rm: cannot remove ‘Target*.target’: No such file or directory awk: fatal: cannot open file `*.target' for reading (No such file or directory) awk: fatal: cannot open file `*.target.trshld.target' for reading (No such file or directory) *.target.jsons: No such file or directory pos.log: No such file or directory - Patternmap: removing *.log, *.target, *.jsons /media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/Patternmap rm: cannot remove ‘*.log’: No such file or directory rm: cannot remove ‘*.jsons’: No such file or directory rm: cannot remove ‘Target*’: No such file or directory Done with Patternmap -- Copying the plots cp: cannot stat ‘/media/MyBook/progs/tabsat-master/tabsat_test_output_miseq/COVERAGE_NONDIR_bowtie2/PLOTS/*’: No such file or directory Start summarizing off-targets ... Finished summarizing off-targets. -- Converting all plots to pngs ls: cannot access *.pdf: No such file or directory Creating final JSON file /media/MyBook/progs/tabsat-master/tools/ait/prepare_json.py ResultMethylListOnlyReferenceCpGs.csv plots subpopulations qc idxstats cph ResultMethylListOnlyReferenceCpGs.bed Creating final JSON file from ResultMethylListOnlyReferenceCpGs.csv using plot directory: 'plots' and subpop dir: 'subpopulations' Traceback (most recent call last): File "/media/MyBook/progs/tabsat-master/tools/ait/prepare_json.py", line 323, in main() File "/media/MyBook/progs/tabsat-master/tools/ait/prepare_json.py", line 314, in main do(final_table, plot_directory, subpopulation_directory, qc_directory, idx_directory, zip_file_path) File "/media/MyBook/progs/tabsat-master/tools/ait/prepare_json.py", line 214, in do count_reader.next() File "/usr/lib64/python2.7/csv.py", line 104, in next row = self.reader.next() StopIteration Finished analyzing the data. ########################################## ## Thank you for using TABSAT! ## ## github.com/tadKeys/tabsat ## ##########################################