https://f1000research.com/articles/6-2055/v2

> #### Resolution of the DNA methylation state of single CpG dyads using in silico strand annealing and WGBS data
Nature protocol | https://www.nature.com/articles/s41596-018-0090-x

> - Whole-genome bisulfite sequencing (WGBS) has been widely used to quantify cytosine DNA methylation frequency in an expanding array of cell and tissue types. Because of the denaturing conditions used, this method ultimately leads to the measurement of methylation frequencies at single cytosines. 
> - Hence, the methylation frequency of CpG dyads (two complementary CG dinucleotides) can be only indirectly inferred by overlaying the methylation frequency of two cytosines measured independently. 
> - Furthermore, hemi-methylated CpGs (hemiCpGs) have not been previously analyzed in WGBS studies. We recently developed in silico strand annealing (iSA), a bioinformatics method applicable to WGBS data, to resolve the methylation status of CpG dyads into __unmethylated__, __hemi-methylated__, and __methylated__. 
> - HemiCpGs account for 4–20% of the DNA methylome in different cell types, and some can be inherited across cell divisions, suggesting a role as a stable epigenetic mark. Therefore, it is important to resolve hemiCpGs from fully methylated CpGs in WGBS studies. This protocol describes step-by-step commands to accomplish this task, including dividing alignments by strand, pairing alignments between strands, and extracting single-fragment methylation calls. 
> - The versatility of iSA enables its application downstream of other WGBS-related methods such as nasBS-seq (nascent DNA bisulfite sequencing), ChIP-BSseq (ChIP followed by bisulfite sequencing), TAB-seq, oxBS-seq, and fCAB-seq. iSA is also tunable for analyzing the methylation status of cytosines in any sequence context. 
> - We exemplify this flexibility by uncovering the single-fragment non-CpG methylome. 
> - This protocol provides enough details for users with little experience in bioinformatic analysis and takes 2–7 h.

> #### Extended-representation bisulfite sequencing of gene regulatory elements in multiplexed samples and single cells
https://www.nature.com/articles/s41587-021-00910-x

<!-- > - Before alignment, primer dimers were filtered using Cutadapt version 2.7 and the following parameters: `--discard -a GCTCTTCCGATCT`. Short read pairs were trimmed using Trim Galore version 0.6.5 and the following parameters: `--paired --illumina --nextseq 20`. 
> - High-quality sequencing reads were then aligned to an in silico bisulfite-converted reference genome (hg38 and mm10) using methylCtools version 1.0.0 (https://github.com/hovestadt/methylCtools, ref. 49) and bwa mem version 0.7.17. 
> - Sorted alignments were further processed to only maintain uniquely mapped read pairs with a mapping score ≥1, that were mapping to an MspI cut site and that had an insert size between 20 bp and 600 bp. Putative PCR duplicates were removed by considering the outer mapping position of both paired-end reads (read 2 being located at the MspI cut site and read 1 being located at variable positions), as well as the random hexamer sequence that was trimmed before alignment and functions as a UMI. For library complexity analysis, alignments were downsampled before this step. 
> - We note that multiple random hexamer priming events during the second-strand synthesis step might lead to additional sequencing reads from the same original fragment that cannot be identified using this approach. DNA methylation calling was performed using methylCtools bcall and the --trimPE parameter. Detailed quality metrics for each library are provided in Supplementary Table 1. DNA methylation values were deposited in the GEO (GSE149954) for all samples reported in this study.


 -->
 
 This is simply called RRBS protocol :)

# SRA Data

my `sra` env

`fastq-dump`

https://www.biostars.org/p/222122/


<!--     fastq-dump --outdir fastq --gzip --split-3 sra/SRR11711273.sra

    Read 17522209 spots for sra/SRR11711273.sra
    Written 17522209 spots for sra/SRR11711273.sra


___

    parallel-fastq-dump --threads 12 --outdir fastq/ --split-files --gzip sra/SRR11711272.sra -->
    

<!--     fasterq-dump --split-files --include-technical --threads 1 
    --temp . --outfile HL60_10ng_dmso.fastq --progress SRR11711272


    fasterq-dump --split-files --include-technical --threads 1 
    --temp . --outfile HL60_10ng_decitabine.fastq --progress SRR11711273 -->


    wget ftp.sra.ebi.ac.uk/vol1/fastq/SRR117/072/SRR11711272/SRR11711272_1.fastq.gz
    wget ftp.sra.ebi.ac.uk/vol1/fastq/SRR117/072/SRR11711272/SRR11711272_2.fastq.gz
    
    wget ftp.sra.ebi.ac.uk/vol1/fastq/SRR117/073/SRR11711273/SRR11711273_1.fastq.gz
    wget ftp.sra.ebi.ac.uk/vol1/fastq/SRR117/073/SRR11711273/SRR11711273_2.fastq.gz

In [3]:
mkdir fastq

In [4]:
!mv -v *fastq.gz fastq/

‘SRR11711272_1.fastq.gz’ -> ‘fastq/SRR11711272_1.fastq.gz’
‘SRR11711272_2.fastq.gz’ -> ‘fastq/SRR11711272_2.fastq.gz’
‘SRR11711273_1.fastq.gz’ -> ‘fastq/SRR11711273_1.fastq.gz’
‘SRR11711273_2.fastq.gz’ -> ‘fastq/SRR11711273_2.fastq.gz’


# Reading and annotating the methylation counts

## Trimming

<!-- > Before alignment, primer dimers were filtered using Cutadapt version 2.7 and the following parameters: `--discard -a GCTCTTCCGATCT`. Short read pairs were trimmed using Trim Galore version 0.6.5 and the following parameters: `--paired --illumina --nextseq 20`. -->

In [7]:
!mkdir -p logs/
!mkdir -p logs/trim_galore/

In [8]:
%%bash
for fq1 in fastq/*1.fastq.gz; do
    fq2=${fq1/1.fastq/2.fastq}
    b=`basename $fq1`
    log_file=${b/_1.fastq.gz/.log}
    cm="trim_galore --core 3 --paired --rrbs -o fastq $fq1 $fq2"
    echo $cm
    $cm &> logs/trim_galore/$log_file;
    wait
done

trim_galore --core 3 --paired --rrbs -o fastq fastq/SRR11711272_1.fastq.gz fastq/SRR11711272_2.fastq.gz
trim_galore --core 3 --paired --rrbs -o fastq fastq/SRR11711273_1.fastq.gz fastq/SRR11711273_2.fastq.gz


In [9]:
mv -v fastq/*trimming_report.txt logs/trim_galore/

‘fastq/SRR11711272_1.fastq.gz_trimming_report.txt’ -> ‘logs/trim_galore/SRR11711272_1.fastq.gz_trimming_report.txt’
‘fastq/SRR11711272_2.fastq.gz_trimming_report.txt’ -> ‘logs/trim_galore/SRR11711272_2.fastq.gz_trimming_report.txt’
‘fastq/SRR11711273_1.fastq.gz_trimming_report.txt’ -> ‘logs/trim_galore/SRR11711273_1.fastq.gz_trimming_report.txt’
‘fastq/SRR11711273_2.fastq.gz_trimming_report.txt’ -> ‘logs/trim_galore/SRR11711273_2.fastq.gz_trimming_report.txt’


In [10]:
!multiqc logs/trim_galore/ -f -n multiqc-trim


  [34m/[0m[32m/[0m[31m/[0m ]8;id=765597;https://multiqc.info\[1mMultiQC[0m]8;;\ 🔍 [2m| v1.12[0m

[34m|           multiqc[0m | Search path : /data_gilbert/home/aarab/Projects/Decitabine-treatment/Bisulfite-seq/logs/trim_galore
[2K[34m|[0m         [34msearching[0m | [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m100%[0m [32m6/6[0m  xt[0mlogs/trim_galore/SRR11711273.log[0m
[?25h[34m|          cutadapt[0m | Found 4 reports
[34m|           multiqc[0m | Compressing plot data
[34m|           multiqc[0m | [33mDeleting    : multiqc-trim.html   (-f was specified)[0m
[34m|           multiqc[0m | [33mDeleting    : multiqc-trim_data   (-f was specified)[0m
[34m|           multiqc[0m | Report      : multiqc-trim.html
[34m|           multiqc[0m | Data        : multiqc-trim_data
[34m|           multiqc[0m | MultiQC complete


## Processing the BS-seq FASTQ files with Bismark

Bismark - 
https://github.com/FelixKrueger/Bismark


It faild maybe because of the genome annotation - [there's no output result #34
](https://github.com/FelixKrueger/Bismark/issues/34)

I've discussed my issues here https://github.com/FelixKrueger/Bismark/issues/475

### Index genome

__(I) Running bismark_genome_preparation__

In [109]:
# !bismark_genome_preparation ~/genomes/hg38/gencode.v34/

In [112]:
!gzip -d /data_gilbert/home/aarab/genomes/hg38/chromosomes/*.fa.gz

In [None]:
!bismark_genome_preparation --verbose ~/genomes/hg38/chromosomes/

Path to genome folder specified as: /data_gilbert/home/aarab/genomes/hg38/chromosomes/
Aligner to be used: >> Bowtie 2 << (default)
Writing bisulfite genomes out into a single MFA (multi FastA) file

Bismark Genome Preparation - Step I: Preparing folders

Bisulfite Genome Indexer version v0.23.1 (last modified: 27 Jan 2021)
Created Bisulfite Genome folder /data_gilbert/home/aarab/genomes/hg38/chromosomes/Bisulfite_Genome/
Created Bisulfite Genome folder /data_gilbert/home/aarab/genomes/hg38/chromosomes/Bisulfite_Genome/CT_conversion/
Created Bisulfite Genome folder /data_gilbert/home/aarab/genomes/hg38/chromosomes/Bisulfite_Genome/GA_conversion/

Step I - Prepare genome folders - completed


Bismark Genome Preparation - Step II: Bisulfite converting reference genome

conversions performed:
chromosome	C->T	G->A
chr1	48055043	48111528
chr10	27639505	27719976
chr11	27903257	27981801
chr12	27092804	27182678
chr13	18839192	18933605
chr14	18423758	18559033
chr15	17752941	17825903
chr16	18172

In [35]:
!bismark_genome_preparation --verbose --hisat2 ~/genomes/hg38/chromosomes/

Path to genome folder specified as: /data_gilbert/home/aarab/genomes/hg38/chromosomes/
Aligner to be used: >> HISAT2 <<
Writing bisulfite genomes out into a single MFA (multi FastA) file

Bismark Genome Preparation - Step I: Preparing folders

Bisulfite Genome Indexer version v0.23.1 (last modified: 27 Jan 2021)

A directory called /data_gilbert/home/aarab/genomes/hg38/chromosomes/Bisulfite_Genome/ already exists. Already existing converted sequences and/or already existing Bowtie 2 or HISAT2) indices will be overwritten!


Step I - Prepare genome folders - completed


Bismark Genome Preparation - Step II: Bisulfite converting reference genome

conversions performed:
chromosome	C->T	G->A
chr1	48055043	48111528
chr10	27639505	27719976
chr11	27903257	27981801
chr12	27092804	27182678
chr13	18839192	18933605
chr14	18423758	18559033
chr15	17752941	17825903
chr16	18172742	18299976
chr17	18723944	18851500
chr18	15794455	16061651
chr19	13954580	14061132
chr2	48318180	48450903
chr20	13916133	14

### Analyze samples

__(II) Running bismark__

In [None]:
%%bash 
genDIR="/data_gilbert/home/aarab/genomes/hg38/chromosomes/"

cd fastq
for fq1 in *_val_1.fq.gz; do 
    fq2=${fq1/_1_val_1.fq/_2_val_2.fq}
    b=`basename $fq1`
    sample=${b/_1_val_1.fq.gz/}
    cm="bismark --gzip --genome $genDIR -1 $fq1 -2 $fq2"
    # --score_min L,0,-0.4
    echo $cm
    $cm &> ${sample}.log;
    wait
done
cd ../

bismark --gzip --genome /data_gilbert/home/aarab/genomes/hg38/chromosomes/ -1 SRR11711272_1_val_1.fq.gz -2 SRR11711272_2_val_2.fq.gz


__(III) Running deduplicate_bismark__
> should not be used for reduced representation libraries such as RRBS, amplicon or target enrichment libraries.

In [98]:
# %%bash 
# for bam in bam/*.bam; do 
#     deduplicate_bismark --bam $bam
# done 

__(IV) Running bismark_methylation_extractor__
extract context-dependent (CpG/CHG/CHH) methylation

This will produce three methytlation output files:

    CpG_context_test_dataset_bismark_bt2.txt.gz
    CHG_context_test_dataset_bismark_bt2.txt.gz
    CHH_context_test_dataset_bismark_bt2.txt.gz
as well as a bedGraph and a Bismark coverage file. For more on these files and their formats please see below.

In [107]:
%%bash 
cd bam/
for bam in *.bam; do 
    bismark_methylation_extractor --multicore 30 --gzip --bedGraph $bam
done
cd ../


 *** Bismark methylation extractor version v0.23.1 ***

Trying to determine the type of mapping from the SAM header line of file SRR11711272_1_val_1_bismark_bt2_pe.bam
Treating file(s) as paired-end data (as extracted from @PG line)

Setting option '--no_overlap' since this is (normally) the right thing to do for paired-end data

Core usage currently set to more than 20 threads. Let's see how this goes... (set value: 30)

Summarising Bismark methylation extractor parameters:
Bismark paired-end SAM format specified (default)
Number of cores to be used: 30
Output will be written to the current directory ('/data_gilbert/home/aarab/Projects/Decitabine-treatment/Bisulfite-seq/bam')


Summarising bedGraph parameters:
Generating additional output in bedGraph and coverage format
bedGraph format:	<Chromosome> <Start Position> <End Position> <Methylation Percentage>
coverage format:	<Chromosome> <Start Position> <End Position> <Methylation Percentage> <count methylated> <count non-methylated>






...passed!
Writing result file containing methylation information for C in CpG context from the original top strand to CpG_OT_SRR11711272_1_val_1_bismark_bt2_pe.txt.gz
Writing result file containing methylation information for C in CpG context from the complementary to original top strand to CpG_CTOT_SRR11711272_1_val_1_bismark_bt2_pe.txt.gz
Writing result file containing methylation information for C in CpG context from the complementary to original bottom strand to CpG_CTOB_SRR11711272_1_val_1_bismark_bt2_pe.txt.gz
Writing result file containing methylation information for C in CpG context from the original bottom strand to CpG_OB_SRR11711272_1_val_1_bismark_bt2_pe.txt.gz

Writing result file containing methylation information for C in CHG context from the original top strand to CHG_OT_SRR11711272_1_val_1_bismark_bt2_pe.txt.gz
Writing result file containing methylation information for C in CHG context from the complementary to original top strand to CHG_CTOT_SRR11711272_1_val_1_bisma

Now testing Bismark result file >SRR11711272_1_val_1_bismark_bt2_pe.bam< for positional sorting (which would be bad...)	SRR11711272_1_val_1_bismark_bt2_pe_splitting_report.txt.1
SRR11711272_1_val_1_bismark_bt2_pe_splitting_report.txt.2
SRR11711272_1_val_1_bismark_bt2_pe_splitting_report.txt.3
SRR11711272_1_val_1_bismark_bt2_pe_splitting_report.txt.4
SRR11711272_1_val_1_bismark_bt2_pe_splitting_report.txt.5
SRR11711272_1_val_1_bismark_bt2_pe_splitting_report.txt.6
SRR11711272_1_val_1_bismark_bt2_pe_splitting_report.txt.7
SRR11711272_1_val_1_bismark_bt2_pe_splitting_report.txt.8
SRR11711272_1_val_1_bismark_bt2_pe_splitting_report.txt.9
SRR11711272_1_val_1_bismark_bt2_pe_splitting_report.txt.10
SRR11711272_1_val_1_bismark_bt2_pe_splitting_report.txt.11
SRR11711272_1_val_1_bismark_bt2_pe_splitting_report.txt.12
SRR11711272_1_val_1_bismark_bt2_pe_splitting_report.txt.13
SRR11711272_1_val_1_bismark_bt2_pe_splitting_report.txt.14
SRR11711272_1_val_1_bismark_bt2_pe_splitting_report.txt.15
SRR1


Processed 24516791 lines in total
Total number of methylation call strings processed: 49033582

Final Cytosine Methylation Report
Total number of C's analysed:	340278125

Total methylated C's in CpG context:	29977708
Total methylated C's in CHG context:	913413
Total methylated C's in CHH context:	2500583

Total C to T conversions in CpG context:	18819225
Total C to T conversions in CHG context:	75700969
Total C to T conversions in CHH context:	212366227

C methylated in CpG context:	61.4%
C methylated in CHG context:	1.2%
C methylated in CHH context:	1.2%



Merging individual M-bias reports into overall M-bias statistics from these 30 individual files:


SRR11711272_1_val_1_bismark_bt2_pe_splitting_report.txt.1.mbias
SRR11711272_1_val_1_bismark_bt2_pe_splitting_report.txt.2.mbias
SRR11711272_1_val_1_bismark_bt2_pe_splitting_report.txt.3.mbias
SRR11711272_1_val_1_bismark_bt2_pe_splitting_report.txt.4.mbias
SRR11711272_1_val_1_bismark_bt2_pe_splitting_report.txt.5.mbias
SRR11711272_1_val_1_bismark_bt2_pe_splitting_report.txt.6.mbias
SRR11711272_1_val_1_bismark_bt2_pe_splitting_report.txt.7.mbias
SRR11711272_1_val_1_bismark_bt2_pe_splitting_report.txt.8.mbias
SRR11711272_1_val_1_bismark_bt2_pe_splitting_report.txt.9.mbias
SRR11711272_1_val_1_bismark_bt2_pe_splitting_report.txt.10.mbias
SRR11711272_1_val_1_bismark_bt2_pe_splitting_report.txt.11.mbias
SRR11711272_1_val_1_bismark_bt2_pe_splitting_report.txt.12.mbias
SRR11711272_1_val_1_bismark_bt2_pe_splitting_report.txt.13.mbias
SRR11711272_1_val_1_bismark_bt2_pe_splitting_report.txt.14.mbias
SRR11711272_1_val_1_bismark_bt2_pe_splitting_report.txt.15.mbias
SRR11711272_1_val_1_bismark_bt2_pe

Determining maximum read lengths for M-Bias plots
Maximum read length of Read 1: 32
Maximum read length of Read 2: 28

Perl module GD::Graph::lines is not installed, skipping drawing M-bias plots (only writing out M-bias plot table)
Determining maximum read lengths for M-Bias plots
Maximum read length of Read 1: 32
Maximum read length of Read 2: 28

Perl module GD::Graph::lines is not installed, skipping drawing M-bias plots (only writing out M-bias plot table)
Deleting unused files ...

CpG_OT_SRR11711272_1_val_1_bismark_bt2_pe.txt.gz contains data ->	kept
CpG_CTOT_SRR11711272_1_val_1_bismark_bt2_pe.txt.gz was empty ->	deleted
CpG_CTOB_SRR11711272_1_val_1_bismark_bt2_pe.txt.gz was empty ->	deleted
CpG_OB_SRR11711272_1_val_1_bismark_bt2_pe.txt.gz contains data ->	kept
CHG_OT_SRR11711272_1_val_1_bismark_bt2_pe.txt.gz contains data ->	kept
CHG_CTOT_SRR11711272_1_val_1_bismark_bt2_pe.txt.gz was empty ->	deleted
CHG_CTOB_SRR11711272_1_val_1_bismark_bt2_pe.txt.gz was empty ->	deleted
CHG_OB

/data_gilbert/home/aarab/Projects/Decitabine-treatment/Bisulfite-seq/bam/CpG_OT_SRR11711272_1_val_1_bismark_bt2_pe.txt.gz	/data_gilbert/home/aarab/Projects/Decitabine-treatment/Bisulfite-seq/bam/CpG_OB_SRR11711272_1_val_1_bismark_bt2_pe.txt.gz



Writing bedGraph to file: SRR11711272_1_val_1_bismark_bt2_pe.bedGraph.gz
Also writing out a coverage file including counts methylated and unmethylated residues to file: SRR11711272_1_val_1_bismark_bt2_pe.bismark.cov.gz

Now writing methylation information for file >>CpG_OT_SRR11711272_1_val_1_bismark_bt2_pe.txt.gz<< to individual files for each chromosome
Finished writing out individual chromosome files for CpG_OT_SRR11711272_1_val_1_bismark_bt2_pe.txt.gz
Now writing methylation information for file >>CpG_OB_SRR11711272_1_val_1_bismark_bt2_pe.txt.gz<< to individual files for each chromosome
Finished writing out individual chromosome files for CpG_OB_SRR11711272_1_val_1_bismark_bt2_pe.txt.gz

Collecting temporary chromosome file information... Processing the following input file(s):
CpG_OT_SRR11711272_1_val_1_bismark_bt2_pe.txt.gz.chrchr12.methXtractor.temp
CpG_OT_SRR11711272_1_val_1_bismark_bt2_pe.txt.gz.chrchr15.methXtractor.temp
CpG_OT_SRR11711272_1_val_1_bismark_bt2_pe.txt.gz.chrchr




...passed!
Writing result file containing methylation information for C in CpG context from the original top strand to CpG_OT_SRR11711273_1_val_1_bismark_bt2_pe.txt.gz
Writing result file containing methylation information for C in CpG context from the complementary to original top strand to CpG_CTOT_SRR11711273_1_val_1_bismark_bt2_pe.txt.gz
Writing result file containing methylation information for C in CpG context from the complementary to original bottom strand to CpG_CTOB_SRR11711273_1_val_1_bismark_bt2_pe.txt.gz
Writing result file containing methylation information for C in CpG context from the original bottom strand to CpG_OB_SRR11711273_1_val_1_bismark_bt2_pe.txt.gz

Writing result file containing methylation information for C in CHG context from the original top strand to CHG_OT_SRR11711273_1_val_1_bismark_bt2_pe.txt.gz
Writing result file containing methylation information for C in CHG context from the complementary to original top strand to CHG_CTOT_SRR11711273_1_val_1_bisma

Now testing Bismark result file >SRR11711273_1_val_1_bismark_bt2_pe.bam< for positional sorting (which would be bad...)	SRR11711273_1_val_1_bismark_bt2_pe_splitting_report.txt.1
SRR11711273_1_val_1_bismark_bt2_pe_splitting_report.txt.2
SRR11711273_1_val_1_bismark_bt2_pe_splitting_report.txt.3
SRR11711273_1_val_1_bismark_bt2_pe_splitting_report.txt.4
SRR11711273_1_val_1_bismark_bt2_pe_splitting_report.txt.5
SRR11711273_1_val_1_bismark_bt2_pe_splitting_report.txt.6
SRR11711273_1_val_1_bismark_bt2_pe_splitting_report.txt.7
SRR11711273_1_val_1_bismark_bt2_pe_splitting_report.txt.8
SRR11711273_1_val_1_bismark_bt2_pe_splitting_report.txt.9
SRR11711273_1_val_1_bismark_bt2_pe_splitting_report.txt.10
SRR11711273_1_val_1_bismark_bt2_pe_splitting_report.txt.11
SRR11711273_1_val_1_bismark_bt2_pe_splitting_report.txt.12
SRR11711273_1_val_1_bismark_bt2_pe_splitting_report.txt.13
SRR11711273_1_val_1_bismark_bt2_pe_splitting_report.txt.14
SRR11711273_1_val_1_bismark_bt2_pe_splitting_report.txt.15
SRR1


Processed 6784977 lines in total
Total number of methylation call strings processed: 13569954

Final Cytosine Methylation Report
Total number of C's analysed:	93345981

Total methylated C's in CpG context:	6652839
Total methylated C's in CHG context:	244172
Total methylated C's in CHH context:	682610

Total C to T conversions in CpG context:	6679675
Total C to T conversions in CHG context:	20695055
Total C to T conversions in CHH context:	58391630

C methylated in CpG context:	49.9%
C methylated in CHG context:	1.2%
C methylated in CHH context:	1.2%



Merging individual M-bias reports into overall M-bias statistics from these 30 individual files:


SRR11711273_1_val_1_bismark_bt2_pe_splitting_report.txt.1.mbias
SRR11711273_1_val_1_bismark_bt2_pe_splitting_report.txt.2.mbias
SRR11711273_1_val_1_bismark_bt2_pe_splitting_report.txt.3.mbias
SRR11711273_1_val_1_bismark_bt2_pe_splitting_report.txt.4.mbias
SRR11711273_1_val_1_bismark_bt2_pe_splitting_report.txt.5.mbias
SRR11711273_1_val_1_bismark_bt2_pe_splitting_report.txt.6.mbias
SRR11711273_1_val_1_bismark_bt2_pe_splitting_report.txt.7.mbias
SRR11711273_1_val_1_bismark_bt2_pe_splitting_report.txt.8.mbias
SRR11711273_1_val_1_bismark_bt2_pe_splitting_report.txt.9.mbias
SRR11711273_1_val_1_bismark_bt2_pe_splitting_report.txt.10.mbias
SRR11711273_1_val_1_bismark_bt2_pe_splitting_report.txt.11.mbias
SRR11711273_1_val_1_bismark_bt2_pe_splitting_report.txt.12.mbias
SRR11711273_1_val_1_bismark_bt2_pe_splitting_report.txt.13.mbias
SRR11711273_1_val_1_bismark_bt2_pe_splitting_report.txt.14.mbias
SRR11711273_1_val_1_bismark_bt2_pe_splitting_report.txt.15.mbias
SRR11711273_1_val_1_bismark_bt2_pe

Determining maximum read lengths for M-Bias plots
Maximum read length of Read 1: 32
Maximum read length of Read 2: 28

Perl module GD::Graph::lines is not installed, skipping drawing M-bias plots (only writing out M-bias plot table)
Determining maximum read lengths for M-Bias plots
Maximum read length of Read 1: 32
Maximum read length of Read 2: 28

Perl module GD::Graph::lines is not installed, skipping drawing M-bias plots (only writing out M-bias plot table)
Deleting unused files ...

CpG_OT_SRR11711273_1_val_1_bismark_bt2_pe.txt.gz contains data ->	kept
CpG_CTOT_SRR11711273_1_val_1_bismark_bt2_pe.txt.gz was empty ->	deleted
CpG_CTOB_SRR11711273_1_val_1_bismark_bt2_pe.txt.gz was empty ->	deleted
CpG_OB_SRR11711273_1_val_1_bismark_bt2_pe.txt.gz contains data ->	kept
CHG_OT_SRR11711273_1_val_1_bismark_bt2_pe.txt.gz contains data ->	kept
CHG_CTOT_SRR11711273_1_val_1_bismark_bt2_pe.txt.gz was empty ->	deleted
CHG_CTOB_SRR11711273_1_val_1_bismark_bt2_pe.txt.gz was empty ->	deleted
CHG_OB

/data_gilbert/home/aarab/Projects/Decitabine-treatment/Bisulfite-seq/bam/CpG_OT_SRR11711273_1_val_1_bismark_bt2_pe.txt.gz	/data_gilbert/home/aarab/Projects/Decitabine-treatment/Bisulfite-seq/bam/CpG_OB_SRR11711273_1_val_1_bismark_bt2_pe.txt.gz



Writing bedGraph to file: SRR11711273_1_val_1_bismark_bt2_pe.bedGraph.gz
Also writing out a coverage file including counts methylated and unmethylated residues to file: SRR11711273_1_val_1_bismark_bt2_pe.bismark.cov.gz

Now writing methylation information for file >>CpG_OT_SRR11711273_1_val_1_bismark_bt2_pe.txt.gz<< to individual files for each chromosome
Finished writing out individual chromosome files for CpG_OT_SRR11711273_1_val_1_bismark_bt2_pe.txt.gz
Now writing methylation information for file >>CpG_OB_SRR11711273_1_val_1_bismark_bt2_pe.txt.gz<< to individual files for each chromosome
Finished writing out individual chromosome files for CpG_OB_SRR11711273_1_val_1_bismark_bt2_pe.txt.gz

Collecting temporary chromosome file information... Processing the following input file(s):
CpG_OT_SRR11711273_1_val_1_bismark_bt2_pe.txt.gz.chrchr6.methXtractor.temp
CpG_OT_SRR11711273_1_val_1_bismark_bt2_pe.txt.gz.chrchr10.methXtractor.temp
CpG_OT_SRR11711273_1_val_1_bismark_bt2_pe.txt.gz.chrchr5

<!-- __(V) Running bismark2report__ -->

__(VI) Running bismark2summary__

In [118]:
%%bash 
cd bismark/ 
bismark2summary
cd ../

No Bismark/Bowtie2 single-end BAM files detected
Found Bismark/Bowtie2 paired-end files
No Bismark/HISAT2 single-end BAM files detected
No Bismark/HISAT2 paired-end BAM files detected

Generating Bismark summary report from 2 Bismark BAM file(s)...
>> Reading from Bismark report: SRR11711272_1_val_1_bismark_bt2_PE_report.txt
No deduplication report present, skipping...
>> Reading from Bismark report: SRR11711273_1_val_1_bismark_bt2_PE_report.txt
No deduplication report present, skipping...

Wrote Bismark project summary to >> bismark_summary_report.html <<



In [None]:
%%bash 
for bam in bismark/*bam; do 
    out=${bam/.bam/.srt.bam}
    cm="samtools sort -@ 24 -o $out $bam"
    echo $cm
    $cm; 
    wait
done 

# 

In [119]:
!date

Tue Mar  8 10:20:19 PST 2022
