# Presentacion y workshop para analysar ATAC-seq de Oryza Sativa (Arroz)

Data viene de:

2. [Comprehensive mapping and modelling of the rice regulome landscape unveils the regulatory architecture underlying complex traits](https://www.nature.com/articles/s41467-024-50787-y)


1. Preparar la data
2. Aligniar al genoma
3. Encontrar areas con muchas reads
4. Visualizar con otra data

In [None]:
%%bash

SAMPLE_NAME="atac_oryza_down"
IN_FOLDER="/data/Epigenetics_Workshop/input_data/"
OUT_FOLDER="/data/Epigenetics_Workshop/input_data/"
fastp -w 16 -i ${IN_FOLDER}/${SAMPLE_NAME}_r1.fq \
      -I ${IN_FOLDER}/${SAMPLE_NAME}_r2.fq \
      -o ${OUT_FOLDER}/${SAMPLE_NAME}_r1_filt.fq.gz \
      -O ${OUT_FOLDER}/${SAMPLE_NAME}_r2_filt.fq.gz \
      -h ${OUT_FOLDER}/${SAMPLE_NAME}_filt.html \
      -j ${OUT_FOLDER}/${SAMPLE_NAME}_filt.json \
      --detect_adapter_for_pe 


In [None]:
%%bash
IX_DIR="/data/Epigenetics_Workshop/input_data/oryza_sativa_mm2/ref_rename.mmi"
IN_DIR="/data/Epigenetics_Workshop/input_data/"
OUT_DIR="/data/Epigenetics_Workshop/input_data/"
TEMP_DIR="/data/Epigenetics_Workshop/input_data/tmp"
SAMPLE_NAME="atac_oryza_down"
cd "${IN_DIR}"

#Map using minimap2 w/ sr params and sort bam for caling peaks
minimap2 -t 16 -ax sr \
            ${IX_DIR} \
            ${SAMPLE_NAME}_r1_filt.fq.gz ${SAMPLE_NAME}_r2_filt.fq.gz | \
            samtools sort  -T ${TEMP_DIR} -@ 8 - | \
            samtools view -hbS - > ${SAMPLE_NAME}_oryza_sativa_mm2_sort.bam
samtools index ${SAMPLE_NAME}_oryza_sativa_mm2_sort.bam

#Make bw for viz
bamCoverage --Offset 5 -1 -b ${SAMPLE_NAME}_oryza_sativa_mm2_sort.bam \
            -o ${SAMPLE_NAME}_oryza_sativa_mm2.bw


In [None]:
%%bash
IN_DIR="/data/Epigenetics_Workshop/input_data/"
SAMPLE_NAME="atac_oryza_down"
GENOME_SIZE="340451842"
GEN_SIZE_CHR="oryza_sativa_sizes.genome"
cd "${IN_DIR}"

# Call peaks using two methods
# Negative binomial background based
macs3 callpeak -f BAMPE --call-summits \
    -t ${SAMPLE_NAME}_oryza_sativa_mm2_sort.bam \
    -g ${GENOME_SIZE} -B -q 0.05 --trackline \
    -n ${SAMPLE_NAME}_oryza_sativa_atac.macs3.default.summits.bampe 
#HMM based -- doesn't run on downsample data because background model needs more data
#macs3 hmmratac -i ${SAMPLE_NAME}_oryza_sativa_mm2_sort.bam \
#    -n ${SAMPLE_NAME}_oryza_sativa_atac.macs3.hmmratac.bampe

#Filter by chromosome only 1-12
awk '$1 > 0 && $1 <= 12' \
    ${SAMPLE_NAME}_oryza_sativa_atac.macs3.default.summits.bampe_summits.bed > \
    ${SAMPLE_NAME}_oryza_sativa_atac_summits_filt.bed
    
bedtools slop -b 350 -g ${GEN_SIZE_CHR} \
    -i ${SAMPLE_NAME}_oryza_sativa_atac_summits_filt.bed > \
    ${SAMPLE_NAME}_oryza_sativa_atac_summits_filt_flank.bed
    

# Interpretando resultados

Ahora podemos tomar el output the nuestro analysis

# Interpretando los resultados

Para interpretar el resultado, podemos usar data que se a publicado, especificamente [ChiPHub](https://biobigdata.nju.edu.cn/ChIPHub/) es un recurso para el analysis de epigenetica en plantas


# Comparar track con histones

La data de chromatin accessibilidad viene de pistil, y en pistil se ha encontrado que el gene GW8 es muy importante.
[Comprehensive mapping and modelling of the rice regulome landscape unveils the regulatory architecture underlying complex traits](https://www.nature.com/articles/s41467-024-50787-y)\
 Figure 2c

Podemos  usar recursos externos como [Uniprot](https://www.uniprot.org/uniprotkb/Q6YZE8/external-links) para buscar el Xref en nuestros anotaciones de gene en el .gff:
```NC_029263.1	Gnomon	CDS	26501223	26501713	.	+	0	gene_id "LOC4346133"; transcript_id "XM_015793891.2"; db_xref "GeneID:4346133"; gbkey "CDS"; gene "LOC4346133"; product "squamosa promoter-binding-like protein 16"; protein_id "XP_015649377.1"; exon_number "1"; ```



# GW8 importante en produccion
GW8 known rice grain size [many papers](https://pubmed.ncbi.nlm.nih.gov/29892844/)


# Mirar GW8 area en browser
8:26500336-26506390 LOC_Os08g41940.1
[Oryza Sativa Genome Viewer](https://biobigdata.nju.edu.cn/browser/?genome=oryza_sativa)

Aqui podemos agregar el track que creamos! 

Tracks --> Custom Tracks --> Add new tracks --> Upload them from your computer --> atac_oryza_down_oryza_sativa_atac_summits_filt_flank.bed


# Poner otras tracks

Podemos colocar mas tracks de epigenetica, como h3k4me3 y h3k27ac!

* Tracks --> public track hubs --> ChIPHub analysis results seacrching by Factor(oryza_sativa)

* Tracks --> Track table --> Podemos poner algunas 
