Skip to content

Latest commit

 

History

History
executable file
·
112 lines (80 loc) · 2.93 KB

CommandLists1.md

File metadata and controls

executable file
·
112 lines (80 loc) · 2.93 KB

Command List

Date: 17-Jul-2023
Author: Moeko Okada

List of commands used in the polyploidy summer school.

Part 1

  1. Data download
  2. Quality control
  3. Homoeologous Mapping
  4. Read sorting by EAGLE-RC
  5. Read count
    1. read count
    2. make a count matrix

0. Data download

Genome Data

$ cd src
src $ wget -O genome.tar.gz https://drive.switch.ch/index.php/s/ZQzcTZ5lGJEcCbA/download
src $ ls
genome.tar.gz  scripts
src $ tar xvzf genome.tar.gz
src $ ls
genome  genome.tar.gz  scripts
src $

Downsampled read data

src $ wget -O fastq.tar.gz https://drive.switch.ch/index.php/s/0ny11xweoA5WhEX/download
src $ ls
fastq.tar.gz  genome  genome.tar.gz  scripts
src $ tar xvzf fastq.tar.gz 
src $ ls
fastq  fastq.tar.gz  genome  genome.tar.gz  scripts
src $ ls fastq
MUR_1_R1_ds.fastq.gz  MUR_3_R1_ds.fastq.gz       MUR_R48h_2_R1_ds.fastq.gz  trimmomatic_adapters.txt
MUR_2_R1_ds.fastq.gz  MUR_R48h_1_R1_ds.fastq.gz  MUR_R48h_3_R1_ds.fastq.gz
src $

1. Quality control

Use a shell script to run all samples.
Make sure you are in the "src" directory.

src $ bash scripts/1_qc.sh 
src $

2. Mapping

Use a shell script to run all samples.
Make sure you are in the "src" directory.

src $ bash scripts/2_map.sh 
src $

3. Read classification

Use EAGLE-RC software to classify reads.

Use a shell script to run all samples.
Make sure you are in the "src" directory.

src $ bash scripts/3_eagle.sh 
src $

4. Count reads

Count the number of reads using featureCounts software.

Use a shell script to run all samples.
Make sure you are in the "src" directory.

src $ bash scripts/4_count.sh 
src $

5. Make expression table

src $ cd 4_count
4_count $ (echo "gene_id";ls *_hal_counts.txt; ) | sed -e s/_hal_counts.txt//g | (awk '{if (NR > 1) printf "\t"} {printf "%s_hal", $0}' && echo) > hal_counts.tsv
4_count $ (echo "gene_id";ls *_lyr_counts.txt; ) | sed -e s/_lyr_counts.txt//g | (awk '{if (NR > 1) printf "\t"} {printf "%s_hal", $0}' && echo) > lyr_counts.tsv
4_count $ python /tmp/eagle/scripts/tablize.py -skip 1 -a -i 0 -c 6 *_hal_counts.txt >> hal_counts.tsv
4_count $ python /tmp/eagle/scripts/tablize.py -skip 1 -a -i 0 -c 6 *_lyr_counts.txt >> lyr_counts.tsv
4_count $