Supplementary methods

"A role for the Saccharomyces cerevisiae* ABCF protein New1 during translation termination"*

preliminary version in bioRxiv

Introduction

Following pipeline and scripts were used to process Ribo-Seq data published in https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-7763. The pipeline Pipeline.py performs data downloading; preprocessing including quality filtering, removing ncRNA; aligning RPFs to genome; and finalizes with corrected P-site assignment. Output of ribosome densities is stored in HDF5 format by default. Python scripts (sub-folder scripts/) can read HDF5 format and convert ribosome coverage to bedgraph files hdf_2_bedgraph.py; calculate relative 3' UTR coverage compute_relative_3utr_coverage.py; and queuing score compute_queuing.py. Genes coverage around stop codon for metagene plots are summed up by the script hdf_2_metagene_tables_stop.py The script can be controlled by gene list; or let it split data based on stop codon. Input files, R script and output of differential expression (DE) analysis is in sub-folder Differential_Expression/.

Prerequisite

Python v.3.6 from Continuum (or some other python). Continuum's Anaconda comes with a bunch of libraries and have a nice package manager conda. Add the bioconda channel :: conda config --add channels bioconda
wget There is a detail installation guide for OS-X. I would recommend precompiled binaries from Rudix site. If you download fastq files manually and placed them under folder 1-Raw/ then the step 1, wget, is not necessary. Don't forget rename Fastq files as they are referred in the Param.in
cutadapt :: conda install cutadapt
pigz - optional if not installed cutadapt falls back to single core mode
HISAT2 v. 2.0.5 or higher ftp://ftp.ccb.jhu.edu/pub/infphilo/hisat2/downloads. Hisat2 trims ends of reads with bad quality by default. Starting from version 2.0.5 there is an option to turn it off.
bowtie2
samtools :: conda install samtools
pysam :: conda install pysam

Additional files

Sequences and annotation

are placed under sub-folder 0-References/

Genome.fa - genome sequence in FastA format
ncRNA.fa - non coding RNA in FastA format
Genome.gtf - genome annotation v.88 in GTF (gff2) format from Ensembl

Other data files are derived from those and commands for that are listed in the file build_index.sh.

# create index files for aligners and readjust GTF file for tabix and pysam
cd 0-References/
sh build_index.sh

Data tables

readlength_offsets.txt - read length specific offsets are used for P-Site assignment.
E-MTAB-7763-riboseq.sdrf.txt - Samples table from ArrayExpress, contains only Ribo-Seq samples and needed for downloading fastq files (Step 1)
Param.in - Contains different parameters used by Pipeline.py

Usage

python  Pipeline.py

Remarks

Annotation in GTF format is originated from Ensembl and it contains features like stop_codon and start_codon what are essential for the pipeline. In versions 84 - 88 stop_codon annotation was missing approximately for 145 ORFs what was corrected at least in the version v.95. We rerunned some analyses with the annotation v.95 and saw subtle or no difference. We keep here annotation version 88 to be consistent with the paper.

References

The pipelines backbone is based on a code used by Radhakrishnan, A., et al. Cell (2016) (Green's lab.)

Name	Name	Last commit message	Last commit date
Latest commit tmargus Merge pull request #1 from GCA-VH-lab/add-license-1 Jun 10, 2019 c355fc5 · Jun 10, 2019 History 4 Commits
0-References	0-References	upload	Jun 10, 2019
Differential_Expression	Differential_Expression	remove old lists	Jun 10, 2019
scripts	scripts	upload	Jun 10, 2019
E-MTAB-7763-riboseq.sdrf.txt	E-MTAB-7763-riboseq.sdrf.txt	upload	Jun 10, 2019
LICENSE	LICENSE	Create LICENSE	Jun 10, 2019
Param.in	Param.in	upload	Jun 10, 2019
Pipeline.py	Pipeline.py	upload	Jun 10, 2019
README.md	README.md	upload	Jun 10, 2019
readlength_offsets.txt	readlength_offsets.txt	upload	Jun 10, 2019
readlength_offsets_26-31.txt	readlength_offsets_26-31.txt	upload	Jun 10, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Supplementary methods

"A role for the Saccharomyces cerevisiae* ABCF protein New1 during translation termination"*

Introduction

Prerequisite

Additional files

Sequences and annotation

Data tables

Usage

Remarks

References

About

Releases

Packages

Languages

License

GCA-VH-lab/2019-NAR

Folders and files

Latest commit

History

Repository files navigation

Supplementary methods

"A role for the Saccharomyces cerevisiae ABCF protein New1 during translation termination"

Introduction

Prerequisite

Additional files

Sequences and annotation

Data tables

Usage

Remarks

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

"A role for the Saccharomyces cerevisiae* ABCF protein New1 during translation termination"*

Packages