Skip to content


Repository files navigation


GCS-seq is a ChIP-seq based approach that identifies genome-wide gyrase cleavage sites (GCSs) in E. coli popluation grown to staionary-phase. In this project, we performed GCS-seq with a panel of differnt fluorquinolone (FQs) antibiotics that are known to have differential killing capacity, aiming to identify the association between cleavage patterns (e.g., number, cleavage strength, location) and bacterial persister levels.

This repository contains scripts that used to identify genome-wide GCSs and the downstream analysis performed related to the manuscript Genome-wide mapping of fluoroquinolone-stabilized DNA gyrase cleavage sites displays drug specific effects that correlate with bacterial persistence

For full project description, please refer to our manuscript (in revision and available on

Raw sequencing data were deposited with GEO under accession number GSE206610.

File Description

This scripts read in coverage file (processed from Galaxy, available upon request), preprocess, and prepare for GCS calling

This scripts is used for the identification of GCSs.

Input the folder contains the coverage data from NGS that preprocessed with

Output files containing GCSs (GCS calling output is avaialbe in folder GCS_calling)


Draw Venn diagram of identified GCSs from levofloxacin (LEVO), moxifloxacin (MOXI), norfloxacin (NOR), ciprofloxacin (CIP),and gemifloxacin (GEMI) treatment.

Input: folder contains the identified GCSs


Script with supporting functions


Script visualizes GCSs within different sets of regions with different transcription levels (upstream, downstream, 5' region, and 3' region).

Input: 1. Annotation file: GCF_000005845.2_ASM584v2_genomic_insert_hierexon.gffread.gtf 2. folder contains the identified GCSs 3. RNA expression data (raw read counts)

This script construct motifs for MOXI, NOR, CIP, GEMI, and LEVO treated samples

Input: folder contains the identified GCSs


Plot GCS number/cleavage strength vs persistence level

The script is used to scan log-odds score of a given seqence or acrross the genome to predict cleavage probability based on obtained motif

Input: 1. folder contains the identified GCSs 2. genome fasta file


This script plot the GCS distribution across the chromosome and perform statistical tests

Input: 1. folder contains the identified GCSs


This scripts takes in the raw read count of FLAG and FLAGless control strains, zoom into 2 GCSs (Mu, mobA) and 2 control sites (nuoN, trkH) to plot the coverage depth.

Input raw read count data

see data folder for numbers used in the scripts


perform PCA analysis on GCS distribution

Input: merged csv containing the cleavage strength for each replicate across 5 FQ treatment


generate GCS hierarchical clustering heat map

Input: merged csv containing the cleavage strength for each replicate across 5 FQ treatment


Sort GCS based on the cleavage strengths and save the file of sites and strength for motif construction; Further, retain the top sites which are identified as real GCSs for motif construction

Also heatmap and barplots showng the top strengths sites for each FQ treatment

Next step in pipeline for motif construction:


This script performs differential gene expression analysis and saves the results

Input 1. annotation file (GCF_000005845.2_ASM584v2_genomic_insert_hierexon.gffread.gtf) 2. GCS file 3. gene expression files (raw counts)


This script Add GCS annotation/ prepare for functional enrichment analysis


Ranked GCS plot based on cleavage strength


Plot GO enrichment results