Single cell analysis of Trichuris-infected caecum

The data analysed in this repo is described in Duque-Correa et al., 2021, "Defining the early stages of intestinal colonisation by whipworms".

The main analysis scripts are:

sample_QC.R (sample QC)
merge_all.R (main single cell analysis)
PAGA.py (trajectory analysis)
SF3.R (subclustering of undifferentiated cells)
SF1.R (analysis of paired bulk libraries and whole caecum infection time course)

Retrieving data (Internal: Sanger-specific)

Need a manifest with Sanger sample IDs and meta data (samples.txt)

kinit #initialise irods

cut -f 1 samples.txt | while read -r sample; do

 CRAMS=($(imeta qu -z seq -d sample = ${sample} and target = 1 and type = cram | grep -o [0-9_#]*.cram))
 
 for cram in ${CRAMS[@]}; do
  echo ${sample}$'\t'${cram}$'\t''CRAM' >> data_locations.txt    #in case we need them later
 done
 
 #Retrieve CellRanger reports 
 #Check CRAMs were all sequenced in the same flow cell (all CRAMS are in the same collection)
 #CellRanger reports for samples sequenced in >1 flow cell end up in /seq/illumina 
 
 COLLECTIONS=($(imeta qu -z seq -d sample = ${sample} and target = 1 and type = cram | grep 'collection' | sort | uniq | grep -o '\/seq.*'))
 
 if [[ "${#COLLECTIONS[@]}" = 1 ]] ; then   
  PATHS=($(ils ${COLLECTIONS[0]}/cellranger | grep ${sample}| grep -o '\/seq.*'))
 else
  PATHS=($(ils /seq/illumina/cellranger | grep ${sample}| grep -o '\/seq.*')) 
 fi
 for path in ${PATHS[@]}; do 
  echo ${sample}$'\t'${path}$'\t''CELLRANGER' >> data_locations.txt
 done
 
done

#2 samples are named differently- need to add paths manually for first CellRangerv1 runs of 4672STDY6814755 and 4672STDY6814756

CRAMs are named like "flowcell_lane#index.cram". Can later reconstruct paths on irods like this, if needed: /seq/flowcell/flowcell_lane#index.cram.

Retrieve cellranger metrics for these samples.

regex="cellranger$"
grep CELLRANGER metadata/data_locations.txt | cut -f 1,2 | while read -r sample path; do 

 version=$(echo $path | grep -o 'cellranger[0-9]*_count' | grep -o 'cellranger[0-9]*')
 annotation=$(echo $path | grep -o mm10\.*$)
 
 if [[ $version =~ $regex ]]; then
  version="cellranger131"
 fi
 
  #pull down the HTML files
 
 if [[ ! -d websummaries/${annotation} ]]; then   
  mkdir websummaries/${annotation}
 fi
 
 if [[ ! -d websummaries/${annotation}/${version} ]]; then   
  mkdir websummaries/${annotation}/${version}
 fi
 
 iget ${path}/web_summary.html websummaries/${annotation}/${version}/${sample}.html   
 
 #get the same data in CSV format
 
 if [[ ! -e ${version}.txt ]]; then      #because we want the headers first time round
  echo -n "sample_id,transcriptome," > ${version}.txt
  iget ${path}/metrics_summary.csv - | head -n 1 >> ${version}.txt 
 fi 
 
 echo -n $sample","$annotation"," >> ${version}.txt
 iget ${path}/metrics_summary.csv - | tail -n -1 >> ${version}.txt
 
 #get the counts matrices
 
 if [[ ! -d count_matrices/${annotation} ]]; then   
  mkdir count_matrices/${annotation}
 fi 
 
 if [[ ! -d count_matrices/${annotation}/${version} ]]; then   
  mkdir count_matrices/${annotation}/${version}
 fi 
 
 counts_dir=$(ils -r ${path} | grep 'filtered' | grep 'C' | sort | tail -n 1 | sed -e 's/C-//' | sed -e 's/\s//g')

 iget -r -f ${counts_dir} count_matrices/${annotation}/${version}/${sample}
 
done

Metrics summaries are combined into a master file, cellranger_metrics.tsv.

cellranger_metrics.R compares the samples that have been run with both cellranger v2.1.1 and cellranger v3.0.2.

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
metadata		metadata
scripts		scripts
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

metadata

metadata

scripts

scripts

README.md

README.md

Repository files navigation

Single cell analysis of Trichuris-infected caecum

Retrieving data (Internal: Sanger-specific)

About

Releases 1

Packages

Languages

fayerodgers/single_cell

Folders and files

Latest commit

History

Repository files navigation

Single cell analysis of Trichuris-infected caecum

Retrieving data (Internal: Sanger-specific)

About

Resources

Stars

Watchers

Forks

Languages