## **Project**: Natural products from the Palaeolithic

## **Section**: Presence and Absence of BGCs in drep and ancient genomes


Anan Ibrahim, 01.05.2022

**Contents**
 - **Step1**: Create conda envirorment with required dependencies if not already installed
 - **Step2**: Screen for BGCs in the drep chlorobiales and ancient genomes

##########

**Step1**: Create conda envirorment with required dependencies if not already installed

##########

In [None]:
# All conda envs can be found in EMN001_Paleofuran/02-scripts/ENVS_*.yml
conda env create -f antismash.yml

##########

**Step2**: Screen for BGCs in the drep chlorobiales and ancient genomes

##########

In [None]:
#!/bin/bash

############################
#Hashes and Directories
############################

# NOTE: Change directories in bash script accordingly 
# NOTE: Add the ancient Bins/MAGs in $BINS

# Directories: 
OUT=/Net/Groups/ccdata/users/AIbrahim/ancientDNA/Deep-Evo/BGC/final-butyrolactone/Output
BINS=/Net/Groups/ccdata/users/AIbrahim/ancientDNA/Deep-Evo/BGC/final-butyrolactone/Input/BINS

DREPGENOME=/Net/Groups/ccdata/databases/ncbi-ref-genomes/Chlorobiales/drep/dereplicated_genomes
ANCIENT_CONTIGS=/Net/Groups/ccdata/users/AIbrahim/ancientDNA/Deep-Evo/BGC/final-butyrolactone/Input/ancient_contigs_names_butyrolactone

mkdir $OUT

############################
# Antismash annotation of MAGS
############################
mkdir $OUT/ANTISMASH-drep-refgenomes
cd $OUT/ANTISMASH-drep-refgenomes

eval "$(conda shell.bash hook)"
conda activate antismash

# Ancient MAGs
for F in $OUT/PROKKA/*/*.fna; do 
  N=$(basename $F .gbk) ;
  mkdir $OUT/ANTISMASH/$N ;
  antismash $F \
  --output-dir $OUT/ANTISMASH/$N \
  --genefinding-gff3 $OUT/PROKKA/$N/*.gff \
  --genefinding-tool none \
  --logfile $OUT/ANTISMASH/$N/log_$N.txt \
  --cb-knownclusters \
  --cb-general \
  --minlength 1000 \
  --smcog-trees -c 29 ;
done

# Modern genomes
for F in $OUT/PROKKA-drep-refgenomes/*/*.gbk; do 
  N=$(basename $F .gbk) ;
  mkdir $OUT/ANTISMASH-drep-refgenomes/$N ;
  antismash $F \
  --output-dir $OUT/ANTISMASH-drep-refgenomes/$N \
  --genefinding-tool none \
  --logfile $OUT/ANTISMASH-drep-refgenomes/$N/log_$N.txt \
  --cb-knownclusters \
  --cb-general \
  --minlength 1000 \
  --smcog-trees -c 4 ;
done

conda deactivate

############################
# Tabulate the antismash results 
############################

python3 antismash_to_tsv1.py \
$BINS \
$OUT/PROKKA \
$OUT/ANTISMASH-drep-refgenomes \
$OUT/ANTISMASH-drep-refgenomes/antismash_bins_table.txt

awk -F '\t' '{print $1"\t"$4}' $OUT/ANTISMASH-drep-refgenomes/antismash_bins_table.txt > \
$OUT/ANTISMASH-drep-refgenomes/antismash_bins_table2.txt