# **Mobile Genetic Element Identification â€“ MEFinder**

## Tool Information

- **Tool:** MEFinder (1.0.5)
- **Input:** Assembled genome contigs (FASTA)
- **Organism:** Acinetobacter baumannii
- **Analysis type:** Mobile genetic element detection

This notebook documents the identification of **mobile genetic elements (MGEs)** in *Acinetobacter baumannii* genomes using **MEFinder**.

MEFinder is a tool designed to detect **mobile genetic elements**, including insertion sequences (IS elements) and transposon-associated features, in bacterial genome assemblies.

It identifies known mobile elements by sequence similarity against curated databases, enabling the characterization of genomic regions involved in horizontal gene transfer.

The analysis was performed on assembled genome contigs generated from the GHRU assembly pipeline. Outputs from this step are used for downstream interpretation of antimicrobial resistance dissemination.

In [1]:
%%bash

# initialise conda
source /home/anaconda/miniconda3/etc/profile.d/conda.sh
conda activate mefinder

# check installation
mefinder --version

1.0.5


## Input Files

The input for MEFinder consists of assembled genome contigs generated by the
GHRU assembly pipeline.

### Input requirements

- Genome assemblies in FASTA format
- One assembly per sample
- Assemblies should be quality-checked prior to analysis

## MEFinder Database Initialization (First-Time Setup)

MEFinder requires a reference database to identify mobile genetic elements. This database needs to be initialized **once per environment** before running the `find` command.

For users running MEFinder for the first time, database initialization can be performed using the following command:

In [None]:
%%bash
# One-time database initialization (run only once per environment)
source /home/anaconda/miniconda3/etc/profile.d/conda.sh
conda activate mefinder

mefinder index --db-path # path for database

Once the database has been initialized, this step does not need to be repeated for subsequent analyses unless the MEFinder installation or database is updated.

The MEFinder reference database had been generated during a prior setup step and was already available in the execution environment at the time of this analysis.
As a result, database initialization was not repeated and the existing database was reused for all MEFinder runs.

## MEFinder Execution

MEFinder was executed using the `find` subcommand on assembled genome contigs.
Each assembly was processed individually, and results were written to per-sample
output files. Runtime messages were captured in a dedicated log file to ensure
traceability without cluttering the notebook output.

In [7]:
%%bash

# initialise conda
source /home/anaconda/miniconda3/etc/profile.d/conda.sh
conda activate mefinder

# define paths
ASSEMBLY_DIR=/data/internship_data/nidhi/aba/new_output/nextflow_output/assemblies
MEFINDER_OUTDIR=/data/internship_data/nidhi/aba/new_output/mefinder_output
LOGDIR=/data/internship_data/nidhi/aba/new_output/logs

# create directories
mkdir -p $MEFINDER_OUTDIR $LOGDIR

# run MEFinder per assembly
for fasta in $ASSEMBLY_DIR/*.short.fasta; do
    sample=$(basename "$fasta" .short.fasta)

    mefinder find \
        --contig "$fasta" \
        --threads 4 \
        "$MEFINDER_OUTDIR/${sample}_mefinder.tsv" \
        >> "$LOGDIR/mefinder.log" 2>&1
done

# Expected Outputs

MEFinder generates multiple output files for each analysed genome assembly. These files capture both the detected mobile genetic elements and their corresponding sequence information.

For each sample, the following outputs are produced:

In [11]:
%%bash
ls /data/internship_data/nidhi/aba/new_output/mefinder_output | head -n 6

ABA-1000_mefinder.csv
ABA-1000_mefinder.tsv_mge_sequences.fna
ABA-1000_mefinder.tsv_result.txt
ABA-1001_mefinder.csv
ABA-1001_mefinder.tsv_mge_sequences.fna
ABA-1001_mefinder.tsv_result.txt


- **`<sample>_mefinder.csv`**  
  Tabular summary of mobile genetic elements detected in the genome. This file contains information on element type, family, and associated annotations and is used for downstream quantitative and comparative analyses.


- **`<sample>_mefinder.tsv_mge_sequences.fna`**  
  FASTA file containing nucleotide sequences of the detected mobile genetic elements extracted from the genome assembly. These sequences can be used for further inspection, comparative analysis, or custom BLAST searches.


- **`<sample>_mefinder.tsv_result.txt`**  
  Detailed text-based report generated by MEFinder, containing alignment-level and detection information for each identified mobile genetic element.

All output files are generated on a per-sample basis and are stored in a dedicated MEFinder output directory.