Welcome to the eccToolkit
repository. This repository contains the analytical code employed in the research paper titled "Dynamics of Extrachromosomal Circular DNA in Rice."
eccToolkit is a specialized software suite designed for the analysis and classification of eccDNA (extrachromosomal circular DNA) elements. It comprises two main scripts: eccDNA_Repeat_Classifier.py
and eccDNA_Detected_eccDNA.py
, which together facilitate the detection and categorization of eccDNA sequences from genomic data.
- eccDNA Detection: Identifies eccDNA elements within genomic data.
- Repeat Classification: Classifies detected eccDNA elements based on their repeat characteristics.
- Comprehensive Analysis: Integrates various data processing steps to provide detailed insights into eccDNA profiles.
- Python environment with necessary dependencies installed (e.g., Pandas, NumPy).
- Genomic data files in appropriate formats (e.g., BED, FASTA).
-
Install Mamba via Miniconda or Anaconda, if not already installed.
-
Create and activate the eccToolkit environment:
mamba env create -f eccToolkit.yml conda activate eccToolkit
-
Alternatively, use Conda for installation:
conda env create -f eccToolkit.yml conda activate eccToolkit
This script is responsible for the initial detection of eccDNA elements from genomic data.
-i
,--input
(Required): Path to the input genomic data file.-o
,--output
(Required): Path for the output file where detected eccDNA elements will be listed.- Additional parameters specific to the detection algorithm (e.g., sensitivity settings).
python eccDNA_Detected_eccDNA.py -i input_genomic_data.bed -o detected_eccDNA.csv
- Input: BED format genomic data.
- Output: CSV file listing detected eccDNA elements.
This script classifies the detected eccDNA elements based on their repeat characteristics.
-i
,--input
(Required): Path to the input file containing detected eccDNA elements.-o
,--output
(Required): Path for the output file where classified eccDNA elements will be listed.- Additional parameters for classification criteria (e.g., repeat length, type).
python eccDNA_Repeat_Classifier.py -i detected_eccDNA.csv -o classified_eccDNA.csv
- Input: CSV file from
eccDNA_Detected_eccDNA.py
. - Output: CSV file with eccDNA elements classified by repeat type.
To fully utilize eccToolkit, follow these steps:
- Run
eccDNA_Detected_eccDNA.py
to detect eccDNA elements. - Use the output of the first script as the input for
eccDNA_Repeat_Classifier.py
to classify the detected elements.
This project is licensed under the GNU General Public License v3.0. Kindly refer to the LICENSE
file for detailed terms and conditions.