gtAI is a new package implemented in python to effectively estimate the tRNA adaptation index (tAI).
- For more information about the gtAI: https://www.frontiersin.org/articles/10.3389/fmolb.2023.1218518/full
Python >=3.7 is required.
-
Biopython
-
pandas
-
numpy
-
gaft
-
lxml
Using pip
pip install gtAI
Contributions to the software are welcome
For bugs and suggestions, the most effective way is by raising an issue on the github issue tracker. Github allows you to classify your issues so that we know if it is a bug report, feature request or feedback to the authors.
If you wish to contribute some changes to the code then you should submit a pull request How to create a Pull Request? documentation on pull requests
from gtAI import Run_gtAI
df_tai, dict_wi, rel_values = Run_gtAI.gtai_analysis(main_fasta, GtRNA, genetic_code_number, size_pop, generation_number=50, ref_fasta= ref_fasta, bacteria=False)
Where:
main_fasta (str): A main fasta file containing the genes to be analyzed.
GtRNA (dict): The tRNA genes count
ref_fasta (str): Reference genes with the highest gene expression in a genome.
genetic_code_number (int): default = 1, The Genetic Codes number described by NCBI (https://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi)
size_pop (int): A parameter for the genetic algorithm to identify the population size containing the possible solutions to optimize Sij-values. (default = 60)
generation_number (int): A parameter for the genetic algorithm to identify the generation number. (default = 100)
bacteria (bool): True If the tested organism is prokaryotic or archaeans, else equal to False (default = False)
Note: for ref_fasta parameter, the user is able to use a reference set of interest (in fasta format). Otherwise, the package will automatically generate a reference set based on the ENc values of the tested genome. For more information: API documentation.
Note: Population size must be an even number
Returns:
df_tai (dataframe): Contains each gene id and its gtAI value.
final_dict_wi (dict): Contains each codon and its absolute adaptiveness value.
rel_values (dict): Contains each codon and its relative adaptiveness values.
1- Import gtAI functions.
from gtAI import Run_gtAI
from gtAI import gtAI
2- In this example, we will use Saccharomyces cerevisiae S288C coding sequences.
3- Prepare the tRNA gene copy number of the tested genome.
The user has two options; a) input the tRNA gene copy number as python dictionary or, b) using GtRNAdb() function, the user can get it automatically from the GtRNA database, using the link to the tested genome (In our case Saccharomyces cerevisiae S288C). Or by tRNADB_CE() function to get the tRNA gene copy number from tRNADB_CE database using also the link to the tested genome.
In this example, the second option (b) will be used.
url_GtRNAdb = "http://gtrnadb.ucsc.edu/genomes/eukaryota/Scere3/"
#### From GtRNAdb
GtRNA = gtAI.GtRNAdb(url_GtRNAdb)
for more infromation about GtRNAdb() as well as tRNADB_CE(); API documentation.
4- Parameter settings for gtai_analysis() function.
main_fasta = "SC.fasta"
genetic_code_number = 1
ref_fasta = ""
bacteria = False
size_pop = 60
generation_number = 100
for more information about gtai_analysis() and the parameters; API documentation.
5- Run gtAI.
df_tai , final_dict_wi, rel_values = Run_gtAI.gtai_analysis(main_fasta = main_fasta,
GtRNA = GtRNA , ref_fasta = ref_fasta, genetic_code_number = genetic_code_number,
size_pop=size_pop, generation_number=generation_number, bacteria=bacteria )
Returns:
df_tai (dataframe): Contains each gene id and its gtAI value
final_dict_wi (dict): Contains each codon and its absolute adaptiveness value
rel_values (dict): Contains each codon and its relative adaptiveness values
6- To save the gtAI result as a CSV file.
import pandas as pd
df_tai.to_csv("test.csv", header=True)
You can access the API documentation from here: gtAI Documentation
Anwar, Ali Mostafa, Saif M., Khodary, Eman Ali, Ahmed, Aya, Osama, Shahd, Ezzeldin, Anthony, Tanios, Sebaey, Mahgoub, and Sameh, Magdeldin. "gtAI: an improved species-specific tRNA adaptation index using the genetic algorithm".Frontiers in Molecular Biosciences 10 (2023). https://doi.org/10.3389/fmolb.2023.1218518 https://www.frontiersin.org/articles/10.3389/fmolb.2023.1218518/full