# ExTaxsI Python functions

ExTaxsI functions can be used separately from the ExTaxsI tool. \
In general, there are 6 main functions for the following tasks: 


*   load_configurations() - **mandatory before using ExTaxsi functions** - setting NCBI API key to allow data query
*   db_creation() - create FASTA, accession with taxonomy or enriched database
*   taxonomyID_converter() - convert NCBI taxid into organism taxonomy and viceversa
*   sunburst_plot() - create sunburst plot
*   scatterplot() - create scatterplot
*   worldmap_plot() - create worldmap plot

Here we report the installation via conda and pip. Furher, tutorial for configuration and usage is provided. Input files used are available in the **examples** directory. Use help() functions to see ExTaxsI function details. In this notebook, all the descriptions of the main functions are reported.

\\
For any doubt, please write to g.agostinetto@campus.unimib.it

## Conda installation

Source: https://github.com/qLSLab/ExTaxsI

Requirements:


*   Python 3 (https://www.python.org/)
*   Conda or Miniconda (https://conda.io/projects/conda/en/latest/user-guide/install/index.html)



In [None]:
# environment creation

conda create --name extaxsi -c qlslab -c conda-forge -c etetoolkit

# activate

conda activate extaxsi

# deactivate

conda deactivate extaxsi

## PyPI installation

Source: https://pypi.org/project/extaxsi/

In [None]:
# installation via pip 

pip install extaxsi

## Configuration and usage

Configuration step is **mandatory** before using ExTaxsI. \
Please, see https://ncbiinsights.ncbi.nlm.nih.gov/2017/11/02/new-api-keys-for-the-e-utilities/ to create a NCBI account and obtain you API key, then insert your e-mail and API key when running load_configurations().

In [None]:
# import functions

import extaxsi

In [None]:
# function description

help(load_configurations)

    Load configurations necessary to NCBI and update taxonomy database.
    Keyword arguments:
    entrez_email (str) -- email registered on NCBI.
    api_key (str) -- api key generated by NCBI.
    taxa_database_update (str) -- dowload or update taxonomy database (default 'yes').

In [None]:
# taxa_database_update -- download or update taxonomy database (default 'yes').
# if you configure the first time, put 'yes' or do not specify

load_configurations("user@email.com","api_key_here", taxa_database_update = 'no')

### Database creation

In [None]:
help(db_creation)

    Create FASTA, accession with taxonomy or enriched database.
    Keyword arguments:
    text_search (str) -- manual input from the user, ex. txid8832 (default None).
    file_search (str) -- file path from the user (default None).
    input_file_type (str) -- 'T','A' or 'O' if the input file is a list of Taxid,
                        accession or organism (default None).
    additional_query (str) -- add string if you want to restrict your output to a specific
                        element, ex. 'COI' (default 0).
    fasta_output -- create fasta database (default False).
    accession_taxonomy_output -- create file with accession and 6 rank taxonomy (default False).
    marker_output -- create gene marker output (default False).
    top10_plot -- create top 10 plot of marker output (default False).
    enrich_output -- create enriched file, i.e. accession with coordinates or country
                     location, gene information and organism (default False).
    overwrite_file -- if file already exists ovewrite (default True).

In [None]:
# example 1
# input text_search txid8832 = Aix galericulata

db_creation(text_search='txid8832',
            accession_taxonomy_output=True,
            fasta_output=False,
            marker_output=True,
            top10_plot=True,
            enrich_output=True)

In [None]:
# example 2
# input file_search = accession list avaliable in example directory 
# (https://github.com/qLSLab/ExTaxsI/tree/master/examples)

db_creation(file_search='example/A_accession_list_example.tsv',
            input_file_type='A',
            accession_taxonomy_output=True,
            fasta_output=True,
            marker_output=True,
            top10_plot=True)

### Taxonomy ID converter

In [None]:
help(taxonomyID_converter)

    Convert taxid into organism taxonomy and viceversa.
    Keyword arguments:
    text_search (str) -- manual search from the user (default none).
    file_search (str) -- file search from the user, see examples on documentations (default None).
    input_type (str) -- 'O' organism search, 'T' taxID search (default 'O').


In [None]:
taxonomyID_converter(file_search = 'example/O_organism_list_example.tsv', input_type = 'O')

### Sunburst visualization

In [None]:
help(sunburst_plot)

    Create sunburst plot.
    Keywords arguments:
    arg[0] (str) -- path to file, accession with taxonomy output if created with db_creation.
    arg[1] (str) -- title of the plot.
    filter_value (int) -- enter the minimum number of accession per organism (default 0).

In [None]:
sunburst_plot("download/txid8832_taxonomy.tsv", "example")

### Scatter plot visualization

In [None]:
help(scatterplot)

    Create scatterplot
    Args:
        accession_taxonomy_output ([str]): path to file, accession with taxonomy output if created with db_creation.
        title_graph ([str]): title of the plot
        filter_value (int, optional): enter the minimum number of accession per organism. Defaults to 0.

In [None]:
scatterplot("download/txid8832_taxonomy.tsv", "example")

### World map plot visualization

In [None]:
help(worldmap_plot)

    Create worldmap plot.
    Keywords arguments:
    enrich_output -- path to file, enriched output if created with db_creation.
    title_map -- title of the plot.
    Returns:
    plot_world_map -- interactive world map of features distributions.

In [None]:
worldmap_plot("download/txid8832_enriched.tsv",'example_worldmap')