# Initialize your IcaData object

## Minimum requirements
The IcaData object only requires two matrices, the **M** and the **A** matrices. However, many functions (such as [TRN enrichment]) will be limited without including additional data.  
These matrices can be added as either a filename or a [pandas](https://pandas.pydata.org/) DataFrame

In [3]:
from pymodulon.core import IcaData
from os import path

In [None]:
DATA_DIR = path.abspath(path.join('..','data','precise_data'))

In [None]:
M = path.join(DATA_DIR, 'M.csv')
A = path.join(DATA_DIR, 'A.csv')

In [None]:
ica_data = IcaData(M,A)

In [None]:
from pymodulon.visualization import *

## Viewing matrices
You can directly view the M and A matrices

In [None]:
ica_data.A.head()

## Sample, gene, and iModulon names
If the M and A matrices have sample/gene/iModulon names, these will be saved as the official sample/gene/iModulon.

In [None]:
print('Gene names:',ica_data.gene_names[:5])
print('Sample names:',ica_data.sample_names[:5])
print('iModulon names:',ica_data.imodulon_names[:5])

## Adding the X matrix
The **X** matrix contains expression data and is mainly used for plotting functions.

In [None]:
X = join(DATA_DIR, 'X.csv')

In [None]:
ica_data.X = X

# Sample, gene, and iModulon tables
You may load in additional data tables with information about your samples, genes, or iModulons. If you do not load any data, these tables are originally empty but can be altered like any pandas table.

In [None]:
ica_data.gene_table.head()

In [None]:
ica_data.gene_table['gene_name'] = 'example'

In [None]:
ica_data.gene_table.head()

## Loading tables
Sample/gene/iModulon tables contain one sample/gene/iModulon per row, and information about the respective item in columns.  
**The sample/gene/iModulon names must match the names in your M and A matrix.** However, the order of the names does not matter.  
The tables can be passed in using either a filename or a pandas DataFrame.

In [None]:
gene_file = join(DATA_DIR, 'gene_table.csv')
sample_file = join(DATA_DIR, 'sample_table.csv')
imodulon_file = join(DATA_DIR, 'imodulon_table.csv')
print('Gene file:',gene_file)

In [None]:
ica_data.gene_table = gene_file
ica_data.sample_table = sample_file
ica_data.imodulon_table = imodulon_file

## Viewing tables

In [None]:
ica_data.sample_table.head()

In [None]:
ica_data.gene_table.head()

In [None]:
ica_data.imodulon_table.head()

# TRN Table

## Load the TRN
Again, this table can be passed in using either a filename or a pandas DataFrame.

In [None]:
trn = pd.read_csv(join(DATA_DIR, 'trn.csv'))
ica_data.trn = trn

In [None]:
ica_data.trn.head()

# iModulon properties

## Inspecting iModulons
The `view_imodulon` shows the information about each gene in the iModulon. Most information is stored in the `gene_table`, but the `regulator` column comes from the TRN.

In [None]:
ica_data.view_imodulon(0)

## Renaming iModulons

In [None]:
# Rename iModulons using a dict
ica_data.rename_imodulons({0:'YieP'})

ica_data.imodulon_table.head()

In [None]:
# Rename iModulons with a list
ica_data.imodulon_names = ica_data.imodulon_table.name

In [None]:
# This also renames the M and A matrix
ica_data.M.head()

In [None]:
ica_data.A.head()

## Identifying "single-gene" iModulons
Some iModulons are dominated by a single, high-coefficient gene. These iModulons may result from:
1. Overdecomposition of the dataset to identify noisy genes
2. Artificial knock-out of single genes
3. Regulons with only one target gene

No matter what causes these iModulons, it is important to be aware of them. The `find_single_gene_imodulons` function identifies iModulons that are likely dominated by a single gene.

In [None]:
ica_data.find_single_gene_imodulons(save=True)

In [None]:
ica_data.imodulon_table.loc[['fur-KO']]

# Saving and Loading IcaData Objects
To facilitate data sharing, you can save IcaData objects as json files that can be easily re-loaded

In [None]:
from pymodulon.io import *

In [None]:
save_to_json(ica_data,'../example_data/example.json')

In [None]:
ica_data = load_json_model('../example_data/example.json')

In [None]:
ica_data.imodulon_table.head()