# Installation

Materials avalaible in the repository:

*   files for tutorials (dir: **tutorials**)
*   files for tests (dir: **test**)
*   functions (dir: **functions**)
*   template files to be filled (dir: **template_inputs**)
*   guided scripts (**script_***)
*   requirements.txt (for conda installation)






Prerequisites:

*   Python > 3 (https://www.python.org/)
*   Conda or Miniconda (https://conda.io/projects/conda/en/latest/user-guide/install/index.html)



In [None]:
# cloning repository via git (or download zip folder drictly from the github page)

git clone https://github.com/qLSLab/microFim.git

In [None]:
# create conda env

conda create --name microFIM --file requirements.txt --channel default --channel conda-forge --channel plotly

# Script usage

Guided scripts must be run in the main directory (within microFIM, after cloning the repository and create the environment). The scripts are 'interactive', with auto-completion for an easy usage.

We suggest to create a specific directory for your project, in order to set it for inputs and outputs.

In [None]:
python script_1_filtertable.py

This script can be used to filter your otu/taxa table based on a list of samples.
Files required and mandatory instructions:
* otu/esv/taxa table - the column name of OTU or TAXA must be '#ID'
* sample list  - the first row of your sample list must be '#SampleID'

The script will ask you to set the input directory and the two files mentioned -
otu/esv/taxa table and sample list. The format of the file does not matter at this stage,
the script will ask you the type of separator.

The output file will be a filtered CSV file saved into the input directory
(in order to allow subsequent analysis).

In [None]:
python script_2_tableconversion.py

This script can be used to convert a otu/esv/taxa table into a list of transactions.
At this stage, do not worry about the format of the input. The script will ask
you which is the separator.

The output will be saved as a list of transactions into input directory.

In [None]:
python script_3_microfimcalculation.py

This script calculate microbial patterns!
Files:
- otu/esv/taxa table previously converted in transactions
- file with parameters in .csv format (support, zmin and zmax + type of report)
    template available in the tutorial folder

In [None]:
script_4_additionalmeasures.py

This script calculate additional interest measures that can be used
to filter results. Currently, all-confidence metric is available (see README for details).

In [None]:
script_5_generatepatterntable.py

This script can be used to create the pattern table.
Inputs:
- pattern results;
- metadata file;
- transactional file.

The output will be saved as a CSV dataframe (with and without
inrerest measures) into input directory.

In [None]:
# # available from monday 15

script_6_generateplots.py

# Library usage


microFIM python functions were divided into thematic sections, in order to promote the integration of new functions and an easy development of the tool. Here we present four scripts commented with the complete frameowrk that can be used on test/test2.csv files and the metadata and parameters related. 
 

*   The first one (named microFIM_example_code_1.py) filter the data table and convert it in transactional file. To filter, use a metadata files removing lines of samples you want to exclude;
*   The second create calculate patterns and calculate additional measures;
*   The third create pattern table;
*   The fourth create visualizations.

For simplicity, inputs are declared in the first lines.

### From import to conversion

Before running:

*   specify input files! (first lines)
*   pay attention at the last printing messages

This script suggests at the end to run a bash script to modify the output file (details after the box script):
*   if you have a Linux system run only the 'sed' script
*   if you have a Mac, run both 'sed' and 'rm'



In [None]:
import os
import sys
import pandas as pd
import numpy as np
import csv
from csv import writer
import readline
import re
import string

import fim
import functions.microdir as md
import functions.microfim as mf
import functions.microimport as mi
import functions.microinterestmeasures as mim


""" microFIM example code on test/test1.csv files
of microFIM github repository
Input files to run microFIM:
- test2.csv
- metadata_test2.csv
- parameters_test2.csv

"""

# setting files
## dir
set_dir = 'test'

## metadata
metadata = 'metadata_test2.csv'
meta_sep = ','

## otu/esv/taxa table
data_table_name = 'test2.csv'
data_sep = ','

# SETTINGS
## set dir
data_dir = md.set_inputs_dir_rev(set_dir)
print(data_dir)

## SET OUTPUT NAME
file_name = mi.output_file_name(data_table_name)

# IMPORT FILES
## import metadata
metadata = mi.import_metadata(metadata, data_dir, meta_sep)
print(metadata)

## import data table (otu, esv or taxa table)
data_table = mi.import_data_table(data_table_name, data_dir, data_sep)
print(data_table)


# FILTER DATA TABLE VIA SAMPLE METADATA
filter_table = mf.filter_data_table(metadata, data_table)
print(filter_table)


# CONVERT DATA TABLE IN TRANSACTIONAL data

t_list = mf.convert_in_transaction_list(filter_table, data_table_name)
print(t_list)

# save file
mf.save_transaction_list(data_dir, t_list, file_name)


# TO BE PRINT WHEN RUNNING THIS SCRIPT
# remove old output to clean folder
output = 'transactions_' + file_name[0]

print(f'\n\n> File converted and saved as ' + output + '.csv' + ' in ' + data_dir + '\n\n')

print(f'\n\n> Now run from your command line in {data_dir}:\n\n \
sed -i -e "s/,/ /g" {output}\n\n \
rm {output}-e\n\n')

In [None]:
# to be run in the output folder
# substitute (change {output} with your file name)

# Linux
sed -i -e "s/,/ /g" {output}

# Mac
rm {output}-e

### Pattern extraction and pattern table creation

Before running, specify:


*   Type of patterns to be extracted - i, c or m (first lines). 
    * i = itemsets (standard; see README for details);
    * c = closed itemsets;
    * m = maximal itemsets.
*   Input files (first lines)

This script generates as output the pattern list, pattern dataframe, pattern table with and without interest measures.


In [None]:
import os
import sys
import pandas as pd
import numpy as np
import csv
from csv import writer
import readline
import re
import string

import fim
import functions.microdir as md
import functions.microfim as mf
import functions.microimport as mi
import functions.microinterestmeasures as mim


""" microFIM example code on test/test2.csv files
of microFIM github repository
Input files to run microFIM:
- test2.csv
- transactional file 2 (can be obtained with microFIM_example_code_1.py)
- metadata_test2.csv
- parameters_test2.csv

Default is itemsets patterns (i), but also closed (c) and maximal (m) can be calculated.

Finally, two pattern tables can be generated:
- complete pattern table (patterns + interest measures)
- clean pattern table

"""

# set fim options
to_calculate = 'i' # default (can be changed in c or m)

# setting files
## dir
set_dir = 'test'

## metadata
metadata = 'metadata_test2.csv'
meta_sep = ','

## otu/esv/taxa table
data_table_name = 'test2.csv'
data_sep = ','

## parameter file
par_file = 'parameters_test2.csv'

# transactional file
trans_file = 'transactions_test2'

# set output name
output_file = 'patterns_test2'
add_interest_file = 'addm_patterns_test2'
output_pattern_table = 'pattern_table_test2'


# SETTINGS
## set dir
data_dir = md.set_inputs_dir_rev(set_dir)

## SET OUTPUT NAME
file_name = mi.output_file_name(data_table_name)


# import files

# IMPORT FILES
## import metadata
metadata = mi.import_metadata(metadata, data_dir, meta_sep)

## import data table (otu, esv or taxa table)
data_table = mi.import_data_table(data_table_name, data_dir, data_sep)

# import parameters file

## import transactions
t = mf.read_transaction(os.path.join(data_dir, trans_file))

## import file with paramaters
minsupp, zmin, zmax= mi.itemsets_parameters(data_dir, par_file)


# FILTER DATA TABLE VIA SAMPLE METADATA
filter_table = mf.filter_data_table(metadata, data_table)

# calculate patterns
results = mf.fim_calculation(t, to_calculate, minsupp, zmin, zmax)

# write patterns results
file, out_file, new_out_file = mf.write_results(results, data_dir, output_file)

# convert itemsets results into a dataframe
df = mf.itemsets_dataframe(new_out_file)
print(df)

# export as a csv
mf.export_dataframe(df, data_dir, output_file)

print('Results saved as ' + new_out_file + ' in ' + data_dir + '\n\n')



# CALCULATE ADDITIONAL METRICS
## import patterns dataframe
df = mi.import_pattern_dataframe(data_dir, output_file)

# calculate and add all-confidence values
data_allc_update = mim.add_interest_measures(data_table, df, trans_file, data_dir)

# export dataframe
mim.add_table_export(data_allc_update, data_dir, add_interest_file)
print('Results saved as df_' + add_interest_file + '.csv in ' + data_dir + '\n\n')



# GENERATE PATTERN TABLE
## generate pattern table with patterns and interest measures
pattern_table_complete = mf.generate_pattern_table(data_allc_update, df, data_dir, trans_file, metadata, meta_sep)
print(pattern_table_complete)

## export pattern table as cv
mf.export_pattern_tables(pattern_table_complete, data_dir, output_pattern_table)

## Visualizations

In [None]:
# available from tuesday 16

# Integration in QIIME2 framework

## Export taxa tables for microFIM analysis

In [None]:
# activate the env (if you do not installed QIIME2 yet, please see https://docs.qiime2.org/2021.8/getting-started/)

conda activate qiime2-2020.8 # example version


# export biom file form qza

qiime tools export --input-path table.qza --output-path exported-feature-table


# convert biom file to tsv

biom convert -i exported-feature-table/feature-table.biom -o feature-table.tsv --to-tsv

In [None]:
# substitue #OTU ID with #ID

sed -i -e "s/#OTU ID/#ID/g" feature-table.tsv


# remove first row

sed -i '1d' feature-table.tsv


## READY TO BE IMPORTED IN microFIM ##

## Import pattern tables in qza format to perform QIIME2 analysis

Change 'Pattern' column in #OTU ID before converting.

In [None]:
# convert in biom file

biom convert -i pattern_table_test.tsv \
  -o pattern_table_test.biom --table-type="OTU table" --to-json


# import in qiime2

qiime tools import \
  --input-path pattern_table_test.biom \
  --type 'FeatureTable[Frequency]' \
  --input-format BIOMV100Format \
  --output-path pattern_table_test.qza

# Contacts
For any doubt, please contact g.agostinetto@campus.unimib.it