# Recon3D_GP - Updating the GEM-PRO

The purpose of this notebook is to provide a quick pipeline to update a GEM-PRO with the most recent sequence and structure data. This pipeline can also be used to download all sequence and structure data if you are starting with just the GEM-PRO ``.json.gz`` file. Otherwise, make sure that the ``Recon3D_GP`` folder is in the same directory as this notebook.

Running this pipeline may take a while, timings are provided in the progress bars for each method below.

For a full tutorial on how a GEM-PRO is actually created, and the details of each method, see [this tutorial notebook](http://ssbio.readthedocs.io/en/latest/notebooks/GEM-PRO%20-%20SBML%20Model%20%28iNJ661%29.html).

### Requirements:
- ``ssbio`` - installation instructions [here](http://ssbio.readthedocs.io/en/latest/#installation), documentation [here](http://ssbio.readthedocs.io/en/latest/index.html)

### Quick start:
##### Installation
```bash
pip install nglview
pip install ssbio
```

##### Running the notebook

1. Obtain one of these three items:
    1. GitHub repository clone (`git clone https://github.com/SBRG/Recon3D`)
    1. Lite GEM-PRO archive (`Recon3D_GP_archive-lite.tar.gz`)
    1. GEM-PRO model (``Recon3D_GP.json.gz``)
1. If A: just open this notebook and run it.
1. If B: unzip the archive into the directory where this notebook is located.
1. If C: create a folder where this notebook is located, named ``Recon3D_GP/model/`` and place ``Recon3D_GP.json.gz`` in it.
1. Make sure your files are arranged like so:
```
.
├── Recon3D_GP
│   ├── data
│   ├── genes
│   ├── homology_models_raw
│   └── model
│       └── Recon3D_GP.json.gz
├── Recon3D_GP - Loading and Exploring the GEM-PRO.ipynb
└── Recon3D_GP - Updating the GEM-PRO.ipynb
```
1. Run this notebook!

### Imports and loading the model

In [None]:
# Loading the JSON file
# Change the location of the .json file if it is located somewhere else
from ssbio.core.io import load_json
Recon3D_GP = load_json('./Recon3D_GP/model/Recon3D_GP.json.gz', decompression=True)

In [None]:
# # Alternative - loading the pickle file
# # Uncomment and use this loading method if the JSON file fails to load
# from ssbio.core.io import load_pickle
# Recon3D_GP = load_pickle('./Recon3D_GP/model/Recon3D_GP.pckl')

In [None]:
# Displaying directories to which information will be downloaded to
# You can change the root_dir if the "Recon3D_GP" folder is located somewhere else, or if you want to start fresh
print('Location of the "Recon3D_GP" folder:', Recon3D_GP.root_dir)

## If the root directory needs to be changed
# Recon3D_GP.root_dir = '/path/to/new/root_dir/with/Recon3D_GP/folder/in/it'

In [None]:
# Setting logging display settings
import os.path as op
import json
import sys
import logging
logger = logging.getLogger()
logger.setLevel(logging.INFO)
handler = logging.StreamHandler(sys.stderr)
formatter = logging.Formatter('[%(asctime)s] [%(name)s] %(levelname)s: %(message)s', datefmt="%Y-%m-%d %H:%M")
handler.setFormatter(formatter)
logger.handlers = [handler]

In [None]:
# Printing multiple outputs per cell
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

### Updating sequence information

In [None]:
# Loading pre-defined isoform ID mappings
with open(op.join(Recon3D_GP.root_dir, 'Recon3D_GP/data/170526-manual_id_mapping.json'), 'r') as f:
    mapping = json.load(f)
u_isoforms = {}
for k, v in mapping['u_isoform_id'].items():
    if v == 'nan':
        continue
    u_isoforms[k] = v

In [None]:
# Set all methods to force_rerun=True to re-download information from KEGG and UniProt
Recon3D_GP.manual_uniprot_mapping(gene_to_uniprot_dict=u_isoforms, 
                                  download_metadata_file_type='txt', 
                                  simple_parse=True, force_rerun=True)

### Updating structure information

In [None]:
# Set all methods to force_rerun=True to re-download information from the PDB
Recon3D_GP.map_uniprot_to_pdb(force_rerun=True)
Recon3D_GP.blast_seqs_to_pdb(seq_ident_cutoff=.8, all_genes=True, force_rerun=True)
Recon3D_GP.set_representative_structure(allow_missing_on_termini=.30, allow_insertions=True, force_rerun=True)

### Saving the updated GEM-PRO

In [None]:
# Both JSON and pickle saving methods are provided
# JSON is human-readable and data can be utilized in other languages
# Pickles are Python specific
# Recon3D_GP.save_json(op.join(Recon3D_GP.root_dir, '/Recon3D_GP/model/Recon3D_GP_updated.json'), compression=False)
Recon3D_GP.save_json(op.join(Recon3D_GP.root_dir, '/Recon3D_GP/model/Recon3D_GP_updated.json.gz'), compression=True)
Recon3D_GP.save_pickle(outfile=op.join(Recon3D_GP.root_dir, '/Recon3D_GP/model/Recon3D_GP_updated.pckl'))