# The INDRA Database: Description and Demos

This notebook walks through some of the basic structure of the INDRA Database, and then works through some use-case examples. It is generally assumed for the purposes of this notebook (unless otherwise stated), that the user has direct access to the database.

--------------------------------------

## The Need-to-knows of INDRA

As the name suggests, this database is built using the tools of INDRA, and in turn it can be used to help with many uses of INDRA. It is thus valuable to go over some key features of the INDRA toolbox.

### The INDRA Statement
The bread and butter of the INDRA Database, and of INDRA itself, is the INDRA Statement, which is described extensively [here](file:///home/patrick/Workspace/indra/doc/_build/html/modules/statements.html). These Statements provide a robust and fairly extensible format for representing mechanistic interactions as Python objects. For the purposes of this tutorial, it is essential to know that Statements:
- Have a **type**, for example:
    - Phosphorylation
    - Complex
- Have **agents**, which in turn have some **db refs**, for example:
    - MEK has the Famplex db ref id MEK
    - Vemurafenib is an agent with the db refs for a CHEBI id "CHEBI:63637" and a ChEMBL id "ChEMBL1229517"
    
Most have two agents, a subject and an object, for example:
- `Phosphorylation(MEK(), ERK())`
- `Inhibition(Vemurafenib(), BRAF())`

but there are some types of Statement that are notable exceptions:
- Complexes (any number of agents)
- Auto-Phospohorylations (one agent)

### Sources of INDRA Statements
INDRA has implemented tools for loading and generating these Statements from several sources. Here, the key points to recall are that:
- INDRA can draw from both from **machine reading systems** such as REACH, and from **mechanism databases**, such as Pathway Commons
- For readings, INDRA also provides the groundwork for **running certain readers at massive scales**, fairly easily using AWS Batch.
- The results from these sources, especially when combined, **contain a lot of duplicate and closely related information**.

### Preassembly of INDRA Statements
To build useful models from all these sources, INDRA supplies tools to perform what is call "preasssembly" (what you do before "assembling" your model), in which:
- grounding is regularized (fixes agent db refs), as are protein sites and agent names.
- the **redunant information between sources is merged, *with the original source information and evidence preserved*, into a distilled set of unique mechanisms**
- the relationship between similar mechanistic information is recorded, such that a more general Statement, such as `Phosphorylation(MEK(), ERK())` can be identified as generalizing `Phosphorylation(MAP2K1(), MAPK1())`.
- **Such Preassembled Statements can be uniquely identified by a hash generated from their contents**.


----------

## The Structure of the Database

<img src="db_basic_structure.png">

The INDRA Database is made up of several tables. There are 4 core groups, shown in the three cylinders and one box above:
- **Sources:** Keep track of the content that we read, and the readings of that content, including titles, abstracts, and full texts from various sources. Also keep some metadata on the databases we import.
    - `text_refs`,
    - `text_content`
    - `reading`
    - `db_info`
- **Raw Statements:** Store all the statements extracted from all the sources, as-is.
    - `raw_statements`
- **Preassembled Statements:** Here are stored the cleaned, distilled, and relation-mapped statements.
    - `raw_unique_links`
    - `pa_statements`
    - `pa_agents`
    - `pa_support_links`
- **Materialized Views:** Pre-calculate certain queries for rapid retrieval.
    - `pa_meta`
    - `fast_raw_pa_link`
    - `pa_stmt_src`
    - `reading_ref_link`

There are many more tables, however there are in general not going to be essential in this demo. Here is a diagram of the database schema, not including the materialized views (which are not really part of the schema), with the major groupings colorcoded. Green indicates *sources*, orange indicates *raw statements*, and blue indicates *preassembled statements*. Each line indicates the presence of a foreign-key link.

<img src="indra_db.png">

------

## Demos

What follows are some demonstrations of the ways you can access the database, at various different levels.

### Low level access

#### pgadmin (or similar)

If you have pgadmin installed, you can browse the database content through a GUI, even changing values by hand.

#### Database Manager API

To access and manage the database at the lowest level, the `DatabaseManager` class, from `indra_db.managers.database_manager` is used. You need to have access to the database, hosted on AWS RDS, configured in a config file (documented elsewhere). Here is an example of getting a piece of content from the database:

In [1]:
from indra_db.util import get_db, unpack

# Get a handle to the database
db = get_db('primary')

# Get a piece of text content that is an abstract. Everything after the first argument is a condition.
tc = db.select_one(db.TextContent, db.TextContent.text_type == 'abstract')
print(tc)

text_content:
	insert_date: 2018-05-18 17:45:23.406707
	text_type: abstract
	source: pubmed
	id: 20202368
	last_updated: None
	content: [not shown]
	format: text
	text_ref_id: 28416337



The actual content is not shown so that the metadata is readable. But you can look at the content by just printing:

In [2]:
print(unpack(tc.content))

Visual expertise induces changes in neural processing for many different domains of expertise. However, it is unclear how expertise effects for different domains of expertise are related. In the present fMRI study, we combine large-scale univariate and multi-voxel analyses to contrast the expertise-related neural changes associated with two different domains of expertise, bird expertise (ornithology) and mineral expertise (mineralogy). Results indicated distributed expertise-related neural changes, with effects for both domains of expertise in high-level visual cortex and effects for bird expertise even extending to low-level visual regions and the frontal lobe. Importantly, a multivariate generalization analysis showed that effects in high-level visual cortex were specific to the domain of expertise. In contrast, the neural changes in the frontal lobe relating to expertise showed significant generalization, signaling the presence of domain-independent expertise effects. In conclusion,

Note that the content must be `unpack`ed. This is because we store compressed binary on the database.

You can get a raw statement from a pmcid by using the `db.link` feature, which uses a networkx graph to construct the necessary joins on your behalf.

In [3]:
raw_stmt_rows = db.select_all(db.RawStatements, db.TextRef.pmcid == 'PMC4055958',
                              *db.link(db.RawStatements, db.TextRef))

Lets look at some of these objects that were returned. The `repr` of the object is not especially informative:

In [4]:
raw_stmt_rows[0]

<indra_db.managers.database_manager.DatabaseManager.__init__.<locals>.RawStatements at 0x7fcaf6cda588>

However you can, as shown above, `print` the object. Again, the more verbose column, the `json` encoding of the Statement is not printed in this display.

In [5]:
print(raw_stmt_rows[0])

raw_statements:
	create_date: 2019-05-31 14:06:53.451841
	indra_version: 1.12.0-8d138ebe7e70fefdb7edde1769c0c8bd8cb91526
	reading_id: 10100019060322
	source_hash: 6135995456400101117
	mk_hash: -22060906923060024
	batch_id: 533420918
	id: 10341406
	json: [not shown]
	type: DecreaseAmount
	db_info_id: None
	text_hash: -7392788542727949939
	uuid: 2d7607df-43c0-451f-a773-534d614f7baf



In [6]:
raw_stmt_rows[0].json

b'{"type": "DecreaseAmount", "subj": {"name": "NO", "db_refs": {"PUBCHEM": "24822", "TEXT": "NO"}}, "obj": {"name": "CYCS", "db_refs": {"UP": "P99999", "HGNC": "19986", "TEXT": "cytochrome c"}}, "belief": 1, "evidence": [{"source_api": "reach", "text": "NO and Ca 2+ synergistically inactivate mitochondrial complex I and cause a loss of cytochrome c, probably via formation of ONOO - [XREF_BIBR].", "annotations": {"found_by": "decrease_amount_1", "agents": {"coords": [[0, 2], [85, 97]]}}, "epistemics": {"direct": false, "section_type": null}, "text_refs": {"PMID": "18050169"}, "source_hash": 6135995456400101117}], "id": "2d7607df-43c0-451f-a773-534d614f7baf"}'

The details of this code are not essential, however you can see that we get a lot of statements from this fulltext, and that there are two different readings producting this content.

In [7]:
from collections import defaultdict
from indra_db.util import get_statement_object

# Make a dict of lists keyed by reading id (all statements grouped by reading)
raw_stmts_by_rid = defaultdict(list)
for row in raw_stmt_rows:
    raw_stmts_by_rid[row.reading_id].append(row)

# Print a sampling of the statements
for rid, some_rows in raw_stmts_by_rid.items():
    print(rid)
    for row in some_rows[:10]:
        print('\t', get_statement_object(row))
    if len(some_rows) > 10:
        print(f"\t ... and {len(some_rows) - 10} more!")


10100019060322
	 DecreaseAmount(NO(), CYCS())
	 IncreaseAmount(MDMA(), Ca())
	 Inhibition(METH(), ROS())
	 Activation(METH(), DA())
	 Inhibition(METH(), SLC18A2())
	 Activation(METH(), ROS())
	 Activation(NO(), peroxynitrite())
	 Activation(NOS1(), NO())
	 Activation(NO(), METH())
	 Activation(NO(), MDMA())
	 ... and 51 more!
20300019060322
	 Phosphorylation(None, SLC6A3())
	 Complex(NOS1(), serotonin())
	 Complex(PTPN5(), FOXM1())


You can also search for pa_statements by linking further from the raw statement to the preassembled statements, through the `raw_unique_links`, which is again handled tidily by the `db.link` feature.

In [8]:
for link in db.link(db.PAStatements, db.TextRef):
    print(link)

raw_unique_links.pa_stmt_mk_hash = pa_statements.mk_hash
raw_unique_links.raw_stmt_id = raw_statements.id
raw_statements.reading_id = reading.id
reading.text_content_id = text_content.id
text_content.text_ref_id = text_ref.id


In [9]:
pa_stmt_rows = db.select_all(db.PAStatements, db.TextRef.pmcid == 'PMC4055958', 
                             *db.link(db.PAStatements, db.TextRef))
print(f"I found {len(pa_stmt_rows)} preassmebled statements.\n")

# Print some samples.
print("Here's a sample:")
for row in pa_stmt_rows[:10]:
    print(get_statement_object(row))

I found 49 preassmebled statements.

Here's a sample:
Phosphorylation(PKC(), SLC6A3())
Complex(serotonin(), NOS1())
Activation(CHRN(), pyraclofos())
Activation(NOS1(), nitric oxide())
Activation(3,4-methylenedioxymethamphetamine(), ROS1())
Activation(METH(), DCF())
Inhibition(MEM(), alpha7 nAChR())
Activation(METH(), dopamine())
DecreaseAmount(calcium(2+)(), CYCS())
Activation(3,4-methylenedioxymethamphetamine(), calcium(2+)())


As you can see, the redundant Statements have been collapse.

As a demonstration, we could use the results of this search to find more paper ids for papers involving similar mechanisms. This works because each preassembled statment is supported by multiple raw statements, in general from multiple papers. *Note that the preassembled statements are identified by a hash of what's called a matches-key, or `mk_hash`.* These are a re-producable value which uniquely identifies a preassembled statement by the information it represents.

In [10]:
text_ref_rows = db.select_all(db.TextRef, db.PAStatements.mk_hash.in_({row.mk_hash for row in pa_stmt_rows}),
                              *db.link(db.PAStatements, db.TextRef))
print(f"We found {len(text_ref_rows)} text refs with related mechanisms!\n")

# Print a sample of the pmids and pmcids for each ref.
for row in text_ref_rows[:10]:
    print(f"PMID: {str(row.pmid):10}  PMCID: {row.pmcid}")

We found 4374 text refs with related mechanisms!

PMID: 20047071    PMCID: None
PMID: 24275851    PMCID: PMC3817602
PMID: 23578024    PMCID: PMC3914398
PMID: 9182590     PMCID: None
PMID: 24549364    PMCID: PMC4138306
PMID: 24875574    PMCID: PMC4203735
PMID: 23959639    PMCID: PMC3859705
PMID: 16359614    PMCID: None
PMID: 19758695    PMCID: None
PMID: 27047180    PMCID: PMC4774759


We can of course also search for statements involving certain entities:

In [11]:
# Search for statements with agents whose 'NAME' is 'BRAF', where the agent is the object, where the
# Statement is an Inhibition.
inhibits_braf_rows = db.select_all(db.PAStatements, db.PAStatements.mk_hash == db.PAAgents.stmt_mk_hash,
                                   db.PAAgents.db_id == 'BRAF', db.PAAgents.db_name == 'NAME',
                                   db.PAAgents.role == 'OBJECT', db.PAStatements.type == 'Inhibition')
print(f"I found {len(inhibits_braf_rows)} statements about the inhibition of BRAF!\n")

# Print a sample
print("Here's a sample:")
for row in inhibits_braf_rows[:10]:
    print(get_statement_object(row))

I found 580 statements about the inhibition of BRAF!

Here's a sample:
Inhibition(vemurafenib(), BRAF())
Inhibition(mTORC1(), BRAF())
Inhibition(BRAF inhibitor(), BRAF(muts: (V, 600, E)))
Inhibition(SPRY2(), BRAF())
Inhibition(dabrafenib(), BRAF())
Inhibition(erlotinib(), BRAF())
Inhibition(phenformin(), BRAF())
Inhibition(PREP(), BRAF())
Inhibition(PTEN(), BRAF(muts: (None, None, None)))
Inhibition(EGFR(), BRAF())


### The Python Client API

This is a rather cumbersome way to look for statements, and moreover there are two problems with this result:
1. The raw evidence is not included.
2. You can only query by one agent, when what you often want is to search for _both_ entities in a relationship.

To address this problem, a higher-level API was developed, which can be found in the `indra_db.client`, in particular `indra_db.client.optimized`. These tools allow for fully-formed (modulo support links) statements to be rapidly loaded from the database. Note: *This API makes use of the materialized views to speed queries.*

The principle function implemented in the client allows you to search and retrieve preassembled statements based on their entities and type.

In [12]:
import json
from indra.statements import Statement
from indra_db import client as dbc

# Look for a statement with two agents, a subject with the Famplex grounding of "MEK", and an object
# with the Famplex grounding of "ERK", that is of type "Phosphorylation", and return only at most 5 evidence
# for each pa statement.
results = dbc.get_statement_jsons_from_agents([('SUBJECT', 'MEK', 'FPLX'), ('OBJECT', 'ERK', 'FPLX')],
                                              stmt_type='Phosphorylation', ev_limit=5)

# Print the keys.
print("The result has the following keys:", set(results.keys()))

# Summarize the results.
print(f"There is {results['total_evidence']} 'total_evidence' available, "
      f"and {results['evidence_returned']} ('evidence_returned') were returned "
      f"for {len(results['statements'])} Statements.")

# Print some samples
for stmt_json in results['statements'].values():
      print(Statement._from_json(stmt_json))

The result has the following keys: {'evidence_returned', 'total_evidence', 'statements', 'evidence_totals'}
There is 929 'total_evidence' available, and 51 ('evidence_returned') were returned for 24 Statements.
Phosphorylation(MEK(), ERK())
Phosphorylation(MEK(), ERK(), T)
Phosphorylation(MEK(), ERK(), Y)
Phosphorylation(MEK(), ERK(), Y, 204)
Phosphorylation(MEK(), ERK(), T, 202)
Phosphorylation(MEK(mods: (phosphorylation)), ERK())
Phosphorylation(MEK(), ERK(), Y, 205)
Phosphorylation(MEK(), ERK(), T, 867)
Phosphorylation(MEK(), ERK(), S, 221)
Phosphorylation(MEK(), ERK(), S)
Phosphorylation(MEK(), ERK(), T, 203)
Phosphorylation(MEK(mods: (modification)), ERK())
Phosphorylation(MEK(), ERK(), S, 217)
Phosphorylation(MEK(mods: (modification), muts: (None, None, None)), ERK())
Phosphorylation(MEK(muts: (None, None, None)), ERK())
Phosphorylation(MEK(), ERK(), T, 125)
Phosphorylation(MEK(), ERK(), S, 431)
Phosphorylation(MEK(), ERK(), A, 1726)
Phosphorylation(MEK(), ERK(), C, 20)
Phosphory

Even though we were fairly specific in our query, there are still variations in the details. Soon we will make it possible to search by the modifications and mutations.

The json for each of these statements contains extensive and rich information, for example let us inspect the very last json in our list:

In [13]:
print(json.dumps(stmt_json, indent=2))

{
  "type": "Phosphorylation",
  "enz": {
    "name": "MEK",
    "db_refs": {
      "TEXT": "MEK1/2",
      "FPLX": "MEK"
    }
  },
  "sub": {
    "name": "ERK",
    "db_refs": {
      "TEXT": "ERK1/2",
      "FPLX": "ERK"
    }
  },
  "residue": "T",
  "position": "581",
  "belief": 1,
  "id": "31f1ce72-b01d-49e9-b7f0-4a7d830996df",
  "evidence": [
    {
      "source_api": "reach",
      "text": "Our results showed that FeF at nontoxic doses effectively suppressed EGF induced transformation of JB6 Cl41 cells that was accompanied by decreased phosphorylation of TOPK (Thr9), ERK1/2 (Tyr202/204) and MSK 1 (Thr581), but not MEK1/2 (Ser221), which suggested that FeF attenuated EGF induced cell transformation by inhibiting of TOPK activity.",
      "annotations": {
        "found_by": "Phosphorylation_syntax_1b_noun",
        "agents": {
          "raw_text": [
            "MEK1/2",
            "ERK1/2"
          ]
        },
        "prior_uuids": [
          "229a23fe-9883-47bc-9bd4-3b5

You can also use the client to get such Statement jsons by using the `mk_hash`. This could be useful, for an example in this case, to get the rest of the evidence for that first, generic statement that was returned (`Phosphorylation(MEK(), ERK())`). In fact, we don't even need to use that object, we could just declare a Statement with those attributes and look up evidence for it:

In [14]:
from indra.statements import Phosphorylation, Agent

stmt = Phosphorylation(Agent('MEK', db_refs={'FPLX': 'MEK'}), Agent('ERK', db_refs={'FPLX': 'ERK'}))
print("Our brand-new off-the-lot Statement:", stmt)

# Show that the hash is the same
print("You can see the hashes are the same: new ", stmt.get_hash(),
      'vs old', list(results['statements'].keys())[0])

# And we can look it up on the database
one_stmt_result = dbc.get_statement_jsons_from_hashes([stmt.get_hash()])

stmt_from_db = Statement._from_json(list(results['statements'].values())[0])

print()
print("The statement retrieved from the database:", stmt_from_db)
print("\nEvidence text and source for this statement:")
for ev in stmt_from_db.evidence:
    print()
    print('source:', ev.source_api)
    print('text:', ev.text)


Our brand-new off-the-lot Statement: Phosphorylation(MEK(), ERK())
You can see the hashes are the same: new  -31782050023208088 vs old -31782050023208088

The statement retrieved from the database: Phosphorylation(MEK(), ERK())

Evidence text and source for this statement:

source: sparser
text: It mediates its inhibitory properties by binding to the ERK-specific MAP kinase MEK, therefore preventing phosphorylation of ERK1/2 (p44/p42 MAPK) by MEK.

source: sparser
text: However, we did not employ this model because ERK is processively phosphorylated by MEK in mammalian cells ( Aoki et al., 2011 ).

source: sparser
text: MEK 1/2 phosphorylates and activates the extracellular signal–regulated kinases (ERK 1/2), and MEK-ERK signals regulate various cellular processes such as survival and apoptosis [ 35 ].

source: reach
text: MEK inhibition blocks ERK1/2 phosphorylation, the targets for the MEK kinases.

source: sparser
text: MEK in turn phosphorylates and activates MAPK, also known as ex

### The Web Service REST API

So far, all these demos require access direction to the database, which could allow an individual to do more than search, but also make changes to the database. This is of course not ideal, so we developed a REST API, a web service, to serve up results such as the above from the database. The API can be used directly using web requests, as will be domonstrated, but can also be navigated by a web Use Interface (UI), and also leveraged using a submodule of INDRA: `indra.sources.indra_db_rest`. The current structure is summarized here:

<img src='api_structure.png'>

where the web service is deployed in a serverless fashion using [Zappa](https://github.com/Miserlou/Zappa) and [AWS Lambda](https://aws.amazon.com/lambda/). This allows anyone, anywhere, to search the contents of the database (specifically the preassembled statements).

#### Direct API access

Here is an example of accessing the API directly, using the built-in `requests` package, essentially repeating our earlier search for inhibitors of BRAF:

In [15]:
import requests

# Execute the request
resp = requests.get('https://db.indra.bio/statements/from_agents?object=BRAF&type=inhibition')

# Print some information about the results
print("Got response with code:", resp.status_code)
result_dict = resp.json()

print("Keys in the result:", set(result_dict.keys()))
print("Evidence Returned:", result_dict['evidence_returned'])
print("Total Evidence:", result_dict['total_evidence'])
print("Statements Returned:", result_dict['statements_returned'])
print("An example statement json:\n", json.dumps(list(result_dict['statements'].values())[-1], indent=2))

Got response with code: 200
Keys in the result: {'evidence_limit', 'statements_returned', 'statements', 'evidence_totals', 'statement_limit', 'offset', 'evidence_returned', 'total_evidence'}
Evidence Returned: 1074
Total Evidence: 1361
Statements Returned: 538
An example statement json:
 {
  "type": "Inhibition",
  "subj": {
    "name": "PAK4",
    "db_refs": {
      "TEXT": "PAK4",
      "UP": "O96013",
      "HGNC": "16059"
    },
    "sbo": "http://identifiers.org/sbo/SBO:0000020"
  },
  "obj": {
    "name": "BRAF",
    "db_refs": {
      "TEXT": "BRAF",
      "UP": "P15056",
      "HGNC": "1097"
    },
    "sbo": "http://identifiers.org/sbo/SBO:0000642"
  },
  "obj_activity": "activity",
  "id": "01b2c123-a2ff-4fc1-8ecd-1da9df68c1da",
  "sbo": "http://identifiers.org/sbo/SBO:0000182",
  "evidence": [
    {
      "source_api": "reach",
      "pmid": "23233484",
      "text": "Knockdown of PAK4 or PAK1 inhibited the proliferation of mutant KRAS or BRAF colon cancer cells in vitro.",


#### INDRA Sources Client to the DB Web Service

However, if you are in python, it is (generally) better to just get Statement objects directly, and moreover the API currently imposes various restrictions on the amount of content that can be returned to enable the the use of the AWS API Gateway timeout limitation. The `indra.sources.indra_db_rest` was written to overcome these challenges by creating a seamless API for getting INDRA Statements from the Database via the Web Service, from anywhere.

In [16]:
from indra.sources import indra_db_rest as idbr

processor = idbr.get_statements(subject='ERK@FPLX', object='MEK@FPLX', stmt_type='phosphorylation')
for stmt in processor.statements:
    print(stmt)

INFO: [2019-07-06 18:21:02] indra.sources.indra_db_rest.processor - The remainder of the query will be performed in a thread...
INFO: [2019-07-06 18:21:02] indra.sources.indra_db_rest.util - url and query string: https://db.indra.bio/statements/from_agents?subject=ERK@FPLX&object=MEK@FPLX&offset=0&type=phosphorylation
INFO: [2019-07-06 18:21:02] indra.sources.indra_db_rest.processor - Waiting for thread to complete...
INFO: [2019-07-06 18:21:02] indra.sources.indra_db_rest.util - headers: {}
INFO: [2019-07-06 18:21:02] indra.sources.indra_db_rest.util - data: None
INFO: [2019-07-06 18:21:02] indra.sources.indra_db_rest.util - params: {'ev_limit': 10, 'best_first': True, 'api_key': '[api-key]'}


Phosphorylation(ERK(), MEK())
Phosphorylation(ERK(), MEK(), T, 867)
Phosphorylation(ERK(), MEK(), S, 472)
Phosphorylation(ERK(), MEK(), S)
Phosphorylation(ERK(mods: (phosphorylation)), MEK())
Phosphorylation(ERK(), MEK(), T)


This of course works for much larger queries as well:

In [17]:
# Quiet the logging a bit
from indra.sources.indra_db_rest.util import logger as idbr_util_logger
import logging

idbr_util_logger.setLevel(logging.WARNING)

# Make the query
processor_tp53 = idbr.get_statements(agents=['TP53'])
print(f"I found {len(processor_tp53.statements)} unique statements! Here are a few of them:")
for stmt in processor_tp53.statements[:10]:
    print(stmt)

INFO: [2019-07-06 18:21:06] indra.sources.indra_db_rest.processor - The remainder of the query will be performed in a thread...
INFO: [2019-07-06 18:21:06] indra.sources.indra_db_rest.processor - Waiting for thread to complete...


I found 39222 unique statements! Here are a few of them:
Activation(TP53(), apoptosis())
Phosphorylation(None, TP53())
Complex(TP53(), MDM2())
Acetylation(None, TP53())
Ubiquitination(None, TP53())
Inhibition(MDM2(), TP53())
Phosphorylation(None, TP53(), S, 15)
Activation(TP53(), cell cycle())
Activation(TP53(), CDKN1A())
Activation(TP53(), cell death())


Well, that took a while! See the section below on the future of the web service for some details about how we hope to improve this.

#### Web Interface

You can also go [here](https://db.indra.bio) and explore statements manually.

#### IndraBot

We also implemented a Slack App called the IndraBot, which can answer simple questions in natural language.


### The Future(s) of the Web Service

The current structure of the web service is for various technical reasons less-than-ideal. We are currently in the process of overhauling out web interface, and among other key things, hope to seperate out the programmatic API service (shown above) from the graphical web interface/browsing feature, ultimately simplifying code and avoiding the complications caused by timeouts imposed by the serverless deployment we use. We also hope to wrap in some of the key features of the indrabot into the web interface, for instance allowing natural language queries, follow-up questions to filter down results, and user logins to simplify the process of curation and open the door to other user-specific features.

This is a diagram of roughly what we think the new architecture will look like:

<img src='api_structure_future.png'>