This set of tutorials aim to introduce the important ephys and histology tables that are ready for usage. We will mention some basics of DataJoint but not systematically. For a full-fledged tutorial on the basics, please visit:  

>* [Get DataJoint Ready](../201909_code_camp/0-Get%20DataJoint%20Ready.ipynb): connection to database, set up config
>* [Explore IBL data pipeline with DataJoint](../201909_code_camp/1-Explore%20IBL%20data%20pipeline%20with%20DataJoint.ipynb): plot diagram, query, and fetch
>* [Analyze data with IBL pipeline and save results](../201909_code_camp/2-Analyze%20data%20with%20IBL%20pipeline%20and%20save%20results.ipynb): use imported and computed table to autopopulate results

# Connect to IBL datajoint database

In [None]:
import datajoint as dj
from getpass import getpass

# set up dj.config
dj.config['database.host'] = 'datajoint.internationalbrainlab.org'
dj.config['database.user'] = '{YOUR_USER_NAME}'
dj.config['database.password'] = getpass('Please type in your password: ')

# connect to the database
dj.conn()

# save the config locally
dj.config.save_local()

# List all the schemas you have access to, using `dj.list_schemas()`

In [None]:
dj.list_schemas()

## Major schemas:   
Meta data from **Alyx**: `ibl_reference`, `ibl_subject`, `ibl_action`, `ibl_acquisition`, `ibl_data`, and `ibl_qc`  
Imported data from **FlatIron**: `ibl_behavior`, `ibl_ephys`, `ibl_histology`  
Computed analzyed results: `ibl_analyses_behavior`, `ibl_analyses_ephys` 

# Access the schemas

There are two ways of accessing the schemas with DataJoint

>* Create virtual modules
>* Import modules from ibl-pipeline

## Create virtual modules 
The tables are designed and generated with DataJoint and the codes are in ibl-pipeline, however, if you just want to access the table contents, you don't have to get the code that defines the tables. Instead, DataJoint provides an method called `create_virtual_module`, allowing users to reconstruct the modules and classes based on the **current** structure of the tables in the database. For example:

In [None]:
ephys = dj.create_virtual_module('ephys', 'ibl_ephys')

The first argument is the `__name__` of the module you would like to give, usually not very important. The second argument is the schema name.

Now we get the virtual module `ephys`, that contains all the classes to interact with the tables in the schema. Apart from the populate methods, you could do all other DJ operations on this virtual module, including diagram, queries, fetches, create child tables, delete, and drop. Please be extra coutious in deleting and dropping tables.

Let's take a look at the relational diagram of the module:

In [None]:
dj.Diagram(ephys)

Here is a friendly reminder of what these shapes, colors and lines mean:

**Table tiers**:  
Manual table: green box  
Lookup table: gray box  
Imported table: blue oval  
Computed table: red circle  
Part table: plain text

Meaning of table tiers could be found in this [presentation](https://docs.google.com/presentation/d/1mp3Bro1_o_nPScD_g0ygw2z633Rdnd-GGlFEJZmhrBs/edit#slide=id.g7e7b39a7dc_0_5)

**Dependencies**:  
One-to-one primary: thick solid line  
One-to-many primary: thin solid line  
Secondary foreign key reference: dashed line  
Renamed secondary foreign key references: orange dot

We could access tables with the classes of the virtual module. 

In [None]:
ephys.DefaultCluster().describe();

Create virtual modules are particularly useful in the following scenarios:

>* `group_shared_` schemas: these are the schemas created by the users, and the code to create these modules are not necessarily accessible easily.
>* `ibl_` schemas: these schemas were created and defined in ibl-pipeline, but as we are in rapid development, the lastest ibl-pipeline package we released may not reflect the current status of the tables. Create virtual modules is a very good way to access the tables with their current definitions.

For ephys tables, there are a lot of external fields, such as the `blob@ephys` shown in the above definition. External storage is a feature provided by DataJoint that allows saving bulky data into s3 buckets. From the user point of view, there is no difference from a internal field. However, using external fields need to pre-configure the storage location. Without the configuration, datajoint does not know where to fetch the data.

In [None]:
# fetch the first two entries
ephys.DefaultCluster.fetch('cluster_spikes_times', limit=2)

To fix the problem, we could `import ibl_pipeline`, where the external storage location was configured. The configuration is stable across different versions of ibl_pipeline.

In [None]:
import ibl_pipeline
ephys.DefaultCluster.fetch('cluster_spikes_times', limit=2)

## Directly import from ibl-pipeline

A more routined method is to directly import modules from the package `ibl-pipeline`

In [None]:
from ibl_pipeline import ephys, histology

In [None]:
ephys.DefaultCluster()

In [None]:
histology.ClusterBrainRegion() & 'insertion_data_source like "%Ephys%"'

# Summary

In this notebook, we introduced the approaches to connect to the database, access schemas and tables. We especially illustrated the usage of `dj.create_virtual_module`, which is quite useful when accessing the rapidly changing schemas and tables.

In the [next notebook](01-Introduction%20of%20ephys%20and%20histology%20tables.ipynb), we will go through the important tables in ephys and histology schemas one-by-one.