# Introduction to the workflow structure

This notebook gives a brief overview of the workflow structure and introduce some useful DataJoint tools to facilitate the exploration.
+ DataJoint needs to be pre-configured before running this notebook, if you haven't set up the configuration, refer to notebook [01-configuration](01-configuration.ipynb).
+ If you are familar with DataJoint and the workflow structure, proceed to the next notebook [03-process](03-process.ipynb) directly to run the workflow.
+ For a more thorough introduction of DataJoint functionings, please visit our [general tutorial site](https://playground.datajoint.io)

To load the local configuration, we will change the directory to the package root.

In [None]:
import os
os.chdir('..')

## Schemas and tables

The current workflow is composed of multiple database schemas, each of them corresponds to a module within `workflow_array_ephys.pipeline`

In [None]:
import datajoint as dj
from workflow_array_ephys.pipeline import lab, subject, session, probe, ephys

+ Each module contains a schema object that enables interaction with the schema in the database.

In [None]:
ephys.schema

+ The table classes in the module corresponds to a table in the schema in the database. e.g. ephys.EphysRecording corresponds to `_ephys_recording` table in the schema `neuro_ephys`.

In [None]:
# show the table name on the database side.
ephys.EphysRecording.table_name

In [None]:
# preview table columns and contents in a table
ephys.EphysRecording()

+ By importing the modules for the first time, the schemas and tables will be created inside the database.
+ Once created, importing modules will not create schemas and tables again, but the existing schemas/tables can be accessed and manipulated by the modules.

## DataJoint tools to explore schemas and tables

+ `dj.list_schemas()`: list all schemas a user has access to in the current database

In [None]:
dj.list_schemas()

+ `list_tables()`: list all tables in a schema

In [None]:
ephys.schema.list_tables()

+ `dj.Diagram()`: plot tables and dependencies. 

In [None]:
# plot diagram for all tables in a schema
dj.Diagram(ephys)

**Table tiers**: 

Manual table: green box, manually inserted table, expect new entries daily, e.g. Subject, ProbeInsertion.  
Lookup table: gray box, pre inserted table, commonly used for general facts or parameters. e.g. Strain, ClusteringMethod, ClusteringParamSet.  
Imported table: blue oval, auto-processing table, the processing depends on the importing of external files. e.g. process of Clustering requires output files from kilosort2.  
Computed table: red circle, auto-processing table, the processing does not depend on files external to the database, commonly used for     
Part table: plain text, as an appendix to the master table, all the part entries of a given master entry represent a intact set of the master entry. e.g. Unit of a CuratedClustering.

**Dependencies**:  

One-to-one primary: thick solid line, share the exact same primary key, meaning the child table inherits all the primary key fields from the parent table as its own primary key.     
One-to-many primary: thin solid line, inherit the primary key from the parent table, but have additional field(s) as part of the primary key as well
secondary dependency: dashed line, the child table inherits the primary key fields from parent table as its own secondary attribute.

In [None]:
# plot diagram of tables in multiple schemas
dj.Diagram(subject) + dj.Diagram(session) + dj.Diagram(ephys)

In [None]:
# plot diagram of selected tables and schemas
dj.Diagram(subject.Subject) + dj.Diagram(session.Session) + dj.Diagram(ephys)

+ `describe()`: show table definition with foreign key references.

In [None]:
ephys.EphysRecording.describe();

+ `heading`: show attribute definitions regardless of foreign key references

In [None]:
ephys.EphysRecording.heading

# Major DataJoint Elements installed in the current workflow

+ [`lab`](https://github.com/datajoint/element-lab): lab management related information, such as Lab, User, Project, Protocol, Source.

In [None]:
dj.Diagram(lab)

+ [`subject`](https://github.com/datajoint/element-animal): general animal information, User, Genetic background, Death etc.

In [None]:
dj.Diagram(subject)

In [None]:
subject.Subject.describe();

+ [`session`](https://github.com/datajoint/element-session): General information of experimental sessions.

In [None]:
dj.Diagram(session)

In [None]:
session.Session.describe();

+ [`ephys`](https://github.com/datajoint/element-array-ephys): Neuropixel based probe and ephys information

In [None]:
dj.Diagram(probe) + dj.Diagram(ephys)

## Summary and next step

+ This notebook introduced the overall structures of the schemas and tables in the workflow and relevant tools to explore the schema structure and table definitions.

+ In the next notebook [03-process](03-process.ipynb), we will further introduce the detailed steps running through the pipeline and table contents accordingly.