[Prev](Primer03.ipynb)

DataJoint Primer. Section 4.
# Schemas as python modules

In [2]:
%matplotlib notebook
import datajoint as dj

In the [previous part](Primer03.ipynb) we have defined several relations of a schema in an Jupyter notebook cell. 

In practice, defining relations in a notebook is hardly practical. Since the python classes that represent your tables should be re-usable in several analyses. Therefore, it is useful to put them into a separate file and import them from a module. 

Let's take the code from the [previous part](Primer03.ipynb) and put it in a separate file called `experiments.py`. The file should look like this. 


```python
import datajoint as dj
schema = dj.schema('dimitri_experiment', locals())

@schema
class Subject(dj.Manual):
    definition = """
    # Basic subject info
    subject_id       : int     # internal subject id
    ---
    real_id                     :  varchar(40)    #  real-world name
    species = "mouse"           : enum('mouse', 'monkey', 'human')   # species
    date_of_birth=null          : date                          # animal's date of birth
    sex="unknown"               : enum('M','F','unknown')       #
    caretaker="Unknown"         : varchar(20)                   # person responsible for working with this subject
    animal_notes=""             : varchar(4096)                 # strain, genetic manipulations, etc
    """


@schema
class Experiment(dj.Manual):
    definition = """
    # Basic subject info

    -> Subject
    experiment          : smallint   # experiment number for this subject
    ---
    experiment_folder               : varchar(255) # folder path
    experiment_date                 : date        # experiment start date
    experiment_notes=""             : varchar(4096)
    experiment_ts=CURRENT_TIMESTAMP : timestamp   # automatic timestamp
    """


@schema
class Session(dj.Manual):
    definition = """
    # a two-photon imaging session

    -> Experiment
    session_id    : tinyint  # two-photon session within this experiment
    -----------
    setup      : tinyint   # experimental setup
    lens       : tinyint   # lens e.g.: 10x, 20x, 25x, 60x
    """


@schema
class Scan(dj.Manual):
    definition = """
    # a two-photon imaging session

    -> Session
    scan_id : tinyint  # two-photon session within this experiment
    ----
    depth  :   float    #  depth from surface
    wavelength : smallint  # (nm)  laser wavelength
    mwatts: numeric(4,1)  # (mW) laser power to brain
    """
```

Now we can import this module and re-use its tables in several scripts or notebooks.

In [4]:
import experiment as expe
expe.Scan()

subject_id,experiment,session_id,scan_id,depth,wavelength,mwatts
,,,,,,


We can also use these classes now to define new relations based on them. Let's say we want an additional table that stores the cortical layer for particular scans. Certainly, this class should depend on `expe.Scan`.

**Remark:** Of course, we would normally define a new relation that belongs to the experiment schema in the same python file. We just define `CorticalLayer` in this notebook to demonstrate a few details about how tables are generated. 

In [7]:
schema = dj.schema('dimitri_experiment', locals())

@schema
class CorticalLayer(dj.Manual):
    definition = """
    # cortical layer a scan was performed in 
    -> expe.Scan
    ---
    layer    : enum("L1","L2/3","L4", "L5","L6")
    """
    
CorticalLayer()

subject_id,experiment,session_id,scan_id,layer
,,,,


Notice how we referred to the `Scan` relation as `expe.Scan`. This is because in _this_ notebook, `Scan` is called `expe.Scan`. If we imported it via `from experiment import Scan as Jabberwocky`, then the table definition would look like this

```python
from experiment import Scan as Jabberwocky
schema = dj.schema('dimitri_experiment', locals())

@schema
class CorticalLayer(dj.Manual):
    definition = """
    # cortical layer a scan was performed in 
    -> Jabberwocky
    ---
    layer    : enum("L1","L2/3","L4", "L5","L6")
    """
```

This works because we pass `locals()` as a second argument to the `schema` decorator. This argument needs to be a dictionary that must contain all the names of the foreign keys and the respective python classes as values. `locals()` is a dictionary that contains all the local variables, so it is an easy choice. However, this would have worked as well. 

```python
import experiment as expe
schema = dj.schema('dimitri_experiment', dict(Jabberwocky=expe.Scan))

@schema
class CorticalLayer(dj.Manual):
    definition = """
    # cortical layer a scan was performed in 
    -> Jabberwocky
    ---
    layer    : enum("L1","L2/3","L4", "L5","L6")
    """
```

This mechanism is very powerful, because it allows us to define dependencies between arbitrary relations on the same server. However, as you can see from the `Jabberwocky` example, it can also be use for all kinds of non-sensical code and black magic. We (the developers) have never found a case in which anything other but `locals()` was necessary or would have made sense. Use common sense to use reasonable and informative names in the dependencies of a relation. 

## Generate classes in a module from the database

Notice that the database `dimitri_experiment` now has an additional relation because we defined it here. Of course, the reasonable step would be to copy it into the file `experiment.py`. However, DataJoint can also generate the python class from the database for use. To that end, we need to add the following code at the end of the file. 

```python

# ... other class definitions

@schema
class Scan(dj.Manual):
    definition = """
    # a two-photon imaging session

    -> Session
    scan_id : tinyint  # two-photon session within this experiment
    ----
    depth  :   float    #  depth from surface
    wavelength : smallint  # (nm)  laser wavelength
    mwatts: numeric(4,1)  # (mW) laser power to brain
    """
    
schema.spawn_missing_classes()
```

This function looks through the database, defines the corresponding python class objects, and puts them into the local name space. The most common use case for this mechanism is when relations are defined in Matlab, but we want to access them from Python. All we need to do in this case is to define a new module, create a `schema` decorator with the correct database, and call the function.

```python
# contents of a fictional schema defined in Matlab

import datajoint as dj
schema = dj.schema('matlab_schema',locals())

schema.spawn_missing_classes()
```

Of course, we could just have copied the definitions from Matlab and generated the corresponding classes in Python. However, this way of generating the definitions from the database is more robust against the case in which the definition is changed in Matlab, but the Python code is not updated. DataJoint would still load the correct version from the database, but the definition in the Python file would not make sense anymore. Better to not define anything at all and refer the use to the corresponding Matlab code. 

[Next](Primer05.ipynb)