# Schema

A **schema** is a namespace for a collection of related tables in the database. The words "schema" and "database" can be used interchangeably in DataJoint, but "schema" is preferred. A single data pipeline can include multiple schemas. This notebook covers creating new schemas or connecting to existing ones. 

In [1]:
import datajoint as dj
dj.__version__

'0.12.dev7'

The function call `dj.schema(schema_name)` make an object that references the named schema. If this is a new schema, then `dj.schema` will create it on the server.

It is a common convention to name the returned object `schema` and we will follow this convention throughout this course. This implies that separate Python modules are created to work with each schema on the database server. We will follow this convention also: 1 Python module $\equiv$ 1 database schema.

You need `schema` to create new tables within this schema.

In [2]:
# create the dj101_university database
schema = dj.schema('dj101_university')

Connecting dimitri@localhost:3306


The `schema` object is used as the **decorator** for DataJoint table classes. 

For example, let's create a table to represent university departments.

In [3]:
@schema
class Department(dj.Manual):
    definition = """
    department : char(8)  # short name
    ---
    department_name : varchar(255)  # department name
    """

The new table is declared on the database server when the decorator executes, so the table becomes ready for use:

In [4]:
Department.insert1(("MATH", "Department of Mathematics"))

In [5]:
Department()

department  short name,department_name  department name
MATH,Department of Mathematics


You can view all accessible schemas on the database server:

In [6]:
dj.list_schemas()

['dimitri_alter',
 'dimitri_attach',
 'dimitri_blob',
 'dimitri_blobs',
 'dimitri_debug',
 'dimitri_filepath',
 'dimitri_nphoton',
 'dimitri_nwb',
 'dimitri_schema',
 'dimitri_singular',
 'dimitri_social',
 'dimitri_test',
 'dimitri_tutorial',
 'dimitri_unique',
 'dimitri_university',
 'dimitri_uuid',
 'dj101_university',
 'test_attach',
 'test_external',
 'test_filepath',
 'test_graphs',
 'test_graphs_external',
 'test_graps',
 'test_mikkel',
 'test_orders',
 'test_parse',
 'test_question001',
 'test_question002',
 'test_unique',
 'university']

## Keyword arguments `create_schema` and `create_tables`
If you wish to connect to existing schema, you might like to set `create_schema=False` to prevent accidental creation of a new schema from a misspelled schema name:

In [7]:
try:
    schema = dj.schema('dj101_misspelled', create_schema=False)
except dj.DataJointError as err:
    print(type(err), err)

<class 'datajoint.errors.DataJointError'> Database named `dj101_misspelled` was not defined. Set argument create_schema=True to create it.


Also, if you connect to existing schema without the intention of extending it with new tables, set `create_tables=False`. This is a good practice for modules that are deployed for routine use after the active design phase. This obsolete copies of the module will not attempt to re-create deprecated tables:

In [8]:
schema = dj.schema('dj101_university', create_schema=False, create_tables=False)

In [9]:
try:
    @schema
    class Deprecated(dj.Manual):
        definition = """
        a : int    
        """
except dj.DataJointError as err:
    print(type(err), err)

<class 'datajoint.errors.DataJointError'> Table `deprecated` not declared


## Other arguments of `dj.schema`

`dj.schema` also takes a `context` argument. It largely unnecessary now. In early versions of DataJoint for python, this argument was used to pass into it the name space in which to look for table class declarations. In some old DataJoint code, you may still find something like `schema = dj.schema('pipeline_ephys', locals())`. Unless you wish to explicitly set the contenxt in which the schema object should find class names, omit this value.

The `connection` argument is provided for the rare cases when it is necessary to explicitly define the database connection instead of using the default connection that is accessible by calling `dj.conn()`. This may be helpful when multiple database connections are accessed simultaneously.



## Dropping a schema
Dropping the schema should be done with extreme caution. It's the quickest way to lose lots of data very quickly. `schema.drop` removes the table contents and table defintions for the entire schema.

In [10]:
schema.drop()

Proceed to delete entire schema `dj101_university`? [yes, No]: yes


## Troubleshooting

The most common problems related to the use of `dj.schema` are user privileges. If you do not have priviliges to create a schema with a specific name or specific name pattern, a `dj.errors.AccessError` will occur:

In [11]:
try:
    schema = dj.schema('some_schema')
except dj.DataJointError as err:
    print(type(err), err)

<class 'datajoint.errors.AccessError'> ('Insufficient privileges.', "Access denied for user 'dimitri'@'%' to database 'some_schema'", 'CREATE DATABASE `some_schema`')
