<div style="overflow: hidden;">
    <img src="images/DREGS_logo_v2.png" width="300" style="float: left; margin-right: 10px;">
</div>

# The production schema

By default, when we connect to the data registry we connect to the default schema, which is the schema for registering and storing data from active DESC projects. Another primary schema of the data registry is the "production" schema. 

The production schema is for projects that are no longer under active development, and are ready to be archived/distributed to the wider DESC community.

Note that **only administrators have write access to the production schema and its shared space**, this tutorial is to cover how the process works, however an administrator will be required to carry out the commands for registering production datasets.

All users can query the production schema.

### What we cover in this tutorial

In this tutorial we will learn how to:

- Connect to the production schema and register a new dataset (admin only)
- Query the production schema

### Before we begin

If you haven't done so already, check out the [getting setup](https://lsstdesc.org/dataregistry/tutorial_setup.html) page from the documentation if you want to run this tutorial interactively.

A quick way to check everything is set up correctly is to run the first cell below, which should load the `dataregistry` package, and print the package version.

In [None]:
# Come up with a random owner name to avoid clashes
from random import randint
import os
OWNER = "tutorial_" + os.environ.get('USER') + '_' + str(randint(0,int(1e6)))

import dataregistry
print(f"Working with dataregistry version: {dataregistry.__version__} as random owner {OWNER}")

**Note** that running some of the cells below may fail, especially if run multiple times. This will likely be from clashes with the unique constraints within the database (hopefully the error output is informative). In these events either; (1) run the cell above to establish a new database connection with a new random user, or (2) manually change the conflicting database column(s) that are clashing during registration.

## Registering a new production dataset

The production schema is essentially identical in its layout to the default schema. Therefore working with it is no different from working with the default schema, which we covered in the getting started tutorials.

To register a new dataset into the production schema

In [None]:
from dataregistry import DataRegistry

# Establish connection to the production schema
datareg = DataRegistry(schema="tutorial_production", owner="production", owner_type="production")

Here we have connected to the data registry tutorial production schema (`schema="tutorial_production"`). We have assigned the universal `owner` and `owner_type` to be "production", which is the only values allowed for the production schema.

In [None]:
# Add new entry.
dataset_id, execution_id = datareg.Registrar.dataset.register(
    f"nersc_production_tutorial:my_desc_production_dataset_{OWNER}",
    "1.0.0",
    description="An production output from some DESC code",
    location_type="dummy"
)

print(f"Created dataset {dataset_id}, associated with execution {execution_id}")

This would register a new dataset in the production schema, identical to the default schema.

To recap about production datasets:
- Only administrators have write access to the production schema and shared space
- All datasets in the production schema have `owner="production"` and `owner_type="production"`
- Production datasets can never be overwritten, even if `is_overwritable=True`

## Querying the production schema

Whilst only administrators have write access to the production schema, it is open for everyone to query.

Querying the production schema is no different from querying the default schema (just make sure you connect to the production schema when initiating the `DataRegistry` object).

For example, to query for the dataset we just registered we would do:

In [None]:
# Create a filter that queries on the owner
f = datareg.Query.gen_filter('dataset.owner', '==', 'production')

# Query the database
results = datareg.Query.find_datasets(['dataset.dataset_id', 'dataset.name', 'dataset.owner',
                                       'dataset.relative_path', 'dataset.creation_date'],
                                      [f],
                                      return_format="dataframe")

print(results)

Note that when using the command line interface to query datasets, e.g., `dregs ls`, both the default schema you are connected to and the production schema are both searched.