<div style="text-align:center;">
    <img src="./pics/datacat_logo.svg" width="80%"></img>
    <br>
    <br>
    <h1 >Tutorial 1: Getting started with DataLad Catalog</h1>
</div>

### Quick Links:

- [DataLad Catalog primer](./datalad_catalog_primer.md)
- [DataLad Catalog demo site](https://datalad.github.io/datalad-catalog/)
- [3-minute explainer video](https://www.youtube.com/watch?v=4GERwj49KFc)
- [DataLad Catalog code repository](https://github.com/datalad/datalad-catalog)
- [DataLad Catalog documentation](http://docs.datalad.org/projects/catalog/en/latest/?badge=latest)
- [DataLad website](https://www.datalad.org/)
- [DataLad code repository](https://github.com/datalad/datalad)
- [DataLad MetaLad code repository](https://github.com/datalad/datalad-metalad)

### Tutorial goals

This tutorial guides you through the process of installing DataLad Catalog and getting to know its basic functionality. After you've gone through these steps, you should be able to create your own data catalog from structured metadata.

### Tutorial prerequisites

If you are not familiar with DataLad Catalog and its function in the wider DataLad ecosystem, it is recommended that you first read [DataLad Catalog - A primer](./datalad_catalog_primer.ipynb), although this is not strictly required before continuing with this tutorial.

### Sections

1. [Install DataLad Catalog](#install)
2. [The main catalog functionality](#main_commands):
   1. [Create a new catalog](#catalog_create)
   2. [Render a catalog locally](#catalog_serve)
   3. [Adding catalog metadata](#catalog_add)
   4. [Setting the default dataset of a catalog](#catalog_super)
   5. [Catalog configuration](#catalog_config)
   6. [Removing catalog metadata](#catalog_remove)


<div id="install"></div>

# 1. Install DataLad Catalog

DataLad Catalog is a free and open source command line tool that has a Python API and is available for installation via [PyPI](https://pypi.org/). Let's start by creating and activating a new and empty virtual environment:

In [1]:
python -m venv my_catalog_env

In [2]:
source my_catalog_env/bin/activate

(my_catalog_env) 

: 1

Now we can clone `datalad-catalog` and install it with `pip`. This will also install `datalad` and other dependencies.

In [3]:
git clone https://github.com/datalad/datalad-catalog.git
cd datalad-catalog
pip install -e .
cd ..

Cloning into 'datalad-catalog'...
remote: Enumerating objects: 2285, done.[K
remote: Counting objects: 100% (1363/1363), done.[K
remote: Compressing objects: 100% (915/915), done.[K
remote: Total 2285 (delta 568), reused 1208 (delta 430), pack-reused 922[K
Receiving objects: 100% (2285/2285), 5.24 MiB | 2.17 MiB/s, done.
Resolving deltas: 100% (1098/1098), done.
(my_catalog_env) (my_catalog_env) Obtaining file:///Users/jsheunis/Documents/psyinf/tutorials/notebooks/catalog_tutorials/datalad-catalog
  Installing build dependencies ... [?25ldone
[?25h  Checking if build backend supports build_editable ... [?25ldone
[?25h  Getting requirements to build wheel ... [?25ldone
[?25h  Preparing metadata (pyproject.toml) ... [?25ldone
[?25hCollecting datalad-metalad>=0.3.2
  Using cached datalad_metalad-0.3.2-py3-none-any.whl (133 kB)
Collecting datalad>=0.16
  Using cached datalad-0.16.4-py3-none-any.whl (1.4 MB)
Collecting jsonschema
  Using cached jsonschema-4.6.0-py3-none-any.whl 

: 1

After that, you can check the installation by running the `datalad catalog` command with the `--help` flag:

In [4]:
datalad catalog --help

Usage: datalad catalog [-h] [-c CATALOG_DIR] [-m METADATA] [-i DATASET_ID]
                       [-v DATASET_VERSION] [-f] [-y CONFIG_FILE] [--version]
                       {create|add|remove|serve|set-super|validate}

Generate web-browser-based user interface for browsing metadata of a 
DataLad dataset.

(Long description of arbitrary volume.)

positional arguments:
  {create|add|remove|serve|set-super|validate}
                        This is the subcommand to be executed by datalad-
                        catalog. Options include: create, add, remove, serve,
                        set-super, and validate. Example: ''. Constraints:
                        value must be one of ('create', 'add', 'remove',
                        'serve', 'set-super', 'validate')

optional arguments:
  -h, --help, --help-np
                        show this help message. --help-np forcefully disables
                        the use of a pager for displaying the help message
  -c CATALOG_DIR, --cata

: 1

You might be wondering why the catalog command is preceded by `datalad` as in `datalad catalog <command>`. DataLad Catalog is an extension of [DataLad](https://github.com/datalad), which provides base functionality that the catalog generation process uses. It is installed as a dependency during the installation of DataLad Catalog, and provides supporting functionality during the catalog generation process. Feel free to explore DataLad's capabilities with `datalad --help`.

<div id="main_commands"></div>

# 2. The main catalog functionality

As you likely saw in the `--help` information, DataLad Catalog has several main commands to support the process of catalog generation. These include:

- `create`: create a new catalog
- `add`: add metadata to a catalog
- `remove`: remove metadata from a catalog
- `serve`: serve the catalog locally on an http server for testing/viewing purposes
- `set-super`: set the so-called super dataset of the catalog, i.e. the dataset that will be displayed when navigating to the root URL of the catalog
- `validate`: validate metadata according the to catalog schema

Below you are taken through several steps showing how to use these commands and their supporting flags.

<div id="catalog_create"></div>

## 2.1. Create a new catalog: `datalad catalog create`

With this command, you can create a new catalog. Let's try it out!

In [5]:
WORKING_DIRECTORY=$(pwd)
CATALOG_PATH="$WORKING_DIRECTORY/test_catalog"
datalad catalog create -c $CATALOG_PATH

(my_catalog_env) (my_catalog_env) [1;1mcatalog create[0m([1;32mok[0m): /Users/jsheunis/Documents/psyinf/tutorials/notebooks/catalog_tutorials [Catalog successfully created at: /Users/jsheunis/Documents/psyinf/tutorials/notebooks/catalog_tutorials/test_catalog]
(my_catalog_env) 

: 1

The `catalog create(ok)` results shows that the catalog was successfully created at the specified location, wich was passed to the command with the `-c/--catalog-dir` flag.

We can inspect the catalog's content with the `tree` command.

In [6]:
cd $CATALOG_PATH
tree
cd ..

(my_catalog_env) .
├── artwork
│   ├── 404.svg
│   ├── binder_logo.svg
│   ├── catalog_logo.svg
│   ├── datalad_catalog_logo_1_dark.svg
│   ├── datalad_catalog_logo_1_light.svg
│   └── datalad_logo_wide.svg
├── assets
│   ├── favicon
│   │   ├── favicon-16x16.png
│   │   ├── favicon-32x32.png
│   │   └── favicon.ico
│   ├── md5-2.3.0.js
│   ├── style.css
│   └── vue_app.js
├── config.json
├── index.html
└── metadata

4 directories, 14 files
(my_catalog_env) (my_catalog_env) 

: 1

As you can see, the catalog's root directory contains subdirectories for:
- *artwork*: images that make the catalog pretty
- *assets*: mainly the JavaScript and CSS code that underlie the user interface of the catalog.
- *metadata*: this is where metadata content for any datasets and files rendered by the catalog will be contained

It also contains an `index.html` files, which is the main catalog HTML content that will be served to users in their browsers, as wel as a `config.json` file, which contains default and user-specified configuration settings for the browser. These directories and files are all populated into their respective locations by the `datalad catalog create` command.

Let's have a look at the catalog that you just created.

<div id="catalog_serve"></div>

## 2.2. Render a catalog locally: `datalad catalog serve`

Since the catalog contains HTML, JavaScript, and CSS that can be viewed in any common browser (Google Chrome, Safari, Mozilla Firefox, etc), this content needs to be served.

With the `serve` command, you can serve the content of a catalog locally via an HTTP server. 

If you are running the code of this tutorial on your own system (i.e. not from within the Jupyter environment), you can do that as follows:

```bash
datalad catalog serve -c $CATALOG_NAME

```

In the Jupyter environment, we can open a new HTTP Server window to achieve the same outcome. If you're using JupyterLab, you can open a new Launcher window from the main menu bar, and then click on the HTTP Server option. If you're running a Jupyter Notebook environment, you can select "New" in the top right of the tree view, and then select "HTTP Server" from the dropdown menu.


<br>
<div style="display: flex; border: 1px solid grey">
    <div style="flex: 50%; border: 1px solid black">
        <img src="./pics/open_server_lab1.png" width="70%"></img>
    </div>
    <div style="flex: 50%; border: 1px solid black">
        <img src="./pics/open_server_lab2.png" width="55%"></img>
    </div>
</div>

<div style="text-align:center;">
    <h5>Opening a new HTTP Server window in JupyterLab</h5>
</div>

<br>
<br>
<div style="text-align:center;">
    <img src="./pics/open_server_notebook.png" width="20%" style="border: 1px solid black"></img>
    <h5>Opening a new HTTP Server window in Jupter Notebook</h5>
</div>


Once the content is served, you can visit the local URL to view the catalog. On your own system, this should be at http://localhost:8000/. From the Jupyter environment, you should see the directory tree of your current directory, which should include the `test_catalog` that you created previously:

<br>
<div style="text-align:center;">
    <img src="./pics/http_server_testcatalog.png" width="50%" style="border: 1px solid black"></img>
    <h5>The directory tree rendered via the HTTP Server</h5>
</div>

If you navigate to the `test_catalog` directory, the catalog should be rendered. You should see the 404 page, since there is no metadata in the catalog yet. (Don't worry, that will change soon!)

<br>
<div style="text-align:center;">
    <img src="./pics/404.png" width="60%" style="border: 1px solid black"></img>
    <h5>The rendered catalog, here showing the 404 page</h5>
</div>

If you press the browser's back button until you see the directory tree again, and then navigate to `datalad-catalog/datalad_catalog`, the fully functional demo catalog should be rendered. This is the same as the demo catalog hosted here: https://datalad.github.io/datalad-catalog/

<div id="catalog_add"></div>

## 2.3. Adding catalog metadata

The catalog is, of course, only as useful as the metadata that is contained within it. So let's add some! This can easily be done with the `add` command and `-m/--metadata` flag:

```bash
datalad catalog add -c $CATALOG_PATH -m <path_to_metadata_file>
```

At the time of creating this tutorial, DataLad Catalog accepts metadata input in the form of json lines, i.e. a text file (typically, `.json`, `.jsonl`, or `.txt`) where each line is a single, correctly formatted, JSON object.


### 2.3.1. The Catalog Schema

Each JSON object provided to the Catalog in the metadata file should be structured according to the Catalog schema, which is based on [JSON Schema](https://json-schema.org/): a vocabulary that allows you to annotate and validate JSON documents.

The implication is that you will have to format your metadata objects to conform to this standard. At the core of this standard are the concepts of a **dataset** and a **file**, which shouldn't be suprising to anyone working with data: we have a set of files organized in some kind of hierarchy, and sets of files are often delineated from other sets of files (here we call this delineation a *dataset*). There are a few core specifications of metadata objects within the context of the Catalog schema:

1. A metadata object can only be about a dataset or a file (its `type`).
2. Each metadata object has multiple "key/value"-pairs that describe it. For example, an object of type `dataset` might have a `name` (key) equal to `my_test_dataset` (value), and a `keywords` field equal to the list `["quick", "brown", "fox"]` (value). An object of type `file` might have a `format` (key) equal to `JSON` (value).
3. Each metadata object should have a way to identify its related dataset. For an object of type `dataset`, this will be the `dataset_id` and `dataset_version` of the actual dataset. For an object of type `file`, this will be the `dataset_id` and `dataset_version` of its parent dataset (i.e. the dataset which the file forms part of).
4. Each metadata object of type `file` should have a `path` key for which the value specifies exactly where the file is located relative to the root directory of its parent dataset.
5. Datasets can have subdatasets.
6. The Catalog schema specifies exactly which fields are required and which data types are accepted for each key/value-pair.

For an improved understanding of the Catalog schema, you can inspect the [JSON documents here](https://github.com/datalad/datalad-catalog/tree/main/datalad_catalog/templates) (`jsonschema_*`)

Here is a toy example of three metadata objects.

First a dataset:

```json
{
    "type": "dataset",
    "dataset_id":"abcd",
    "dataset_version":"1234",
    "name": "My toy dataset",
    "short_name": "My toy dataset",
    "description": "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus nec justo tellus. Nunc sagittis eleifend magna, eu blandit arcu tincidunt eu. Mauris pharetra justo nec volutpat euismod. Curabitur bibendum vitae nunc a pharetra. Donec non rhoncus risus, ac consequat purus. Pellentesque ultricies ut enim non luctus. Sed viverra dolor enim, sed blandit lorem interdum sit amet. Aenean tincidunt et dolor sit amet tincidunt. Vivamus in sollicitudin ligula. Curabitur volutpat sapien erat, eget consectetur mauris dapibus a. Phasellus fringilla justo ligula, et fringilla tortor ullamcorper id. Praesent tristique lacus purus, eu convallis quam vestibulum eget. Donec ullamcorper mi neque, vel tincidunt augue porttitor vel.",
    "doi": "",
    "url": ["https://github.com/jsheunis/multi-echo-super"],
    "license": {
      "name": "CC BY 4.0",
      "url": "https://creativecommons.org/licenses/by/4.0/"
    },
    "authors": [
        {
            "givenName":"Stephan",
            "familyName":"Heunis",
        }
    ],
    "keywords": ["lorum", "ipsum", "foxes"],
    "funding": [
        {
            "name":"Stephan's Bank Account",
            "identifier":"No. 42",
            "description":"Nothing to see here"
        }
    ],
    "extractors_used": [
        {
            "extractor_name": "stephan_manual",
            "extractor_version": "1",
            "extraction_parameter": {},
            "extraction_time": 1652340647.0,
            "agent_name": "Stephan Heunis",
            "agent_email": ""
        }
    ]
}
```

And then two of its files:

```json
{
    "type": "file"
    "dataset_id": "abcd",
    "dataset_version": "1234",
    "contentbytesize": 1403
    "path": "README",
    "extractors_used": [
        {
            "extractor_name": "stephan_manual",
            "extractor_version": "1",
            "extraction_parameter": {},
            "extraction_time": 1652340647.0,
            "agent_name": "Stephan Heunis",
            "agent_email": ""
        }
    ]
}
{
    "type": "file"
    "dataset_id": "abcd",
    "dataset_version": "1234",
    "contentbytesize": 15572
    "path": "main_data/main_results.png",
    "extractors_used": [
        {
            "extractor_name": "stephan_manual",
            "extractor_version": "1",
            "extraction_parameter": {},
            "extraction_time": 1652340647.0,
            "agent_name": "Stephan Heunis",
            "agent_email": ""
        }
    ]
}
```

These objects are also contained in the `test_data/toy_metadata.jsonl` file:

In [7]:
cat $WORKING_DIRECTORY/test_data/toy_metadata.jsonl | jq .

[1;39m{
  [0m[34;1m"type"[0m[1;39m: [0m[0;32m"dataset"[0m[1;39m,
  [0m[34;1m"dataset_id"[0m[1;39m: [0m[0;32m"abcd"[0m[1;39m,
  [0m[34;1m"dataset_version"[0m[1;39m: [0m[0;32m"1234"[0m[1;39m,
  [0m[34;1m"name"[0m[1;39m: [0m[0;32m"My toy dataset"[0m[1;39m,
  [0m[34;1m"short_name"[0m[1;39m: [0m[0;32m"My toy dataset"[0m[1;39m,
  [0m[34;1m"description"[0m[1;39m: [0m[0;32m"Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus nec justo tellus. Nunc sagittis eleifend magna, eu blandit arcu tincidunt eu. Mauris pharetra justo nec volutpat euismod. Curabitur bibendum vitae nunc a pharetra. Donec non rhoncus risus, ac consequat purus. Pellentesque ultricies ut enim non luctus. Sed viverra dolor enim, sed blandit lorem interdum sit amet. Aenean tincidunt et dolor sit amet tincidunt. Vivamus in sollicitudin ligula. Curabitur volutpat sapien erat, eget consectetur mauris dapibus a. Phasellus fringilla justo ligula, et fringilla tortor ullam

: 1


### 2.3.2. Validating your metadata: `datalad catalog validate`

For convenience during metadata setup and catalog generation, the Catalog has the `validate` command that let's you test whether your metadata conforms to the Catalog schema before adding it:

```bash
datalad catalog validate -m <path_to_metadata_file>
```

Let's test this on the toy metadata:

In [9]:
datalad catalog validate -m $WORKING_DIRECTORY/test_data/toy_metadata.jsonl

[1;1mcatalog validate[0m([1;32mok[0m): /Users/jsheunis/Documents/psyinf/tutorials/notebooks/catalog_tutorials [Metadata successfully validated]
metadata validation against catalog schema: 100%|█| 3.00/3.00 [00:00<00:00, 2.98(my_catalog_env) 

: 1

Great, we have valid metadata!

Take note that this validator also runs internally whenever metadata is added to the catalog, so there is no need to run validation explicitly unless you want you.

### 2.3.3. Adding metadata: `datalad catalog add`

Next, let's add our valid metadata to the catalog!

In [10]:
datalad catalog add -c $CATALOG_PATH -m $WORKING_DIRECTORY/test_data/toy_metadata.jsonl

[1;1mcatalog add[0m([1;32mok[0m): /Users/jsheunis/Documents/psyinf/tutorials/notebooks/catalog_tutorials [Metadata items successfully added to catalog]
(my_catalog_env) 

: 1

The `catalog add(ok)` result indicates that our metadata was added successfully to the catalog. You can inspect this by looking at the content of the metadata directory inside the catalog:

In [11]:
tree $WORKING_DIRECTORY/test_catalog/

/Users/jsheunis/Documents/psyinf/tutorials/notebooks/catalog_tutorials/test_catalog/
├── artwork
│   ├── 404.svg
│   ├── binder_logo.svg
│   ├── catalog_logo.svg
│   ├── datalad_catalog_logo_1_dark.svg
│   ├── datalad_catalog_logo_1_light.svg
│   └── datalad_logo_wide.svg
├── assets
│   ├── favicon
│   │   ├── favicon-16x16.png
│   │   ├── favicon-32x32.png
│   │   └── favicon.ico
│   ├── md5-2.3.0.js
│   ├── style.css
│   └── vue_app.js
├── config.json
├── index.html
└── metadata
    └── abcd
        └── 1234
            ├── 10f
            │   └── 7898cf7fc3465f078a67e15124c72.json
            └── 217
                └── 17d85bd1b1526b7e463279763cdb0.json

8 directories, 16 files
(my_catalog_env) 

: 1

Where previously the `metadata` directory contained nothing, it now has several subdirectories and two `.json`-files. Note, first, that the first two recursive subdirectory names correspond respectively to the `dataset_id` and `dataset_version` of the dataset in the toy metadata that we added to the catalog. This supports the DataLad Catalog's ability to identify specific datasets and their files by ID and version in order to update the catalog easily (and, when it comes to decentralized contribution, without conflicts). The subdirectories further down the hierarchy, as well as the filenames, are just hashes the path to the specific directory node relative to the parent dataset. Let's look at the content of these files:

In [12]:
cat $WORKING_DIRECTORY/test_catalog/metadata/abcd/1234/217/17d85bd1b1526b7e463279763cdb0.json | jq .
cat $WORKING_DIRECTORY/test_catalog/metadata/abcd/1234/10f/7898cf7fc3465f078a67e15124c72.json | jq .

[1;39m{
  [0m[34;1m"dataset_id"[0m[1;39m: [0m[0;32m"abcd"[0m[1;39m,
  [0m[34;1m"dataset_version"[0m[1;39m: [0m[0;32m"1234"[0m[1;39m,
  [0m[34;1m"type"[0m[1;39m: [0m[0;32m"dataset"[0m[1;39m,
  [0m[34;1m"children"[0m[1;39m: [0m[1;39m[
    [1;39m{
      [0m[34;1m"type"[0m[1;39m: [0m[0;32m"file"[0m[1;39m,
      [0m[34;1m"dataset_id"[0m[1;39m: [0m[0;32m"abcd"[0m[1;39m,
      [0m[34;1m"dataset_version"[0m[1;39m: [0m[0;32m"1234"[0m[1;39m,
      [0m[34;1m"contentbytesize"[0m[1;39m: [0m[0;39m1403[0m[1;39m,
      [0m[34;1m"path"[0m[1;39m: [0m[0;32m"README"[0m[1;39m,
      [0m[34;1m"extractors_used"[0m[1;39m: [0m[1;39m[
        [1;39m{
          [0m[34;1m"extractor_name"[0m[1;39m: [0m[0;32m"stephan_manual"[0m[1;39m,
          [0m[34;1m"extractor_version"[0m[1;39m: [0m[0;32m"1"[0m[1;39m,
          [0m[34;1m"extraction_parameter"[0m[1;39m: [0m[1;39m{}[0m[1;39m,
          [0m[34;1m"extraction_tim

: 1

As you can see, the content of these files is very similar to the original toy data, but slightly transformed. This transformation creates a structure that is easier for the associated browser application to read and render. Additionally, structuring data into metadata files that represent nodes in the dataset hierarchy (i.e. a datasets or directories) allows the browser application to only access the data in those metadata files whenever the user selects the applicable node. This saves loading time which makes the user experience more seamless.

### 2.3.4. Viewing a particular dataset

That was everything that happened behind the scenes during the `datalad catalog add` procedure, but what does our updated catalog look like? Let's take a look. If you navigate to the HTTP server tab/window that you opened previously, and hit refresh, you should see... no change?!

The reason for this is that we didn't specify the details of the particular dataset that we want to view.

If we want to view the specific dataset that we just added to the catalog, we can specify its `dataset_id` and `dataset_version` by appending them to the URL in the format: `/#/dataset/<dataset_id>/<dataset_version>`. This makes it possible to view any uniquely identifiable dataset by navigating to a unique URL.

Let's try it with out example. Go to the HTTP server tab/window, make sure that you are located in the `test_catalog` directory (the 404 page should be displaying), append `/#/dataset/abcd/1234` to the end of the URL, and hit ENTER/RETURN. You should see something like this:

<br>
<div style="text-align:center;">
    <img src="./pics/dataset_view_subdatasets.png" width="75%" style="border: 1px solid black"></img>
    <h5>The rendered catalog, showing the dataset view with the subdatasets tab</h5>
</div>

The is the dataset view, with the subdatasets tab (auto-)selected. This view displays all the main content related to the dataset that was provided by the metadata, and allows the user further functionality like downloading the dataset with DataLad, downloading the metadata, filtering subdatasets by keyword, browsing files, and viewing extended attributes such as funding information related to the dataset. Below are two more views, the first with the files tab selected, and the second with the funding tab selected.

<br>
<div style="text-align:center;">
    <img src="./pics/dataset_view_files.png" width="75%" style="border: 1px solid black"></img>
    <h5>The rendered catalog, showing the dataset view with the files tab</h5>
    <br>
    <img src="./pics/dataset_view_funding.png" width="75%" style="border: 1px solid black"></img>
    <h5>The rendered catalog, showing the dataset view with the funding tab</h5>
</div>

<div id="catalog_super"></div>

## 2.4. Setting the default dataset of a catalog: `datalad catalog set-super`

When one navigates to a specific catalog's root address (in our toy case: `$WORKING_DIRECTORY/test_catalog`), i.e. without a `dataset_id` and `dataset_version` specified in the URL, the browser application checks if a so-called "superdataset" is specified for the catalog. If not, it renders the 404 page.

The specification of a superdataset could be useful for cases where the catalog, when navigated to, should always render the top-level list of available datasets in the catalog (provided by the metadata as subdatasets to the superdataset).

Let's add our toy dataset as the catalog's superdataset, using the `-s/--set-super` flag and additionally specifying the dataset's `dataset_id` (`-i/--dataset-id` flag) and `dataset_version` (`-v/--dataset-version` flag):

In [13]:
datalad catalog set-super -c $WORKING_DIRECTORY/test_catalog -i abcd -v 1234

[1;1mcatalog set-super[0m([1;32mok[0m): /Users/jsheunis/Documents/psyinf/tutorials/notebooks/catalog_tutorials [Superdataset successfully set for catalog]
(my_catalog_env) 

: 1

The `catalog set-super(ok)` result shows that the superdataset was successfully ser for the catalog, and you will now also be able to see an additional `super.json` file in the catalog metadata directory. The content of this file is a simple JSON object specifying the superdataset's `dataset_id` and `dataset_version`:

In [14]:
tree $WORKING_DIRECTORY/test_catalog/metadata
cat $WORKING_DIRECTORY/test_catalog/metadata/super.json

/Users/jsheunis/Documents/psyinf/tutorials/notebooks/catalog_tutorials/test_catalog/metadata
├── abcd
│   └── 1234
│       ├── 10f
│       │   └── 7898cf7fc3465f078a67e15124c72.json
│       └── 217
│           └── 17d85bd1b1526b7e463279763cdb0.json
└── super.json

4 directories, 3 files
(my_catalog_env) {"dataset_id": "abcd", "dataset_version": "1234"}(my_catalog_env) 

: 1

*Now*, when one navigates to the catalog's root address without a `dataset_id` and `dataset_version` specified in the URL, the browser application will find that a superdataset is indeed specified for the catalog, and it will navigate to that specific dataset view.

<div id="catalog_config"></div>

## 2.4. Catalog configuration

***to be completed***

A useful feature of the catalog process is to be able to configure certain properties according to your preferences. This is done with help of a config file (in either `JSON` or `YAML` format) and the `-y/--config-file` flag. DataLad Catalog provides a default config file with the following content:

In [16]:
cat $WORKING_DIRECTORY/datalad-catalog/datalad_catalog/templates/config.json

{
    "catalog_name": "DataCat",
    "link_color": "#fba304",
    "link_hover_color": "#af7714",
    "property_source": {
        "dataset": {
            "dataset_id": "metalad_core",
            "dataset_version": "metalad_core",
            "type": "metalad_core",
            "children": "merge",
            "name": "metalad_studyminimeta",
            "short_name": "",
            "description": ["metalad_studyminimeta", "datacite_gin", "readme.md"],
            "doi": "",
            "url": "merge",
            "authors": "merge",
            "keywords": "merge",
            "license": "",
            "funding": "merge",
            "publications": "merge",
            "subdatasets": "merge",
            "extractors_used": "merge",
            "additional_display": "merge",
            "top_display": "merge"
        }
    }   
}(my_catalog_env) 

: 1

<div id="catalog_remove"></div>

## 2.5. Removing catalog metadata

***to be completed***