---
title: "Annotation Tables"
aliases:
    - programmatic_access/em_py_03_annotation_tables.html
format: 
    html:
        toc: true 
        code-fold: false
execute:
    eval: False
    warning: False
jupyter: python3
bibliography: references.bib
---

The `minnie65_public` data release includes a number of annotation tables that help label the dataset.
This section describes the content of each of these tables — [see here for instructions for how to query and filter tables](/quickstart_notebooks/em_py_02_cave_quickstart.html).


Unless otherwise specificied (i.e. via `desired_resolution`), all positions are in units of 4,4,40 nm/voxel resolution.

## Common Fields

Several fields (or column names) are common to many tables.
These fall into two main classes: the spatial point columns that are how we assign annotations to cells via points in the 3d space and book-keeping columns, that are used internally to track the state of the data.

### Spatial Point Columns

Most tables have one or more **Bound Spatial Points**, which is a location in the 3d space that tells the annotation to remain associated with the root id at that location.

Bound spatial points have will have one prefix, usually `pt` (i.e. "point") and three associated columns with different suffixes: `_position`, `_supervoxel_id`, and `_root_id`.

For a given prefix `{pt}`, the three columns are as follows:

* The `{pt}_position` indicates the location of the point in 3d space.
* The `{pt}_supervoxel_id` indicates a unique identifier in the segmentation, and is mostly internal bookkeeping.
* The `{pt}_root_id` indicates the root id of the annotation at that location.

### Book-keeping Columns

Several columns are common to many or all tables, and mostly used as internal book-keeping.
Rather than describe these for every table, they will just be mentioned briefly here:

| Column | Description |
| :--- | :--- |
| `id` | A unique ID specific to the annotation within that table |
| `created` | Internal bookkeeping column, should always be `t` for data you can download |
| `valid` | A unique ID specific to the annotation within that table |
| `target_id` | Some tables reference other tables, particularly the nucleus table. If present, this column will be the same as `id` |
| `created_ref` / `valid_ref` / `id_ref` (optional) | For reference tables, the data shows both the created/valid/id of the reference annotation and the target annotation. The values with the `_ref` suffix are those of the reference table (usually something like proofreading state or cell type) and the values without a suffix ar ethose of the target table (usually a nucleus) |

: Common columns {.light .hover}


## Synapse Table
{{< include _annotation_tables/_synapses_pni_v2.qmd >}}


## Nucleus tables

The 'nucleus centroid' of a cell is unlikely to change with proofreading, and so is a useful static identifier for a given cell. The results of automatic nucleus segmentation and neuron-detection are avialable in the following tables. These tables are often the 'reference' table for other annotations.

### Nucleus Detection Table
{{< include _annotation_tables/_nucleus_detection_v0.qmd >}}

### Neuron-Nucleus Table
{{< include _annotation_tables/_nucleus_ref_neuron_svm.qmd >}}

### Nucleus brain area assignment
{{< include _annotation_tables/_nucleus_functional_area_assignment.qmd >}}

## Cell Type Tables

There are several tables that contain information about the cell type of neurons in the dataset, with each table representing a different method of doing the classificaiton.
Because each method requires a different kind of information, not all cells are present in all tables.
Each of the cell types tables has the same format and in all cases the `id` column references the nucleus id of the cell in question.

### Manual Cell Types (V1 Column)

Table name: `allen_v1_column_types_slanted_ref` and `aibs_column_nonneuronal_ref`

A subset of nucleus detections in a 100 um column (n=2204) in VISp were manually classified by anatomists at the Allen Institute into categories of cell subclasses, first distinguishing cells into classes of non-neuronal, excitatory and inhibitory. 

The key columns are:

| Column | Description |
| :--- | :--- |
| `id` | Soma ID for the cell |
| `pt_position` \ `pt_supervoxel_id` \ `pt_root_id` | Bound spatial point columns associated with the centroid of the cell nucleus |
| `classification-system`| One of `aibs_coarse_excitatory` or `aibs_coarse_inhibitory` for detected neurons, or `aibs_coarse_nonneuronal` for non-neurons (glia/pericytes).
| `cell_type` | One of several cell types, detailed below |

: AIBS Manual Cell Types, V1 COlumn {.light .hover}

This is a reference table on `nucleus_detection_v0`. The cell types in the table are:

::: {.callout-note appearance="minimal" collapse=true}

### Manual Cell Types (neurons)

| Cell Type | Subclass | Description |
| :--- | :--- | :--- |
| `23P` | Excitatory | Layer 2/3 cells | 
| `4P` | Excitatory | Layer 4 cells | 
| `5P-IT` | Excitatory | Layer 5 **i**ntra**t**elencephalic cells | 
| `5P-ET` | Excitatory | Layer 5 **e**xtra**t**elencephalic cells | 
| `5P-NP` | Excitatory | Layer 5 near-projecting cells | 
| `6P-IT` | Excitatory | Layer 6 **i**ntra**t**elencephalic cells | 
| `6P-CT` | Excitatory | Layer 6 **c**ortico**t**halamic cells | 
| `BC` | Inhibitory | Basket cell | 
| `BPC` | Inhibitory | Bipolar cell. *In practice, this was used for all cells thought to be VIP cell, not only those with a bipolar dendrite* | 
| `MC` | Inhibitory | Martinotti cell. *In practice, this label was used for all inhibitory neurons that appeared to be Somatostatin cell, not only those with a Martinotti cell morphology*| 
| `Unsure` | Inhibitory | Unsure. *In practice, this label also is used for all likely-inhibitory neurons that did not match other types*| 

: AIBS Manual Cell Type definitions (neurons) {.light .hover}
:::

::: {.callout-note appearance="minimal" collapse=true}
### Manual Cell Types (non-neurons)
| Cell Type | Subclass | Description |
| :--- | :--- | :--- |
| `OPC` | Non-neuronal | Oligodendrocyte precursor cell | 
| `astrocyte` | Non-neuronal | Astrocyte | 
| `microglia` | Non-neuronal | Microglia | 
| `pericyte` | Non-neuronal | Pericyte | 
| `oligo` | Non-neuronal | Oligodendrocyte | 

: AIBS Manual Cell Type definitions (non-neurons) {.light .hover}

:::

### Predictions from soma/nucleus features

Table name: `aibs_metamodel_celltypes_v661`

This table contains the results of a hierarchical classifier trained on features of the cell body and nucleus of cells. This was applied to most cells in the dataset that had complete cell bodies (e.g. not cut off by the edge of the data). For more details, see "Perisomatic Features" [@elabbady_perisomatic_2022]. In general, this does a good job, but sometimes confuses layer 5 inhibitory neurons as being excitatory: 

The key columns are:

| Column | Description |
| :--- | :--- |
| `id` | Soma ID for the cell |
| `pt_position` \ `pt_supervoxel_id` \ `pt_root_id` | Bound spatial point columns associated with the centroid of the cell nucleus |
| `classification-system`| One of `excitatory_neuron` or `inhibitory_neuron` for detected neurons, or `nonneuron` for non-neurons (glia/pericytes).
| `cell_type` | One of several cell types, detailed below |

: AIBS Soma Nuc Metamodel Table {.light .hover}

This is a reference table on `nucleus_detection_v0`, with small-objects and multi-soma errors removed. The model was run with cell-based features as of version 661 of the dataset. The cell types in the table are:

::: {.callout-note appearance="minimal" collapse=true}

### Soma Nuc Metamodel Cell types

| Cell Type | Subclass | Description |
| :--- | :--- | :--- |
| `23P` | Excitatory | Layer 2/3 cells | 
| `4P` | Excitatory | Layer 4 cells | 
| `5P-IT` | Excitatory | Layer 5 **i**ntra**t**elencephalic cells | 
| `5P-ET` | Excitatory | Layer 5 **e**xtra**t**elencephalic cells | 
| `5P-NP` | Excitatory | Layer 5 near-projecting cells | 
| `6P-IT` | Excitatory | Layer 6 **i**ntra**t**elencephalic cells | 
| `6P-CT` | Excitatory | Layer 6 **c**ortico**t**halamic cells | 
| `BC` | Inhibitory | Basket cell | 
| `BPC` | Inhibitory | Bipolar cell. *In practice, this was used for all cells thought to be VIP cell, not only those with a bipolar dendrite* | 
| `MC` | Inhibitory | Martinotti cell. *In practice, this label was used for all inhibitory neurons that appeared to be Somatostatin cell, not only those with a Martinotti cell morphology*| 
| `NGC` | Inhibitory | Neurogliaform cell. *In practice, this label also is used for all inhibitory neurons in layer 1, many of which may not be neurogliaform cells although they might be in the same molecular family*| 
| `OPC` | Non-neuronal | Oligodendrocyte precursor cell | 
| `astrocyte` | Non-neuronal | Astrocyte | 
| `microglia` | Non-neuronal | Microglia | 
| `pericyte` | Non-neuronal | Pericyte | 
| `oligo` | Non-neuronal | Oligodendrocyte | 

: AIBS Soma Nuc Metamodel: Cell Type definitions {.light .hover}

:::

Previous versions of this table include: `aibs_soma_nuc_metamodel_preds_v117` (run on a subset of data, the V1 column) and `aibs_soma_nuc_exc_mtype_preds_v117` (using training data labeled by another classifier: see `mtypes` below). 


### Coarse prediction from spine detection
Table name: `baylor_log_reg_cell_type_coarse_v1`

This table contains the results of a logistic regression classifier trained on properties of neuronal dendrites. This was applied to many cells in the dataset, but required more data than soma and nucleus features alone and thus more cells did not complete the pipeline. It has very good performance on excitatory vs inhibitory neurons because it focuses on dendritic spines, a characteristic property of excitatory neurons. It is a good table to double check E/I classifications if in doubt.

The key columns are:

| Column | Description |
| :--- | :--- |
| `id` | Soma ID for the cell |
| `pt_position` \ `pt_supervoxel_id` \ `pt_root_id` | Bound spatial point columns associated with the centroid of the cell nucleus |
| `classification-system`| `baylor_log_reg_cell_type_coarse` for all entries |
| `cell_type` | `excitatory` or `inhibitory` |

: Baylor Dendrite Feature Table {.light .hover}

### Fine prediction from dendritic features

Table name: `aibs_metamodel_mtypes_v661_v2`

This table contains all detected neurons across the dataset, 

Excitatory neurons and inhibitory neurons were distinguished with the `soma_nucleus` model above, and subclasses were assigned based on a data-driven clustering of the neuronal features. Inhibitory neurons were classified based on how they distributed they synaptic outputs onto target cells, while exictatory neurons were classified based on a collection of dendritic features. 

For more details, see the section on the [minnie column](../em_03_proofreading.html#minnie-column) or read the
preprint "A connectomic census" [@schneider-mizell2023]. 

Note that all cell-type labels in this table come from a clustering specific to this paper, and while they are intended to align with the broader literature they are not a direct mapping or a well-established convention. 

For a more conventional set of labels on the same set of cells, look at the manual table `allen_v1_column_types_slanted_ref`. Cell types in that table align with those in the `aibs_metamodel_celltypes_v661` classifier above.

The key columns are:

| Column | Description |
| :--- | :--- |
| `id` | Soma ID for the cell |
| `pt_position` \ `pt_supervoxel_id` \ `pt_root_id` | Bound spatial point columns associated with the centroid of the cell nucleus |
| `classification-system`| `excitatory` or `inhibitory` |
| `cell_type` | One of several cell types, detailed below |

: Column M-type Table {.light .hover}

This is a reference table on `nucleus_detection_v0`, with non-neuronal objects removed. The model was run with cell-based features as of version 661 of the dataset. The cell types in the table are:


The cell types in the table are:

::: {.callout-note appearance="minimal" collapse=true}

### M-type Cell Type definitions

| Cell Type | Subclass | Description |
| :--- | :--- | :--- |
| `L2a` | Excitatory | A cluster of layer 2 (upper layer 2/3) excitatory neurons | 
| `L2b` | Excitatory | A cluster of layer 2 (upper layer 2/3) excitatory neurons | 
| `L3a` | Excitatory | A cluster of excitatory neurons transitioning between upper and lower layer 2/3 | 
| `L3b` | Excitatory | A cluster of layer 3 (upper layer 2/3) excitatory neurons | 
| `L3c` | Excitatory | A cluster of layer 3 (upper layer 2/3) excitatory neurons | 
| `L4a` | Excitatory | The largest cluster of layer 4 excitatory neurons | 
| `L4b` | Excitatory | Another cluster of layer 4 excitatory neurons | 
| `L4c` | Excitatory | A cluster of layer 4 excitatory neurons along the border with layer 5 | 
| `L5a` | Excitatory | A cluster of layer 5 IT neurons at the top of layer 5 | 
| `L5b` | Excitatory | A cluster of layer 5 IT neurons throughout layer 5 | 
| `L5ET` | Excitatory | The cluster of layer 5 ET neurons | 
| `L5NP` | Excitatory | The cluster of layer 5 NP neurons | 
| `L6a` | Excitatory | A cluster of layer 6 IT neurons at the top of layer 6 | 
| `L6b` | Excitatory | A cluster of layer 6 IT neurons throughout layer 6. *Note that this is different than the label "Layer 6b" which refers to a narrow band at the border between layer 6 and white matter* | 
| `L6c` | Excitatory | A cluster of tall layer 6 cells (unsure if IT or CT) | 
| `L6CT` | Excitatory | A cluster of tall layer 6 cells matching manual CT labels | 
| `L6wm` | Excitatory | A cluster of layer 6 cells along the border with white matter | 
| `PTC` | Inhibitory | Perisomatic targeting cells, a cluster of inhibitory neurons that target the soma and proximal dendrites of excitatory neurons. Approximately corresponds to **basket cell** | 
| `DTC` | Inhibitory | Dendrite targeting cells, a cluster of inhibitory neurons that target the distal dendrites of excitatory neurons. Most **SST cells** would be DTC | 
| `STC` | Inhibitory | Sparsely targeting cells, a cluster of inhibitory neurons that don't concentrate multiple synapses onto the same target neurons. Many **neurogliaform cells** and layer 1 interneurons fall into this category |
| `ITC` | Inhibitory | Inhibitory targeting cells, a cluster of inhibitory neurons that preferntially target other inhibitory neurons. Most **VIP cells** would be ITCs |

:::

Previous versions of this table include: `allen_column_mtypes_v1` (run on a subset of data, the V1 column)

## Proofreading Tables

### Proofreading Status and Strategy
{{< include _annotation_tables/_proofreading_status_and_strategy.qmd >}}


:::: {.callout-note appearance="minimal" collapse=true}
#### Proofreading status at public release

::: {.callout-important}
This table is out-of-date, and remains here for convenient reference
:::

{{< include _annotation_tables/_proofreading_status_public_release.qmd >}}

::::

## Functional Coregistration Tables

To relate the structural data to functional data, cell bodies must be coregistered between the functional imaging and EM volumes.
The results of this coregistration are stored in two tables with the same columns:

* `coregistration_manual_v4` : The results of manually verified coregistration. This table is well-verified, but contains fewer {term}`ROI`s (N=15,352 root ids, 19,181 ROIs).
* `coregistration_auto_phase3_fwd_apl_vess_combined` : The results of automated functional matching between the EM and 2-p functional data. This table is not manually verified, but contains more {term}`ROI`s (N=84,233 ROIs). This table reconciles the following two tables that both make a best match of the of registration using different techniques: `coregistration_auto_phase3_fwd` and `apl_functional_coreg_vess_fwd`.

::: {.callout-important}
Please see the [Functional Data](../functional_tutorial/functional_intro.html) section for more information about using this data. 

fix broken link
:::

The column descriptions are:

| Column | Description |
| :--- | :--- |
| `id` | Soma ID for the cell |
| `pt_position` \ `pt_supervoxel_id` \ `pt_root_id` | Bound spatial point columns associated with the centroid of the cell nucleus |
| `session`| The session index from functional imaging |
| `scan_idx` | The scan index from functional imaging |
| `unit_id` | The functional unit index from imaging. Only unique within scan and session |
| `field` | The field index from functional imaging |
| `residual` | The residual distance between the functional and the assigned structural points after transformation, in microns |
| `score` | A separation score, measuring the difference between the residual distance to the assigned neuron and the distance to the nearest non-assigned neuron, in microns. *This can be negative if the non-assigned neuron is closer than the assigned neuron. Larger values indicate fewer nearby neurons that could be confused with the assigned neuron.* |

: Coregistration table {.light .hover}

Previous versions of this analysis include the following tables: 

* `coregistration_manual_v3` : The results of manually verified coregistration. This table is well-verified, but contains fewer {term}`ROI`s (N=12,052 root ids, 13,925 ROIs).
* `apl_functional_coreg_forward_v5` : The results of automated functional matching between the EM and 2-p functional data. This table is not manually verified, but contains more {term}`ROI`s (N=36,078 root ids, 68,873 ROIs).


### Functional properties

A summary of the functional properties for each of the coregistered neurons (as of `coregistration_manual_v3`) are available for convenience. 

The table `functional_properties_v3_bcm` is a reference table on the `nucleus_detection_v0`, and adds the following columns


| Column | Description |
| :--- | :--- |
| `target_id` | Soma ID for the cell |
| `pt_position` \ `pt_supervoxel_id` \ `pt_root_id` | Bound spatial point columns associated with the centroid of the nucleus |
| `session` | The session index from functional imaging |
| `scan_idx` | The scan index from functional imaging |
| `unit_id` | The functional unit index from imaging. Only unique within scan and session |
| `pref_ori` | preferred orientation in radians (0 - pi), horizontal bar moving upward is 0 and orientation increases clockwise, extracted from model responses to oriented noise stimuli |
| `pref_dir` | preferred direction in radians (0 - 2pi), horizontal bar moving upward is 0 and orientation increases clockwise, extracted from model responses to oriented noise stimuli |
| `gOSI` | global orientation selectivity index |
| `gDSI`|  global direction selectivity index |
| `cc_abs` | prediction performance of the model, higher is better |

: Fuctional properties of coregistered neurons {.light .hover}



## Overview of relevant tables
| Table Name | Number of Annotations | Description |
| :--- | :--- | :--- |
| `synapses_pni_v2` | 337,312,429 | The locations of synapses and the segment ids of the pre and post-synaptic automated synapse detection
| `nucleus_detection_v0` | 144,120 | The locations of nuclei detected via a fully automated method |
| `nucleus_alternative_points`| 8,388 | A reference annotation table marking alternative segment_id lookup locations for a subset of nuclei in nucleus_detection_v0 that is more accurate than the centroid location listed there |
| `nucleus_ref_neuron_svm` | 144,120 |  reference annotation indicating the output of a model detecting which nucleus detections are neurons versus which are not 1 |
| `coregistration_manual_v4` | 19,181 |  A table indicating the association between individual units in the functional imaging data and nuclei in the structural data, derived from human powered matching. Includes residual and separation scores to help assess confidence |
| `coregistration_auto_phase3_fwd_apl_vess_combined` | 84,233 |  A table indicating the association between individual units in the functional imaging data and nuclei in the structural data, derived from the automated procedure. Includes residuals and separation scores to help assess confidence |
| `proofreading_status_and_strategy` | 1421 |  A table indicating which neurons have been proofread on their axons or dendrites |
| `aibs_column_nonneuronal_ref` | 542 | Cell type reference annotations from a human expert of non-neuronal cells located amongst the Minnie Column |
| `allen_v1_column_types_slanted_ref` | 1,357 | Neuron cell type reference annotations from human experts of neuronal cells located amongst the Minnie Column |
| `allen_column_mtypes_v1` | 1,357 | Neuron cell type reference annotations from data driven unsupervised clustering of neuronal cells |
| `aibs_metamodel_mtypes_v661_v2`| 72,158| Reference annotations indicating the output of a model predicting cell types across the dataset based on the labels from allen_column_mtypes_v1.1 |
| `aibs_metamodel_celltypes_v661` | 94,014 | Reference annotations indicating the output of a model predicting cell classes based on the labels from allen_v1_column_types_slanted_ref and aibs_column_nonneuronal_ref |
| `baylor_log_reg_cell_type_coarse_v1` | 55,063 | Reference annotations indicated the output of a logistic regression model predicting whether the nucleus is part of an excitatory or inhibitory cell |
| `baylor_gnn_cell_type_fine_model_v2` | 49,051 | Reference annotations indicated the output of a graph neural network model predicting the cell type based on the human labels in allen_v1_column_types_slanted_ref |
| `proofreading_edits` | 121,271 |  A table containing the number of edits on every segment_id associated with a nucleus in the volume |
| `vortex_astrocyte_proofreading_status` | 12 |  This table reports the status of a manually selected subset of astrocytes within the VISP column. Astrocyte seelection and proofreading performed as part of VORTEX.  |

: heard you like tables--here's a table for your tables {.light .hover}