# Circuit composition

Copyright (c) 2025 Open Brain Institute

Authors: Michael W. Reimann

last modified: 07.2025

## Summary
This notebook lists the neuronal composition of a (SONATA) circuit model as a Sankey plot.
From the first dropdown menu select the [node set](https://sonata-extension.readthedocs.io/en/latest/sonata_nodeset.html) you want to display the composition of.

From the element in the next cell select at least two properties to display. 

For details, see the [README](README.md).

## Circuit selection and download

A SONATA circuit unser `./analysis_circuit/circuit_config.json` will be analyzed. To place the data there, we must select 
the circuit, then download it. 

Should the circuit of interest already be placed at that location, you can skip ahead to the section `Circuit analysis` below.

#### Project selection
As a first step we select one of the projects we have access to that the circuit is associated with. If the circuit of interest is part of the public OBI assets, any project can be selected.

In [None]:
from entitysdk import Client, ProjectContext, models
from obi_auth import get_token
import os
import time

from obi_notebook import get_projects
from obi_notebook import get_entities

token = get_token(environment="production", auth_mode="daf")
project_context = get_projects.get_projects(token)

#### Circuit selection

Next, we select the circuit. If you already know the unique identifier of the circuit of interest, paste it below into line 4 of the next cell.

Otherwise, a widget for circuit selection will be created that allows you to simply mark the circuit of interest.

In [None]:
client = Client(environment="production", project_context=project_context, token_manager=token)

# Optional: Download using unique ID
entity_ID = "<CIRCUIT-ID>"  # <<< FILL IN UNIQUE CIRCUIT ID HERE


if entity_ID != "<CIRCUIT-ID>":
    circuit_ids = [entity_ID]
else:
# Alternative: Select from a table of entities
    circuit_ids = []
    circuit_ids = get_entities.get_entities("circuit", token, circuit_ids,
                                            project_context=project_context,
                                            multi_select=False,
                                            default_scale="small")

#### Fetch circuit
The circuit is copied to the local system at the expected location.

In [None]:
# Fetch circuit
fetched = client.get_entity(entity_id=circuit_ids[0], entity_type=models.Circuit)
print(f"Circuit fetched: {fetched.name} (ID {fetched.id})\n")
print(f"#Neurons: {fetched.number_neurons}, #Synapses: {fetched.number_synapses}, #Connections: {fetched.number_connections}\n")
print(f"{fetched.description}\n")

# Download SONATA circuit files
asset = [asset for asset in fetched.assets if asset.label=="sonata_circuit"][0]
asset_dir = asset.path 
circuit_dir = "analysis_circuit"
assert not os.path.exists(asset_dir), f"ERROR: Circuit download folder '{asset_dir}' already exists! Please delete folder."
assert not os.path.exists(circuit_dir), f"ERROR: Circuit folder '{circuit_dir}' already exists! Delete folder or choose a different path."

t0 = time.time()
client.download_directory(
    entity_id=fetched.id,
    entity_type=models.Circuit,
    asset_id=asset.id,
    output_path=".",
    max_concurrent=4,  # Parallel file download
)
t = time.time() - t0
print(f"Circuit files downloaded to '{asset_dir}' in {t:.1f}s")
os.rename(asset_dir, circuit_dir)
print(f"'{asset_dir}' folder renamed to '{circuit_dir}'")

## Circuit analysis

In [None]:
import bluepysnap as snap
import pandas

from ipywidgets import widgets
import plotly.graph_objects as go

# Path to existing circuit config
circuit_config = "./analysis_circuit/circuit_config.json"
assert os.path.exists(circuit_config), f"ERROR: Circuit config '{os.path.split(circuit_config)[1]}' not found!"

circ = snap.Circuit(circuit_config)

nodepop = widgets.Dropdown(
    options=
    list(circ.nodes.keys()),
    description="Node population"
)
nodeset = widgets.Dropdown(
    options=
    list(circ.node_sets.content.keys()),
    description='Node set')

# Selection of node set

Please select one of the node populations and node sets defined in the circuit model from the following menu.

In [None]:
display(nodepop)
display(nodeset)

# Selection of properties to display
Please select *between two and eight* properties from the following list of categorical properties defined in the circuit model.

In [None]:
# Get dataframe of all properties and their values
val_df = circ.nodes[nodepop.value].get(nodeset.value)

# This type of display only works for categorical properties. In the future, numerical properties could be binned...
max_num_unique_vals = 25
is_categorical = val_df.dtypes.apply(lambda _x: isinstance(_x, pandas.CategoricalDtype))
has_few_vals = val_df.apply(lambda _x: len(_x.drop_duplicates()) <= max_num_unique_vals, axis=0)
valid_props = is_categorical[is_categorical | has_few_vals].index.values

to_display = widgets.SelectMultiple(options=valid_props,
                                    index=tuple(range(len(valid_props)))[:8],
                                    description="Properties") # 8 is the arbitrarily decided maximum

display(to_display)

In [None]:
# Test of user selection
assert len(to_display.value) >= 2, "Please select AT LEAST 2 properties"
assert len(to_display.value) <= 8, "Please select AT MOST 8 properties"
# Dataframe of only the selected properties
use_df = val_df[list(to_display.value)].apply(pandas.Categorical, axis=0)

# Create a dataframe for a lookup of every possible (categorical) value of the selected properties to a unique index.
# Index: level 0: Name of the property, level 1: value of the property; values: unique index.
label_idx_lo = pandas.concat([pandas.Series(use_df[col].values.categories.values, name="value")
                              for col in use_df.columns], keys=use_df.columns,
                              names=["column"], axis=0).reset_index(level="column")
label_idx_lo["index"] = range(len(label_idx_lo))
label_idx_lo = label_idx_lo.set_index(["column", "value"])["index"]

# The sankey links are built by iterating over pairs of adjacent columns.
lnk_src = []; lnk_tgt = []; lnk_sz = []

for c1, c2 in zip(use_df.columns[:-1], use_df.columns[1:]):
    # Size of a link: Number of overlapping values.
    counts = use_df[[c1, c2]].value_counts()
    for row_idx, row_val in counts.items():
        lnk_src.append(label_idx_lo[c1][row_idx[0]])
        lnk_tgt.append(label_idx_lo[c2][row_idx[1]])
        lnk_sz.append(row_val)

# Create sankey
fig = go.Figure(data=[go.Sankey(
    node = dict(
      pad = 15,
      thickness = 20,
      line = dict(color = "black", width = 0.5),
      label = label_idx_lo.index.to_frame()["value"],
      color = "blue"
    ),
    link = dict(
      source = lnk_src, # indices correspond to labels, eg A1, A2, A1, B1, ...
      target = lnk_tgt,
      value = lnk_sz
  ))])

fig.update_layout(title_text=f"Composition: {nodepop.value}/{nodeset.value}", font_size=10)
fig.show()