# Node sets

## Preamble
The code in this section is identical to the code in sections "Introduction" and "Loading" from the previous tutorial. It assumes that you have already downloaded the circuit. If not, take a look to the notebook **01_circuits** (Downloading a circuit).

In [1]:
import bluepysnap
import pandas as pd

circuit_path = "sonata/circuit_sonata.json"
circuit = bluepysnap.Circuit(circuit_path)

## Introduction
As briefly mentioned in [node properties notebook](./03_node_properties.ipynb), node set is a predetermined collection of queries for nodes. They are saved in a JSON file which is usually added into the circuit and/or simulation config. For a more in-depth explanation, please see: [SONATA Node Sets - Circuit Documentation](https://sonata-extension.readthedocs.io/en/latest/sonata_nodeset.html).

We can directly access node sets in snap if it's added to the circuit config:

In [2]:
circuit.node_sets.content

{'Mosaic': ['All'],
 'All': ['thalamus_neurons'],
 'thalamus_neurons': {'population': 'thalamus_neurons'},
 'Excitatory': {'synapse_class': 'EXC'},
 'Inhibitory': {'synapse_class': 'INH'},
 'Rt_RC': {'mtype': 'Rt_RC'},
 'VPL_IN': {'mtype': 'VPL_IN'},
 'VPL_TC': {'mtype': 'VPL_TC'},
 'bAC_IN': {'etype': 'bAC_IN'},
 'cAD_noscltb': {'etype': 'cAD_noscltb'},
 'cNAD_noscltb': {'etype': 'cNAD_noscltb'},
 'dAD_ltb': {'etype': 'dAD_ltb'},
 'dNAD_ltb': {'etype': 'dNAD_ltb'},
 'mc0;Rt': {'region': 'mc0;Rt'},
 'mc0;VPL': {'region': 'mc0;VPL'},
 'mc1;Rt': {'region': 'mc1;Rt'},
 'mc1;VPL': {'region': 'mc1;VPL'},
 'mc2;Rt': {'region': 'mc2;Rt'},
 'mc2;VPL': {'region': 'mc2;VPL'},
 'mc3;Rt': {'region': 'mc3;Rt'},
 'mc3;VPL': {'region': 'mc3;VPL'},
 'mc4;Rt': {'region': 'mc4;Rt'},
 'mc4;VPL': {'region': 'mc4;VPL'},
 'mc5;Rt': {'region': 'mc5;Rt'},
 'mc5;VPL': {'region': 'mc5;VPL'},
 'mc6;Rt': {'region': 'mc6;Rt'},
 'mc6;VPL': {'region': 'mc6;VPL'},
 'IN': {'mtype': {'$regex': '.*IN'}, 'region': {'$reg

To prove a point, let's query some ids using a node set, and compare it with querying with a similar query:

In [3]:
node_set_result = circuit.nodes.ids('VPL_TC')
print(f'Ids found: {len(node_set_result)}')

query_result = circuit.nodes.ids({'mtype': 'VPL_TC'})
print(f'Queries result in the same outcome : {node_set_result == query_result}')

Ids found: 64856
Queries result in the same outcome : True


We'll go more into querying later at this document. For now, let's go over other aspects of node sets.

## Features / usecases
Sometimes, we may want to work with node sets that aren't found in a circuit or simulation config. This can be due to
* experimenting
* can't / don't want to modify the config file
* combining node sets
* etc.

First of all, let's see how we can open / create node sets.

### Opening a node set file
For demonstration purposes, let's open the circuit's node sets from a file:

In [4]:
node_sets_circuit = bluepysnap.node_sets.NodeSets.from_file('./sonata/networks/nodes/node_sets.json')
print(f"Contents match: {node_sets_circuit.content == circuit.node_sets.content}")

Contents match: True


### Creating node sets on the fly

If we want to, for example, test node sets without having to write them to a file and load that over and over again. We can create node sets directly from a dict:

In [5]:
node_sets_0_9 = bluepysnap.node_sets.NodeSets.from_dict({'nodes_0-9': {'node_id': [*range(10)]}})
node_sets_0_9.content

{'nodes_0-9': {'node_id': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]}}

This may be handy if you can't modify the existing nodesets file, but want to augment it with nodesets.

### Combining node sets
So now that we have two nodesets, `node_sets_circuit` read from a file and `node_sets_0_9` created from a dict, let's combine them. Naturally, we could also open two node sets from files and combine them.

For this purpose, node sets objects have an `NodeSets.update()` method. `update` takes another node sets object as an argument, and adds all its node sets to itself. 

Let's update the `node_sets_circuit` with `node_sets_0_9`:

In [6]:
node_sets_circuit.update(node_sets_0_9)
node_sets_circuit.content

{'Mosaic': ['All'],
 'All': ['thalamus_neurons'],
 'thalamus_neurons': {'population': 'thalamus_neurons'},
 'Excitatory': {'synapse_class': 'EXC'},
 'Inhibitory': {'synapse_class': 'INH'},
 'Rt_RC': {'mtype': 'Rt_RC'},
 'VPL_IN': {'mtype': 'VPL_IN'},
 'VPL_TC': {'mtype': 'VPL_TC'},
 'bAC_IN': {'etype': 'bAC_IN'},
 'cAD_noscltb': {'etype': 'cAD_noscltb'},
 'cNAD_noscltb': {'etype': 'cNAD_noscltb'},
 'dAD_ltb': {'etype': 'dAD_ltb'},
 'dNAD_ltb': {'etype': 'dNAD_ltb'},
 'mc0;Rt': {'region': 'mc0;Rt'},
 'mc0;VPL': {'region': 'mc0;VPL'},
 'mc1;Rt': {'region': 'mc1;Rt'},
 'mc1;VPL': {'region': 'mc1;VPL'},
 'mc2;Rt': {'region': 'mc2;Rt'},
 'mc2;VPL': {'region': 'mc2;VPL'},
 'mc3;Rt': {'region': 'mc3;Rt'},
 'mc3;VPL': {'region': 'mc3;VPL'},
 'mc4;Rt': {'region': 'mc4;Rt'},
 'mc4;VPL': {'region': 'mc4;VPL'},
 'mc5;Rt': {'region': 'mc5;Rt'},
 'mc5;VPL': {'region': 'mc5;VPL'},
 'mc6;Rt': {'region': 'mc6;Rt'},
 'mc6;VPL': {'region': 'mc6;VPL'},
 'IN': {'mtype': {'$regex': '.*IN'}, 'region': {'$reg

as we can see, the node sets object contains the newly created node set `nodes_0-9`.

**WARNING:** if the node sets object already contains node sets with same names as in the update, those node sets will be overwritten. The names of the overwritten node sets are returned in the `update` function:

In [7]:
# Let's overwrite 'nodes_0-9'
fake_0_9_node_set = bluepysnap.node_sets.NodeSets.from_dict({'nodes_0-9': {'node_id': [1]}})
overwritten = node_sets_circuit.update(fake_0_9_node_set)
print(f'Overwritten node sets: {overwritten}')
print(f'content["nodes_0-9"]: {node_sets_circuit.content["nodes_0-9"]}')

Overwritten node sets: {'nodes_0-9'}
content["nodes_0-9"]: {'node_id': [1]}


### Compound node sets
Compound node sets are literally node sets that are composed of other node sets. Let's create node sets with one compound node set:

In [8]:
node_sets_compound = bluepysnap.node_sets.NodeSets.from_dict({
    'nodes_0-4': {'node_id': [*range(5)]},
    'nodes_5-9': {'node_id': [*range(5,10)]},
    'nodes_0-9': ['nodes_0-4', 'nodes_5-9'], # Compound node set with node set names in a list results in OR case
})
node_sets_compound.content

{'nodes_0-4': {'node_id': [0, 1, 2, 3, 4]},
 'nodes_5-9': {'node_id': [5, 6, 7, 8, 9]},
 'nodes_0-9': ['nodes_0-4', 'nodes_5-9']}

**NOTE:** Compound node sets always represent "OR" instead of "AND". I.e., the queries return results belonging to any of the node sets listed in a compound node set.

### Referring to a node set in a `NodeSets` object
`NodeSets` object works kind of like a `dict` in the sense that if you wish to refer to a specific node set, the syntax is the same as in `dict`:

In [9]:
node_sets_circuit['VPL_TC']

<bluepysnap.node_sets.NodeSet at 0x7fffa4493580>

Above, we got a `NodeSet` (not `NodeSets`!) object, i.e., one instance of a node set. For our purposes, we don't really have to know what it is, as long as we know how to access it. This will become handy in querying.

## Conclusion
In this notebook we took a deeper look into node sets. In the next notebook of the series, we learn how to query nodes in SNAP with and without node sets.