# Feature Extractor
We can use the feature extractor to get morphological data for features of interest. You can find the collection of features in the Allen Institute dataset [here](https://neuron-morphology.readthedocs.io/en/latest/autoapi/neuron_morphology/features/index.html).

You can find the steps used to load the .swf files used in this notebook [here](https://github.com/tb-harris/neuroscience-2024/blob/main/tools/Feature_Extractor_and_Reconstructions_Setup.ipynb).

This is based on code adapted from Curt and the Allen Institute.

## Setup

In [4]:
from neuron_morphology.swc_io import morphology_from_swc
from neuron_morphology.feature_extractor.data import Data
from neuron_morphology.feature_extractor.feature_extractor import FeatureExtractor
from neuron_morphology.features.default_features import default_features
from neuron_morphology.constants import (
    SOMA, AXON, BASAL_DENDRITE, APICAL_DENDRITE
)
import numpy as np
import neuron_morphology.feature_extractor.feature_writer as fw

import pandas as pd

Should be 1.23:

In [6]:
np.__version__

'1.23.0'

In [5]:
manifest = pd.read_csv('2021-09-13_mouse_file_manifest.csv')
metadata = pd.read_csv('20200711_patchseq_metadata_mouse.csv', index_col="cell_specimen_id")

In [7]:
reconstructions_manifest = manifest.loc[
    (manifest["file_type"] == "transformed_swc")
]


In [8]:
reconstructions_manifest['archive_uri']

2        ftp://download.brainlib.org:8811/biccn/zeng/ps...
9        ftp://download.brainlib.org:8811/biccn/zeng/ps...
16       ftp://download.brainlib.org:8811/biccn/zeng/ps...
23       ftp://download.brainlib.org:8811/biccn/zeng/ps...
30       ftp://download.brainlib.org:8811/biccn/zeng/ps...
                               ...                        
19314    ftp://download.brainlib.org:8811/biccn/zeng/ps...
19361    ftp://download.brainlib.org:8811/biccn/zeng/ps...
19368    ftp://download.brainlib.org:8811/biccn/zeng/ps...
19375    ftp://download.brainlib.org:8811/biccn/zeng/ps...
19410    ftp://download.brainlib.org:8811/biccn/zeng/ps...
Name: archive_uri, Length: 573, dtype: object

## Step 0 - Download morphologies

If you don't have the morphologies downloaded yet, uncomment and run the code below to download them to a folder called `reconstructions/`:

In [9]:
'''
import urllib.request

for index, swc_urls in reconstructions_manifest.iterrows():
    if not os.path.exists("./reconstructions/"):
        os.makedirs("./reconstructions/")
    if not os.path.exists("./reconstructions/" + swc_urls.loc["file_name"]):
        print("Downloading " + swc_urls.loc["file_name"])
        urllib.request.urlretrieve(swc_urls.loc["archive_uri"], "./reconstructions/" + swc_urls.loc["file_name"])

'''

'\nimport urllib.request\n\nfor index, swc_urls in reconstructions_manifest.iterrows():\n    if not os.path.exists("./reconstructions/"):\n        os.makedirs("./reconstructions/")\n    if not os.path.exists("./reconstructions/" + swc_urls.loc["file_name"]):\n        print("Downloading " + swc_urls.loc["file_name"])\n        urllib.request.urlretrieve(swc_urls.loc["archive_uri"], "./reconstructions/" + swc_urls.loc["file_name"])\n\n'

## Step 1: Load morphologies

Get the morphology of each cell (takes some time). If you want to focus only on a specific group of cells, subset here.

In [None]:
# Apply the morphology_from_swc() function to each file
morphologies = reconstructions_manifest["file_name"].apply(lambda name : morphology_from_swc("reconstructions/" + name))
morphologies.index = reconstructions_manifest["cell_specimen_id"].astype(int)

## Step 2: Import and Register our features
Find our relevant feature(s) from the [documentation](https://neuron-morphology.readthedocs.io/en/latest/autoapi/neuron_morphology/features/index.html) or by looking at directly at the [code](https://github.com/AllenInstitute/neuron_morphology/tree/dev/neuron_morphology/features).

Import the relevant features, and then put them together in a list to register with the FeatureExtractor.

The below example is adapted from the [default features](https://github.com/AllenInstitute/neuron_morphology/blob/dev/neuron_morphology/features/default_features.py) set, and will take a substantial amount of time to run -- I would recommend using much fewer features.

`specialize()` and `nested_specialize()` are used to specify specializations:
 * `NEURITE_SPECIALIZATIONS` - AxonSpec, ApicalDendriteSpec, BasalDendriteSpec, DendriteSpec
 * `COORD_TYPE_SPECIALIZATIONS` - x, y, z coordinates (x and y are generally more useful!)

In [None]:
# Features - Change the features of interest here
from neuron_morphology.features.dimension import dimension
from neuron_morphology.features.intrinsic import (
    num_branches, num_tips, num_nodes, mean_fragmentation,
    max_branch_order
)
from neuron_morphology.features.branching.bifurcations import (
    num_outer_bifurcations, mean_bifurcation_angle_local, mean_bifurcation_angle_remote
)
from neuron_morphology.features.size import (
    total_length, total_surface_area, total_volume, mean_diameter,
    mean_parent_daughter_ratio, max_euclidean_distance
)
from neuron_morphology.features.path import (
    max_path_distance, early_branch_path, mean_contraction
)
from neuron_morphology.features.statistics.overlap import overlap
from neuron_morphology.features.statistics.moments import moments

from neuron_morphology.features.layer.layer_histogram import (
    earth_movers_distance, normalized_depth_histogram)

# ----------------------------

# Other imports
from neuron_morphology.constants import (
    AXON, BASAL_DENDRITE, APICAL_DENDRITE
)

# Feature Extractor, marks, and specializations
from neuron_morphology.feature_extractor.data import Data
from neuron_morphology.feature_extractor.marked_feature import (
    marked, specialize, nested_specialize
)
from neuron_morphology.feature_extractor.mark import (
    RequiresLayerAnnotations, Intrinsic, Geometric, AllNeuriteTypes,
    RequiresSoma)
from neuron_morphology.feature_extractor.feature_specialization import (
    NEURITE_SPECIALIZATIONS, NEURITE_COMPARISON_SPECIALIZATIONS,
    AxonSpec, ApicalDendriteSpec, BasalDendriteSpec, DendriteSpec,
    AxonCompareSpec, ApicalDendriteCompareSpec,
    BasalDendriteCompareSpec, DendriteCompareSpec, AllNeuriteSpec
)
from neuron_morphology.features.statistics.coordinates import COORD_TYPE_SPECIALIZATIONS
# ----------------------------


# Adopted from default features; mean_fragmentation removed due to div by 0 errors
features_to_calculate = [
    nested_specialize(
            dimension,
            [COORD_TYPE_SPECIALIZATIONS, NEURITE_SPECIALIZATIONS]),
    specialize(num_nodes, NEURITE_SPECIALIZATIONS),
    specialize(num_branches, NEURITE_SPECIALIZATIONS),
    specialize(num_tips, NEURITE_SPECIALIZATIONS),
    specialize(max_branch_order, NEURITE_SPECIALIZATIONS),
    specialize(num_outer_bifurcations, NEURITE_SPECIALIZATIONS),
    specialize(mean_bifurcation_angle_local, NEURITE_SPECIALIZATIONS),
    specialize(mean_bifurcation_angle_remote, NEURITE_SPECIALIZATIONS),
    specialize(total_length, NEURITE_SPECIALIZATIONS),
    specialize(total_surface_area, NEURITE_SPECIALIZATIONS),
    specialize(total_volume, NEURITE_SPECIALIZATIONS),
    specialize(mean_diameter, NEURITE_SPECIALIZATIONS),
    specialize(mean_parent_daughter_ratio, NEURITE_SPECIALIZATIONS),
    specialize(max_euclidean_distance, NEURITE_SPECIALIZATIONS),
    max_path_distance,
    early_branch_path,
    mean_contraction,
    nested_specialize(
            overlap,
            [{AxonSpec, ApicalDendriteSpec, BasalDendriteSpec, DendriteSpec},
             {AxonCompareSpec, ApicalDendriteCompareSpec,
              BasalDendriteCompareSpec,
              DendriteCompareSpec}]),
    nested_specialize(
            moments,
            [COORD_TYPE_SPECIALIZATIONS, NEURITE_SPECIALIZATIONS]),
    specialize(normalized_depth_histogram, NEURITE_SPECIALIZATIONS),
    nested_specialize(
        earth_movers_distance, 
        [
            {AxonSpec, ApicalDendriteSpec, BasalDendriteSpec, DendriteSpec},
            {
                AxonCompareSpec, ApicalDendriteCompareSpec,
                BasalDendriteCompareSpec,
                DendriteCompareSpec
            },
        ]
    )

]


# Create a new feature extractor
fe = FeatureExtractor()
# Register our target features
fe.register_features(features_to_calculate)

<neuron_morphology.feature_extractor.feature_extractor.FeatureExtractor at 0x44b1ec280>

## Step 3: Extract features

Function to extract features for a single neuron:

In [200]:
from typing import AbstractSet
from neuron_morphology.feature_extractor.utilities import unnest
from neuron_morphology.feature_extractor.data import Data

# Extract the features from a single neuron morphology object
def extract_features(neuron_morphology):
  data = Data(neuron_morphology)
  try: 
      feature_extraction_run = fe.extract(data, required_marks=frozenset())
      results = feature_extraction_run.results

  except Exception as e:
      print(f"Error occurred while extracting features: {e}")
      return dict()

  return unnest(results)

Create a data frame by running the *extract_features()* function on each neuron morphology (takes some time). Start by setting `test = True` to only calculate features for the first 3 neurons.

In [221]:
test = False

In [None]:
morphologies_target = morphologies

if test:
    morphologies_target = morphologies_target[:3]


features = pd.DataFrame(
    (extract_features(neuron) for neuron in morphologies_target.values),
    index=morphologies_target.index
)


Double check that your features dataframe has values for each of your target features across all the cells:

## Step 4: Save features

In [None]:
features.to_csv(
    'features.csv', # File name
    index_label='cell_specimen_id'
)