#data from
 https://brainmapportal-live-4cc80a57cd6e400d854-f7fdcae.divio-media.net/filer_public/94/2b/942bdfbc-89cb-4414-9eda-a348fadef841/mouse_patch-seq_demo.html


In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns


import sys

sys.path.append("../src")
sys.path.append("../downloads")

Metadata file
The metadata file has information about each Patch-seq cell, including its different identifiers and assigned cell types.

In [None]:
metadata = pd.read_csv("../downloads/20200711_patchseq_metadata_mouse.csv")

metadata.head()

File manifest
The file manifest contains URLs for the different data files associated with the Patch-seq cells that are located in various archives.

In [None]:
#change - file is actually an xcel
file_manifest = pd.read_excel("../downloads/2021-09-13_mouse_file_manifest.xlsx")
file_manifest.head()

Transcriptomic data
Here, we have already downloaded the (large) transcriptomic expression data file. In that file, expression is quantified as counts-per-million (cpm), and each row is a gene and each column is a cell.

In [None]:
gene_data = pd.read_csv(
    "../downloads/20200513_Mouse_PatchSeq_Release_cpm.v2.csv",
    index_col=0,
)

In [None]:
gene_data

Let's create a 2D projection of the data using the transcriptomic data to see the structure of the different subclasses and types. We will transform the data, then use the UMAP algorithm to create a nonlinear embedding of a subset of the differentially expressed genes.

the select_markers.csv was not available and so need to be generated. I am trying this method



In [None]:
import umap

marker_genes_for_umap = pd.read_csv("../downloads/select_markers.csv", index_col=0)

#need to modify select_markers so it is single column, unique and confirm exsist in gene_data see generate_select markers py
embedding = umap.UMAP(n_neighbors=25).fit_transform(
    np.log2(gene_data.loc[marker_genes_for_umap["Gene"], :].values.T + 1)
)

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns

%matplotlib inline

In [None]:
plt.figure(figsize=(8, 8))
plt.scatter(*embedding.T, s=1, edgecolor="none")
sns.despine()

We can see where specific types fall in this embedding by using the metadata to identify specific cells.

In [None]:
# Identify the cells from a particular t-type (Lamp5 Plch2 Dock5)
my_ttype_metadata = metadata.loc[metadata["T-type Label"] == "Lamp5 Plch2 Dock5", :]

my_ttype_metadata

In [None]:
my_ttype_mask = gene_data.columns.isin(my_ttype_metadata["transcriptomics_sample_id"].tolist())

plt.figure(figsize=(8, 8))
plt.scatter(*embedding.T, s=1, edgecolor="none")
plt.scatter(*embedding[my_ttype_mask, :].T, s=2, edgecolor="none")
sns.despine()

We can also look at how expression of a particular gene varies across types in this embedding.

We can also look at how expression of a particular gene varies across types in this embedding.

In [None]:
plt.figure(figsize=(8, 8))
plt.scatter(
    *embedding.T,
    s=1,
    c=gene_data.loc["Npy", :].values,
    vmin=0,
    vmax=5e3,
    cmap="viridis",
    edgecolor="none"
)

Finding cells with electrophysiology and morphology

In [None]:
my_ttype_with_recon_metadata = metadata.loc[
    (metadata["T-type Label"] == "Lamp5 Plch2 Dock5") &
    (metadata["neuron_reconstruction_type"].isin(["full"])),
    :]

In [None]:
my_ttype_with_recon_metadata.iloc[0, :]

Electrophysiology
Let's get the NWB file that has the electrophysiology data for this cell and use the IPFX library to process it.

In [None]:
my_specimen_id = my_ttype_with_recon_metadata.iloc[0, :]["cell_specimen_id"]

nwb_urls = file_manifest.loc[
    (file_manifest["cell_specimen_id"] == float(my_specimen_id)) &
    (file_manifest["file_type"] == "nwb"),
    :
]

In [None]:
nwb_urls["archive_uri"].values[0]

In [None]:
!dandi download https://api.dandiarchive.org/api/assets/5a0d8719-3b7c-41f7-b235-3640d3f242e7/download/

In [None]:
nwb_path = nwb_urls["file_name"].values[0]
nwb_path

In [None]:
from ipfx.dataset.create import create_ephys_data_set
from ipfx.data_set_features import extract_data_set_features
from ipfx.utilities import drop_failed_sweeps

data_set = create_ephys_data_set(nwb_file=nwb_path)
drop_failed_sweeps(data_set)
cell_features, sweep_features, cell_record, sweep_records, _, _ = \
    extract_data_set_features(data_set, subthresh_min_amp=-100.0)

In [None]:
cell_features.keys()

In [None]:
cell_features["long_squares"].keys()

In [None]:
cell_features["long_squares"]["rheobase_sweep"]

In [None]:
cell_features["long_squares"]["rheobase_sweep"]["latency"]
df=cell_features["long_squares"]

In [None]:
swp = data_set.sweep(cell_features["long_squares"]["rheobase_sweep"]["sweep_number"])

swp = data_set.sweep(10)

In [None]:
plt.figure(figsize=(10, 6))

plt.plot(swp.t, swp.v)
plt.xlabel("time (s)", fontsize=16)
plt.ylabel("membrane potential (mV)", fontsize=16)
sns.despine()

Morphology
Now we'll get the SWC file that has the morphological reconstruction of this cell and use the neuron_morphology library to process it.

In [None]:
swc_urls = file_manifest.loc[
    (file_manifest["cell_specimen_id"] == float(my_specimen_id)) &
    (file_manifest["file_type"] == "transformed_swc"),
    :
]

In [None]:
swc_urls["archive_uri"].values[0]

In [None]:
!wget ftp://download.brainlib.org:8811/biccn/zeng/pseq/morph/200526/645169930_transformed.swc