# Pre-Lab ELN Final Dataset Production

### Name: Kylie Joyce


### Date: 02/22/22


### PID: 730333803


### Jupyter Notebook Number: 5

## Purpose/Objective:
The objective of this notebook is to breakdown the very broad dataset provided by Allen Brain Institute Atlas data into the specific datasets we want to focus on (VIP inhibitory neuron data of males when viewing familiar images, VIP inhibitory neuron data of females when viewing familiar images, VIP inhibitory neuron data of males when viewing novel images, and VIP inhibitory neuron data of females when viewing novel images). This will allow us to get a good idea of how many data points we are working with as well as get an idea of how our figure 1 panels may look (since this was focused on the breakdown of our data into the final datasets). 

## Protocol:
1. Import allensdk data 
2. Filter and save sessions by sex (one list for males, one list for females).
3. Breakdown the data into excitatory and inhibitory neuron data.
4. Save all information of inhibitory neuron trials: one for males and one for females.
5. Analyze the data into VISp ONLY data points.
6. Further breakdown these lists into 2 lists each: one containing data when viewing novel images and one containing data when viewing familiar images.
7. Trouble shoot any failures with syntax/errors. 
8. If we finish early and everything works, we will look to see if we should focus on the depth of the cells within the V1 area as another variable. (need to see amount of depths in each sex and and how much data falls into each depth per sex). 

## Expected Results:
We expect to find a good amount of data in each category listed in the objective above. These numbers should be more than 5 (which we consider to be quite small of an n) but will not be outrageous numbers (like over 1000 data points to work with). Having these data sets broken down and defined gives us a much better understanding of what we are actually working with and will allow us to further narrow/expand the borders of our research as needed.

### As a note, this prelab was not broken up into individual portions because separating the data heavily relies on havin previously split data (aka splitting inhibitory from excitatory specifically in males and females separately. This also was worked on intensly by all members and we tried to trouble shoot any for loop issues before hand but want to be able to trouble shoot together if anything goes wrong during lab. 

### Installing AllenSDK into your local environment. 

In [None]:
pip install allensdk

### Import Packages:

In [None]:
import os
import shutil
import allensdk
import pprint
from pathlib import Path

import numpy as np
import pandas as pd
import scipy.stats as st

import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
sns.set_context('notebook', font_scale=1.5, rc={'lines.markeredgewidth': 2})

### If working with Visual Coding: Neuropixels Data use the following code block to import the cache object and point it to the data already downloaded and stored on LongLeaf.  Do not change this code.

In [None]:
#this code block should only be run if you are working with the neuropixels data
from allensdk.brain_observatory.ecephys.ecephys_project_cache import EcephysProjectCache

data_directory = '/overflow/NSCI274/projects/ecephysdata/' 

manifest_path = os.path.join(data_directory, "manifest.json")

cache = EcephysProjectCache.from_warehouse(manifest=manifest_path)

### If working with Visual Coding: 2P Ca2+ Imaging Data use the following code block to import the cache object and point it to the data already downloaded and stored on LongLeaf.  Do not change this code.

In [None]:
#this code block should only be run if you are working with the brain observatory Ca2+ imaging data
from allensdk.core.brain_observatory_cache import BrainObservatoryCache

drive_path = '/overflow/NSCI274/projects/BrainObservatoryOPhysData/'

manifest_file = os.path.join(drive_path,'manifest.json')

boc = BrainObservatoryCache(manifest_file=manifest_file)

### If working with Visual Behavior: 2P Ca2+ Imaging Data use the following code block to import the cache object and point it to the data already downloaded and stored on LongLeaf.  Do not change this code.

In [None]:
#this code block should only be run if you are working with the visual behavior Ca2+ imaging data
from allensdk.brain_observatory.behavior.behavior_project_cache import VisualBehaviorOphysProjectCache

data_storage_directory = Path("/overflow/NSCI274/projects/ophysdata")

cache = VisualBehaviorOphysProjectCache.from_s3_cache(cache_dir=data_storage_directory)

### To filter and sort the data into two lists, one for each sex.

In [None]:
M_ophys_exp_data = ophys_experiments[ophys_experiments.sex == 'M']
type(M_ophys_exp_data)
print("Total Number of Males is: " + str(len(M_ophys_exp_data)))

F_ophys_exp_data = ophys_experiments[ophys_experiments.sex == 'F']
print("Total Number of Females is: " + str(len(F_ophys_exp_data)))

### In order to know what genotypes are a part of the dataset, we will use the unique function to pull out the exact names of each genotype.

In [None]:
M_ophys_exp_data.full_genotype.unique()

In [None]:
F_ophys_exp_data.full_genotype.unique()

### Get a count of how many data points are excitatory neurons or inhibitory neurons.

#### Males

In [None]:
M_Excitatory_ophys_exp_id = (M_ophys_exp_data[M_ophys_exp_data.full_genotype == 'Slc17a7-IRES2-Cre/wt;Camk2a-tTA/wt;Ai94(TITL-GCaMP6s)/wt']
                + M_ophys_exp_data[M_ophys_exp_data.full_genotype == 'Slc17a7-IRES2-Cre/wt;Camk2a-tTA/wt;Ai93(TITL-GCaMP6s)/wt'])

print("Total Number of Male Neurons that are Excitatory is: " + str(len(M_Excitatory_ophys_exp_id)))

M_Inhibitory_ophys_exp_id = (M_ophys_exp_data[M_ophys_exp_data.full_genotype == 'Sst-IRES-Cre/wt;Ai148(TIT2L-GC6f-ICL-tTA2)/wt']
                + M_ophys_exp_data[M_ophys_exp_data.full_genotype == 'Vip-IRES-Cre/wt;Ai148(TIT2L-GC6f-ICL-tTA2)/wt'])

print("Total Number of Male Neurons that are Inhibitory is: " + str(len(M_Inhibitory_ophys_exp_id)))

#### Females

In [None]:
F_Excitatory_ophys_exp_id = (F_ophys_exp_data[F_ophys_exp_data.full_genotype == 'Slc17a7-IRES2-Cre/wt;Camk2a-tTA/wt;Ai94(TITL-GCaMP6s)/wt']
                + F_ophys_exp_data[F_ophys_exp_data.full_genotype == 'Slc17a7-IRES2-Cre/wt;Camk2a-tTA/wt;Ai93(TITL-GCaMP6s)/wt'])

print("Total number of inhibitory neurons measured for females is: " + str(len(F_Excitatory_ophys_exp_id)))

F_Inhibitory_ophys_exp_id = (F_ophys_exp_data[F_ophys_exp_data.full_genotype == 'Sst-IRES-Cre/wt;Ai148(TIT2L-GC6f-ICL-tTA2)/wt']
                + F_ophys_exp_data[F_ophys_exp_data.full_genotype == 'Vip-IRES-Cre/wt;Ai148(TIT2L-GC6f-ICL-tTA2)/wt'])

print("Total number of inhibitory neurons measured for females is: " + str(len(F_Inhibitory_ophys_exp_id)))

### Store the data for each index point that returns a mouse that has the excitatory genotypes in the case we need/decide to pursue this route of research as well.

#### Males 

In [None]:
M_Excitatory_ophys_exp_tot = []
for i in range(len(M_ophys_exp_data)):
    full_genotype = M_ophys_exp_data.iloc[i,1]
    x = M_ophys_exp_data.iloc[i].array
    if full_genotype == 'Slc17a7-IRES2-Cre/wt;Camk2a-tTA/wt;Ai93(TITL-GCaMP6s)/wt':
        M_Excitatory_ophys_exp_tot.append(x)
    elif full_genotype == 'Slc17a7-IRES2-Cre/wt;Camk2a-tTA/wt;Ai94(TITL-GCaMP6s)/wt':
        M_Excitatory_ophys_exp_tot.append(x)
print("The number of excitatory neurons measured for males is: " + str(len(M_Excitatory_ophys_exp_tot)))

#### Females

In [None]:
F_Excitatory_ophys_exp_tot = []

for i in range(len(F_ophys_exp_data)):
    full_genotype = F_ophys_exp_data.iloc[i,1]
    x = F_ophys_exp_data.iloc[i].array
    if full_genotype == 'Slc17a7-IRES2-Cre/wt;Camk2a-tTA/wt;Ai93(TITL-GCaMP6s)/wt':
        F_Excitatory_ophys_exp_tot.append(x)
    elif full_genotype == 'Slc17a7-IRES2-Cre/wt;Camk2a-tTA/wt;Ai94(TITL-GCaMP6s)/wt':
        F_Excitatory_ophys_exp_tot.append(x)
print("The number of excitatory neurons measured for females is: " + str(len(F_Excitatory_ophys_exp_tot)))

### Store the data for each index that returns a mouse that has the inhibitory genotypes we are primarily interested in.

#### Males

In [None]:
M_Inhibitory_ophys_exp_tot = []

for i in range(len(M_ophys_exp_data)):
    full_genotype = M_ophys_exp_data.iloc[i,1]
    x = M_ophys_exp_data.iloc[i].array
    if full_genotype == 'Sst-IRES-Cre/wt;Ai148(TIT2L-GC6f-ICL-tTA2)/wt':
        M_Inhibitory_ophys_exp_tot.append(x)
    elif full_genotype == 'Vip-IRES-Cre/wt;Ai148(TIT2L-GC6f-ICL-tTA2)/wt':
        M_Inhibitory_ophys_exp_tot.append(x)
print("The number of inhibitory neurons measured for males is: " + str(len(M_Inhibitory_ophys_exp_tot)))

#### Females

In [None]:
F_Inhibitory_ophys_exp_tot = []

for i in range(len(F_ophys_exp_data)):
    full_genotype = F_ophys_exp_data.iloc[i,1]
    x = F_ophys_exp_data.iloc[i].array
    if full_genotype == 'Sst-IRES-Cre/wt;Ai148(TIT2L-GC6f-ICL-tTA2)/wt':
        F_Inhibitory_ophys_exp_tot.append(x)
    elif full_genotype == 'Vip-IRES-Cre/wt;Ai148(TIT2L-GC6f-ICL-tTA2)/wt':
        F_Inhibitory_ophys_exp_tot.append(x)
print("The number of inhibitory neurons measured for females is: " + str(len(F_Inhibitory_ophys_exp_tot)))

### Find what index in this list returns the targeted structure, the session type, and the depth of the probe column index.

In [None]:
print(F_Inhibitory_ophys_exp_tot[0])

### Separate the data into the VISp data for each sex 

##### We will have to fill in the indices found in the cell above.

##### !!IOET stands for inhibitory_ophys_exp_tot so we can keep consistent names but have short accurrate variable names!!

#### Males

In [None]:
M_IOET_VISp = []

for i in range(len(M_Inhibitory_ophys_exp_tot)):
    target_structure = M_Inhibitory_ophys_exp_tot[i][See index comment in cell above]
    x = M_Inhibitory_ophys_exp_tot[i]
    if target_structure == 'VISp':
        M_IOET_VISp.append(x)
print("The number of inhibitory neurons measured for males in the VISp is: " + str(len(M_IOET_VISp)))

#### Females

In [None]:
F_IOET_VISp = []

for i in range(len(F_Inhibitory_ophys_exp_tot)):
    target_structure = F_Inhibitory_ophys_exp_tot[i][See index comment above a few cells]
    x = F_Inhibitory_ophys_exp_tot[i]
    if target_structure == 'VISp':
        F_IOET_VISp.append(x)
print("The number of inhibitory neurons measured for females in the VISp is: " + str(len(F_IOET_VISp)))

### Split data into the respective novel and familiar image trials per sex.

##### IOETV now stands for inhibitory_ophys_exp_tot_visp

#### Males

##### Familiar

In [None]:
M_IOETV_Familiar = []

for i in range(len(M_IOET_VISp)):
    session_type = M_IOET_VISp[i][See index comment above a few cells]
    x = M_IOET_VISp[i]
    if session_type == 'OPHYS_2_images_A_passive':
        M_IOETV_Familiar.append(x)
    elif session_type == 'OPHYS_2_images_B_passive':
        M_IOETV_Familiar.append(x)
print("The number of trials for males passively viewing familiar images is: " + str(len(M_IOETV_Familiar)))

##### Novel

In [None]:
M_IOETV_Novel = []

for i in range(len(M_IOET_VISp)):
    session_type = M_IOET_VISp[i][See index comment above a few cells]
    x = M_IOET_VISp[i]
    if session_type == 'OPHYS_5_images_A_passive':
        M_IOETV_Novel.append(x)
    elif session_type == 'OPHYS_5_images_B_passive':
        M_IOETV_Novel.append(x)
print("The number of trials males passively viewing novel images is: " + str(len(M_IOETV_Novel)))

#### Females

##### Familiar

In [None]:
F_IOETV_Familiar = []

for i in range(len(F_IOET_VISp)):
    session_type = F_IOET_VISp[i][See index comment above a few cells]
    x = F_IOET_VISp[i]
    if session_type == 'OPHYS_2_images_A_passive':
        F_IOETV_Familiar.append(x)
    elif session_type == 'OPHYS_2_images_B_passive':
        F_IOETV_Familiar.append(x)
print("The number of trials for females passively viewing familiar images is: " + str(len(F_IOETV_Familiar)))

##### Novel

In [None]:
F_IOETV_Novel = []

for i in range(len(F_IOET_VISp)):
    session_type = F_IOET_VISp[i][See index comment above a few cells]
    x = F_IOET_VISp[i]
    if session_type == 'OPHYS_5_images_A_passive':
        F_IOETV_Novel.append(x)
    elif session_type == 'OPHYS_5_images_B_passive':
        F_IOETV_Novel.append(x)
print("The number of trials females passively viewing novel images is: " + str(len(F_IOETV_Novel)))

### If we have time, separate the data to see how many depths in the V1 were measured from

In [None]:
Depths_M_Familiar = set()

for i in range(len(M_IOETV_Familiar)):
    Depths_M_Familiar.add(M_IOETV_Familiar[i][See index comment above a few cells])

print("There are " + str(len(Depths_M_Familiar)) + " different levels of imaging in the V1 area for inhibitory male neurons passively viewing familiar images.")
print("They are "+ str(Depths_M_Familiar))

Depths_M_Novel = set()

for i in range(len(M_IOETV_Novel)):
    Depths_M_Novel.add(M_IOETV_Novel[i][See index comment above a few cells])

print("There are " + str(len(Depths_M_Novel)) + " different levels of imaging in the V1 area for inhibitory male neurons passively viewing novel images.")
print("They are "+ str(Depths_M_Novel))

Depths_F_Familiar = set()

for i in range(len(F_IOETV_Familiar)):
    Depths_F_Familiar.add(F_IOETV_Familiar[i][See index comment above a few cells])

print("There are " + str(len(Depths_F_Familiar)) + " different levels of imaging in the V1 area for inhibitory female neurons passively viewing familiar images.")
print("They are "+ str(Depths_F_Familiar))

Depths_F_Novel = set()

for i in range(len(F_IOETV_Novel)):
    Depths_F_Novel.add(F_IOETV_Novel[i][See index comment above a few cells])

print("There are " + str(len(Depths_F_Novel)) + " different levels of imaging in the V1 area for inhibitory female neurons passively viewing novel images.")
print("They are "+ str(Depths_F_Novel))

## Resources:


### Sample Allen Jupyter Notebooks to get started:
https://allensdk.readthedocs.io/en/latest/visual_behavior_optical_physiology.html
https://allensdk.readthedocs.io/en/latest/visual_coding_neuropixels.html


### Referenced NSCI290_2P_Ca_Imaging_Behavior_Inlab for data and variable names

### Referenced to create for loops and if statements when using pandas dataframes
https://www.datacamp.com/community/tutorials/for-loops-in-python