#  1 Brain State Data Processing

This script is used for processing brain state data. It follows these main steps:

1. **Define a Mapping Function**: A mapping function `map_states` is defined which categorizes the brain states into specific integer labels based on the condition of each row in the original DataFrame. The labels are as follows:
    * 2: `durR = 1`
    * 1: `durNR = 1`
    * 0: `durW = 1`
    * 4: `seizure = 1`
    * 5: `noise/packet loss = 1`
2. **Create a New DataFrame**: A new DataFrame `brain_states` is created that only includes the `start_epoch` column from the original DataFrame and a new `brainstate` column that contains the mapped values from the `map_states` function.
3. **Reset Index**: The index of the new DataFrame is reset for consistency.
4. **Create End Epoch Column**: A new column `end_epoch` is created in the `brain_states` DataFrame. Each value in `end_epoch` is incremented by 5 starting from 5.
5. **Reorder Columns**: The DataFrame columns are reordered to `['brainstate','start_epoch','end_epoch']`.
6. **Check NaN Values**: The number of NaN values in the `brainstate` column is checked and printed.
7. **Identify NaN Values**: A boolean mask is created to identify the NaN values in the `brainstate` column, and then the indices of these NaN values are obtained.
8. **Assign Brain State Labels**: Brain state labels are manually assigned to specific indices that were identified as having NaN values in the `brainstate` column.
9. **Change Data Type**: The data type of the `brainstate` column is changed to `int64` for consistency.
10. **Export DataFrame**: Finally, the DataFrame `brain_states` is exported as a pickle file for future use. The file name includes an animal number for tracking.
'''
This script is used for processing brain state data. It follows these main steps:

1. Define a mapping function `map_states` which categorizes the brain states into specific integer labels based on the condition of each row in the original DataFrame. The labels are as follows:
    - 2: durR = 1
    - 1: durNR = 1
    - 0: durW = 1
    - 4: seizure = 1
    - 5: noise/packet loss = 1

2. Create a new DataFrame `brain_states` that only includes the 'start_epoch' column from the original DataFrame and a new 'brainstate' column that contains the mapped values from the `map_states` function.

3. Reset the index of the new DataFrame for consistency.

4. Create a new column 'end_epoch' in `brain_states` DataFrame. Each value in 'end_epoch' is incremented by 5 starting from 5.

5. Reorder the DataFrame columns to ['brainstate','start_epoch','end_epoch'].

6. Check and print the number of NaN values in the 'brainstate' column.

7. Create a boolean mask to identify the NaN values in the 'brainstate' column, and then get the indices of these NaN values.

8. Manually assign brain state labels to specific indices that were identified as having NaN values in the 'brainstate' column.

9. Change the data type of the 'brainstate' column to 'int64' for consistency.

10. Finally, the DataFrame `brain_states` is exported as a pickle file for future use. The file name includes an animal number for tracking.
'''

In [5]:
import numpy as np
import pandas as pd

In [6]:
#Read the excel file, page needs to be changed for each animal 
df = pd.read_excel(r'/Users/valentinreateguirangel/Python/Scoring(melissa_format)/Data_excel/GRIN2B_SleepScoring_Alex.xlsx', sheet_name='129')

FileNotFoundError: [Errno 2] No such file or directory: '/Users/valentinreateguirangel/Python/Scoring(melissa_format)/Data_excel/GRIN2B_SleepScoring_Alex.xlsx'

In [None]:
#function that maps all possible options in the brain state file 
def map_states(x):
    if x['durR'] == 1:
        return 2
    elif x['durNR'] == 1:
        return 1
    elif x['durW'] == 1:
        return 0
    elif x['seizure'] == 1:
        return 4
    elif x['noise/packet loss'] == 1:
        return 5

# Create a new DataFrame with the mapped values
brain_states = pd.DataFrame()
brain_states['start_epoch'] = df['start_epoch']
brain_states['brainstate'] = df.apply(map_states, axis=1)

# Reset the index of the new DataFrame
brain_states = brain_states.reset_index(drop=True)

In [None]:
#Create new column with end.epoch
brain_states['end_epoch'] = range(5, len(brain_states)*5 + 1, 5)

In [3]:
#Reindex the new brain_state object with the right format
brain_states = brain_states.reindex(columns = ['brainstate','start_epoch','end_epoch'])
print(brain_states['brainstate'].isna().sum())

NameError: name 'brain_states' is not defined

In [4]:
# create a boolean mask of NaN values in the 'brainstate' column
mask = brain_states['brainstate'].isna()

# get the indices where the mask is True
indices = np.where(mask)[0]

print(indices)

NameError: name 'brain_states' is not defined

In [17]:
#modify values per index to change unexpected Na
brain_states.loc[2784, 'brainstate'] = 4
brain_states.loc[2785, 'brainstate'] = 4
brain_states.loc[2786, 'brainstate'] = 2
brain_states.loc[2787, 'brainstate'] = 2
brain_states.loc[2788, 'brainstate'] = 2
brain_states.loc[2789, 'brainstate'] = 2
brain_states.loc[2790, 'brainstate'] = 2
brain_states.loc[2791, 'brainstate'] = 2
brain_states.loc[2792, 'brainstate'] = 2
brain_states.loc[2793, 'brainstate'] = 2
brain_states.loc[2794, 'brainstate'] = 2
brain_states.loc[2795, 'brainstate'] = 2
brain_states.loc[2796, 'brainstate'] = 2
brain_states.loc[2797, 'brainstate'] = 2
brain_states.loc[2798, 'brainstate'] = 2
brain_states.loc[2799, 'brainstate'] = 2
brain_states.loc[2800, 'brainstate'] = 2
brain_states.loc[2801, 'brainstate'] = 2
brain_states.loc[2802, 'brainstate'] = 2

In [21]:
brain_states['brainstate'] = brain_states['brainstate'].astype('int64')

In [22]:
#export as pickle file, number needs to be changed to keep track of each animal 
brain_states.to_pickle('129_score.pkl')

In [24]:
# Open the file and load the data into a DataFrame
score= pd.read_pickle('129_score.pkl')

In [None]:
print(score.dtypes)