In [1]:
# !pip install pandas hvplot

In [2]:
import pandas as pd
import hvplot.pandas

# Jupyter Notebooks To Make Advanced Analysis Documents

Jupyter Notebooks are a powerful tool to combine your code and add analysis documentation all in the same document.
In this session, we start by understanding the basic unit of a Jupyter notebook known as cell.
Then we look into some of the advanced ways to integrate text based explanations into analysis document.
Now that we are familiar with the structure of Jupyter notebook, we move on to looking at the data itself within a notebook using `Pandas`.
Finally, we use `hvPlot` to add visualizations.

To make things easier, you can use the following shortcut table to quickly navigate and edit Jupyter notebook.

| **Category**               | **Shortcut**              | **Action**                                         |
|----------------------------|---------------------------|---------------------------------------------------|
| **Cell Operations**         | `Shift + Enter`           | Run the current cell and move to the next         |
|                            | `Ctrl + Enter`            | Run the current cell but stay on the same cell    |
|                            | `Ctrl + Shift + -`        | Split the current cell at the cursor              |
| **Cell Insertion and Deletion** | `Esc + A`             | Insert a new cell above                           |
|                            | `Esc + B`                 | Insert a new cell below                           |
|                            | `Esc + D + D`             | Delete the current cell                           |
|                            | `Esc + Z`                 | Undo cell deletion                                |
| **Cell Type Conversion**    | `Esc + M`                 | Convert the current cell to Markdown              |
|                            | `Esc + Y`                 | Convert the current cell to Code                  |
|                            | `Esc + R`                 | Convert the current cell to Raw                   |
| **Editing and Saving**      | `Ctrl + S`                | Save the notebook                                 |
|                            | `Ctrl + /`                | Toggle comment on the selected code line(s)       |

---


## Cell Types and Cell Modes

Cell is a basic unit of Jupyter notebook. It is where we write, execute, and organize our code and notes. There are three types of cells in Jupyter.

![Cell Type](img/cell_mode.png)

- **Code Cells**: Execute code and display the resulting output below the cell.
- **Markdown Cells**: Render formatted text for documentation, using Markdown syntax.
- **Raw Cells**: Contain raw, unprocessed text meant for export without modification.

There are two types of cell modes: Command mode and Edit mode. Basically, command mode lets you change your cell at the notebook level while edit mode lets you change the contents of the cell. 

| Mode          | How to Access                | What Can Be Done                                  |
|---------------|------------------------------|---------------------------------------------------|
| **Command Mode** | Press **Esc** or click outside the cell | Manage cells: Add, delete, copy, paste, move cells. |
| **Edit Mode**    | Press **Enter** or click inside the cell | Edit cell content: Type, modify text/code, format. |

**Example** In a code cell, type 1+1 and execute the cell (Ctrl+Enter). What do you see?

In [3]:
1+1

2

Jupyter evaluates the arithmetic expression and displays the result 2. 
Since it's an expression, the output is captured and numbered as Out[1]: 2. where the square brackets hold the execution count. 
If you execute the same cell again, it will change.

Type in "Hello" in a code cell and execute it. Do you still see the ouput number?

In [4]:
"Hello"

'Hello'

You will see it as "Hello" is still considered a string expression in Jupyter and evaluated. 
Now, let's use a Python print() function to display hello.

Type `print("Hello")` in a code cell. Any difference?

In [5]:
print('Hello')

Hello


Unlike expressions, print() doesn't return a value, so no output number is assigned, and the result appears as console output without a corresponding Out[] number. 

**Example** Assign value of 10 to variable a and execute the code cell.

In [6]:
a=10

This is an assignment statement. 
It assigns the value 10 to the variable a. 
Since assignment statements don’t produce output by themselves, there is no result displayed or numbered output.

If you want to see an output, type 
```
a=10
a
```

In [7]:
a=10
a

10

After assigning a = 10 in the previous cell, entering just a evaluates the variable and returns its value, 10. 
This is treated as an expression, so Jupyter displays the result and assigns the output number.

Assign 

`10 to a` and `10 to b`

Add a and b 

Execute the cell

In [8]:
a=10
b=10
a+b

20

Let's see a small example of markdown cell. 

**Example** Type 1+1 in a markdown cell

1 + 1

It does not do anything as markdown cells are not evaluated. 
They are only for formatting texts

Try `print("hello")` in markdown. What do you expect?

print("hello")

**Example** Type 1+1 in Raw cell

Nothing? Same as Markdown? Then why Raw? 
Raw cells display content exactly as it is written, with no formatting, processing, or interpretation. 
They are typically used when you want to export the notebook in a specific way or preserve unformatted text. 
Markdown cells process the content, apply markdown formatting, and display it in a rendered form.

Lets summarize the three types of cells with the below exercises

**Example** Type the below code in a code cell. 
```python
#### this is in code
```

In [9]:
#### this is in code

This executes like a code cell. 
In Python, any line starting with `#` is not executed. 
These are called comments.

Type `#### this is in markdown` in markdown cell

# This is in markdown

Markdown renders anything after # as a heading. 
So you see the same text in large font size. 

Type '#### this is raw' in raw cell

Here the text is not formatted. 

---

## Markdown

Markdown in Jupyter Notebooks is a great tool for formatting text, embedding images, and organizing information.

You can create headings using # symbols for different levels. 
More number of # preceding the heading text, smaller is the font size. 

**Example** Create a markdown cell that looks like this
# This is level 1 heading

Hint: 
```markdown
# This is level 1 heading
```

--- *your code here*

Create a level 2 heading

<h2>This is a level 2 heading</h2>

--- *your code here*

Create a level 6 heading

<h2>This is a level 6 heading</h2>

--- your code here

You can also make the text bold, italics, or both.

**Example** Make text bold

```markdown
**This is Bold**
```

--- your code here

Display below text by using only one `*` instead of two on either side of the text.

<i>This is italics</i>

--- your code here

Display below text by using *** on either side of the text

<b><i>This is bold and italics</b></i>

--- your code here

Making ordered and unordered lists in Jupyter notebook is quite simple.

**Example** Make an unordered list of three programming languages
```markdown
- Python
- Julia
- R
```

--- your code here

You can also use *, or + to create unordered lists. 
Make an unordered list of your three favorite fruits using `*`

--- your code here

Make an unordered list of your favorite vegetables using +

--- your code here

Order your favorite vegetables starting from most favorite using `1` to `3` to number your list.

--- your code here

Create this list in markdown

<ul>
  <li>First item
    <ul>
      <li>Sub-item 1
        <ul>
          <li>Sub-sub-item 1</li>
          <li>Sub-sub-item 2</li>
        </ul>
      </li>
      <li>Sub-item 2</li>
    </ul>
  </li>
  <li>Second item
    <ul>
      <li>Sub-item 1
        <ul>
          <li>Sub-sub-item 1</li>
        </ul>
      </li>
      <li>Sub-item 2</li>
    </ul>
  </li>
</ul>


--- your code here

```markdown
- First item
  - Sub-item 1
    - Sub-sub-item 1
    - Sub-sub-item 2
  - Sub-item 2
- Second item
  - Sub-item 1
    - Sub-sub-item 1
  - Sub-item 2
```

Create this list

1. First item (ordered)
   - Sub-item 1 (unordered)
     - Sub-sub-item 1 (unordered)
     - Sub-sub-item 2 (unordered)
   - Sub-item 2 (unordered)
2. Second item (ordered)
   1. Sub-item 1 (ordered)
      - Sub-sub-item 1 (unordered)
      - Sub-sub-item 2 (unordered)
   2. Sub-item 2 (ordered)
3. Third item (ordered)
   - Sub-item 1 (unordered)
   - Sub-item 2 (unordered)
     1. Sub-sub-item 1 (ordered)
     2. Sub-sub-item 2 (ordered)


--- your code here

```markdown
1. First item (ordered)
   - Sub-item 1 (unordered)
     - Sub-sub-item 1 (unordered)
     - Sub-sub-item 2 (unordered)
   - Sub-item 2 (unordered)
2. Second item (ordered)
   1. Sub-item 1 (ordered)
      - Sub-sub-item 1 (unordered)
      - Sub-sub-item 2 (unordered)
   2. Sub-item 2 (ordered)
3. Third item (ordered)
   - Sub-item 1 (unordered)
   - Sub-item 2 (unordered)
     1. Sub-sub-item 1 (ordered)
     2. Sub-sub-item 2 (ordered)
```

You can add hyperlinks. 

```markdown
[The Name You Want To See](https://abc.com)
```

Add a link to your favorite website

```markdown
[Google](wwww.google.com)
```

Adding images is almost the same syntax

```markdown
![Sample Image](path/to/image.png)
```

**Example** Type in the below text in markdown cell 
```markdown
![iBehave](img/iBehave_Logo.png)
```

--- your code here

Have a mathematical equation? No problem. You can use latex to write equations. Type the below code into markdown and see how it renders

```markdown
$E = mc^2$
```

--- your code here

Want to make a table?

```markdown
| Column 1 | Column 2 |
|----------|----------|
| Value 1  | Value 2  |
| Value 3  | Value 4  |
```

Make a table in the below cell with two columns and three rows

--- your code here

Want to display a block of code in markdown? You can do so by enclosing your code within ```

**Example** Show syntax of printing hello world in Python (Press enter on this cell to see the code)
```python
print("Hello, world!")
```

--- your code here

---

## Reading in Data and Seeing it: Pandas

`Pandas` is a popular Python library for tabular data analysis. Jupyter notebooks can display pandas data known as Dataframes in a simple tabular format making it easy to 'see' our data. Let's see how these tables appear in Jupyter notebooks.

In this notebook, we'll get our first look at the experiment we'll be analyzing in this course; curated data from the [Steinmetz et al, 2019 paper](https://www.nature.com/articles/s41586-019-1787-x).

The data we'll be using in this notebook is focused on three CSV files, each containing sessions from a different stretch of data collection. They contain trial-level data from the experiment:

```
steinmetz_winter2016.csv
steinmetz_summer2017.csv
steinmetz_winter2017.csv
```

**Run the below code to get the datasets downloaded to the `data` folder.

In [10]:
import sys
sys.path.append('src')
import sciebo

sciebo.download_file('https://uni-bonn.sciebo.de/s/G5rdvTsoESXolF4', 'data/steinmetz_winter2017.csv')
sciebo.download_file('https://uni-bonn.sciebo.de/s/xKAG9nqHyWmXBBI', 'data/steinmetz_winter2016.csv')
sciebo.download_file('https://uni-bonn.sciebo.de/s/XLDoTQQoDdFLhlz', 'data/steinmetz_summer2017.csv')

Downloading data/steinmetz_winter2017.csv: 100%|███████████████████████████████████████████████| 806k/806k [00:00<00:00, 4.05MB/s]
Downloading data/steinmetz_winter2016.csv: 100%|███████████████████████████████████████████████| 359k/359k [00:00<00:00, 3.11MB/s]
Downloading data/steinmetz_summer2017.csv: 100%|███████████████████████████████████████████████| 276k/276k [00:00<00:00, 3.74MB/s]


**Example** Load the Winter 2016 dataset and preview the first 3 rows of the data

In [11]:
df1 = pd.read_csv('data/steinmetz_winter2016.csv')
df1.head(3)

Unnamed: 0,trial,active_trials,contrast_left,contrast_right,stim_onset,gocue_time,response_type,response_time,feedback_time,feedback_type,reaction_time,reaction_type,mouse,session_date,session_id
0,1,True,100,0,0.5,1.027216,1.0,1.150204,1.186819,1.0,170.0,1.0,Cori,2016-12-14,5dd41e
1,2,True,0,50,0.5,0.874414,-1.0,1.399503,1.437623,1.0,230.0,-1.0,Cori,2016-12-14,5dd41e
2,3,True,100,50,0.5,0.825213,1.0,0.949291,0.986016,1.0,200.0,1.0,Cori,2016-12-14,5dd41e


Load the Winter 2017 Dataset and preview the first 5 rows of the data

In [12]:
df2 = pd.read_csv('data/steinmetz_winter2017.csv')
df2.head(5)

Unnamed: 0,trial,active_trials,contrast_left,contrast_right,stim_onset,gocue_time,response_type,response_time,feedback_time,feedback_type,reaction_time,reaction_type,mouse,session_date,session_id
0,1,True,100,0,0.5,0.508117,1.0,0.903312,0.946524,1.0,210.0,1.0,Theiler,2017-10-11,aeb92f
1,2,True,0,100,0.5,0.678304,1.0,0.859908,0.859908,-1.0,270.0,1.0,Theiler,2017-10-11,aeb92f
2,3,True,0,100,0.5,0.508295,-1.0,0.646241,0.683098,1.0,320.0,-1.0,Theiler,2017-10-11,aeb92f
3,4,True,0,25,0.5,0.437219,-1.0,0.985264,1.022429,1.0,790.0,-1.0,Theiler,2017-10-11,aeb92f
4,5,True,100,25,0.5,0.672789,1.0,1.137715,1.175197,1.0,250.0,1.0,Theiler,2017-10-11,aeb92f


Load the Summer 2017 Dataset and preview the last 4 rows of the data

In [13]:
df3 = pd.read_csv('data/steinmetz_summer2017.csv')
df3.tail(4)

Unnamed: 0,trial,active_trials,contrast_left,contrast_right,stim_onset,gocue_time,response_type,response_time,feedback_time,feedback_type,reaction_time,reaction_type,mouse,session_date,session_id
2743,449,False,100,25,0.5,,,,,,,,Hench,2017-06-18,dd9ee9
2744,450,False,0,100,0.5,,,,,,,,Hench,2017-06-18,dd9ee9
2745,451,False,0,100,0.5,,,,,,,,Hench,2017-06-18,dd9ee9
2746,452,False,0,100,0.5,,,,,,,,Hench,2017-06-18,dd9ee9


**Example** Display basic structural information of Winter 2016

In [14]:
df1.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3767 entries, 0 to 3766
Data columns (total 15 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   trial           3767 non-null   int64  
 1   active_trials   3767 non-null   bool   
 2   contrast_left   3767 non-null   int64  
 3   contrast_right  3767 non-null   int64  
 4   stim_onset      3767 non-null   float64
 5   gocue_time      2437 non-null   float64
 6   response_type   2437 non-null   float64
 7   response_time   2437 non-null   float64
 8   feedback_time   2437 non-null   float64
 9   feedback_type   2437 non-null   float64
 10  reaction_time   2437 non-null   float64
 11  reaction_type   2437 non-null   float64
 12  mouse           3767 non-null   object 
 13  session_date    3767 non-null   object 
 14  session_id      3767 non-null   object 
dtypes: bool(1), float64(8), int64(3), object(3)
memory usage: 415.8+ KB


Display basic structural information of Winter 2017

In [15]:
df2.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7906 entries, 0 to 7905
Data columns (total 15 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   trial           7906 non-null   int64  
 1   active_trials   7906 non-null   bool   
 2   contrast_left   7906 non-null   int64  
 3   contrast_right  7906 non-null   int64  
 4   stim_onset      7906 non-null   float64
 5   gocue_time      5596 non-null   float64
 6   response_type   5596 non-null   float64
 7   response_time   5596 non-null   float64
 8   feedback_time   5596 non-null   float64
 9   feedback_type   5596 non-null   float64
 10  reaction_time   5596 non-null   float64
 11  reaction_type   5596 non-null   float64
 12  mouse           7906 non-null   object 
 13  session_date    7906 non-null   object 
 14  session_id      7906 non-null   object 
dtypes: bool(1), float64(8), int64(3), object(3)
memory usage: 872.6+ KB


Display basic structural information of Summer 2017

In [16]:
df3.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2747 entries, 0 to 2746
Data columns (total 15 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   trial           2747 non-null   int64  
 1   active_trials   2747 non-null   bool   
 2   contrast_left   2747 non-null   int64  
 3   contrast_right  2747 non-null   int64  
 4   stim_onset      2747 non-null   float64
 5   gocue_time      2017 non-null   float64
 6   response_type   2017 non-null   float64
 7   response_time   2017 non-null   float64
 8   feedback_time   2017 non-null   float64
 9   feedback_type   2017 non-null   float64
 10  reaction_time   2017 non-null   float64
 11  reaction_type   2017 non-null   float64
 12  mouse           2747 non-null   object 
 13  session_date    2747 non-null   object 
 14  session_id      2747 non-null   object 
dtypes: bool(1), float64(8), int64(3), object(3)
memory usage: 303.3+ KB


**Example** Generate descriptive statistics of Winter 2016

In [17]:
df1.describe()

  sqr = _ensure_numeric((avg - values) ** 2)
  diff_b_a = subtract(b, a)


Unnamed: 0,trial,contrast_left,contrast_right,stim_onset,gocue_time,response_type,response_time,feedback_time,feedback_type,reaction_time,reaction_type
count,3767.0,3767.0,3767.0,3767.0,2437.0,2437.0,2437.0,2437.0,2437.0,2437.0,2437.0
mean,182.598354,34.483674,43.980621,0.5,0.838138,-0.007796,1.573492,1.599236,0.321297,inf,0.066065
std,118.353819,41.510643,41.289574,0.0,0.201099,0.809136,0.640202,0.635914,0.947173,,0.836455
min,1.0,0.0,0.0,0.5,0.480407,-1.0,0.568211,0.59921,-1.0,0.0,-1.0
25%,86.0,0.0,0.0,0.5,0.664811,-1.0,1.034471,1.056812,-1.0,250.0,-1.0
50%,172.0,0.0,25.0,0.5,0.842013,0.0,1.354081,1.386823,1.0,680.0,0.0
75%,260.0,100.0,100.0,0.5,1.007211,1.0,2.205901,2.226439,1.0,,1.0
max,554.0,100.0,100.0,0.5,1.193219,1.0,2.713576,2.738448,1.0,inf,1.0


Generate descriptive statistics of Winter 2017

In [18]:
df2.describe()

  sqr = _ensure_numeric((avg - values) ** 2)


Unnamed: 0,trial,contrast_left,contrast_right,stim_onset,gocue_time,response_type,response_time,feedback_time,feedback_type,reaction_time,reaction_type
count,7906.0,7906.0,7906.0,7906.0,5596.0,5596.0,5596.0,5596.0,5596.0,5596.0,5596.0
mean,193.392993,34.69833,43.580825,0.5,0.59558,-0.028592,1.292851,1.320909,0.42995,inf,-0.037527
std,116.445395,42.062204,41.371914,0.0,0.115727,0.831617,0.640483,0.636515,0.902933,,0.872591
min,1.0,0.0,0.0,0.5,0.395066,-1.0,0.479414,0.494678,-1.0,0.0,-1.0
25%,95.0,0.0,0.0,0.5,0.495339,-1.0,0.771072,0.802457,-1.0,210.0,-1.0
50%,189.0,0.0,25.0,0.5,0.595182,0.0,0.982517,1.014364,1.0,350.0,0.0
75%,284.0,100.0,100.0,0.5,0.695798,1.0,2.015343,2.035058,1.0,1720.0,1.0
max,514.0,100.0,100.0,0.5,0.800223,1.0,2.602421,2.63634,1.0,inf,1.0


Generate descriptive statistics of Summer 2017

In [19]:
df3.describe()

  sqr = _ensure_numeric((avg - values) ** 2)
  diff_b_a = subtract(b, a)


Unnamed: 0,trial,contrast_left,contrast_right,stim_onset,gocue_time,response_type,response_time,feedback_time,feedback_type,reaction_time,reaction_type
count,2747.0,2747.0,2747.0,2747.0,2017.0,2017.0,2017.0,2017.0,2017.0,2017.0,2017.0
mean,211.746269,38.942483,40.398617,0.5,0.852217,0.08825,1.601423,1.625137,0.300942,inf,0.109569
std,134.892877,43.057757,40.922708,0.0,0.194813,0.790661,0.663819,0.657148,0.953879,,0.852397
min,1.0,0.0,0.0,0.5,0.488406,-1.0,0.552185,0.586403,-1.0,0.0,-1.0
25%,99.0,0.0,0.0,0.5,0.693611,-1.0,1.02175,1.054412,-1.0,250.0,-1.0
50%,197.0,25.0,25.0,0.5,0.851214,0.0,1.402214,1.432843,1.0,690.0,0.0
75%,311.5,100.0,100.0,0.5,1.020012,1.0,2.2745,2.292037,1.0,,1.0
max,557.0,100.0,100.0,0.5,1.198819,1.0,2.699877,2.736444,1.0,inf,1.0


**Example** Make a histogram of `response_time` of Winter 2016

In [20]:
df1.hvplot.hist('response_time')

The plot shows up as cell output. On the right side of the plot, there are options to interact with the plot.

Make a histogram of `feedback_time` of Winter 2017 and 
1. Pan the plot using `Pan` option
2. Refresh the plot using `Reset` option

You can see the names of the options by hovering your mouse over it.

In [21]:
df2.hvplot.hist('feedback_time')

Make a scatter plot of `response_time` with `feedback_time` of Summer 2017 and 

1. Zoom into a section of your interest using `Box Zoom`
2. Save the plot with `Save`
3. Reset the plot and `Save`

In [22]:
df3.hvplot.scatter(x='response_time', y='feedback_time')