# Google Colab and Python

## NSCI801: Introduction to Python for Neuroscience

### Today's Journey:
- **Why** Python for neuroscience research?
- **How** to use Google Colab
- **What** are the fundamental data structures?

## Agenda

**A. Why Python?**

- Common uses for Python in research

**B. Understanding the Python Environment**

- Navigating Google Colab
- Help and documentation

**C. Using Python**

- Lists, arrays, dataframes

# Part A: Why Python?


## The Origins of Python

Python is an **open source** programming language designed to be **easy-to-read** and **powerful**.

Created by **Guido van Rossum** in 1991 (Netherlands)
- Named after the television show *Monty Python's Flying Circus*
- Many Python examples and tutorials include jokes from the show!

## Python in Scientific Computing

Python has seen a **surge in popularity** across the sciences:

- **Readability** - code that reads like English
- **Modularity** - reusable components
- **Large standard library** - batteries included

### Key Milestone: NumPy (2006)

The merger of Numeric and Numarray packages created **NumPy**, enabling optimized operations on large arrays.

### Neuroscience Ecosystem (2007-present)

- Computational neuroscience
- Neuroimaging
- Electrophysiology
- Psychophysics


## Common Uses for Python in Research

1. **Data Acquisition** - Interface with experimental hardware

2. **Multi-format Data Importing** - Read any format (CSV, MATLAB, binary files)

3. **Analysis Tools & Statistics** - Existing libraries and custom tools

4. **Graphing & Visualization** - Publication-quality figures

## Data Acquisition

A framework for bringing live, measured data into computers via acquisition hardware.

**Examples:**
- EEG systems
- Eye trackers
- Behavioral response boxes
- Custom experimental setups

![data-acquisition-system.png](https://github.com/BlohmLab/NSCI801-QuantNeuro/blob/master/Figures/data-acquisition-system.png?raw=1)

## Multi-platform, Multi Format data importing

Data can be loaded into Python from **almost any format**:
- Binary data files (Eg. REX, PLEXON)
- ASCII Text (Eg. Eyelink I, II)
- MATLAB .mat files
- CSV, Excel, HDF5

**Key packages:** numpy, pandas, scipy

![importing_data.png](https://github.com/BlohmLab/NSCI801-QuantNeuro/blob/master/Figures/importing_data2.png?raw=1)

## Analysis Tools

### Library of Existing Tools
Python has a considerable variety of statistical tests available (parametric, non-parametric, time-series analysis, etc.)
- **scikit-learn/sklearn**: Machine learning
- **SciPy**: Statistical tests, signal processing
- **statsmodels**: Statistical modeling

### Supports Custom Tools
Python provides a framework for the design, creation, and implementation of any custom analysis tool imaginable

Analysis Tools (Existing,Custom)/Statistics

• A considerable library of analysis tools exist for data analysis (sklearn etc.)
• Provides a framework for the design, creation, and implementation of any custom analysis tool
  imaginable

A considerable variety of statistical tests available including: Parametric and non-parametric, and various others (Scipy, sklearn etc)



## Graphing

**Comprehensive plotting options:**

- 2D, 3D, and 4D visualizations
- Full control of formatting, axes, and other visual representational elements
- Publication-quality output

**Common libraries:** matplotlib, seaborn, plotly, etc.

![Mpl_screenshot_figures_and_code.png](https://github.com/BlohmLab/NSCI801-QuantNeuro/blob/master/Figures/Mpl_screenshot_figures_and_code.png?raw=1)

## Machine Learning in Neuroscience

Machine Learning is the hottest trend in modern neuroscience (and science in general):

- **34% growth** in ML patents between 2013 and 2017 (increasing)
- **Python is #1 programming language** for ML (GitHub)

### Why is Python the Best-Suited Programming Language for Machine Learning?

Rich ecosystem (TensorFlow, PyTorch, scikit-learn)

### Neuroscience Applications

Neural decoding, Pattern classification, Brain-computer interfaces

Machine Learning is the hottest trend in modern neuroscience (and science in general). Machine learning patents grew at a 34% rate between 2013 and 2017 and this is only set to increase in the future. And Python is the primary programming language used for much of the research and development in Machine Learning. So much so that Python is the top programming language for Machine Learning according to Github. However, while it is clear that Python is the most popular, this article focuses on the all-important question of “Why is Python the Best-Suited Programming Language for Machine Learning?”

# Part B: Understanding the Python Environment

Since most of the students are new to Python, they could run into the following practical problems easily.

1) Do not know how to install and set up the Python running environment;

2) Do not know how to find the solutions effectively when facing the problems;

3) Do not know how to collaborate with others when trying to finish the group tasks；

4) Do not know how to handle version control, which may lead to chaotic code.

The problems mentioned above are the main pain points for Python beginners.

Your own machine and Anaconda 

![alt text](https://github.com/BlohmLab/NSCI801-QuantNeuro/blob/master/Figures/Screenshot_anaconda.png?raw=1)

## Getting Started with Google Colab

1. Log in to your Google account
2. Go to: **https://colab.research.google.com**

### Advantages of Colab:

- Zero installation required (in-browser)
- Free GPU/TPU access
- Easy sharing
- Auto-save to Google Drive
- Pre-installed packages

Opening Jupyter Notebook in Colab:
On opening the website you will see a pop-up containing following tabs –

![Screenshot%202019-12-24%2013.54.17.png](https://github.com/BlohmLab/NSCI801-QuantNeuro/blob/master/Figures/Screenshot%202019-12-24%2013.54.17.png?raw=1)

**EXAMPLES:** Contain a number of Jupyter notebooks of various examples.
**RECENT:** Jupyter notebook you have recently worked with.
**GOOGLE DRIVE:** Jupyter notebook in your google drive.
**GITHUB:** You can add Jupyter notebook from your GitHub but you first need to connect Colab with GitHub.
**UPLOAD:** Upload from your local directory.

On creating a new notebook, it will create a Jupyter notebook with Untitled.ipynb and save it to your google drive in a folder named Colab Notebooks. Now as it is essentially a Jupyter notebook, all commands of Jupyter notebooks will work here. 

![notebook_figure.png](https://github.com/BlohmLab/NSCI801-QuantNeuro/blob/master/Figures/notebook_figure.png?raw=1)

You can change runtime environment by selecting the “Runtime” dropdown menu. Select “Change runtime type”. 

![runtime_enviornment.png](https://github.com/BlohmLab/NSCI801-QuantNeuro/blob/master/Figures/runtime_enviornment.png?raw=1)

Click the “Runtime” dropdown menu. Select “Change runtime type”. Now select anything(GPU, CPU, None) you want in the “Hardware accelerator” dropdown menu.

![notebook_settings.png](https://github.com/BlohmLab/NSCI801-QuantNeuro/blob/master/Figures/notebook_settings.png?raw=1)


## Colab Notebook Interface

### Two Cell Types:

1. **Code Cells** - Run with Shift + Enter
2. **Text Cells** - Formatted text using Markdown

Combines code, results, and documentation in one place

## Installing Python Packages

Most common packages are **pre-installed**:
- numpy, scipy, pandas, matplotlib, scikit-learn

### To install additional packages:

run the following line in a code cell:

!pip install package_name

## Help Documentation - Your Best Friend!

### Official Documentation:

- **NumPy**: https://numpy.org
- **SciPy**: https://scipy.org
- **Pandas**: https://pandas.pydata.org
- **Scikit-learn**: https://scikit-learn.org

**Learning to read documentation is a VERY important skill!**

# Variables and Expressions

**Variables** are like name tags that point to objects/values stored in memory. 

Python has several fundamental data types that variables can store:

In [None]:
# 1. Integers (int) - Whole numbers
year = 2026
temperature = -10

# 2. Floats (float) - Decimal numbers
pi = 3.14159
temperature = 98.6


# 3. Strings (str) - Text enclosed in quotes (' or ")
message = 'Hello, World!' # Can use one inside of the other
address = "123 Main Street"

# 4. Booleans (bool) - True or False values
is_student = True
has_graduated = False

Variable names are used to retrieve stored values

In [None]:
# Example:
price = 29.99
quantity = 3
total = price * quantity
print(f"Total cost: ${total}")  # Uses the stored values

Variables are **case-sensitive**

In [None]:
temperature = 72
Temperature = 65
TEMPERATURE = 80

print(temperature)  # 72
print(Temperature)  # 65
print(TEMPERATURE)  # 80
# These are three different variables!

Python is an **Expression language**

Expressions are combinations of values, variables, and operators that Python evaluates to produce a result. 

Examples:
5 + 3 evaluates to 8
10 / 2 evaluates to 5.0
x * 2 + 1 evaluates based on the value of x

**Variable = Expression**

The right side (expression) is evaluated first, then stored in the left side (assigned to a variable)

In [None]:
# Example
x = 5
y = x + 10  # First calculates 5 + 10, then stores 15 in y
print(y)

x = x + 1   # Takes current x (5), adds 1, stores 6 back in x
print(x)
print(y)    # What is the output?


# Documentation

The hash symbol (#) is used for in-line comments. The interpreter will ignore these lines

Triple single quotes (''') are used to create multi-line strings and docstrings (documentation strings)

What are the use cases?

# Lists

In Python, a list is created by placing all the items (elements) inside a square bracket [ ], separated by commas.

It can have any number of items and they may be of different types (integer, float, string etc.).

In [None]:
# empty list
my_list = []
# list of integers
my_list = [1, 2, 3]
# list with mixed datatypes
my_list = [1, "Hello", 3.4]

Also, a list can even have another list as an item. This is called nested list.

In [None]:
# nested list
my_list = ["mouse", [8, 4, 6], ['a']]

## How to access elements from a list

There are various ways in which we can access the elements of a list.

**List Index**

We can use the index operator [] to access an item in a list. 

Index starts from 0. So, a list with 5 elements will have indexs ranging from 0 to 4.

Trying to access an element outside of this range will raise an **IndexError**. 

The index must be an integer. We can't use float or other types, this will result into **TypeError**.

Nested list are accessed using nested indexing.

![python-list-index.png](https://github.com/BlohmLab/NSCI801-QuantNeuro/blob/master/Figures/python-list-index.png?raw=1)

In [None]:
# Indexing a list
my_list = ['p','r','o','b','e']
# Output: p
print(my_list[0])
# Output: o
print(my_list[2])
# Output: e
print(my_list[4])
# Error! Only integer can be used for indexing
# my_list[4.0]

# Nested List
n_list = ["Happy", [2,0,1,5]]
# Nested indexing
# Output: a
print(n_list[0][1])    
# Output: 5
print(n_list[1][3])

Negative indexing
Python allows negative indexing for its sequences. The index of -1 refers to the last item, -2 to the second last item and so on.

In [None]:
my_list = ['p','r','o','b','e']
# Output: e
print(my_list[-1])
# Output: p
print(my_list[-5])

In [None]:
# How would I access element 'h' using only negative indexing?
test_list = ['try', ['t', 'h', 'i', 's'], 'example']
# print(...)

Can also index ranges of elements using: 
    
    my_list[start:end]
    
Inclusive of start, non-inclusive of end index

In [None]:
my_list = ['p','r','o','b','e']
print(my_list[:]) # all
print(my_list[1:])
print(my_list[:3])
print(my_list[1:3])


### Not all data structures are mutable (individual elements can be changed)
- Lists **ARE** mutable
- Othe data structures like tuples and strings are **NOT** mutable

We can use assignment operator (=) to change an item or a range of items.

In [None]:
# mutable list example
# mistake values
odd = [2, 4, 6, 8]

# change the 1st item    
odd[0] = 1            
# Output: [1, 4, 6, 8]
print(odd)

# change 2nd to 4th items
odd[1:4] = [3, 5, 7]  
# Output: [1, 3, 5, 7]
print(odd)   

In [None]:
# Non-mutable string example
string_var = "probe"
print(string_var)
print(string_var[2])

# string_var[3]= "v"

## Operators and Methods

There are multiple ways in which you can modify elements of a mutable data structure

### Operators

Certain math operators also apply to data structures such as lists

### Methods

Python data structures have built-in methods for many different data manipulations 
(See https://docs.python.org/3/tutorial/datastructures.html)

The "+" operator can be used to combine two lists. This is called concatenation.

In [None]:
odd = [1, 3, 5]
even = [2, 4, 6]

# Output: [1, 3, 5, 2, 4, 6]
print(odd + even)
print(odd + [2, 4, 6])

Similarly, the **append()** and **extend()** methods allow us to add to lists

In [None]:
odd = [1, 3, 5]
# using the method "append"
odd.append(7)
# Output: [1, 3, 5, 7]
print(odd)

# using the method "extend"
odd.extend([9, 11, 13])
# Output: [1, 3, 5, 7, 9, 11, 13]
print(odd)

For more control, we can use the **insert** method which allows us to insert one item at a desired location. 

We can also achieve this through list indexing by squeezing multiple items into an empty slice of a list.

In [None]:
odd = [1, 9]

# using the method "insert"
odd.insert(1,3)
# odd.insert(1,[3,5]) # Will this work?
# Output: [1, 3, 9] 
print(odd)


odd[2:2] = [5, 7]
# Output: [1, 3, 5, 7, 9]
print(odd)

The "*" operator repeats a list for the given number of times.

In [None]:
odd = [1, 3, 5]
#Output: [1, 3, 5, 1, 3, 5, 1, 3, 5]
print(odd * 3)

#Output: ["re", "re", "re"]
print(["re"] * 3)

### Python List Methods:

**append()** - Add an element to the end of the list

**extend()** - Add all elements of a list to the another list

**insert()** - Insert an item at the defined index

**remove()** - Removes an item from the list

**pop()** - Removes and returns an element at the given index

**clear()** - Removes all items from the list

**index()** - Returns the index of the first matched item

**count()** - Returns the count of number of items passed as an argument

**sort()** - Sort items in a list in ascending order

**reverse()** - Reverse the order of items in the list

**copy()** - Returns a shallow copy of the list

# Arrays

We will need to import the package **numpy** for arrays

Arrays are similar to lists in many ways but have a few key differences:

- all elements of an array must be of the same numeric type.
- arrays can be N-dimentional (1D, 2D, 3D, etc.)
- Vectorized operations: Math on entire arrays at once
- Memory efficiency: Arrays store raw values, while lists store pointers to full Python objects


The easiest way to create a Numpy array is by converting a Python list to an array

In [None]:
import numpy as np

my_list = [1,9,8,3]

# Convert a list to an array
numpy_array_from_list = np.array(my_list) 
print(numpy_array_from_list)


# Convert an array to a list
my_restored_list = numpy_array_from_list.tolist()
print(my_restored_list)

### Numeric Type
Numeric type is implied if not explicitly defined, Ex. np.array(my_list, dtype=np.float32) 

Can check the numeric type using the **dtype** array method

In [None]:
my_list = [1,9,8,3]
numpy_array_from_list = np.array(my_list) 
print(numpy_array_from_list)

# get numeric type
print(numpy_array_from_list.dtype)

### Mathematical Operations

You can perform mathematical operations like addition, subtraction, division and multiplication on an array. 

The syntax is the array name followed by the operation (+.-,*,/) followed by the operand

In [None]:
# Operation comparison on list vs. array
my_list = [1,2,3,4]
my_array = np.array([1, 2, 3, 4])

# Compare the following:
print(my_list * 3)
print(my_array * 3)

In [None]:
# Operations are applied individually to each element of the array

my_array = np.array([1, 2, 3, 4])
print(my_array + 10)

# This operation adds 10 to each element of the numpy array.

### N-Dimensional arrays (1D, 2D, 3D, etc.)

Dimensions are separated with a comma ","

Note that it has to be within the outmost bracket []


In [None]:
# 2D array example (3 rows, 4 columns)
my_2d_array = np.array([[1, 2, 3, 4],
                     [5, 6, 7, 8],
                     [9, 10, 11, 12]])
print(my_2d_array)

In [None]:
### 3D array example
my_3d_array = np.array([[[1, 2, 3],[4, 5, 6]],[[7, 8, 9],[10, 11, 12]]])
print(my_3d_array)

### Array Shape
You can check the shape of an array with using the **shape** method.

In [None]:
my_1d_array  = np.array([1,2,3])

my_2d_array = np.array([[1, 2, 3, 4],
                        [5, 6, 7, 8],
                        [9, 10, 11, 12]])

my_3d_array = np.array([[[1, 2, 3],[4, 5, 6]],[[7, 8, 9],[10, 11, 12]]])

# get shape of arrays
print(my_1d_array.shape)
print(my_2d_array.shape)
print(my_3d_array.shape)

### Other Useful NumPy Methods and Operations

**np.zeros()** - create an array full of zeroes

**np.ones()** - create an array full of ones

**reshape()** - change the shape of the array, Ex. from wide to long

**flatten()** - converts a multi-dimensional NumPy array into a one-dimensional array

**mean()** - calculates the arithmetic mean (average) of elements in a given array along a specified axis

**max()** - returns the maximum value within an array

**min()** - returns the minimum value within an array


In [None]:
#zeros
zeros_array = np.zeros((3,4))
print(zeros_array)

#ones
ones_array = np.ones((2,3))
print(ones_array)

In [None]:
# Reshape an array
old_array  = np.array([[1,2,3], [4,5,6]])
new_array = old_array.reshape(3,2)
print(old_array)
print(new_array)

In [None]:
# Flatten an array
my_2d_array = np.array([[1, 2, 3, 4],
                        [5, 6, 7, 8],
                        [9, 10, 11, 12]])
flat_array = my_2d_array.flatten()
print(flat_array)

In [None]:
my_2d_array = np.array([[1, 2, 3, 4],
                        [5, 6, 7, 8],
                        [9, 10, 11, 12]])

print("Mean value is: ", my_2d_array.mean()) # alternatively could be np.mean(my_2d_array)
print("Max value is: ", my_2d_array.max()) # alternatively could be np.max(my_2d_array)
print("Min value is: ", my_2d_array.min()) # alternatively could be np.min(my_2d_array)


## Indexing Array Elements

Indexing elements in an array works almost identically to lists


In [None]:
my_2d_array = np.array([[1, 2, 3, 4],
                        [5, 6, 7, 8],
                        [9, 10, 11, 12]])

# indexing a single element
print(my_2d_array[0, 0])       # 1 (first row, first column)
print(my_2d_array[1, 2])       # 7 (second row, third column)
print(my_2d_array[2, 1])       # 10 (third row, second column)

# indexing entire rows
print(my_2d_array[0])          # first row
print(my_2d_array[-1])         # last row

# indexing entire columns
print(my_2d_array[:, 0])       # first column
print(my_2d_array[:, 1])       # second column

# indexing subsections
print(my_2d_array[0:2, 0:2])   # First 2 rows, first 2 columns  
print(my_2d_array[:, 1:3])     # All rows, middle 2 columns

# DataFrames

### What Are Pandas DataFrames?

Before you start, let’s have a brief recap of what data frames are.

Those who are familiar with R know the data frame as a way to store data in rectangular grids that can easily be overviewed. 

Each row of these grids corresponds to measurements or values of an instance, while each column is a vector containing data for a specific variable. This means that a data frame’s rows do not need to contain, but can contain, the same type of values: they can be numeric, character, logical, etc.


### Creating DataFrames

Obviously, making your DataFrames is your first step in almost anything that you want to do when it comes to data munging in Python. Sometimes, you will want to start from scratch, but you can also convert other data structures, such as lists or NumPy arrays, to Pandas DataFrames. In this section, you’ll only cover the latter. 

In [None]:
import pandas as pd

data = np.array([[1,2],[3,4]])
col = ['Col1','Col2']

df = pd.DataFrame(data=data,columns=col)

print(df)

### Fundamental DataFrame Operations

Now that you have put your data in a more convenient Pandas DataFrame structure, it’s time to get to the real work!

This first section will guide you through the first steps of working with DataFrames in Python. It will cover the basic operations that you can do on your newly created DataFrame: adding, selecting, deleting, renaming, … You name it!

Selecting an Index or Column 


In [None]:
# Using `iloc[]`
print('using iloc ', df.iloc[0][0]) #can't use boolean

# Using `loc[]`
print('using loc ', df.loc[0]['Col2']) #CAN use boolean

# Using `at[]`
print('using at ',df.at[0,'Col2'])

# Using `iat[]`
print('using iat ', df.iat[0,0])


How do we select just the data, columns or Index?

In [None]:
#get values
vals = df.values
print("these are the values ", vals)

#get columns
cls = df.columns.values
print("these are the values ", cls)

#get columns
idx = df.index.values
print("these are the values ", idx)


From here all the same operations done on lists and arrays are possible!