# Fundamentals of Python

## Working with the Jupyter Notebook

1. There are two basic types of cells, Code and Markdown. 


2. Markdown cells contains meta-text that explains what is going on in the code cells. For our purposes, the code cells contain Python code only. 


3. Some shortcuts (more can be found in the menu above):
    * "shift + enter": run the currently selected cell
    * "A" / "B": insert a new cell above or below the current cell
    

4. Within a code cell "#" is used for comments (this is true for Python code everywhere)


5. Code cells will need to be run in order to take effect and change the workspace. 

**Compared to other languages I have used, Python always seems to have 1000 ways of doing the same thing. Do not let that freak you out.**

This code was inspired by Mark Kramer's [Python for the Practicing Neuroscientist](https://hub.gke2.mybinder.org/user/mark-kramer-case-studies-python-qflt1uo9/notebooks/01.ipynb) which provides additional details that were not included here. I recommended checking that out. 

## Importing modules

In [1]:
import numpy as np

## The workspace

In [3]:
# let's define a few variables
my_dog = "golden retriever"
dog_age = 5

In [6]:
# list variables in the workspace
%who

dog_age	 my_dog	 np	 


In [8]:
# same, but more detailed
%whos 

Variable   Type      Data/Info
------------------------------
dog_age    int       5
my_dog     str       golden retriever
np         module    <module 'numpy' from '/Us<...>kages/numpy/__init__.py'>


In [9]:
# return working directory (the folder I am currently in)
%pwd

'/Users/kohler/code/git/teaching/NRSC-2200/python_fundamentals'

In [10]:
# list files in working directory
%ls

fundamentals.ipynb


In [11]:
# reset the workspace
%reset

Once deleted, variables cannot be recovered. Proceed (y/[n])? n
Nothing done.


## Variables

Detailed and useful information about variable types [here](https://physics.nyu.edu/pine/pymanual/html/chap3/chap3_arrays.html).

In [12]:
# NUMBERS (float and integers)

# Integers are numbers without decimal points.
num_int = 8

# Floats are numbers with decimal points.
num_flo = 5.0

# you can convert one to the other
print(float(num_int))
print(int(num_flo))

8.0
5


In [13]:
# STRINGS
# Strings are lists of characters. Any character that you can type from a computer keyboard, 
# plus a variety of other characters, can be elements in a string.
# Strings can be defined using both double and single quotes

a = "My dog's name is"
b = "Bingo"
c = a + " " + b # you can add strings

print(c)

# use format to generate strings
d = "My dog {} is a {}, he is {} years old".format("Bingo", my_dog, 5)

print(d)

# use format to specify the formatting of numbers in a string
e = "My dog {} is a {}, he is {:.2f} years old".format("Bingo", my_dog, 5)

print(e)

My dog's name is Bingo
My dog Bingo is a golden retriever, he is 5 years old
My dog Bingo is a golden retriever, he is 5.00 years old


In [14]:
# LISTS
# The elements of lists can be numbers or strings, or both. 
# Lists (we will discuss tuples later) are defined by a pair of square brackets on either end 
# with individual elements separated by commas. 

# can be numbers
my_list = [1, 2, 3, 4]

# or strings
fruits = ["apple", "banana", "cherry", "kiwi", "mango"]

# or combinations
test = [203, "dog", np.pi]

# you can multiply and add lists
print(fruits*2)
print(fruits + fruits)

['apple', 'banana', 'cherry', 'kiwi', 'mango', 'apple', 'banana', 'cherry', 'kiwi', 'mango']
['apple', 'banana', 'cherry', 'kiwi', 'mango', 'apple', 'banana', 'cherry', 'kiwi', 'mango']


In [15]:
# list comprehension (making a new list from an existing list - so powerful):
 
fruits_with_a = [x for x in fruits if "a" in x]

print(fruits_with_a) 

['apple', 'banana', 'mango']


In [16]:
# use zip to combine lists

hemi = [ "L", "R" ]
brain_area = ["V1", "V2", "V3"]

area_list = [x + "-" + y for x,y in zip(brain_area*2, hemi*3)]
print(area_list)

['V1-L', 'V2-R', 'V3-L', 'V1-R', 'V2-L', 'V3-R']


In [19]:
# use sort to sort lists
area_list.sort()
print(area_list)

# sort is a method, but unlike other methods and functions
# sort does not return anything, so the output should not be assigned to a variable
# instead, sort changes the content of the variable area_list

['V1-L', 'V1-R', 'V2-L', 'V2-R', 'V3-L', 'V3-R']


In [20]:
area_list.sort?

In [None]:
# you can create lists using functions like range([start], stop, [step])

num_list = [x for x in range(0, 10) ] # numbers from 0-9, step size 1 (default)
num_list = [x for x in range(0, 10, 2) ] # numbers from 0-9, step size 2 = skip odd numbers

print(num_list)

In [22]:
# DICTIONARIES
# Collection of Python objects, just like a list, but one that is indexed by strings or numbers 

d = {} # define empty dictionary

d["last name"] = "Kohler" # add key-value pairs

d["first name"] = "Peter"

d["birthday"] = "October 13"

# print dictionary
print(d)

{'last name': 'Kohler', 'first name': 'Peter', 'birthday': 'October 13'}


In [23]:
print(d["last name"]) # get value for a specific key

Kohler


In [24]:
# define set of keys / values

area_sizes = {"V1": 500, "V2": 300, "V3": 200}

print(area_sizes.values()) # get all values from a dictionary 
print(area_sizes.keys()) # get all keys from a dictionary 

dict_values([500, 300, 200])
dict_keys(['V1', 'V2', 'V3'])


In [26]:
# NUMPY ARRAYS
# The elements of a NumPy array, or simply an array, are usually numbers, 
# but can also be boolians, strings, or other objects. 
# When the elements are numbers, they must all be of the same type. 
# For example, they might be all integers or all floating point numbers.

# here's a one-dimensional array
my_array = np.array([[0,1,2,3]])

# use shape to get the shape of the array
my_array.shape 
# returns a tuple, similar to a list

(1, 4)

In [27]:
# we can get specific dimensions by indexing the tuple
my_array.shape[0]

1

In [28]:
# now let's make a two-dimensional array, a matrix: 

my_array = np.array([[0, 0, 0], [1, 1, 1]])
print( "here's my array: \n {}".format(my_array) )

# numpy has ton of functions for making arrays:
zero_array = np.zeros_like(my_array) # make an array filled with zeros that has the same shape and type as test
print( "here's a version filled with zeros:\n {}".format(zero_array) )

# again use shape function to get the spape of the array
print("the shape of my array is {} by {}".format( my_array.shape[0], my_array.shape[1] ) )

here's my array: 
 [[0 0 0]
 [1 1 1]]
here's a version filled with zeros:
 [[0 0 0]
 [0 0 0]]
the shape of my array is 2 by 3


In [29]:
# note that shape is a method of the numpy array
# a method is a "built-in" function 
# a function that "belongs to" an object, like an np.array variable
  
# np arrays have other methods: mean, sum, count
my_array.sum(1)

# note that the corresponding function does the same thing
np.sum(my_array, 1)
print("built-in function: {}, stand-alone function: {}".format( np.sum(my_array, 1), my_array.sum(1) ) )

built-in function: [0 3], stand-alone function: [0 3]


While some functions are also methods, this is not true for all functions: 
A relevant example: 
    
    np.nanmean() 

... which gives you the mean while excluding any NaNs in the array.  

Finally, note that format, which we used above, is a method of strings in Python:
    
    "{}, {}".format(x, y ...) 

In [30]:
# now that we've defined some variables, we can use whos to see what they are
%whos

Variable        Type       Data/Info
------------------------------------
a               str        My dog's name is
area_list       list       n=6
area_sizes      dict       n=3
b               str        Bingo
brain_area      list       n=3
c               str        My dog's name is Bingo
d               dict       n=3
dog_age         int        5
e               str        My dog Bingo is a golden <...>ver, he is 5.00 years old
fruits          list       n=5
fruits_with_a   list       n=3
hemi            list       n=2
my_array        ndarray    2x3: 6 elems, type `int64`, 48 bytes
my_dog          str        golden retriever
my_list         list       n=4
np              module     <module 'numpy' from '/Us<...>kages/numpy/__init__.py'>
num_flo         float      5.0
num_int         int        8
test            list       n=3
zero_array      ndarray    2x3: 6 elems, type `int64`, 48 bytes


In [43]:
# we can also inspect the variable type for a specific variable, using the "type" variable
type(area_sizes)

dict

## Manipulating arrays and broadcasting
Broadcasting is a powerful feature of Python that allows you to perform operations using arrays of different shapes

In [89]:
base_array = np.array([[2,2,2],[4,4,4]])
print(base_array)

# we can manipulate arrays by adding, multiplying, dividing, by single numbers (scalars)

print(base_array + 1)

print(base_array * 2)

[[2 2 2]
 [4 4 4]]
[[3 3 3]
 [5 5 5]]
[[4 4 4]
 [8 8 8]]


In [92]:
# we can also perform operations using two arrays

print(base_array + np.array([[0,0,0],[1,1,1]]))

[[2 2 2]
 [5 5 5]]


In [97]:
# let's try to add two arrays that have different shapes
sub_array = np.array([[1,2,3]])

print("shape of base is {}, shape of sub is {}".format(base_array.shape, sub_array.shape))

shape of base is (2, 3), shape of sub is (1, 3)


In [99]:
# when arrays have different shapes, broadcasting extends the shape of the smaller array to fit the larger array
# subject to certain constraints

print(base_array - sub_array)

[[ 1  0 -1]
 [ 3  2  1]]


Here,

    [ [2, 2, 2], [4, 4, 4] ] - [ 1, 2, 3]
    
is evaluated as:

    [ [2-1, 2-2, 2-3], [4-1, 4-2, 4-3] ] = [ [ 1  0 -1], [ 3  2  1] ]

Note that operations with a scalar is just the simplest case of this:
    
    [1, 2, 3] * a
    
is evaluated as:

    [1*a, 2*a, 3*a]

## Selecting subsets from a list

In [48]:
# you can index lists to grab individual elements
print(area_list[0]) # grab first element of area list

# note that indexing in Python is zero-based, indices start at zero

V1-L


In [49]:
print(area_list[:2]) # grab first two elements of area list

['V1-L', 'V1-R']


In [50]:
print(area_list[2:]) # grab all elements expect the first two

['V2-L', 'V2-R', 'V3-L', 'V3-R']


In [45]:
print(area_list[-2:]) # grab the last two elements (negative values mean starting from the back)

['V3-L', 'V3-R']


In [47]:
V3_list = [x for x in area_list if "V3" in x ] # of course, you can also use list comprehension

print(V3_list)

['V3-L', 'V3-R']


## Selecting subsets from an array
### Numerical Indexing

In [74]:
my_array = np.array([[1, 2, 3], [4, 5, 6]])

my_array[0,1] # grab second element in first row
              # note the square brackets

2

In [67]:
my_array[:,1] # grab entire first row

array([2, 5])

### Logical Indexing

In [102]:
# let's define a matrix of random numbers 
test = np.random.randn(2,5) # returns an 2 x 5 array of random numbers selected pulled a normal distribution

In [73]:
non_zero_test = test[test > 0] # returns the non-zero values of test
                               # note the square brackets

print(non_zero_test)

[0.01904696 1.22586239 1.14557532]


In [70]:
lgl_idx = test > 0 # this is a logical index (can be applied to any array that has the same shape)

print("here is the logical index: \n {}".format(lgl_idx))

here is the logical index: 
 [[ True False False False False]
 [False  True False  True False]]


In [71]:
# use built-in function non-zero to get the numerical indexes

num_idx = lgl_idx.nonzero()
print("here is the numerical index: \n {}".format(num_idx))

here is the numerical index: 
 (array([0, 1, 1]), array([0, 1, 3]))


In [72]:
# now we can use the numerical index to grab the non-zero values
print(test[num_idx])

[0.01904696 1.22586239 1.14557532]


In [82]:
# now we want to identify the absolute values bigger than 1

big_num_test = test[np.abs(test) > 1]

print(big_num_test)

[ 1.22586239 -1.47862988  1.14557532]


In [175]:
# what if want to identify the index associated with some value 
# e.g. minimum and maximum of an array? 

# np.min and np.max returns the minimum and maximum values
max_val = np.max(test)
min_val = np.min(test)

# create a logical index identifying positions in test where the value is equal to the max
# in our case, there will only be one such position
max_lgl_idx = test==max_val

# we can again use nonzero to convert to numerical index

max_num_idx = max_lgl_idx.nonzero()

# they can then be used to index into the same variable or other variables

print( "np.max: {:.2f}, index method: {:.2f}".format(max_val, test[max_num_idx].item()) )

np.max: 1.78, index method: 1.78


## How to get help?
Adding "?" in front of a function or variable will give you information about it. 

In [33]:
# can be applied to functions
np.nanmean?

In [38]:
# methods
area_list.sort?

In [39]:
# modules
np?

In [40]:
# and even variables
area_sizes?