## Preliminaries

If you want to follow along (and you should), please do the following in the command line:

1. Download the repository:
   ```git clone https://github.com/LorenFrankLab/franklab_python_tutorial.git```
2. Change into `franklab_python_tutorial` folder:
```cd franklab_python_tutorial```
3. Create conda environment (installs packages into environment):
```conda env create -f environment.yml```
4. Activate the conda environment:
```conda activate franklab_python_tutorial```
5. Make it so you can change code in the repository folder and use it (**WARNING: note the period**):
```pip install --editable .```
6. Run jupyter lab:
```jupyter lab```

### What did we just do?

1. We used `git` to download a folder (repository) full of the files we want from Github.
    + Github is a service that allows you to store these folders "in the cloud" (aka on some other computer).
    + `git` itself is a way to keep track of the changes in a repository folder and synchronize those changes made on other computers.
2. We went into the `franklab_python_tutorial` folder.
3. We used `conda` to create an environment from the `environment.yml` file located in the `franklab_python_tutorial` folder.
    + An environment is an isolated container that you install software packages into.
    + Python has many useful software packages made by other people that you want to use.
    + The reason you use a container to hold these is that these software packages often rely on other software packages. Making sure these software packages play together well is a hard problem.
    + I make an environment for each project I'm working on so that the software packages from one project don't interfere with the software packages from another project.
    + `conda` is a software program that helps you manage these dependencies as well as environments. `pip` is another tool for installing software packages but it doesn't have environments. You can use pip to install software packages into conda environments.
    + The `environment.yml` file is a convenient way of installing software packages we want to use like `python`, `numpy`, `scipy`. These packages don't have to be python based. The reason you would use an `environment.yml` instead of installing each package independently is that this allows `conda` to figure out all the dependencies together. If you did this serially (installing one after another), then `conda` has to figure out the dependencies based on what's previously installed each time and this can lead to a suboptimal configuration of packages.
    + That being said, when working on a project you will inevitably have to install other packages you did not think of. You can do this by typing `conda install <package>` in the command line. You have to make sure this is in the activated environment (see below).
 4. We activate the conda environment to tell the computer which environment we are using. By default there is the `base` environment which has one set of packages installed. To switch the environment we just created we run `conda activate franklab_python_tutorial`. We can switch back to the base environment by running `conda activate base`. If we want to look at which packages are in the environment, we can run `conda list` to see.
 5. We want to be able to run and change code we put in the `franklab_python_tutorial` folder. Using `pip install .` alone installs `franklab_python_tutorial` using the `setup.py` file and allows us to import and run the existing code. But if we want be able to change that code and see the updates without repeatedly running `pip install .` we use the `--editable` flag to say we want to be able to change the code. I have found very few cases where we don't want to be able to do this.
 6. We started Jupyter Lab -- a program in your browser that allows you to make jupyter notebooks. Jupyter notebooks are an interactive way to write code that also can have text, images, and plots. It is good for explanation and prototyping code. This code can be Julia, Python or R (Ju, Pyt, R).
    + Importantly, **this is not the only way to write code for python and execute it** and should not be the only way you write code for python. It is convenient because it is interactive and allows you to prototype and explain things, but it can lead to bad code. More on this later.
    
## Jupyter Notebook basics
You can execute python code in Jupyter Lab notebooks by using SHIFT + ENTER. Or by selecting the cell and hitting the play button in the toolbar above.

**Warning**: You can execute code in cells in Jupyter Lab out of order. This can lead to variables being assigned a different thing than you intended. Good practice is to periodically run your notebooks from start to finish (or better yet, don't keep all your code in notebooks).

## Python basics

You can find a lot of this material at: https://docs.python.org/3/tutorial/

### Numbers

In python you can assign variables with an equal sign. This is how you assign an **integer** (..., -2, -1, 0, 1, 2, ...) to a variable.

In [1]:
x = 1
x

1

In [2]:
type(x)

int

You can even assign multiple variables in a single line

In [3]:
x = y = 1

y

1

We can do multiplication with this variable.

In [4]:
x * 3

3

In [5]:
type(x * 3)

int

It is important to know whether you are working with floating point numbers or integers because sometimes operations can be different on them. A floating point number is a any number that is not an integer. They will be denoted by decimal points. These are special because a computer is digitial and cannot store these with infinite precision. For example, division always returns a float:

In [6]:
x / 3

0.3333333333333333

In [7]:
type(x / 3)

float

If you intend your number to be a float and it can be misconstrued as an integer, it is good practice to put the decimal point there

In [8]:
x = 3.0 # instead of x = 3

In [9]:
type(x)

float

You can also do integer division with two forward slashes:

In [10]:
x // 3

1.0

In [11]:
type(x // 3)

float

Here are some other useful math operations. In python comments are denoted with an octothorpe (`#`). `**` means exponentiation. Parenthesis can be used to group things

In [12]:
1 - (x + 1) ** 3

-63.0

### Strings

You can assign different types of variables to the same variable without declaring it. Here we use single or double quotation marks to indicate it is a string and not a number.

In [13]:
x = 'a'

x

'a'

You can concatenate two string variables with a `+`

In [14]:
x + 'b'

'ab'

Strings in parenthesis without commas will also be concatenated. This is helpful when breaking up long strings.

In [15]:
('Neuroscience (or neurobiology) is the scientific study of the nervous system. It is a multidisciplinary science that combines physiology, anatomy, molecular biology, developmental biology'
 ',cytology, computer science and mathematical modeling to understand the fundamental and emergent properties of neurons and neural circuits.')

'Neuroscience (or neurobiology) is the scientific study of the nervous system. It is a multidisciplinary science that combines physiology, anatomy, molecular biology, developmental biology,cytology, computer science and mathematical modeling to understand the fundamental and emergent properties of neurons and neural circuits.'

There are also what as known as formatted strings ('f-strings'). These allow you to easily put variables in strings and enhance readability of the code. Use them:

In [16]:
n_chickens = 3

f'There are {n_chickens} chickens'

'There are 3 chickens'

NOTE: by convention people use snake_case in python (underscore to separate words such as `n_chickens`) except for classes. This is different from Matlab which conventionally uses camelCase. There a lot of conventions in python and it generally helps your code to be more readable and consistent to use them.

You can also use format strings to format the variable. For example say we want four leading zeros.

In [17]:
f'There are {n_chickens:04d} chickens'

'There are 0003 chickens'

You can learn about other ways to do string formatting here: https://pyformat.info/


There are a lot of built in functions for strings that come with python. They are handy to learn. Learn about them here: https://docs.python.org/3/library/string.html. Here are a couple:

In [18]:
x = 'brown bear'

x.upper()

'BROWN BEAR'

In [19]:
x.lower()

'brown bear'

In [20]:
x.title()

'Brown Bear'

In [21]:
x.startswith('fox')

False

In [22]:
x.endswith('bear')

True

### Booleans and Comparisons

Notice that these last two introduce another data type: Booleans. Booleans are True or False. They are always capitalized in the first letter.

In [23]:
type(x.endswith('bear'))

bool

In [24]:
True or False

True

In [25]:
True and False

False

In [26]:
False or True

True

In [27]:
x = True

not x

False

In [28]:
not x or x

True

In [29]:
'a' in 'Bear'

True

In [30]:
'c' in 'Bear'

False

In [31]:
'b' in 'Bear'

False

## Data Structures

A data structure is an object that can contain other python objects. This includes numbers, strings, functions, and even other data structures. Basically they are ways to group things.

There are four main data structures:
1. Lists
2. Tuples
3. Dictionaries
4. Sets


### Lists

Lists are the most generic type of data structure. For example here is a list that contains numbers:

In [32]:
squares = [1, 4, 9, 16, 25]
squares

[1, 4, 9, 16, 25]

It can also contain letters or even numbers and letters

In [33]:
letters = ['a', 'b', 'c']
letters

['a', 'b', 'c']

In [34]:
letters_and_numbers = [1, 'a', 2]

letters_and_numbers

[1, 'a', 2]

In [35]:
[type(element) for element in letters_and_numbers]

[int, str, int]

You can access elements of the list by using an index **which starts from zero**. For example this is the first element:

In [36]:
letters_and_numbers[0]

1

And second element:

In [37]:
letters_and_numbers[1]

'a'

You can also access elements of the list counting from the end. In Matlab, this is like the `end` statement. For example this is the last element of the list:

In [38]:
letters_and_numbers[-1]

2

And the second to last number:

In [39]:
letters_and_numbers[-2]

'a'

In [40]:
letters_and_numbers[-2] == letters_and_numbers[1]

True

In [41]:
letters_and_numbers[-3] == letters_and_numbers[0]

True

In [42]:
letters_and_numbers[-1] == letters_and_numbers[2]

True

We can figure out the length of a list by using the `len` function:

In [43]:
len(letters_and_numbers)

3

We can check if something is in the list by using `in`

In [44]:
'a' in letters_and_numbers

True

We can also "slice" lists, meaning we can access a subset of a list. We do this by using list[start:stop:step] where `start` is the beginning of slice index, `stop` is the last slice index + 1, and `step` is the size of the step to take inbetween start and stop.

**The key point is the stop value represents the first value that is not in the selected slice**

For example, if we only want the first two elements of the list, the index is 0 and 1 so if we set start=0, stop=2, and step=1:

In [45]:
letters_and_numbers[0:2:1]

[1, 'a']

Let's say we want every other element of the list aka indices 0 and 2. Then we want start=0, stop=3, and step=2:

In [46]:
letters_and_numbers[0:3:2]

[1, 2]

Another feature is that we can use `None` to denote the last index of the array. For example

In [47]:
letters_and_numbers[None:None:2]

[1, 2]

Another allowable shorthand instead of using `None` is simply omitting it.

In [48]:
letters_and_numbers[::2]

[1, 2]

You can also do this for stop or start:

In [49]:
letters_and_numbers[:2:]

[1, 'a']

In [50]:
letters_and_numbers[2::]

[2]

An even further shorthand leaves out the step at the end:

In [51]:
letters_and_numbers[:2]

[1, 'a']

In [52]:
letters_and_numbers[2:]

[2]

We can also make the step negative. Let's set step=-1:

In [53]:
letters_and_numbers[::-1]

[2, 'a', 1]

We can see that this reverses the list. This is equivalent to:

In [54]:
letters_and_numbers[3:-4:-1]

[2, 'a', 1]

Or more generally:

In [55]:
n_letters_and_numbers = len(letters_and_numbers)

letters_and_numbers[n_letters_and_numbers:-n_letters_and_numbers-1:-1]

[2, 'a', 1]

One last handy thing to know about is the `slice` object, which is equivalent to the list[start:stop:step] notation.

In [56]:
letters_and_numbers[slice(0, 2, 1)]

[1, 'a']

For more information on slicing and how it works, see: https://stackoverflow.com/questions/509211/understanding-slice-notation

**Another key feature of lists is that they are mutable**. This means that you can change a list after you have created it, which includes making it longer or changing a particular element in the list.

In [57]:
letters_and_numbers

[1, 'a', 2]

In [58]:
letters_and_numbers[1] = 'b'
letters_and_numbers

[1, 'b', 2]

You can even use slicing to change a list

In [59]:
letters_and_numbers[:2] = [1, 'a']
letters_and_numbers

[1, 'a', 2]

You can also add elements to the end of the list

In [60]:
letters_and_numbers.append('c')
letters_and_numbers

[1, 'a', 2, 'c']

You can even append another list

In [61]:
letters_and_numbers.append(['j', 'f', 'k'])
letters_and_numbers

[1, 'a', 2, 'c', ['j', 'f', 'k']]

In [62]:
letters_and_numbers.insert?

[0;31mSignature:[0m [0mletters_and_numbers[0m[0;34m.[0m[0minsert[0m[0;34m([0m[0mindex[0m[0;34m,[0m [0mobject[0m[0;34m,[0m [0;34m/[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m Insert object before index.
[0;31mType:[0m      builtin_function_or_method


In [63]:
letters_and_numbers.insert(1, 'b')
letters_and_numbers

[1, 'b', 'a', 2, 'c', ['j', 'f', 'k']]

We can also concatenate lists using the `+` operator:

In [64]:
[1, 2, 3] + ['a', 'b', 'c']

[1, 2, 3, 'a', 'b', 'c']

You can also *unpack* one list into another using the `*` operator:

In [65]:
a = [1, 2, 3]

['blah', *a]

['blah', 1, 2, 3]

### Tuples

Tuples are just like lists but you can't change them once they are created. This can be good if you don't intend for the data to be changed and slightly faster/memory efficient. For example, a tuple of numbers is created with a comma separating the two elements:

In [66]:
a = (1, 2)
a

(1, 2)

**Note that you do not have to have the parenthesis. They are there by convention and for visual convenience.**

This also works:

In [67]:
a = 1, 2
a

(1, 2)

This can trip you up if you accidentally put in a comma somewhere because you will create a tuple, e.g.:

In [68]:
a = 1,
a

(1,)

Tuples cannot be changed once they are created:

In [149]:
b = 1, 5, 7

b[1] = 3

TypeError: 'tuple' object does not support item assignment

Like lists, tuples can hold anything as an element including another list or tuple:

In [70]:
c = (1, 2, 3), 'a', [4, 5, 6]
c

((1, 2, 3), 'a', [4, 5, 6])

You can index into tuples just like lists:

In [71]:
c[2]

[4, 5, 6]

In [72]:
c[0]

(1, 2, 3)

In [73]:
c[1]

'a'

We can also use tuples to assign variables individually

In [74]:
a, b = 1, 2

In [75]:
a

1

In [76]:
b

2

Or even swap things

In [77]:
x, y = 3, 4

a, b = y, x

a, b

(4, 3)

### Dictionaries

A dictionary consists of two elements: a key and a value. The key is like an unordered index, where you give it the key and the dictionary returns the value (instead of giving it an index). Keys and values can be anything immutable (numbers, strings, tuples containing immutable things). Dictionaries are constructed like so:

In [78]:
n_animals = {'n_frogs': 3, 'n_bats': 2}

n_animals

{'n_frogs': 3, 'n_bats': 2}

Items in a dictionary can be accessed by giving the key

In [79]:
n_animals['n_bats']

2

You can also initalize a dictionary this way:

In [80]:
n_animals2 = dict(n_frogs=3, n_bats=2)
n_animals2

{'n_frogs': 3, 'n_bats': 2}

You can add key, value pairs to a dictionary:

In [81]:
n_animals['n_rats'] = 30
n_animals

{'n_frogs': 3, 'n_bats': 2, 'n_rats': 30}

As well as remove key, value pairs

In [82]:
n_animals.pop('n_bats')
n_animals

{'n_frogs': 3, 'n_rats': 30}

We can check which keys are in the dictionary by using `in`

In [83]:
'n_bats' in n_animals

False

In [84]:
'n_rats' in n_animals

True

We can list the keys by converting the dictionary to a `list` or `tuple` or simply using the `keys` method.

In [85]:
list(n_animals)

['n_frogs', 'n_rats']

In [86]:
tuple(n_animals)

('n_frogs', 'n_rats')

In [87]:
n_animals.keys()

dict_keys(['n_frogs', 'n_rats'])

We can use the `values` method to get the values of the dictionary:

In [88]:
n_animals.values()

dict_values([3, 30])

Or `items` to get both a key and a value pair as a tuple

In [89]:
n_animals.items()

dict_items([('n_frogs', 3), ('n_rats', 30)])

### Sets

Finally, there is a data structure called sets. Sets are not ordered and cannot contain duplicate entries. The syntax for sets looks very similar to dictionaries so be forewarned.

In [90]:
{'a', 'b', 'a'}

{'a', 'b'}

Sets are not used that often but they can be useful for finding intersections and unions.

In [91]:
animals = {'cat', 'bear', 'dog', 'pig'}
annoying_animals = {'cat'}
fat_animals = {'panda'}

The union of the two sets (everything uniquely in both sets)

In [92]:
animals | annoying_animals

{'bear', 'cat', 'dog', 'pig'}

In [93]:
animals | fat_animals

{'bear', 'cat', 'dog', 'panda', 'pig'}

The intersection of the two sets (everything shared between the sets)

In [94]:
animals & annoying_animals

{'cat'}

In [95]:
animals & fat_animals

set()

## Control Flow (for, if, else)


### For Loops
One of the more common things you will need to do is to loop through variables. This is done with a `for` statement.

**Key thing: Indentation matters in python. You should use 4 spaces per indentation level.**

Here is how you can loop over a list

In [96]:
letters = ['a', 'b', 'c', 'd', 'e'] # list of letters

for letter in letters:
    print(letter)

a
b
c
d
e


If you do not indent you will get an error

In [148]:
for letter in letters:
print(letter)

IndentationError: expected an indented block (<ipython-input-148-f7b174667ece>, line 2)

Here is how you loop over a dictionary. Notice that looping over a dictionary only gives you the keys.

In [98]:
n_animals = {'n_frogs': 3, 'n_bats': 2}

for animal in n_animals:
    print(animal)

n_frogs
n_bats


You can also loop over keys and values. Notice that by looping over items, we get a key and value as a tuple

In [99]:
n_animals = {'n_frogs': 3, 'n_bats': 2}

for animal in n_animals.items():
    print(animal)

('n_frogs', 3)
('n_bats', 2)


We can assign each of these tuple elements to a variable:

In [100]:
n_animals = {'n_frogs': 3, 'n_bats': 2}

for animal, count in n_animals.items():
    print(animal)
    print(count)

n_frogs
3
n_bats
2


What if we have a tuple as the value? We can still assign both values as a variable.

In [101]:
n_animals = {'n_frogs': (1, 3),
             'n_bats': (5, 7)}

for animal, (count1, count2) in n_animals.items():
    print(animal)
    print(count1)
    print(count2)

n_frogs
1
3
n_bats
5
7


#### Comprehensions

There is special syntax for simple loops over lists and dicationaries that turn back into lists and dictionaries called `list comprehensions` and `dictionary comprehensions` respectively.

Comprehensions are important because they are more efficient and readable ways of creating lists and dictionaries via loops

This is a list comprehension:

In [102]:
a = [1, 2, 3]

[x**2 for x in a]

[1, 4, 9]

This is a dictionary comprehension

In [103]:
d = {'a': 1, 'b': 2}

{key: value**2 for key, value in d.items()}

{'a': 1, 'b': 4}

#### Other important functions to use with loops

`range` returns numbers from start to stop-1 at a given, step just like slice. So the syntax is `range(start, stop, step)`

In [104]:
for num in range(0, 5, 2):
    print(num)

0
2
4


Like slice, you can omit start and step and by default start=0 and step=1:

In [105]:
for num in range(5):
    print(num)

0
1
2
3
4


`zip` is another extremely useful function. It allows you to iterate over two iterables. For example if you need to iterate over two lists simultaneously, zip will return both outputs as a tuple:

In [106]:
list1 = ['a', 'b', 'c']
list2 = [1, 2, 3]

for element1, element2 in zip(list1, list2):
    print(element1, element2)

a 1
b 2
c 3


**Be careful about the length of both lists because zip will iterate only up to the shortest list**

In [107]:
list1 = ['a', 'b', 'c', 'd']
list2 = [1, 2, 3]

for element1, element2 in zip(list1, list2):
    print(element1, element2)

a 1
b 2
c 3


In [108]:
len(list1)

4

In [109]:
len(list2)

3

Finally, the `itertools` module is a handy set of methods from the python standard library. You have to import it (tell python you're going to use this other set of code that's not in the basic python) to use it, but here is an example of how it can be useful. We use the `combinations` function to get all unique combinations of the elements of a list (ignoring order):

In [110]:
import itertools

a = [1, 2, 3]
n_combinations = 2

list(itertools.combinations(a, n_combinations))

[(1, 2), (1, 3), (2, 3)]

Notice that we had to use list to get the list of tuples returned by itertools. That's because it gets returned as something known as a generator, which only returns the objects upon request in case the list is very long. We can still use it in a loop like so:

In [111]:
for comb1, comb2 in itertools.combinations(a, n_combinations):
    print(comb1, comb2)

1 2
1 3
2 3


In [112]:
for comb1, comb2, comb3 in itertools.combinations(a, 3):
    print(comb1, comb2, comb3)

1 2 3


### If/Else

Sometimes we want to do different things based on some condition. We can use an if-else statement to control what happens based on a criteria. These statements also rely on indentation:

In [113]:
type_of_animal = 'dog'

if type_of_animal == 'cat':
    print('hiss')
else:
    print('bark')

bark


In [114]:
type_of_animal = 'cat'

if type_of_animal == 'cat':
    print('hiss')
else:
    print('bark')

hiss


We can use `elif` to add more conditions

In [115]:
type_of_animal = 'god'

if type_of_animal == 'cat':
    print('hiss')
elif type_of_animal == 'dog':
    print('bark')
else:
    print('oy')

oy


## Functions

Functions are important because they define short, resuable chunks of code.
+ As soon as you find yourself using a set of code more than once, you should make it into a function. Functions are good for writing good code because any error you fix in the function will automatically be fixed for any other use of the code. 
+ Functions are also good ways to make your code more understandable and readable because lots of lines of code are hard to understand. Well named, small functions (<60 lines) can break complex code down into smaller parts.
+ Functions are also an opportunity to document your code and tell your future self what is going on. Document your functions.
+ Give functions good names that say what the function is doing.
+ Don't rewrite functions that are already implemented by a well-known package unless you have to.

Here is the basic skeleton of a function. It also relies on indentation and colons for the format. Note that `pass` is a special python keyword that says nothing is happening:

In [116]:
def my_function():
    pass

When a function returns no output, it technically always returns a `None`

In [117]:
my_function()

In [118]:
my_function() is None

True

Here is a function that takes two variables and adds them. Notice that we use the `return` keyword to return an output of the function. Otherwise, `None` will be returned.

In [119]:
def my_adds_function(x, y):
    return x + y

In [120]:
my_adds_function(1, 2)

3

In [121]:
my_adds_function('a', 'b')

'ab'

We can also specify default values for inputs to the function using an equals sign. This means if no input is given, these values will be used. These are known as keyword arguments.

In [122]:
def my_adds_function2(x=1, y=2):
    return x + y

In [123]:
my_adds_function2()

3

In [124]:
my_adds_function2(5, 9)

14

 Variables without a default are known as positional arguments. Positional arguments must come before keyword arguments.

In [125]:
def my_adds_function3(w, x=1, y=2):
    return w + x + y

In [126]:
my_adds_function3(3)

6

In [147]:
def my_adds_function4(x=1, y=2, w):
    return w + x + y

SyntaxError: non-default argument follows default argument (<ipython-input-147-a54f4a0190cd>, line 1)

We also should document our functions. We do this using something called a doc-string, which is a special string that goes at the beginning of the function. I prefer to use the numpy style to document my functions: https://numpydoc.readthedocs.io/en/latest/format.html

In [128]:
def my_adds_function3(w, x=1, y=2):
    """A short description of what the function does.
    
    Parameters
    ----------
    w : int
        A description of the variable can go here.
    x : int
    y : int
    
    Returns
    -------
    output : int
    """
    return w + x + y

In jupyter notebooks we can use a `?` to look at the doc-string for a function

In [129]:
my_adds_function3?

[0;31mSignature:[0m [0mmy_adds_function3[0m[0;34m([0m[0mw[0m[0;34m,[0m [0mx[0m[0;34m=[0m[0;36m1[0m[0;34m,[0m [0my[0m[0;34m=[0m[0;36m2[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
A short description of what the function does.

Parameters
----------
w : int
    A description of the variable can go here.
x : int
y : int

Returns
-------
output : int
[0;31mFile:[0m      ~/Documents/GitHub/franklab_python_tutorial/notebooks/<ipython-input-128-c57d6ca564f9>
[0;31mType:[0m      function


We can also use two question marks (`??`) to look at the corresponding code:

In [130]:
my_adds_function3??

[0;31mSignature:[0m [0mmy_adds_function3[0m[0;34m([0m[0mw[0m[0;34m,[0m [0mx[0m[0;34m=[0m[0;36m1[0m[0;34m,[0m [0my[0m[0;34m=[0m[0;36m2[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mSource:[0m   
[0;32mdef[0m [0mmy_adds_function3[0m[0;34m([0m[0mw[0m[0;34m,[0m [0mx[0m[0;34m=[0m[0;36m1[0m[0;34m,[0m [0my[0m[0;34m=[0m[0;36m2[0m[0;34m)[0m[0;34m:[0m[0;34m[0m
[0;34m[0m    [0;34m"""A short description of what the function does.[0m
[0;34m    [0m
[0;34m    Parameters[0m
[0;34m    ----------[0m
[0;34m    w : int[0m
[0;34m        A description of the variable can go here.[0m
[0;34m    x : int[0m
[0;34m    y : int[0m
[0;34m    [0m
[0;34m    Returns[0m
[0;34m    -------[0m
[0;34m    output : int[0m
[0;34m    """[0m[0;34m[0m
[0;34m[0m    [0;32mreturn[0m [0mw[0m [0;34m+[0m [0mx[0m [0;34m+[0m [0my[0m[0;34m[0m[0;34m[0m[0m
[0;31mFile:[0m      ~/Documents/GitHub/franklab_python_tutorial/notebooks/<ipy

#### Debugging functions in jupyter notebooks

A handy feature in jupyter notebooks is that if a function breaks, we can go to the exact spot in the function where it broke with all the corresponding variables. We do this using the debug "magic" in jupyter notebooks. Magics in jupyter notebook are special functions that are preceeded with a `%`.

So for example, here is how we use the debug magic (`%debug`).
 + We can use `dir` to see what variables exist in the scope of the function.
 + We can use `q` to exit
 + We can use `u` and `d` to go up and down the call stack (aka if another function calls this function we can go to the calling function)
 + We can use `ll` to see the whole function

In [131]:
def broken():
    list1 = [1, 2, 3]
    list2 = [4, 5, 6]
    output_list = []
    
    for element1, element2 in zip(list1, list2):
        output_list.append(element1 + element2[0])
    
    return output_list

In [145]:
broken()

TypeError: 'int' object is not subscriptable

In [146]:
%debug

> [0;32m<ipython-input-131-3bd5881d2136>[0m(7)[0;36mbroken[0;34m()[0m
[0;32m      5 [0;31m[0;34m[0m[0m
[0m[0;32m      6 [0;31m    [0;32mfor[0m [0melement1[0m[0;34m,[0m [0melement2[0m [0;32min[0m [0mzip[0m[0;34m([0m[0mlist1[0m[0;34m,[0m [0mlist2[0m[0;34m)[0m[0;34m:[0m[0;34m[0m[0;34m[0m[0m
[0m[0;32m----> 7 [0;31m        [0moutput_list[0m[0;34m.[0m[0mappend[0m[0;34m([0m[0melement1[0m [0;34m+[0m [0melement2[0m[0;34m[[0m[0;36m0[0m[0;34m][0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0m[0;32m      8 [0;31m[0;34m[0m[0m
[0m[0;32m      9 [0;31m    [0;32mreturn[0m [0moutput_list[0m[0;34m[0m[0;34m[0m[0m
[0m


ipdb>  dir()


['element1', 'element2', 'list1', 'list2', 'output_list']


ipdb>  list1


[1, 2, 3]


ipdb>  list2


[4, 5, 6]


ipdb>  element1


1


ipdb>  element2


4


ipdb>  u


> [0;32m<ipython-input-145-ba3e6ebe1246>[0m(1)[0;36m<module>[0;34m()[0m
[0;32m----> 1 [0;31m[0mbroken[0m[0;34m([0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0m


ipdb>  d


> [0;32m<ipython-input-131-3bd5881d2136>[0m(7)[0;36mbroken[0;34m()[0m
[0;32m      5 [0;31m[0;34m[0m[0m
[0m[0;32m      6 [0;31m    [0;32mfor[0m [0melement1[0m[0;34m,[0m [0melement2[0m [0;32min[0m [0mzip[0m[0;34m([0m[0mlist1[0m[0;34m,[0m [0mlist2[0m[0;34m)[0m[0;34m:[0m[0;34m[0m[0;34m[0m[0m
[0m[0;32m----> 7 [0;31m        [0moutput_list[0m[0;34m.[0m[0mappend[0m[0;34m([0m[0melement1[0m [0;34m+[0m [0melement2[0m[0;34m[[0m[0;36m0[0m[0;34m][0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0m[0;32m      8 [0;31m[0;34m[0m[0m
[0m[0;32m      9 [0;31m    [0;32mreturn[0m [0moutput_list[0m[0;34m[0m[0;34m[0m[0m
[0m


ipdb>  q


## Modules, Packages and Scripts

Okay, you've written a bunch of code in a jupyter notebook. You've turned some of that code into functions because you use the same chunk of code repeatedly. You've documented the code inputs and outputs via the docstring and some handy comments in the function body itself. But several problems arise:
+ You have a lot of functions, some of which are related and some of which are not and it's hard to keep track of them
+ Your functions keep having errors because sometimes you run code out of order in jupyter notebooks.
+ You accidentally keep changing things when you come back to the code and you don't remember what you did before. Or you want to change back to something you did before.


The solution?
+ Use modules, packages and scripts
+ Use programs specifically made for coding (vscode, pycharm, atom, etc.).
+ Use version control (git)


This will make your code more readable, well organized, and less error prone.

**Jupyter notebooks do not work well with version control** because version control tracks changes in text files and jupyter notebooks are represented by json files, which contain a mixture of things the jupyter notebooks needs, code for images, etc.

Code editors are good because you can use things called **linters**, which will help format your code so that it is more reasdable and catch common mistakes. There is a guide for writing clean code in python called pep8: https://www.python.org/dev/peps/pep-0008/. Linters can help you conform to this standard which ultimately will lead to more uniform, cleaner code. You want cleaner code so that future you can figure out what's going on with your code.

Finally, modules, packages and scripts help you organize the code in a sensible, reproducible way.

### Modules

A module is a text file that ends with `.py`. It usually contains functions but it can contain other things like variables (or code that gets executed when the module is imported). You use functions in a module by importing them using the keyword `import`. You have already seen an example of this. You can import modules from programs already installed in your conda environment or you can import modules that you have made locally (in your repository).

Before we import a file, let's use two jupyter notebook magics to make sure that any changes we make to the files will be reflected in the code here:

+ `%reload_ext autoreload`:  Loads the extension autoreload
+ `%autoreload 2`: Reload all modules (except those excluded by %aimport) every time before executing the Python code typed.

See more documentation here: https://ipython.org/ipython-doc/3/config/extensions/autoreload.html

In [134]:
%reload_ext autoreload 
%autoreload 2

Let's try importing a module from the `src` folder

In [135]:
import src.my_first_module

You can use the built-in `dir` function in python to inspect the contents of the module. Alternatively, you can use tab-completion (hitting tab after putting a dot) in jupyter lab to inspect the contents of the module.

In [136]:
dir(src.my_first_module)

['N_CATS',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 'get_abs_path_of_cur_dir',
 'my_first_function',
 'my_second_function',
 'os']

We see that we have a bunch of things with a double underscore (also known as a *dunder*) and then `get_abs_path_of_cur_dir`, `my_first_function`, and `my_second_function`. The double underscore is a way to indicate that these are special things that you might not use that often, but might be useful in some situations. You won't see them when using tab completion. One particular one you will see used for scripts is the `__name__` global variable. This just gives the name of the module. More on this later.

In [137]:
src.my_first_module.__name__

'src.my_first_module'

We can also look at the one function in the module:

In [138]:
src.my_first_module.my_first_function?

[0;31mSignature:[0m [0msrc[0m[0;34m.[0m[0mmy_first_module[0m[0;34m.[0m[0mmy_first_function[0m[0;34m([0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m Prints Success
[0;31mFile:[0m      ~/Documents/GitHub/franklab_python_tutorial/src/my_first_module.py
[0;31mType:[0m      function


And run the function

In [139]:
src.my_first_module.my_first_function()

Success


We can also rename the imported module using `as`. This can cut down on the amount of typing you have to do, but has the downside of obscuring where things are coming from if your new name is ambiguous.

Some packages/modules have standard things they are renamed to. For example, numpy is usually renamed as np.

In [140]:
import src.my_first_module as my_first_module

my_first_module.my_first_function()

Success


We can also import the function for use using the `from` keyword

In [141]:
from src.my_first_module import my_first_function

my_first_function()

Success


Lastly, we can import all functions in a module using `*`

**Note that this is bad practice because you can unknowingly import a bunch of things you don't know the name of, some of which may conflict with other things you have defined**

A common mantra in python is it is better to be explicit than implicit.

In [142]:
from src.my_first_module import *

We imported `my_second_function` and we didn't know it

In [143]:
my_second_function()

3


Other things:
+ You can also import functions from other modules into modules (see `my_first_module.py`).
+ There are a standard set of modules in python known as the standard library. They are full of useful functions. `itertools`, which we imported before, is one of them

### Scripts

What is the difference between a module and script?

Functionally nothing except for its intended use.

A script is still just a text file with the `.py` suffix

A module is a collection of functions (and/or variables) intended for use elsewhere.

A script is a set of code intended to execute a purpose. For example if we want to plot some data, we might write code to load data, process the data and plot the data.

You often execute scripts in the command line by using the following syntax:
```bash
python my_first_script.py
```

You can also do the following in a jupyter notebook (although it is a little weird).

In [144]:
%run ../scripts/my_first_script.py

Loading data...
Processing data...
Plotting and saving figure...
Done!


### Packages

Packages are just collections of modules. Packages are folders with an `__init__.py` file and modules in them. You can have subpackages within packages if you have a lot of code to organize. These are less important to understand unless you need to have complicated code.

### Classes

Classes are actually quite similar to modules. They are objects that organize a set of functions. The difference is that they typically have data attached to them that you can manipulate. If this doesn't make sense, don't worry about it.

Here is how you initialize a class with a function. Note the convention of using camelCase with class names.

In [1]:
class myFirstClass():
    def __init__(self, x, y):
        self.x = x
        self.y = y
        
    def my_first_function(self, z):
        return self.x + self.y + z

Notice that classes have an init method (a method is a function in a class) which you use to pass in data to the class that needs to be stored.

Also notice that functions in methods always use `self` as their first argument. This is so you can access data and functions of the class.

To use a class you need to instantiate it (assign it to a variable) like so:

In [3]:
c = myFirstClass(x=1, y=2)
c

<__main__.myFirstClass at 0x7ff23435db50>

Notice that the function `my_first_function` depends on the data in the class

In [4]:
c.my_first_function(3)

6