# Lecture 1.1 - Variables
## Introduction to python

### Why Python?
- Relatively easy to learn.
- Powerful - can do anything (websites, deep learning, ...).
- You write code and can immediately run it - no intermediate "compilation" steps required. 
- Big Community -> a lot of problems have been solved before.

### Core concepts
- accessing/storing data: variables (numbers, text, True/False) and lists (sequences of variables like time series)
- manipulating data: arithmetic (+, - etc), comparison (>, <, =)
- control logic: _for_ loops, _if_ statements
- use external libraries and functions for plotting etc 
- write your own functions
- algorithmic thinking - dividing a problem (and its solution) into small programming steps
- how to help yourself (google for errors, googling and reading documentation)

## The problem - spike detection
We want to detect the spikes in this voltage trace.

A spike is a peak in the voltage trace exceeding a threshold value. 

More specifically, we are interested in when the spikes occur - the spikes times.

![](voltage_trace.png)

#### How would you solve this?
Let's break this down into a sequence of simple steps, like a recipe:

1. weigh 100 grams flour
2. take 3 eggs
3. mix eggs and floor in a bowl
3. beat it!
4. ...

Now try to explain to your grandma or a 5-year old how you would extract the time of each spike:

1. 
2.
3.

### What do we need to know to detect spikes using python?
- [ ] __Present data in code (individual voltage values, manipulate them and store the results) - variables__
- [ ] Compare variables (voltage to threshold) - boolean values
- [ ] Perform different actions based on the value of a variable (only keep the position if the voltage exceeds the threshold) - if-else statements
- [ ] Present and access data in a time series of voltage values - lists
- [ ] Perform an action for each element in a sequence of values (inspect voltage values one-by-one) - for loops
- [ ] Separate data and logic so we can use the same code for new recordings - functions
- [ ] Apply this to multiple data files
- [ ] Plot and save the results

## Presenting data in code

### Why have variables?
You can work directly with numbers and use python as a calculator:

In [5]:
1 + 7

8

Try it yourself by clicking into the cell above and changing the numbers.

You can execute the code using Control+Enter.

### Variables abstract away specific data "types"/"roles" from the specific data values
Working with numeric values directly isn't very general, since you tie your code to specific numeric values.

Variables are a way to make your code more general (and thereby more useful), by separating specific data values from the general computation. You can think of the variable as a storage container: it can store information that you can access via its name and manipulate in your program.

A variable is created with the syntax `name = value`.

- `name` can be any combination of characters, underscores and numbers (as long as it does not start with a number)
- `value` can be a number, text (or an arbitrary "python object").

That way, you can express your program's computation in general terms - as the manipulation of variables.

> What is a Variable? In computer programming, a variable has a name and contains a value. A variable is like a box. If you labeled the box as `toys` and put a yo-yo inside it, in programming terms, `toys` is the variable name, and yo-yo is the value.

### Baking a cake
Take the ingredients in a cake recipe as an analogy: A recipe typically contains eggs as an ingredient and contains an instruction like _"Add eggs"_. But what if we want to bake a vegan cake?

Eggs play specific role in the recipe - for instance, they enrich the flavor and add moisture. We can make a general recipe by replacing the specific ingredient, "eggs", in a recipe with its role, "flavor": _"Add flavor"_.

In this example the variable name is `flavor`. In a receipe for an old-fashioned cake, the variable `flavor` has the value `"eggs"`: 
`flavor = "eggs"`. When we execute the instructions, we replace the variable name `flavor` with its value `"eggs"` and the instruction becomes: _"Add eggs"_.

But the recipe is now much more general and we can easily make a vegan cake by replacing a specific ingredient: By changing the value of the `flavor` variable to `banana mash`, `flavor = "banana mash"`, the instruction now becomes _"Add banana mash"_.

### Computing firing rates
Back to neuroscience: Say we want to compute the firing rate of a neuron. We have counted the number of spikes (132) our neuron fired during a recording of 16 seconds. The firing rate - the number of spikes per second (1/s=Hz) - is given by the ratio of both:

In [7]:
132 / 16

9.375

But this is 

1. not very general and 
2. not very readable. 

First, we directly tie the computation - calculating the ratio - to the data values, if we want to compute the firing rate for another neuron, we have to meddle with the code. This becomes more relevant if the computation becomes more complicated.

Second, it's unclear what `132 / 16` means without knowing the context of the code - these numbers could mean anything.

Variables with informative names provide context - it's clear from looking at the code what is being computed!

A variable is created and assigned a value, using the `=` character, like in math:
`x = 10`. Here, `x` is the name of the variable and `10` is it's value.

We can assign the number of spikes and the duration of experiments to two variables, `n_spikes` and `duration` and compute the firing rate as their ratio:

In [3]:
n_spikes = 132
duration = 16  # seconds
# compute the firing rate
firing_rate = n_spikes / duration  # 1/seconds=Hz
firing_rate

8.25

__Note 1:__ The `#` character allows you to comment your code - everything following `#` will be treated as a comment and not as code that is executed by python. Comments allow us to add extra information to the code.

__Note 2:__ The value of the last line in a cell, in this case `firing_rate`, is printed below the cell automatically.

By changing the value of the variables, `n_spikes` and `duration`, you can run the same computation, `n_spikes / duration`, on different data (for instance, different trials from the same experiment). This is useful if the computation is more complicated.

Whenever you type a variable name in code it will be replaced by the value (in this case, the number) it is referring to.

Above, the expression `firing_rate = n_spikes / duration` is treated the following by Python:
1. The value of the variable `n_spikes` is looked up as `132`
2. `n_spikes` in `n_spikes / duration` is replaced by `132`. The expression is now `132 / duration`
3. the value of the variable `duration` is looked up as `16`
4. `duration` in `132 / duration` is replaced by `16`. The expression is now `132 / 16`
5. The computation is performed and the result, `8.25` assigned the variable `firing_rate`.

### Dealing with text
How about data that is not numeric, like names?

Simply assigning letters to a variable does not work - this will throw an error:

In [4]:
name_of_the_whale = Keiko

NameError: name 'Keiko' is not defined

This is because letters and words are interpreted as variable names. 
In the above example, python assumes that "Keiko" is a variable and looks up the value of the `Keiko` variable, but that a variable with the name `Keiko` does not exist! Hence the `NameError`

To tell python that the value of your variable is text, we wrap it in `'...'` or `"..."`.
Variables with text as values are called _strings_.

In [11]:
name_of_the_whale = 'Keiko'
speed = "Mach 2"
print('speed=', speed)
year = "1968"
year

speed= Mach 2


'1968'

__Note: Errors__ When you write "invalid" code, python does not make your computer crash, but generates an error with a (sometimes cryptic) description of the cause of the error. Errors are also called _exceptions_. We will learn later how to deal with - and raise our own - exceptions.

#### Note: Functions

To print the value of variables in the notebook, we relied on jupyter's functionality of printing the result of the last line in a cell.
To print the value of variables whenever you want, you can use `print`, which is something new: a _function_. 
More about functions (and how to write them yourself) later - they are a great way of hiding away complexity. 

For now, you only need to know this:
- The general form of a function is `output = function_name(input)`. Print does not produce an explicit output, all it does is print it's input.
- In python, the input is called the _argument_, the output is called the _return value_.
- Like many functions, `print` accepts multiple arguments. Multiple arguments are separated by commas: `print(arg1, arg2)`.
- Using a function is also termed "to call a function". Functions are therefore termed `Callable`s, regular variables are not.

So, to print the value of a variable: `print(variable_name)`.

To print the value of multiple variables: `print(variable_name1, variable_name2)`.

In [12]:
# create a new variable `c` and give it a particular value (aka assign a value to `c`)
c = 1  
print('the value of c is', c)  # `c` will be replaces with its value
print(c)  # for the lazy...

#  create a new variable `d` and assign to it the outcome of `c + 5`
d = c + 5
print('the value of d is', d)

# you can also re-assign values to existing variables - this allows you to update a variable during computation
print('the value of c before reassignment is', c)
c = c + 8
print('the value of c after re-assignment is', c)

the value of c is 1
1
the value of d is 6
the value of c before reassignment is 1
the value of c after re-assignment is 9


You can get help about how to use a function using `help(function_name)`. The result may be very technical. Google is your friend!

In [None]:
help(print)

Help on built-in function print in module builtins:

print(...)
    print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)
    
    Prints the values to a stream, or to sys.stdout by default.
    Optional keyword arguments:
    file:  a file-like object (stream); defaults to the current sys.stdout.
    sep:   string inserted between values, default a space.
    end:   string appended after the last value, default a newline.
    flush: whether to forcibly flush the stream.



### Types of variables

Variables come in different flavors - so far, we have encountered two kinds of variables: numbers and text.

Kinds of variables are called _types_. These are the most important ones:

- boolean variable: `bool`- only two values: `True` or `False`
- integer numbers: `int` - number without a decimal point (1, 2, 103241)
- floating point numbers: `float` - number with a decimal point (3.141)
- (complex numbers - not covered here: `complex`)
- string: `str`: sequence of characters ("yes", 'no')

### Working with numerical variables - arithmetics on numbers
| Symbol | Task Performed |
|----|---|
| +  | Addition |
| -  | Subtraction |
| *  | multiplication |
| /  | division |
| **  | to the power of |
| //  | floor division |
| %  | mod |

In [14]:
a = 16
b = 3

a / b


5.333333333333333

In [None]:
# Examples
a = 16
b = 3

print(f'Addition: {a} + {b} = {a+b}')
print(f'Subtraction: {a} - {b} = {a-b}')
print(f'Multiplication: {a} * {b} = {a*b}')
print(f'Division: {a} / {b} = {a/b}')
print(f'Power: {a} ^ {b} = {a**b}')
print('Plus some weird stuff that we will ignore for now:')
print(f'Floor Division: {a} // {b} = {a//b} # Return the integer part of the quotient')
print(f'Modulo: {a} % {b} = {a%b}  # The modulo or mod operation returns the remainder of division {a} % {b} = {a} - ({b} * {int(a/b)})')

print('Floor division vs. modulo:')
print('Floor division', 10 // 3, 9 // 3, 8 // 3)
print('Modulo', 10 % 3, 9 % 3, 8 % 3)

Addition: 16 + 3 = 19
Subtraction: 16 - 3 = 13
Multiplication: 16 * 3 = 48
Division: 16 / 3 = 5.333333333333333
Power: 16 ^ 3 = 4096
Plus some weird stuff that we will ignore for now:
Floor Division: 16 // 3 = 5 # Return the integer part of the quotient
Modulo: 16 % 3 = 1  # The modulo or mod operation returns the remainder of division 16 % 3 = 16 - (3 * 5)
Floor division vs. modulo:
Floor division 3 3 2
Modulo 1 0 2


### Determining the type of variables
Wrong variable types can be a source of errors in your code - it is therefore often useful to determine the type of a variable.

In [17]:
a = 15.6
b = '12.4'
a + b

TypeError: unsupported operand type(s) for +: 'float' and 'str'

The `type` function returns the type of a variable.

Let's create for variables and print their types:

In [3]:
correct = True
a = 1
b = 1.0
c = '1.0'

type(correct), type(a), type(b), type(c)

(bool, int, float, str)

You can change the type of a variable - _cast_ its type - by using `bool`, `int`, `float`,  `str` as functions, with the variable you want to cast as the argument:

In [5]:
a_as_str = '1'
a_cast_to_float = float(a_as_str)  # cast from an string to a floating point
type(a_as_str), type(a_cast_to_float)

(str, float)

You can turn a boolean or a number into a text using `str`:

In [6]:
str(True), str(10)

('True', '10')

You can turn a string to a number using `int` or `float`. This works only if the string variable contains only numerical data: 

In [7]:
age_as_str = '10'
age_as_number = int(age_as_str)
print(age_as_str, age_as_number)
print(type(age_as_str), type(age_as_number))

10 10
<class 'str'> <class 'int'>


Careful - casting sometimes works in unexpected ways.

For instance, any non-zero number and non-empty string is `True` in python.

In [8]:
# bool(0), bool(1.6), bool(10), bool(-1), bool('True'), bool('False'), bool('')
print(f"{bool(0)=}, {bool('')=}, {bool(1.6)=}, {bool(10)=}, {bool(-1)=}, {bool('True')=}, {bool('False')=}")

bool(0)=False, bool('')=False, bool(1.6)=True, bool(10)=True, bool(-1)=True, bool('True')=True, bool('False')=True


__Note:__ To format the above I used a python construct called f-strings. More on f-strings later.

Casting does not work for all values of variables. For instance, if the string variable contains non-numeric characters, letters or even just a blank space, python produces a `ValueError` because the value of the argument is not "right":

In [13]:
age_as_str = '2years'
age_as_number = int(age_as_str)

ValueError: invalid literal for int() with base 10: '2years'

In [14]:
age_as_str = '2 0'
age_as_number = int(age_as_str)

ValueError: invalid literal for int() with base 10: '2 0'

### Naming variables
You can name a variable *almost* anything you want.

It can only start with a letter or "\_", and it can contain alphanumeric characters (letters or numbers) plus underscores ("\_"):

In [15]:
# Valid names
cool_variable = 10
_cool_variable = 10
password_123 = 'bad'

Variable names must not start with numbers, contain special characters.   

In [18]:
# Invalid names
10th_number = 3.34

SyntaxError: invalid decimal literal (3019652057.py, line 2)

In [19]:
a+b = 27

SyntaxError: cannot assign to expression here. Maybe you meant '==' instead of '='? (2438563722.py, line 1)

In addition, certain key words must not be used as variable names since they are reserved for the language (no need to remember them - just keep this in mind so you know how to interpret the resulting error): `and`, `as`, `assert`, `break`, `class`, `continue`, `def`, `del`,
`elif`, `else`, `except`, `exec`, `finally`, `for`, `from`, `global`, `if`, `import`, `in`, `is`, `lambda`, `not`, `or`, `pass`,
`print`, `raise`, `return`, `try`, `while`, `with`, `yield`

In [22]:
# Example
and = 3

SyntaxError: invalid syntax (2687088324.py, line 2)

__Caution__ Python lets you use the names of builtin functions for variables. However the associated function will then not be available anymore.

In [1]:
# `sorted` is a function for sorting a list of variables
# we don't need it today so we can mess it up to illustrate the point
print(sorted('acb'))
print(type(sorted))  # this will return "builtin_function_or_method"

['a', 'b', 'c']
<class 'builtin_function_or_method'>


In [3]:
sorted = 10  # create a variable with the same name as the function `sorted` and assign the value 10
print(type(sorted))  # now sorted is not a function anymore, but an integer
sorted('this will fail')  # this fails because you can't "call" an integer, only a function

<class 'int'>


TypeError: 'int' object is not callable

#### Names should be short (but not too short) and descriptive
- `a` is very short but meaningless - it is unclear what type of information `a` stores.
- `concentration` or `subject_name` is a bit longer but its meaning is evident from the name.
- `name_of_the_first_subject_on_monday_morning` is too long.

In [4]:
a = 1  # could be anything
concentration = 0.1  # mM
subject_name = "Mabel"  # good descriptive variable name
name_of_the_first_subject_on_monday_morning = "Thelonious"  # too long

### Re-using and updating variables

Alice and Bob count mosquitoes. Alice, to store her count, creates a variable with name `n_alice_mosquitoes` and value `1` (she has counted only one mosquito so far).

In [18]:
c

9

In [6]:
# create a new variable `n_alice_mosquitoes` and 
# give it a particular value (aka assign the value 1 to `n_alice_mosquitoes`)
n_alice_mosquitoes = 1
n_alice_mosquitoes

1

Now Alice can perform a computation using `n_alice_mosquitoes`.
For instance, Bob has counted 10 more mosquitoes - so we add `10` to her own count and store the result in a new variable, `n_mosquitoes`, for the total count:

In [None]:
n_mosquitoes = n_alice_mosquitoes + 10
n_mosquitoes

11

What is going on here?

In the above expression, `n_alice_mosquitoes + 10`, python sees the variable name, `n_alice_mosquitoes`, and looks up it's value, `1`. 

__Note:__ `n_alice_mosquitoes` was created and assigned a value in the previous cell. Python has a single "work space" for all cells in a notebook - all variables are available and can be modified in all cells, irrespective of the order of the cells in the notebook!

Python then substitutes the value `1` for the name `n_alice_mosquitoes` and performs the computation (it evaluates the expression).

`n_alice_mosquitoes + 10` is evaluated as `1 + 10`. 

`n_mosquitoes = n_alice_mosquitoes + 10` means that the result of the operation, the value `11`, is saved in a new variable, with name `n_mosquitoes`.

Note that above we exploited that what you do in one cell transfers to the next:
- we defined `n_alice_mosquitoes` in one cell
- and used it in a computation in the next cell

We cannot use a variable we have not defined before:

In [7]:
n_mosquitoes = n_tim_mosquitoes + 10

NameError: name 'n_tim_mosquitoes' is not defined

The fact that we can re-use variables across cells allows us to successively build up our analysis code, cell by cell.

Suddenly, Trudy comes and claims that she found 3 more mosquitoes!!
We update the total count with this value and should now have a total count of 14: 1 from alice, 10 from Bob, and 3 from Trudy. 

But the code produces a count of 4. What's wrong with this code??

In [9]:
n_mosquitoes = n_alice_mosquitoes + 3  # add Trudy's mosquitoes 
n_mosquitoes

4

### What do we need to know to detect spikes using python?

![](voltage_trace.png)

- [x] __Present data in code (individual voltage values, manipulate them and store the results) - variables__
- [ ] Compare variables (voltage to threshold) - boolean values
- [ ] Perform different actions based on the value of a variable (only keep the position if the voltage exceeds the threshold) - if-else statements
- [ ] Present and access data in a time series of voltage values - lists
- [ ] Perform an action for each element in a sequence of values (inspect voltage values one-by-one) - for loops
- [ ] Separate data and logic so we can use the same code for new recordings - functions
- [ ] Apply this to multi data files
- [ ] Plot and save the results
