# What `Python` is...

- Interpretable
- High-level
- (Relatively) easy to learn
- Very popular (web, data science and more) -> a lot of resources, tutorials, etc
<img src="img/python-usage.svg">

...and what it is not:

- NOT the most efficient language (speed/memory), but it pretty good if you do things the right way
- NOT very straightforward (can behave in somewhat unpredictable ways, may try to coerce errors, instead of raising, example: `numpy` broadcasting)



# Basics overview

## Notebook's cell output

As you noticed, a cell will display output of the script inside, when you run. However, it will display only the output of the **last** line. If last line doesn't have output, it won't display anything. Compare the following two cells:

In [1]:
x = 10
x

10

In [2]:
x = 10
x
y = 9

If you want to actually display something, you need to say it explicitly with `print` function. Note that in this case the message is not an "output" *per se* (which you can notice by the fact that on the left it doesn't say `Out[]:`), it is just printed. You can `print` infinitely many things, but show output of only one per cell.

In [3]:
x = 10

# this value is just printed
print(x)

y = 5

# this is going to be displayed as an output
y

10


5

**Pro-tip**: When you're not sure what the function's output, just put it in a separate cell and run it. If it has an output, if will show up as an output of the cell.

## Container data types

### Lists

In [30]:
X = [0,5,10,15,20]
X

[0, 5, 10, 15, 20]

In [32]:
X[2]

10

In [34]:
X[3] = -99
X

[0, 5, 10, -99, 20]

In [49]:
X[-1]

20

In [50]:
X[1:4]

[5, 10, -99]

#### A note on assignment

Variables allow you to store value. Or do they? In fact, variables are just a pointer (a *reference*) to an object in memory. Here is an example which can be confusing to a novice:

In [52]:
X = [0,5,10,15,20]
Y = X
Y[1] = -999

print(X)
print(Y)

[0, -999, 10, 15, 20]
[0, -999, 10, 15, 20]


What happens here? We created a list in memory, and variable `X` points to that object. Variable `Y` is assigned the same value as `X`, but the list is not copied, rather `Y` merely points to the same object. We modify the list through `Y` and discover that `X` still points to the same object.

If you want to avoid this, use explicit `.copy()` method on the list:

In [5]:
X = [0,5,10,15,20]
Y = X.copy()
Y[1] = -999
print(X)
print(Y)

[0, 5, 10, 15, 20]
[0, -999, 10, 15, 20]


This is an example of Python giving you more control over memory and pointers. In Matlab and R default behavior is to copy an object, which can lead to some serious memory replication (using a lot of memory for copies of the same objects). Control is good but you need to be aware of this behavior.

### Tuples: immutable lists

We already learned about `list`: they are containers for different types of stuff. There is another type of *in-built* contained data type, called `tuple`. They are denoted with parentheses `()` instead of brakets `[]`. 

In [23]:
# make a tuple
info = ('Sergey', 28, 'Russian', 1989, 1, 9)
info

('Sergey', 28, 'Russian', 1989, 1, 9)

In [24]:
type(info)

tuple

Tuples are very much like lists, except one thing -- they cannot be changed like lists, here is an example:

In [25]:
# let's make a list out of our tuple:
info_list = list(info)
info_list

['Sergey', 28, 'Russian', 1989, 1, 9]

In [26]:
type(info_list)

list

In [27]:
# now try to change something in it: it works
info_list[1] = 29
info_list

['Sergey', 29, 'Russian', 1989, 1, 9]

In [28]:
# let's try to do the same with tuple
info[1] = 29

TypeError: 'tuple' object does not support item assignment

We get an error if we try to change some value in a tuple. The same if we try to add something to it. 

A good question would be -- why do we need to have exactly the same thing as `list`, but which can do LESS than a `list`? It turns out that for many reasons it is very convenient to have some data type, which cannot be changed. We won't go into details here, but if you have something which you don't intend to change, consider making it a `tuple` instead of a `list`. In the very least you won't change it *accidentally*.

### Strings

In [45]:
my_string = 'alphabet'

In [46]:
my_string[0]

'a'

In [48]:
my_string[1]

'l'

### Mapping data types

Another *in-built* container data type is `dict` (short for *dictionary*). `Dict` contains **pairs of things**. Any entry in a `dict` is pair `key`:`value` (in programming this relationship is called *mapping*: a value maps onto the key). Think about it as a real world dictionary -- in an English-Italian dictionary you have a `key` word, e.g. **shirt**, and a `value`, associated with it: **camicia**. And you can find a `value` by addressing the `key`. Just like in the real dictionary, you cannot go the other way and find the word **shirt** by looking up **camicia** -- you would need another, Italian-English dictionary for that. Same with `dict`: `keys` and `values` are not symmetric, you can only get them in one direction `key`->`value`.

Syntax for a `dict` is to put `key:value` pairs inside curly brackets `{}`, with different pairs separated by comma:

In [13]:
{'shirt':'camicia'}

{'shirt': 'camicia'}

In [14]:
info = {'name':'Adina', 'surname':'Drumea', 'lab':'Diamond', 'languages': ['Matlab','C++']}
info

{'lab': 'Diamond',
 'languages': ['Matlab', 'C++'],
 'name': 'Adina',
 'surname': 'Drumea'}

Another way of defining a `dict`. Results are equivalent, so choose whatever you like. Note in this case `keys` need not be strings, but they become strings in the dict:

In [16]:
info = dict(name='Adina', surname='Drumea', lab='Diamond', languages=['Matlab','C++'])
info

{'lab': 'Diamond',
 'languages': ['Matlab', 'C++'],
 'name': 'Adina',
 'surname': 'Drumea'}

We can retrieve values from `dict` by specifying `key` like this:

In [None]:
info['surname']

In [None]:
info['taken_prog_class']

**Note**: Both `key` and `value` can be of any type (with only exception that `keys` cannot be `list` and some other *modifiable* types; this has to do with implementation of `dict` in Python). If `key` repeats, it will override:

In [None]:
{'name':'Adina', 'name':'Marinella'}

Besides storing and retrieving values from `dict`, you can also iterate through `keys` and `values` easily:

In [None]:
for (key, value) in info.items():
    print('The key was:', key)
    print('The value was:', value)
    print('')

`dict` supports a lot of different operations (check full documentation <a href="https://docs.python.org/2/library/stdtypes.html#mapping-types-dict">here</a>). Here are some of them:

In [None]:
# check whether certain key is in the dict
'surname' in info

In [None]:
'age' in info

In [None]:
# return list of keys
info.keys()

In [None]:
# return list of values
info.values()

In [None]:
# add stuff to the dict
info.update({'age':28, 'rooms':324})

In [None]:
# removing stuff from the dict
del info['taken_prog_class']

In [None]:
info

**Note**: The most attentive of you will notice that order of the `key`:`value` pairs has changed when we updated the `dict`. This shows potential pitfall of using `dict`, which you have to be careful about: **`dict` does NOT store the order of inserted pair**! For example, if you try to iterate through the values in the dict (using, you cannot trust that it will iterate in the order in which you inserted the pairs. 

If ever you need to use mapping type which remembers the order, take a look at <a href="https://docs.python.org/2/library/collections.html#collections.OrderedDict">`OrderedDict` from `collections` module</a>. It operates the same way as `dict`, but will keep the order if you iterate.

In [39]:
from collections import OrderedDict
info_ordered = OrderedDict(name='Adina', surname='Drumea', lab='Diamond', languages=['Matlab','C++'])
info_orderedfrom collections import OrderedDict
info_ordered = OrderedDict(name='Adina', surname='Drumea', lab='Diamond', taken_prog_class=False, languages=['Matlab','C++'])
info_ordered

for key, value in info_ordered.items():
    print(key, value)

OrderedDict([('surname', 'Drumea'),
             ('languages', ['Matlab', 'C++']),
             ('lab', 'Diamond'),
             ('name', 'Adina')])

In [40]:
for key, value in info_ordered.items():
    print(key, value)

surname Drumea
languages ['Matlab', 'C++']
lab Diamond
name Adina


## `for` loops and list comprehensions

In Python there is a number of syntax simplifications, which can be used to speed up coding. You will learn those over time, but there is one particularly useful shortcut called *list comprehensions*, which not only speeds up the coding, but also significantly improves code readability. As a consequence it is used ubiquitously. 

It has to do with how we write `for` loops. In particular, consider the following (real life) example. Let's say I recored behavior in a bunch of rats, and for each session I have a name, which contains year, month, day of the session and the codename of the rat in the following format: YYYYMMDDratcode. Example: `20170114S8`, where `S8` is the name of the rat.

In [41]:
sessions = ['20160701S8', '20160702S9', '20160702S8','20160703S10', '20160703S9', '20160703S8']
sessions

['20160701S8',
 '20160702S9',
 '20160702S8',
 '20160703S10',
 '20160703S9',
 '20160703S8']

Now I just want to get the dates of the session, so that I can see on which days I recorded at least 1 rat. I could construct the following loop:

>**Syntax tip**: `append(x)` is a method of type `list`, which will add `x` in the end of the `list`.

In [42]:
# create a new empty list, which we will append later
sessions_date = []
# iterate through every session
for s in sessions:
    # append the new list with the first 8 characters from the session name
    sessions_date.append(s[:8])
    
sessions_date

['20160701', '20160702', '20160702', '20160703', '20160703', '20160703']

Possible, but a bit too tedious. Especially the part where you have to create an empty `list` and then append values there. There is a better way in Python:

In [43]:
[s[:8] for s in sessions]

['20160701', '20160702', '20160702', '20160703', '20160703', '20160703']

This produces the same exact output, but instead of taking several lines it just takes one. It is also quite easy to read once you get a hang of it. See `for s in sessions`, which is exactly the same as in the long `for` loop above, and it does the same thing: iterates through values of the `sessions`, and on each iteration `s` takes value from the list, one after another. And for each iteration, you return `s[:8]`, which is the first 8 characters from `s`. These values are automatically captured in the list. You can assign it to a variable in the same way as any other list:

In [44]:
sessions_date = [s[:8] for s in sessions]
sessions_date

['20160701', '20160702', '20160702', '20160703', '20160703', '20160703']

**Side note**: To follow through with the example, if I wanted to get unique days of the recording, I can use a function `unique` from `numpy` module, which will return only the unique values:

In [None]:
from numpy import unique
print(unique(sessions_date))

*List comprehensions* (or *listcomps* for short) will save you a lot of time and space in your script. You can even do some conditional things inside. Let's say I wanted to return the date ONLY for the rat `S9`. I can use `in` to check presence of a sub-string in a larger string like so:

In [None]:
s = '20160701S9'
'S9' in s

However to go through all sessions and check, I would need a `for` loop with `if` inside (if you want, try to implement it like that as an exercise). Instead we can do the same with listcomp:

In [None]:
[s[:8] for s in sessions if 'S9' in s]

## Functions with default arguments

We saw how to create basic functions. One additional useful trick is to have a function with default arguments. That is, when you call the fucntion, you can specify the argument if you want, but you can also skip it, and it will take on some default value. It is very easy to do in Python: when you define the function's arguments, simple write `<argument-name> = <default-value>` and the argument will take this value if no other value is specified when the function is called. 

Let's see an example. By default, the following function raises `x` to the power `p`. However, there is a twist, you can specify additional argument `verbose`, and if it is `True`, the function will print out what it does. (It is a good practice to add these kinds of options to your functions, because they can help you debug the errors in your code).

In [54]:
def power(x, p, verbose=False):
    if verbose:
        print("evaluating power for x = " + str(x) + " using exponent p = " + str(p))
    return x**p

In [55]:
# we can use the function to just raise 5 to power 2 (equivalent to 5**2)
power(5,2)

25

In [56]:
# but we can also make the function "verbose", so it tells us what it does
power(5,2,True)

evaluating power for x = 5 using exponent p = 2


25

Any number of arguments can have default values, even all of them. Try to understand which values each argument takes and what the function outputs in the following cells:

In [57]:
def power(x=5, p=2, verbose=False):
    if verbose:
        print("evaluating power for x = " + str(x) + " using exponent p = " + str(p))
    return x**p

In [58]:
power()

25

In [59]:
power(6)

36

In [60]:
power(6,3)

216

In [61]:
power(6,3,True)

evaluating power for x = 6 using exponent p = 3


216

## Positional arguments as keyword arguments

When you're creating the function, you always specify the names of the arguments so that you can use them inside the function. These names have another use: when you call the function, you can pass values by using argument names, for example:

In [62]:
power(x=10, p=3, verbose=False)

1000

In [63]:
power(x=3, p=10, verbose=False)

59049

Why would we do it? First, it leads to clearer function calls, you don't need to remember that first argument was `x`, and second was `p`, etc. However, more importantly, if you can only pass arguments based on position, you cannot keep default value for an argument while specifying a non-default value to the argument after that.

Consider the following example: what if I wanted to call own function `power()` with default `x` and `p`, but specifying `verbose=True`? I cannot do `power(True)`, in that case `True` will become the value of `x`, because it is the first argument. But what I can do is the following:

In [64]:
power(verbose=True)

evaluating power for x = 5 using exponent p = 2


25

In that case both `x` and `p` keep their default values. This is extremely useful for functions with many parameters.

Moreover, if I use this type of passing arguments (which is called *keyword arguments* as opposed to *positional arguments*), I don't need to care about the order at all:

In [65]:
power(verbose=True, x=9, p=2)

evaluating power for x = 9 using exponent p = 2


81

In [66]:
# same call as before: order doesn't matter
power(p=2, verbose=True, x=9)

evaluating power for x = 9 using exponent p = 2


81

You can also combine *keyword* and *positional* arguments:

In [67]:
# in this case p will keep default value 
power(10, verbose=True)

evaluating power for x = 10 using exponent p = 2


100

The only thing you cannot do is to pass positional arguments after keyword:

In [68]:
power(verbose=True, 10)

SyntaxError: positional argument follows keyword argument (<ipython-input-68-88ea12f59210>, line 1)

The reasons it is impossible is logically sound: in that case it is ambiguios which argument you're trying to pass as a "positional" `10`.

### Function excapsulation

**Pro-tip**: Python doesn't have complete encapsulation like some other languages. This means that while definitions from inside the function do not "leak" outside (just like I couldn't find variable `D` when I asked for it above), the definitions from **outside** can be freely used inside the functions even if we didn't specify them as inputs to the functions. Example:

In [69]:
def test_function(x):
    local_variable = 'this is local'
    print(x)
    print(outside_variable) # note that it is not part of the input!

outside_variable = 'this is outside'
input_variable = 'this is input'

test_function(input_variable)

print(local_variable)

this is input
this is outside


NameError: name 'local_variable' is not defined

However, this is only true about functions you define in the same script (or notebook) as the one where you use it, like in our case. If you import a function from another place (we will learn how to do it in the next section), you have complete encapsulation. Which is why when you're building a complex script, it is **good practice** to keep your functions in a separate script and import them as needed.