# All of Python - The Basics

This notebook will introduce enough of Python and programming concepts to enable you to write basic programs.

We are taking a "breadth first" approach, rather than "depth first."

## Built-In Data Types

At a high level, there are two main data types: numbers and words. However, there are some complexities which novice programmers often miss. We will also introduce the boolean type and mention `None`. For the purpose of this calss, we will skip types such as bytes, complex data types, etc.

### Numbers
There are actually _two_ common number types: integer style numbers and decimal style numbers (aka floating point numbers). In Python, these are called `int` and `float`. Integers don't have any decimal parts while floating point numbers do.

As we will see in the computer architecture lecture, the algorithms and actual, physical hardware to do math on integers and floating point numbers is different.

**Warning** Floating point numbers are not exact. This becomes extremely important when we need to compare numbers. If you add 10 cents and 20 cents, do you get 30 cents? We will look at the surprising answers later.

Some languages refer to floating point numbers as doubles. Lower level programming languages have a whole heirarchy of numeric types to represent the range of values numeric types can take on. As we will see in the computer architecture lecture, numbers can not be arbitrarily large (or small).


Use the function `type()` to find the type of an expression

In [None]:
type(100)

In [None]:
type(3.14)

In [None]:
type( 100 + 3.14 )

**Exercise** What is the type of `100.000`?

In [None]:
#Try it here

**Extended exercise** Use the function `dir()` to get a list of operations which can be done on an expression

In [None]:
dir(100) # Why are there so many __XXX___ style entries?

### Words and characters
Python has a single type to represent words and characters. Many lower level languages have one type to represent single characters (usually called `char`) and another to represent collections of characters, such as words, sentences, web pages, books, etc. This data type is usually called a `string` (sometimes abbrevaited as `str` or caplitalized as `String`). Imagine a `string` as a long strand, connecting characters.

![](images/best-mommy-ever-jewelry.jpg)

Python has an `str` type to represent strings (and characters).

In [None]:
type("hello!")

In [None]:
type('hello!')

**Exercise**

In [None]:
type(hello) #<== What is going on here?

In [None]:
type("A")

In [None]:
type('A')

**Exercise** What is the type of 1 vs "1"?

In [None]:
#Try it here

**Exercise** What happens if we add to strings?

In [None]:
"hello" + "world"

In [None]:
"1" + "2" # What is going on here?

**Extended exercise**

In [None]:
dir("hello!")

### Boolean (True and False)
Python is among those languages which provide a way to represent `True` and `False` directly. This data type is unexpectedly common in programming languages. For example, if you ask Python if 5 is greater than 3, the answer will be a boolean value (hint: True). Numeric values have associated operations we are used to from school: addition, subtraction, etc. String types have natural function associated with them such as upper case, lower case, combining strings, etc. Similarly, there is a "boolean algebra." We will introduce this later in the lecture.

In many languages, even widely used ones like `C`, there is no explicit boolean type. Instead, the number `0` is used to represent false and `1` is used to represent true.

Although Python has `True` and `False` keywords, they are actually just aliases for `1` and `0`, respectively!

**Real world examples**

While numbers are strings are natural to us, the boolean type needs some context. As you will see in the example below, comparing things requires an answer that is either true or false. 

Since computers are so good at doing repetitive tasks (executing loops), we need to tell the computer when to stop executing a loop. This is done by using booleans: keep doing something, until the value of a boolean is set to false.

Although such statements haven't been introduced yet, you have probably heard of if/else statements. This is one of most common ways computers choose between options. Booleans are an integral part of such decision. `if` a boolean value is `True`, then do this thing, `else` do something else. The operations carried out by your program depend on the value of a boolean value.

In [None]:
type(True)

In [None]:
type(False)

In [None]:
5 > 3 # Is 5 greater than 3?

In [None]:
type(5 > 3)

**Basic operations**

Python provides the following comparison operators:

In [None]:
1 < 2 # is 1 less than 2?

In [None]:
1 > 2 # is 1 greater than 2?

In [None]:
1 <= 2 # is 1 less than or equal to 2?

In [None]:
1 >= 2 # what is this?

Keep in mind that you obviously don't need to compare 1 and 2, you already know the answer, However, such comparisons are useful when one or both terms are variables. Notice in an earlier lecture where we used comparison operators to check if the record we were processing belonged to Arya Stark or other characters of interest.

**Possibly confusing**

In [None]:
1 == 2 # is one EQUAL to 2?

Notice that `=` is used for assignment of variables. Such as setting x equalt to 10 `x = 10`. However, checking if one thing is equal to another is done via two equal signs `1 == 2`

In [None]:
1 != 2 #is 1 NOT EQUAL to 2? or is 1 different from 2?

Boolean statements, such as above can be combined 

In [None]:
homer_age = 34
marge_age = 32
bart_age  = 10
lisa_age  = 8
maggie_age = 1

In [None]:
marge_age < homer_age and marge_age > bart_age # Marge's age is between Homer and Bart

In [None]:
#Is there anyone who is younger than lisa?
anyone_younger_than_lisa = lisa_age > homer_age or lisa_age > marge_age or lisa_age > bart_age or lisa_age > maggie_age
anyone_younger_than_lisa

In [None]:
is_lisa_youngest = not anyone_younger_than_lisa
is_lisa_youngest

**Exercise**

In [None]:
dir(True) # Does this list look similar to a dir(123)?

**Boolean algebra**

If you have a true statement, such as "Homer Simpsons is a dad." And you combine it with another true statement "Marge Simpsons is a mom," is the resulting statement True or False? (hint: you already know the right answer)

In [None]:
homer_is_a_dad = True
marge_is_a_mom = True

marge_is_a_dad = False
homer_is_a_mom = False


In [None]:
print(homer_is_a_dad and marge_is_a_mom)
print(homer_is_a_dad and marge_is_a_dad)
print(marge_is_a_dad and homer_is_a_dad)
print(homer_is_a_mom and homer_is_a_mom)

In [None]:
print(homer_is_a_dad or marge_is_a_mom)
print(homer_is_a_dad or marge_is_a_dad)
print(marge_is_a_dad or homer_is_a_dad)
print(homer_is_a_mom or homer_is_a_mom)

Here is the full **truth table**:

|Statement A | Statement B| A and B | A or B|
| ---        | ---        | ---     | ---   |
| True       | True       | True    | True  |
| True       | False      | False   | True  |
| False      | True       | False   | True  |
| False      | False      | False   | False |

## Variables

Recall from the 'first programs' lecture that variables are necessary to answer questions, like the number of people who met their match in Arya Stark. Variables store values which your program needs to recall. You may also think of a variable as a name assigned to a value. You can type out 3.14, again and again in your program. Wouldn't it be more informative to use `pi` instead? 

You could go through a file (which may have thousands of lines) and assign a variable `current_line` to whichever line you are processing.

In [None]:
x = 100

In [None]:
x

In [None]:
x + 2

Once a variable is assigned a value, you should be able to substitute the variable any place that value is expected

In [None]:
type(x) # Is there a difference between type(x) and type(100) or dir(x) and dir(100)?

Notice that different values can be assigned to variabls (variables can _vary_ )

In [None]:
x = 200

In [None]:
x

You can create an expression where a variable is assigned to itself. This is used when you are aggregatign values in a loop.

In [None]:
x = x + 1

In [None]:
x

**Exercise** The formula for maximum safe heart rate is `220 - age`. In the cell below, set the age variable and execute the cell to find out what your max heart rate

In [None]:
age = ???
MHR = 220 - age
MHR

## Functions

We will break up the study of functions into two subjects: how to use functions and how to create them. In this tutorial, let's see how to use functions.

Functions are so fundamental to programming that we have already used them several times in this tutorial.

When we find the absolute value of an integer `abs(-10)`, we are using the `abs` function. When we check the `type("hello")` of a string or an integer, we are using the `type` function. When we want to find out which operations are valid for a data type, using `dir(100)`, we are using the `dir` function.

Think of functions as a machine. We provide some input to the machine and it provides us with some output. We don't always concern ourselves with how this machine works. How does `abs` remove the negative sign? How does `dir` figure out which operations are valid. We can safely ignore these details, until we have an actual need to understand these functions.

In [None]:
abs(-10)

Functions are _called_, _invoked_ or _executed_ in the following manner:

`result = function_name(argument1, argument2, argument3)`

A function has a name (we will study nameless functions later). We pass it some input, more often called _arguments_ or _parameters_.

Once the function is done executing, it _returns_ a value.

Recall from an earlier lecture that, in Jupyter notebook, executing `function_name?` gives you documentation for that function. In some cases, `function_name??` gives you the source code for the same function.

#### Importing functions

Python has a number of built-in functions. We have already seen `type` and `dir`. 

This page describes the full list, along with their docs: https://docs.python.org/3/library/functions.html

Notice that math functions such as `abs` (absolute value) and `round` (return rounded number) as available. Surely python has other functions, such as ceiling, floor, log, power, sin, cos and many other middle school level functions? Where are they?

In order to keep functions organized, python puts them in relevant modules (also called libraries). For example, Python provides amodule called `math`, which contains _many_ useful math functions. In order to use those function, you have to tell your program that you would like to `import` a relevant module.

In [None]:
sqrt(100) # What happened?

In [None]:
floor(3.14)

In [None]:
import math #import the math module, all function accessible by prefixing the module name

In [None]:
math.sqrt(100)

In [None]:
floor(3.14)

In [None]:
math.floor(3.14)

In [None]:
from math import * # import all functions from math and make them accessible without a prefix

In [None]:
floor(3.14)

In [None]:
import math as m # import the math module, but change the prefix name to m

In [None]:
m.floor(3.14)

This link provides information about all built-in modules: https://docs.python.org/3/py-modindex.html

#### Methods

Notice that so far in this calss, we have called functions to ways: by themselves and using the _dot_ notation. For example:

In [None]:
abs(-100)

In [None]:
"hello world".split(" ")

Hopefully, you can tell from context that `"hello world".split(" ")` calls the `split` function, which understands that it is operating on the string `"hello world"`. 

When a function is called via the dot notation, it is called a method, and it is operating on an object. We will introduce objects and object oriented programming a little further down the line. For now, understanding this usage from context should be enough.

#### Third-party modules

When we discuss relative quality of a programming languages, their ecosystem of libraries is often extremely important. As powerful as programming languages are, if one had to rely only on built-in function, thousands of programmers would find themselves re-inventing the wheel countless times.

Luckily, programmers, such as you and I, can create our own modules (aka libraries) and distribute them for others to use. Recall from an earlier lecture that getting information about Game Of Thrones characters took a bit of work. However, when we used the Pandas library, we were able to answer the same question in a single line of code! This is because there were others before us who were annoyed enough at having to write so much code that they decided to _abstract_ away the complexity for us.

People have created libraries for practically every branch of mathematics, data science, physics, game development, communication. Whatever you can think of, someone has created a library for it.

In a later lecture, we will learn more about this ecosystem.

## Container types (aka data structures): lists and dictionaries

We often need to keep track of several items. In an earlier lecture, we saw that we needed to keep a list of names. In another scenario, we need to keep track of numbers per person. This section describes some basic container types, also known as data structures (in computer science literature).

### Dictionaries
Recall this program from an earlier lecture:

In [None]:
#How many people did Arya Stark and Jon Snow kill

jon = 0 #variable containing Jon's score
arya = 0 #variable containing Arya's score

#Open file
file = open("../../datasets/deaths-in-gameofthrones/game-of-thrones-deaths-data.csv", encoding='utf8')

#Go through each line in file
for line in file:
  tokens = line.split(',') #separate line into columns
  if tokens[4]=="Arya Stark": arya = arya + 1
  if tokens[4]=="Jon Snow": 
    jon = jon + 1

file.close()
print("Arya killed", arya, "people")
print("Jon killed", jon, "people")

We have a variable for jon and for arya. But what if we wanted to calculate how many people _everyone_ killed (evidence for their trials)? We need a way to _dynamically_ create variables. These variables will need to be assigned values and those values will need to be updated.

Almost all programming languages provdie a way to do this via _dictionaries_. Some languages call them maps (map a key to a value), some call them associative arrays. The keyword for such types in Python is `dict`.

In [None]:
got_killers = {"arya": 1278, "jon": 112}

In [None]:
got_killers["arya"]

You can create an empty dictionary via `my_dict = {}` or `my_dict = dict()`.

In [None]:
simpson_ages = dict()

In [None]:
simpson_ages

At this point, `simpson_ages` is a dictinoary with nothing in it

| Keys | Values |
|------|--------|
|      |        |


In [None]:
simpson_ages["Homer"] = 36 #Set the value of key "Homer" to 36

The dictionary is no longer empty

| Keys | Values |
|------|--------|
|  Homer    |  36      |

In [None]:
simpson_ages["Homer"] # Get the value of key "Homer"

We haven't modified the dictionary

| Keys | Values |
|------|--------|
|  Homer    |  36      |

In [None]:
simpson_ages["Marge"] # What happened?

In [None]:
person_name = "Marge"
person_age = 34

simpson_ages[person_name] = person_age
simpson_ages[person_name]

The dictionary now looks like this:

| Keys | Values |
|------|--------|
|  Homer    |  36      |
|  Marge    |  34      |

You have seen how to create an empty dictionary, then add items.
You can also create a dictionary with items already in it:

In [None]:
my_second_dictionary = {"key1": "value1", "key2": "value2", "key3": "value3", 4:"value4"}

In [None]:
my_second_dictionary["key1"]

In [None]:
my_second_dictionary[4]

**Exercise** Add 2 years to Homer's age in `simpson_ages` (use the `+` operator)

In [None]:
# Try it here

**Exercise** Create a new dictionary, call it `last_names` and use this table to fill it:


| Keys | Values |
|------|--------|
|  Homer    |  Simpson      |
|  Marge    |  Simpson      |
|  Ned    |  Flanders      |
|  Barney    |  Gumble      |

**Exercise** Say we want to create a combined database of The Simpson's and The Flinstones. What happens when you add "Barney Rubble" to this dictionary? (Hint: Try it and look for Mr. Gumble and Mr. Rubble)

**Exercise** What is the `type` of `last_names`?

**Extended exercise** What operations are possible  on `last_names`? (hist: `dir`)

### Lists

Now that you understand dictionaries, what if you want a list of all the names in the dictionary `simpson_ages`? Notice that you can call simpson_ages.keys() to get this list.

In [None]:
simpson_ages.keys()

List is extremely important container type, so important that some languages are built around it.

Recall that dictionaries are created either using the `dict()` keryword or curly braces `{}`. Similarly, lists can be created via `list()` or square brackets `[]`

In [None]:
programming_languages = list() # or programming_languages = []

At this point, we have created an empty list:

[  ]

In [None]:
programming_languages.append("python")

The list now has an element:

["python"]

In [None]:
programming_languages.append("R")
programming_languages.append("julia")

The list now has three items:

["python", "R", "julia"]

If we knew exactly what we wanted to put in this list, we could have created it like this:

In [None]:
programming_languages = ["python", "R", "julia"]
programming_languages

In the next section, we will learn how to `loop` through each item in a list (or dictionary). If you want to access a single item at a known location, you can `index` into the list.

In [None]:
programming_languages[1]

In the previous statement, you asked for the item at location 1. Did you get what you expected? Try the next line:

In [None]:
programming_languages[0]

Python is a zero-index language! Many programming languages (and programmers), start counting at zero, instead of one! R is one of the exceptions, as it starts counting at 1 (like most humans).

If you want to get a range of items from a list (say, the first 2), you can use the following syntax

In [None]:
programming_languages[0:2]

In [None]:
programming_languages[0:] #Start at zero (first item), go until the end

Here, instead of indexing with a single number, you are using `starting_point:ending_point` syntax, where `ending_point` is not inclusive (so the value at `ending_point` will NOT be included).

Finally (for this quick tutorial), you can access items from the end of a list with negative numbers.

In [None]:
programming_languages[-1] #This should get you the last item

In [None]:
programming_languages[-2] #This should get you second from the last item

In [None]:
programming_languages[-2:] #This should get you the last two items (start at -2, end at the end)

Recall this example from an earlier lecture, which recorded the names of people who were killed by Jaime. Notice where and how `list` is being used.

In [None]:
killed = [] # list data type

file = open("../../datasets/deaths-in-gameofthrones/game-of-thrones-deaths-data.csv", "r", encoding='utf8')

for line in file:
  tokens = line.split(',')
  if tokens[4]=="Jaime Lannister":
    name_of_killed = tokens[3]
    killed.append(name_of_killed)

file.close()
print(killed)

**Exercise** What is the type of `programming_languages`?

Did you notice that we used a type of list at the beginning of the lecture: strings!
While python strings are not identical to lists, their _interface_ (operations and function which can be used) is very similar! Everything you learn about accessing items from a list will apply to strings (but strings can't be modified)

In [None]:
homer = "Homer Simpson"

In [None]:
homer[0]

In [None]:
homer[0:5]

Many functions return lists. One you have already seen is the `split` function.

In [None]:
"All models are wrong, some are useful".split(" ")

**Extended exercise** What operations can be done on lists, such as `programming_languages`?

**Exercise** Which function finds the length of a list? (hint: it is a built-in function, use google to help you find this)

### Type conversion

Say you read a file which contains a line containing this: "Homer Simpson". You know that is a string type. What if the line contains 100? When python gets input from the outside world, it has no idea if it is dealing with an integer, float, a picture or an audio file of a song. It assumes everything is a string. Lucky for us, there are several functions which convert from one type to another.

In [None]:
type(100)

In [None]:
type("100")

In [None]:
type(int("100"))

Notice that the `int` function converted a string to an integer. What if we hadn't convert "100" to a numeric value?

In [None]:
"100" + "200"

In [None]:
int("100") + int("200")

In [None]:
int("100.2")

In [None]:
float("100.2")

In [None]:
float("100")

In [None]:
type(float("100")) # Do you expect the output of this cell to be 'int' or 'float'?

How about the other way, given an integer or a float, how do we convert it to a string?

In [None]:
"Homer is " + 34 + " years old"

In [None]:
"Homer is " + str(34) + " years old"

In [None]:
str(1234.567)

## Control flow: if/else and loops

Programs which never make any decisions and never execute a line of code more than once are not very interesting. It can be argued that what sets a programming language apart from a calculator is its ability to jump over some lines (using if/else statements) or operate on some value a number of times (loops)

### Loops

Once you have see lists, loops are an obvious next step. Given a list of things, you need to _go through them, one by one and operate on each item_. This process is called a `loop`.

For example, given a list of items, how would you capitalize each entry?

In [None]:
for name in ['homer', 'marge', 'bart', 'lisa', 'maggie']:
    print(name.capitalize())

print("Done with the loop")

What you just saw is a _for loop_. Since the list above has 5 entries, the two lines above execute 5 times.

The syntax looks like this:

```python
for VARIABLE in LIST:
   EXECUTE_SOME_STATEMENT
   EXECUTE_SOME_OTHER_STATEMENT
   ...
```

Some important things to notice:
* The first line of the loop ands in the ':' character
* All _indented_ lines immediately below the loop are considered part of the loop.
* The value of _VARIABLE_ changes with each _iteration_ of the loop

![](images/loop_diagram.png)

**Exercise** Take a look at programs from notebook _first_programs_ and find all the loops

**Exercise** Describe the output of the code below before executing it (in a later lecture, we will see cleaner method of combining text and variables):

In [None]:
for word in "this sentence has a few words in it".split(" "):
    print("Length of the word", word," is", len(word))

**Exercise** What is the `range` function and how can you use it in a loop?

**Exercise** What gets returned when you call `.keys()` on a dictionary? How can you use it in a loop?

### Conditional statements: if/else

Popular media seems to have already made "if, else statements" part of every day parlance. Let's see how to use them in Python.

In [None]:
homer_age = 34
marge_age = 32
bart_age  = 10
lisa_age  = 8
maggie_age = 1

is_maggie_youngest_child = maggie_age < bart_age and maggie_age < lisa_age

if is_maggie_youngest_child:
    print("I'm a baby!!!")
else:
    print("I'm NOT a baby")

The general syntax of a conditonal is:

```python
if BOOLEAN_VALUE:
  Excute if BOOLEAN_VALUE is true
elif BOOLEAN_VALUE2:
  Excute if BOOLEAN_VALUE2 is true
elfif...
else:
  Excute if none of the above boolean values was true
```

![](images/ifelse_diagram.png)

* Keep in mind that `elif` and `else` statements are optional
* Indentation is necessary
* You can execute as many statements after an if/elif/else clause as you like

**Exercise** Find all instances of a conditional statement in notebook "first_programs"

**Homework**
1. Please list all locations in the filegame-of-thrones-deaths-data.csv
2. Please list all allegiances in the same file (second to last column)
3. Please show the number of killings per season.
4. Print out this cheatsheet and keep it handy: https://perso.limsi.fr/pointal/_media/python:cours:mementopython3-english.pdf
5. Book mark this page and start to go through it (you may ignore, for now, things you don't understand): https://learnxinyminutes.com/docs/python3/


**Reference**

Necklace image retrieved from https://www.designsbyleigha.com/name-necklace-snake.html

Diagram editor: https://mermaidjs.github.io/mermaid-live-editor