# Part 1a: An Introduction to Python

Let's get started with some Python fundamentals. A fill-in-the-blank version of this notebook is available on [Google Colab](https://colab.research.google.com/drive/1eoKBDSuBBIuroIhSChXZnAhHNmccPi-Z).

### Some Fun Facts

<img src="https://cms.qz.com/wp-content/uploads/2019/05/Guido-van-Rossum-e1558635088256.jpg?quality=75&strip=all&w=410&h=231" align="right" width="20%"/>

- created by a guy named Guido Van Rossum from the Netherlands
- officially released in 1991
- Guido named the language "Python" after the British comedy group Monty Python
- companies that use Python on its backend: Instagram, Netflix, Dropbox, Reddit, Pinterest, and many more

### Main Features

- general purpose, the "swiss army knife" of programming languages
- relies heavily on whitespacing (unlike Javascript, it doesn't use semi-colons)
- open source and free to use
- contains packages (e.g., pandas, numpy, etc.) that make it easier to write code

### Python 2 vs. 3

- Python 2 has been discounting as of January 1, 2020
- Python 3.7 is currently the gold standard





## Data Types

In this tutorial, we're going to walk through the most commonly used data types in Python:

- integers
- floats
- booleans
- strings

If you're interested in learning more about Python's built-in data types, check out the official Python 3 documentation [here](https://docs.python.org/3/library/stdtypes.html).

### 1. Numeric Types

The most common numeric types in Python are `int` (integers) and `float` (floating point numbers). Integers are numbers without a fractional component. 

In [1]:
x = 10

type(x)

int

Floats are real numbers written in scientific notation with decimals. This is useful when more precision is needed. 

In [2]:
y = 3.14159

type(y)

float

Let's create two variables, `n_apples` and `cost`, and assign different numeric values to each variable name.

In [3]:
n_apples = 10

assert type(n_apples) == int

In [4]:
cost_per_apple = 3.55

assert type(cost_per_apple) == float

Now, let's print the value of these variables, along with their data type. We can get the data type of a value by wrapping it inside Python's built-in `type()`.

In [5]:
print(f"n_apples, {n_apples}, is type {type(n_apples)}")
print(f"cost_per_apple, {cost_per_apple}, is type {type(cost_per_apple)}")

n_apples, 10, is type <class 'int'>
cost_per_apple, 3.55, is type <class 'float'>


As expected, variable `n_apples` is an `int` while `cost_per_apple` is a `float`. 

The print statements above use something called f-strings. Why is it called f-string? If you notice in the code above, the string inside the print statement is preceded by an `f` - this puts it in "f-string mode". To embed an expression in your string, you need to wrap it inside squiggly brackets `{ }`. The f-string is only available in Python 3.6 or greater. It lets you embed Python expressions inside string literals in a readable way. Before Python 3.6, you would have to use `%-formatting` or `.format()` to embed expressions inside strings, which was much more verbose and prone to error. If you want to learn more about f-strings, check out this [link](https://realpython.com/python-f-strings/). 

If we want to convert a float to an integer (or vice versa), we can easily do so by wrapping the variable in `int()` or `float()`. Let's try this out:

In [6]:
print(f"{cost_per_apple} --> {int(cost_per_apple)}")
print(f"{n_apples} --> {float(n_apples)}")

3.55 --> 3
10 --> 10.0


In Python, it's possible to mix integers and floats in an arithmetic operation. So you don't need to worry about converting these numeric types into a common format. Let's test it out with our variables `n_apples` (an integer) and `cost_per_apple` (a float).

In [7]:
total_cost = n_apples*cost_per_apple
total_cost

35.5

We can see that the output of `n_apples * cost_per_apple` is a float. This is because `n_apples`, which was originally an integer, gets converted to a `float` when it gets multiplied with `cost_per_apple`. 

Here's a complete list of arithmetic operations in Python:

- **Addition:** gets the sum of the operands

```
x + y
```

- **Subtraction:** gets the difference of the operands

```
x - y
```

- **Multiplication:** gets the product of the operands

```
x * y
```

- **Division:** produces the quotient of the operands and returns a float

```
x / y
```

- **Division with floor:** produces the quotient of the operands and returns an integer (rounds down)

```
x // y
```

- **Exponent:** raises the first operand to the power of the second operand

```
x ** y
```

### 2. Boolean Type

Python's boolean (`bool`) can have one of two values: `True` and `False`. 

In [8]:
is_vegetarian = True 

type(is_vegetarian)

bool

In [9]:
is_vegan = False

type(is_vegan)

bool

#### Comparing Values with Boolean Expressions

A `boolean` expression evaluates a statement and results in a boolean value. For example, the operator `==` tests if two values are equal.

In [10]:
is_vegetarian == is_vegan

False

The `!=` operator tests if two values are **not** equal.

In [11]:
is_vegetarian != is_vegan

True

You can also compare two numeric values using:
 - `>` (greater than)
 - `<` (less than)
 - `>=` (greater than or equal to)
 - `<=` (less than or equal to)

In [12]:
n_donuts = 10
n_muffins = 5

In [13]:
n_donuts >= n_muffins

True

In [14]:
n_donuts < n_muffins

False

#### Comparing Strings with Boolean Expressions

Interestingly, you can also compare two strings. The evaluation goes by alphabetical order so the "larger" item would be higher up in the alphabet.

In [15]:
server = 'Anne'
host = 'Jim'

In [16]:
server > host

False

In [17]:
server < host

True

### 3. Strings

Text is stored as type str (string). We think of a string as a sequence of characters. We write strings as characters enclosed with single or double quotes.

In [18]:
his_name = "John Doe"

In [19]:
her_name = 'Jane Smith'

In [20]:
print(f"{his_name} and {her_name} are friends.")
print(type(his_name), type(her_name))

John Doe and Jane Smith are friends.
<class 'str'> <class 'str'>


If a string contains an apostrophe, we can use double quotes to define the string and use a single quote character in the string.

In [21]:
phrase = "It's snowing outside"
print(phrase)

It's snowing outside


#### Adding Strings

We can also join strings together by "adding" them.

In [22]:
'ab' + 'cd'

'abcd'

#### Multiplying Strings

We can also create multiple copies of a string by "multiplying" them by an integer.

In [23]:
'Hello'*3

'HelloHelloHello'

#### Strings within Strings

If we want to see if a shorter string is inside a longer string, we can use the `in` operator.

In [24]:
'el' in 'Hello'

True

In [25]:
'le' in 'Hello'

False

We can also see if something is `not in` a string.

In [26]:
'test' not in 'Python is cool'

True

#### Strings as Lists

Strings can sometimes be treated as lists. Similar to lists, you can get the length of a string which counts the number of characters in the string.

In [27]:
text = 'I like Monty Python'
len(text)

19

You can also iterate over a string, which treats each character as a separate element of a list.

In [28]:
for c in 'string':
    print(c + ' :)')

s :)
t :)
r :)
i :)
n :)
g :)


#### Built-in String Functions

Strings also have some built-in functions that are useful when you're analyzing text.

You can replace a string with another string.

In [29]:
text = 'I like Monty Python'
text.replace('like', 'love')

'I love Monty Python'

You can convert a string to all upper case.

In [30]:
text.upper()

'I LIKE MONTY PYTHON'

You can also convert it to all lower case.

In [31]:
text.lower()

'i like monty python'

If your string has trailing whitespace, it can clean this up using `strip`. 

In [32]:
whitespace_string = '    hello  .  '
whitespace_string.strip()

'hello  .'

Want to learn more about Python strings? Check out this [article](https://realpython.com/python-strings/). 

## Sequence Types

The standard seuqence types in Python are lists, tuples, and sets.

### 1. Lists

Lists are constructed with square brackets, separating items with commas. 

In [33]:
tally = [10,12,5,17]

To get the length of a list, you can use `len()`.

In [34]:
len(tally)

4

You can get the max and min values of a list using `max()` and `min()`.

In [35]:
print(f"Max value: {max(tally)}, Min value: {min(tally)}")

Max value: 17, Min value: 5


You can also sort elements within a list using the `sorted()` function.

In [36]:
sorted(tally)

[5, 10, 12, 17]

#### LIsts are Ordered

Lists are ordered which means that the order of elements within a list is part of a list's identity. You can have two lists with the exact same elements but if the order of elements are different, these lists are not the same. Let's demonstrate this with an example.

In [37]:
list1 = [1,2,3,4]
list2 = [4,3,2,1]

list1 == list2

False

`list1` and `list2` are not equal to one another since the order of their elements are different. 

#### Accessing Elements within a List

You can access elements in a list by referencing its index. The index of a list starts at 0, which is probably different from what you're use to if you come from an R or Matlab background.

In [38]:
shopping_bag = ['apple', 'carrots', 'milk', 'berries', 'grapes']

print(f"index 0: {shopping_bag[0]}")
print(f"index 2: {shopping_bag[2]}")

index 0: apple
index 2: milk


A list can have negative indices too. A negative list index counts from the end of a list. The figure below shows the default positive list indices on the bottom and the negative list indices on the top. 

<img src="assets/imgs/indexing_list.png" width="65%"/>

You can slice a list using list indices. If `shopping_bag` is a list, the expression `[m:n]` returns the portion of `shopping_bag` from the index `m` to BUT not including index `n`. Let's try this out. 

In [39]:
shopping_bag[1:3]

['carrots', 'milk']

The code above returns 'carrots' and 'milk', which are represented by indices 1 and 2. It didn't return index 3 because the second number of the slice is non-inclusive. To include index 3, we would have to update the slice to `[1:4]`:

In [40]:
shopping_bag[1:4]

['carrots', 'milk', 'berries']

#### Finding Elements in a List

You can check to see if an element exists inside a list using the `in` operator.

In [41]:
'carrots' in shopping_bag

True

In [42]:
'c' in shopping_bag

False

Tuples are constructed by the comma operator (not within square brackets), with or without enclosing parentheses, but an empty tuple must have the enclosing parentheses, e.g., a, b, c or (). 

#### Iterating Over Lists

There are several ways to iterate over a list. The traditional approach is to use a for loop.

In [43]:
for item in shopping_bag:
    print(f"-{item}")

-apple
-carrots
-milk
-berries
-grapes


If you also need the element's index in your for loop, you can access it using `enumerate`.

In [44]:
for i, item in enumerate(shopping_bag):
    print(f"{i+1}) {item}")

1) apple
2) carrots
3) milk
4) berries
5) grapes


Another way to iterate over a list is to use list comprehension. This is a one liner that is useful when you're applying a simple operation to each element in your list. For example, let's make all elements inside `shopping_bag` upper case.

In [45]:
[item.upper() for item in shopping_bag]

['APPLE', 'CARROTS', 'MILK', 'BERRIES', 'GRAPES']

#### Lists are Mutable

An important feature of a list is that it's mutable. This means that elements within a list can be added, deleted, or changed after being defined.  

In [46]:
friends = ['Max', 'Mike', 'Mindy', 'Moore']

In [47]:
friends[0] = 'Wayne'
friends

['Wayne', 'Mike', 'Mindy', 'Moore']

To remove the last element of a list, you can "pop" it:

In [48]:
friends.pop()

'Moore'

In [49]:
friends

['Wayne', 'Mike', 'Mindy']

If you wanted to remove a specific element from your list, you can use the `remove()` method. 

In [50]:
friends.remove('Mike')
friends

['Wayne', 'Mindy']

You can also insert a new element using `insert`. The first parameter is the index where you want the new element to be inserted. The second parameer is the element to be inserted.

In [51]:
friends.insert(0, 'Cindy')

In [52]:
friends

['Cindy', 'Wayne', 'Mindy']

You can append another list to your list using the notation below.

In [53]:
more_friends = ['Lanny', 'Manny']

In [54]:
friends += more_friends

In [55]:
friends

['Cindy', 'Wayne', 'Mindy', 'Lanny', 'Manny']

### 2. Tuples

Tuples are another sequence type in Python which are constructed with round brackets (unlike square brackets for lists).

In [56]:
t = (10,12,13)

The main difference between a list and tuple is that a tuple is immutable. This means that you can't change elements within a tuple after it's defined. Let's look at the first element of our tuple.

In [57]:
t[0]

10

If we try to assign it another number, we'll get an error:

In [58]:
t[0] = 3

TypeError: 'tuple' object does not support item assignment

#### Tuple Assignment

One cool feature of tuples is that you can assign a tuple of variables to a tuple of values. If we have a tuple containing several pieces of information, we can assign each one to a different variable.

In [59]:
(patient_id, last_name, birth_year, heart_rate, has_flu) = (2103, "Taylor", 1952, 120, True)

In [60]:
print(f"Patient {patient_id} was born in {birth_year} and has a heart rate of {heart_rate}.")

Patient 2103 was born in 1952 and has a heart rate of 120.


This type of multiple variable assignment is called "tuple unpacking". This only works if we have the same number of values and variables. If we try to unpacker a larger tuple of values into a smaller number of variables, we'll get an error. We'll also get an error when we try to unpack a smaller tuple of values into a larger number of variables.

Tuple unpacking is especially useful when we write a function that returns more than one value. Check out how it works in the example below.

In [61]:
def calculate_temperature(celcius):
    """Converts temperature measured in Celcius
    and returns the Kelvin and Fahrenheit equivalents. 
    """
    kelvin = celcius + 273.15
    fahrenheit = celcius*9/5 + 32
    return kelvin, fahrenheit

toronto_temp = -10

temp_k, temp_f = calculate_temperature(toronto_temp)
print(f"Temperature in Toronto is {toronto_temp}C, {temp_k}K, {temp_f}F.")

Temperature in Toronto is -10C, 263.15K, 14.0F.


### 3. Sets

A set is a special sequence type which is constructed with curly brackets. Unlike lists and tuples, elements in a set must be unique. There cannot be two of the same element. Also, elements in a set will always be rearranged from lowest value to highest value. Let's try it out here.

In [62]:
numbers = {3,9,2,4,3}
numbers

{2, 3, 4, 9}

Sets are unordered which means that the order in which the elements are arranged does not matter. If the elements in two sets are identical, then we can say that Set A is equivalent to Set B. 

In [63]:
set_a = {3,2,4,3,9}
set_b = {4,9,4,3,2}
set_a == set_b

True

Sets are great for finding the number of unique elements in a sequence. Let's say you have a list of beverage orders.

In [64]:
beverages = ['coffee', 'tea', 'tea', 'coffee', 'espresso', 
             'latte', 'latte', 'latte', 'latte', 'coffee']

But you only want the unique beverages in the list. You can easily convert the list to a set to filter out the repeated beverages. 

In [65]:
unique_beverages = set(beverages)
unique_beverages

{'coffee', 'espresso', 'latte', 'tea'}

In [66]:
print(f"There are {len(unique_beverages)} unique beverages.")

There are 4 unique beverages.


#### Unions and Intersections

Let's say we're working with two sets. One contains beverages from `Office A` while the other is from `Office B`.

In [67]:
office_a = {'coffee', 'latte', 'espresso', 'iced tea'}
office_b = {'coffee', 'cappucino', 'frappucino', 'chai tea', 'latte'}

There is some overlap in beverages between the two sets. If we want to combine these sets and look at all unique beverages between the two offices, we can use the [union](https://docs.python.org/2/library/stdtypes.html#frozenset.union) operation. 

In [68]:
office_a.union(office_b)

{'cappucino',
 'chai tea',
 'coffee',
 'espresso',
 'frappucino',
 'iced tea',
 'latte'}

Which beverages were ordered by both `Office A` and `Office B`? We can get the answer to this using the [intersection](https://docs.python.org/2/library/stdtypes.html#frozenset.intersection) operation.  

In [69]:
office_a.intersection(office_b)

{'coffee', 'latte'}

Both offices ordered coffee and a latte.

Which beverages were ordered by `Office A` but not `Office B`? 

In [70]:
office_a - office_b

{'espresso', 'iced tea'}

How about beverages that were ordered by `Office B` but not `Office A`?

In [71]:
office_b - office_a

{'cappucino', 'chai tea', 'frappucino'}