<img src="http://imgur.com/1ZcRyrc.png" style="float: left; margin: 20px; height: 55px">

# Session 2: Python Foundations

## Agenda

#### Part 1: Intro to Python & Data Types
#### BREAK
#### Part 2: Control Flow (loops and functions)

### Learning Objectives (Part 1)

**After this lesson, you will be able to:**
- Discuss Python as a programming language
- Define integers, strings, tuples, lists, and dictionaries
- Demonstrate arithmetic operations and string operations
- Demonstrate variable assignment

### Part 1: Data Types

- [Get the course materials](#get-materials)
- [Why Python?](#why_py)
- [Introduction to Data Types](#intro)
- [Jupyter Notebook](#jupyter_nb)
- [Python Variables](#variables)
- [Operators](#operators)
- [Integers and Floats](#numbers)
- [Strings](#strings)
- [String Indexing](#slicing)
- [Lists](#lists)
- [Tuples](#tuples)
- [Dictionaries](#dictionary)
- [Practise With a Partner](#exercise)
- [Importing Packages and Documentation](#import)

----

<a id="get-materials"> </a>

# Get the course materials

- make sure you can access https://git.generalassemb.ly/GADS-BOH/dat24
- open a terminal or command prompt where you want to save the material
    - the materials will be created in a new `dat24` folder
- type the following:

`git clone https://git.generalassemb.ly/GADS-BOH/dat24.git`

- it may ask you for your username and password for GitHub Enterprise

<a id='why_py'></a>

## Why Python?

Python was created by Guido van Rossum and released back in 1991. Since then, Python has greatly grown as a high-level, general-purpose programming language with a huge open-source community supporting it. The language was developed to emphasize readability of code (specifically, white-space use and syntax).

"The Zen of Python" is a poem that explains the nature of the Python functionality

```python
import this
```

---

```
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
```

The ones that are perhaps most important are:

- Readability counts.
- There should be one-- and preferably only one --obvious way to do it.
    - This is often referred to as being "Pythonic"
- Explicit is better than implicit.

## Why Use Python for Data Science?

These are some of the more prominent reasons Python has been so widely adopted for data science.

**General purpose**

**Open source** 

**Rich ecosystem**

Here are a few examples:
- `pandas`, `numpy`, `matplotlib`, `scikit-learn`: the "Data Science toolkit"
- `requests`: Interacting with websites.
- `django`: Python web framework.
- `pyglet`: GUI application building.
- `tensorflow`: Google's machine learning library.

**Readability**

<a id='intro'></a>
## Introduction: Python Data Types

There are several _standard_ data types within Python, the seven most common being:

Actually, you tell me!

**Integers:** Whole numbers from negative infinity to infinity, such as 1, 0, -5, etc.

**Floats:** Short for "floating point number," usually used with decimals such as 2.8 or 3.14159.

**Strings:** A set of letters, numbers, or other characters, e.g., "The fox is quick."

**Booleans:** True or False values

**Tuples:** An ordered sequence with a fixed number of elements, e.g., in `x = (1, 2, 3)`, the brackets makes it a tuple. `x = ("Kirk", "Picard", "Spock")`

**Lists:** An ordered sequence without a fixed number of elements, e.g., `x = [1, 2, 3]`. Note the square brackets. `x = ["Lord", "of", "the", "Rings"]`

**Dictionaries**: An unordered collection of key-value pairs, e.g., `x = {'Mark': 'Twain', 'Apples': 5}`. To retrieve each value (the part after each colon), use its key (the part before each colon). For example, `x['Apples']` retrieves the value 5.

Throughout this lesson, we will review each data type more in depth and discuss common ways of interacting with each of them.

[Python's basic data types](https://en.wikiversity.org/wiki/Python/Basic_data_types).

<a id='jupyter_nb'></a>
## Jupyter Notebook

Before we get started, let's go over interacting with Python in the Jupyter Notebook.

Launch the notebook server by typing `jupyter notebook` inside your `dat24` folder.

Code cells are run by pressing `shift + enter` or using the Play button in the toolbar. You can also run a cell using `ctrl + enter` in which case the "cursor" stays on the current cell.

In [1]:
# This is a cell.

In [2]:
# Assigning a variable:
v = 1

In [3]:
# Assign another:
ds_ga = 'Data Science is awesome!'

In [4]:
# Run this!
ds_ga

'Data Science is awesome!'

In [5]:
# Print this:
print(v)

1


You can also perform basic maths

In [6]:
45 - 19

26

You can basically evaluate any Python expression in a notebook cell!

<a id='variables'></a>
## Variables

Variables are names that have been assigned to specific values or data. These names can be almost anything you want, but there are some restrictions and best practices.

**Restrictions**
- Variable names cannot be just a number (i.e., `2`, `0.01`, `10000`).
- Variables cannot be assigned the same name as a default or imported function (i.e., '`type`', '`print`', '`for`').
- Variable names cannot contain spaces.

**Best Practices**
- Variable names should be lowercase.
- A variable's name should be representative of the value(s) it has been assigned.
- If you must use multiple words in your variable name, use an underscore to separate them.

In [7]:
# Assigning a float:
x = 1.0
type(x)

float

In [8]:
# Assigning an int:
y = 1
type(y)

int

In [9]:
# Assigning a string:
z = '1'
type(z)

str

**It is critical to remember that, when we're assigning variables, we are not stating that "_x equals 1_"**

**We're stating that "_x has been assigned the value of 1_."**

<a id='operators'></a>
## Operators

"Operators are the constructs (that) can manipulate the value of operands." — [Tutorials Point: Python](https://www.tutorialspoint.com/python/python_basic_operators.htm)

Operators can be used in a mathematical sense to calculate (or create) the sum, difference, or product of values or variables.

In [10]:
# Addition:
print(1 + 2)

# Subtraction:
print(1 - 2)

# Multiplication:
print(1 * 2)

# Division:
print(1 / 2)

3
-1
2
0.5


The `=` sign in Python is known as the assignment operator. It is the means by which we can assign values to variables.

In [11]:
number = 2.0
type(number)

float

In [12]:
# Exponent power operator:
2 ** 2

4

In [13]:
# Modulo/modulus can be used to get the remainder:
5%2

1

**Booleans and Boolean Evaluation Operators** 

Booleans exist as either true or false and are generally used as a means of evaluation.

In [14]:
True and False

False

In [15]:
not False

True

In [16]:
True or False

True

**Comparison Operators**

- Less than: **`<`**
- Greater than: **`>`**
- Less than or equal to: **`<=`**
- Greater than or equal to: **`<=`**
- Equals: **`==`**
- Does not equal: **`!=`**


In [17]:
2 > 1, 2 < 1, 2 > 2, 2 < 2, 2 >= 2, 2 <= 2

(True, False, False, False, True, True)

In [18]:
# equality
[1,2] == [1,2], [1,2] != [2,1]

(True, True)

<a id='numbers'></a>
## Numbers in Python

Integers are whole numbers. 
- 1
- 200
- 100009 

Floats are numbers with decimals. The name "float" comes from "floating point," as the decimal can _float_ the length of the number.
- 1.11
- 26.006
- 3.0

In [19]:
x_int = 1
x_float =1.0

type(x_int), type(x_float)

(int, float)

If an integer or float is compatible, it can be converted to the other type.

In [20]:
float(x_int)

1.0

In [21]:
type(int(x_float))

int

<a id='strings'></a>

## Strings

Strings are essentially any character combination in between quotes. They are most often used as a way of storing text.

In [22]:
s = "Hello world"
type(s)

str

Strings have a lot of associated methods and attributes that allow us to better understand and manipulate them.

In [23]:
# Length of the string:
len(s)

11

In [24]:
# Replace an element of a string:
s2 = s.replace("world", "test")
print(s2)

Hello test


<a id='slicing'></a>


**String Indexing**  

We can extract characters at specific index locations in a string using indexing.

In [25]:
# Indexing the first (index 0) character in the string:
s[0]

'H'

The number you enter after the variable name in brackets (the `[0]`) is called the **index** (its plural is **indices**).

_Counting in Python and many other programming languages begins at zero, as opposed to one. This is called **zero-based indexing**._

In [26]:
# This is called *splicing*. We start at the left index 
#   and go up to but don't include the right index:

# Objects at indexes 0, 1, and 2
s[0:3]

'Hel'

Most ranges or functions with ranges have upper ends that are not inclusive. So, a range of `[0:5]` starts at `0` and stops before `5`.

In [27]:
# From index 6 up to the end of the string:
s[6:]

'world'

In [28]:
# No start or end specified:
s[:]

'Hello world'

What happens when we specify a **negative** index?

In [29]:
s[-1]

'd'

In addition to specifying a range, you can add a step size or character skip rate.

In [30]:
# Define a step size of 2, i.e., every other character:
s[::2]

'Hlowrd'

#### Concatenating
To add two strings together, type the first string, an addition sign, and then the second string.

In [31]:
print('Hello' + ' world')

Hello world


You can do the same with variables that refer to strings.

In [32]:
x = 'Hello'
y = ' world'

x + y

'Hello world'

In [33]:
# Conversion from int to str is required!

dice_roll = 3

print('You rolled a ' + str(dice_roll) + '.')  

You rolled a 3.


There are alternative ways of formatting, which allows us to create a string with placeholder values that we can populate.

Before Python 3.6, it looked something like this:

In [34]:
s3 = 'value1 = {0}, value2 = {1}'.format(3.1415, 1.5)

print(s3)

value1 = 3.1415, value2 = 1.5


But since Python 3.6, we can do (notice the `f` before the quotes, and the use of curly brackets):

In [35]:
value_1 = 3.1415
value_2 = 1.5

print(f"Value 1 is {value_1} and value 2 is {value_2}, {value_1 + value_2}")

Value 1 is 3.1415 and value 2 is 1.5, 4.641500000000001


<a id='lists'></a>


## Lists

Lists are a means of storing ordered data.

Lists can be composed of ints, floats, strings, or other lists, as well as other data types we haven't covered yet.

In [36]:
l = [1, 2, 3, 4]

print(type(l))
print(l)

<class 'list'>
[1, 2, 3, 4]


In [37]:
# The contents of a variable can be reassigned to another variable:
a = l

In [38]:
print(a)

[1, 2, 3, 4]


In [39]:
# List of strings:
names = ['Joseph', 'Bob', 'Rick']
print(names)

['Joseph', 'Bob', 'Rick']


Lists also have several methods that allow us to alter them, such as the `.append()` method, which allows us to add another element to the end of a list.

In [40]:
names.append('John')

In [41]:
names

['Joseph', 'Bob', 'Rick', 'John']

In [42]:
# We can slice a value in a list as well:
names[1][1:]

'ob'

`names[1][1:]`

Note that we always read indexing from left to right. In the example above, the interpreter looks up `names` and gets the first element, which is the string `"Bob"`. Then, the slice (`[1:]`) adds the first index of that string to the end of the original string, evaluating to `"ob"`.

Interestingly, the following works in the same way. Instead of having to look up the value of `names`, the list is directly specified (just read the line from left to right!).

In [43]:
['Joseph', 'Bob', 'Rick', 'John'][1][1:]

'ob'

In [44]:
# Lists don't have to be the same type:
l = [1, 'a', 1.0, 1-1j]
print(l)

[1, 'a', 1.0, (1-1j)]


In [45]:
# We can create a list of values in a range using the "range" function:
start = 10
stop = 30
step = 2
range(start, stop, step)

# range() produces a "generator," which is beyond the scope of this introduction!
# It is often convenient to have the generator 
#    generate all of its values by converting it to a list:
list(range(start, stop, step))

[10, 12, 14, 16, 18, 20, 22, 24, 26, 28]

Here's how we create a list from scratch:

In [46]:
# Create a new empty list:
l = []

# Add an element using append():
l.append("A")
l.append("d")
l.append("d")

print(l)

['A', 'd', 'd']


You can count the number of occurrences of an item in a list with `.count`

In [47]:
l.count("d")

2

In [48]:
# Reassign a range of values with another list:
l[1:3] = ["b", "c"]
print(l)

['A', 'b', 'c']


The `del()` function can be used with a list and index to delete values.

In [49]:
del l[1]

print(l)

['A', 'c']


<a id='tuples'></a>


## Tuples

Tuples are similar to lists in that they store a sequence of various separate values. However, tuples are not mutable in that, once they are created, their values cannot be changed.

In [50]:
point = (10, 20)
print(point)
print(type(point))

(10, 20)
<class 'tuple'>


Oh oh!

In [51]:
point[0] = 2

TypeError: 'tuple' object does not support item assignment

In [52]:
# They can be sliced just like lists and strings:
point[0]

10

Unpacking a variable is a common practice when iterating through Python data types. Unpacking essentially allows us to simultaneously set new variables to items in a list, tuple, or dictionary.  

In [53]:
# Unpacking:
x, y = point

print("x = {}".format(x))
print("y = {}".format(y))

x = 10
y = 20


<a id='dictionary'></a>


## Dictionaries

Dictionaries are a non-ordered Python data type. Instead of using an ordered index to access data stored in a dictionary, we use a system of key-value pairs.

- A key is similar to a variable name. 
- A value is similar to the value assigned to the variable.

Curly braces ({ }) enclose dictionaries. Note: You can also use curly braces to construct a set. The first input in a dictionary pair is the "key." The second input in a dictionary pair is the "value." The general format looks like this:

In [54]:
params = {"key1" : 1.0,
          "key2" : 2.0,
          "key3" : 3.0,}

print(type(params))
print(params)

<class 'dict'>
{'key1': 1.0, 'key2': 2.0, 'key3': 3.0}


The keys stay the same, but the values are changeable. You can also only have one occurrence of a key in a dictionary, but you can have all of the values be the same.

In [55]:
# Value for parameter2 in the params dictionary:
params["key2"]

2.0

In [56]:
# Adding a new dictionary entry:
params["key4"] = "D"

In [57]:
# Print the entirety of the dictionary:
print(params)

{'key1': 1.0, 'key2': 2.0, 'key3': 3.0, 'key4': 'D'}


In [58]:
# Reassigning the value of a key-value pair in the dictionary:
params["key1"] = "A"
params["key2"] = "B"

In [59]:
print("Key 1 = " + str(params["key1"]))
print("Key 2 = " + str(params["key2"]))
print("Key 3 = " + str(params["key3"]))
print("Key 4 = " + str(params["key4"]))

Key 1 = A
Key 2 = B
Key 3 = 3.0
Key 4 = D


In [60]:
# Dictionaries also have methods.

# Convert a dictionary to a list of tuples (key-value pairs).
# This is later used to conveniently loop through a dictionary:
list(params.items())

[('key1', 'A'), ('key2', 'B'), ('key3', 3.0), ('key4', 'D')]

To check if an item is in your dictionary:

In [61]:
"key2" in params

True

In [62]:
"key5" in params

False

<a id='import'></a>

## Importing Packages and Documentation

Not everything we will use is readily available in Python. Sometimes, we'll need to import packages, which are assemblies of functions or additional data types.

In [63]:
import math

x = math.cos(2 * math.pi)
print(x)

1.0


To not have to write `math.` each time, you *can* import the whole module into the current namespace...

In [64]:
from math import *
x = cos(2 * pi)
print(x)

1.0


But you **shouldn't**, because different packages can have functions named the same, which will cause clashes and confusion.

Remember: *explicit is better than implicit*.

A nice compromise is to import only what you need, and then you can use those without the `math.` prefix:

In [65]:
from math import pi, cos

pi

3.141592653589793

There are several ways to look at a module's documentation. Within the Jupyter Notebook, we can use the `help()` function, or you can place your cursor inside of a function and press `shift + tab`.

In [66]:
import math

help(math.cos)

Help on built-in function cos in module math:

cos(...)
    cos(x)
    
    Return the cosine of x (measured in radians).



<a id="exercise"> </a>

## Exercise

In pairs, go through the `02_exercise_data_types` notebook.