# Introduction to Python

## First - what are Jupyter Notebooks?

This course makes extensive use of Jupyter Notebooks hosted on Microsoft Azure. Azure-hosted Jupyter Notebooks provide an easy way for you to experiment with programming concepts in an interactive fashion that requires no installation of software by students on local computers.

Jupyter Notebooks are divided into cells. Each cell contains either text written in the Markdown markup language or a space in which to write and execute computer code. Because all the code resides inside code cells, you can run each code cell inline rather than using a separate Python interactive window.

> **Note**: This notebook is designed to have you run code cells one by one, and several code cells contain deliberate errors for demonstration purposes. As a result, if you use the **Cell** > **Run All** command, some code cells past the error won't be run. To resume running the code in each case, use **Cell** > **Run All Below** from the cell after the error.

### How to run cells

In [1]:
print("Press 'Run' button or shift + enter")

Press 'Run' button or shift + enter


### Command mode vs. Edit mode

There are two main modes of navigating. **Command mode** is the default and let's you navigate and edit at the page level.

**Edit mode** lets you modify what's inside of individual cells. Enter Edit mode by pressing "Enter" and exit back into Command mode by pressing "Escape".

### How to create new cells

Create new cells by pressing the "+" button, "a" to create above or "b" to create below in Command mode.

this is my cell

In [None]:
## Markdown

### Markdown vs. code cells

There are two main cell types in Jupyter notebooks: Markdown and Code. 

[**Markdown**](https://daringfireball.net/projects/markdown/) cells contain text stored in Markdown format.

**Code** cells contain code and are run using whatever interpreter the kernel is set for.

You can switch cells to Markdown by pressing "m" and to code by pressing "y."

Find out more about how to user Jupyter notebooks (and helpful keyboard shortcuts by bringing up the help menu by pressing "h".

## Comments

In [None]:
# this is the first comment
spam = 1  # and this is the second comment
          # ... and now a third!
text = "# This is not a comment because it's inside quotes."
print(text)

## Python basics

### Arithmetic and numeric types

> **Learning goal:** By the end of this subsection, you should be comfortable with using numeric types in Python arithmetic.

Python is an interpreted language, which means that you can interactively use the interpreter to get immediate results. You can see this by using the Python interpreter as a simple calculator: type an expression, and you can see the output immediately.

Jupyter notebooks run different "kernels" - or environments with a programming language. We're running Python 3.6 in this kernel. That means the Python 3.6 interpreter runs every time we execute a cell.

#### Python numeric operators

In [None]:
2 + 3

**Share**: What is the answer? Why?

Order of operations still works in Python!

In [None]:
30 - 4 * 5

**Share**: What is the answer? Why?

In [None]:
7 / 5

What programming languages are you familar with? Was it a surprise?

**Floor Division**

You can perform a type of division that returns an integer: [floor division](https://docs.python.org/3.6/glossary.html#term-floor-division). Floor division uses the `//` operator, discards any remainders, and just returns an `int`.

In [None]:
7 // 5

**Mixed type operations**

Python (like other programming languages) has different numeric types. Integer numbers (such as `1`, `3`, and `20`) have type [`int`](https://docs.python.org/3.6/library/functions.html#int). Numbers with a fractional component (such as `3.0` or `1.6`) have type [`float`](https://docs.python.org/3.5/library/functions.html#float).

In [None]:
3 * 3.5

In [None]:
7.0 / 5

**Remainder (modulo)**

In [None]:
5 % 5

**Exponents**

For exponents, use the `**` operator. For example, you can write $5^2$ as:

In [1]:
5 ** 2

25

In [2]:
2 ** 5

32

**Share**: What is the answer? Why?

In [3]:
-5 ** 2

-25

In [4]:
(-5) ** 2

25

In [5]:
(30 - 4) * 5

130

### Variables

**Share**: What is the answer? Why?

In [6]:
length = 15
width = 3 * 5
length * width

225

**You don't need to declare variable types**

**Duck typing** in Python: "If it walks like a duck and it quacks like a duck, then it must be a duck". 

In [7]:
length = 15
length

15

In [8]:
type(length)

int

We can also change the variable type:

In [9]:
length = 15.0
length

15.0

In [10]:
type(length)

float

In [11]:
length = 'fifteen'
length

'fifteen'

In [12]:
type(length)

str

What if we try to use a variable we haven't assigned yet?

**Share**: What will happen? Why?

In [13]:
n

NameError: name 'n' is not defined

**Previous Output** - A Jupyter feature:

In [14]:
tax_rate = 11.3 / 100
price = 19.95
price * tax_rate

2.25435

In [None]:
price + _

Note that you should always treat the `_` variable as read-only. Explicitly assigning a value to it will create an independent local variable with the same name and will mask the built-in variable (and its behavior).

In [None]:
round(_, 2)

**Multiple Variable Assignment**, **Variable augmentation**

In [None]:
a, b, c, = 3.2, 1, 6
a, b, c

In [None]:
x = 5
x = x + 1  # Un-pythonic variable augmentation
x -= 3  # Pythonic variable augmentation
x

Augmented assignment can be by numbers other than 1, and you can do the same thing with other operations: -=, \*=, /=, %=, and \**=. 

Try making a cell below this and playing around with different augmentation assignments until this concept makes sense.

### Expressions

**Share**: What is the answer? Why?

In [None]:
2 < 5

*(Run after learners have shared the above)*

**Python Comparison Operators**:
![all of the comparison operators](./Images/Screen%20Shot%202019-09-10%20at%207.15.49%20AM.png)

In [None]:
type(2 < 5)

**Complex Expressions**

In [None]:
a, b, c = 1, 2, 3
a < b < c

**Built-In Functions**

In [None]:
min(3, 2.4, 5)

In [None]:
max(3, 2.4, 5)

**Compound Expressions**

In [None]:
1 < 2 and 2 < 3



### Exercise:

**Think, Pair, Share**
1. Quietly think about what would happen if you flipped one of the `<` to a `>`. 
2. Share with the person next to you what you think will happen. 
3. Try it out in the code cell below. 
4. Share anything you thought was surprising.

In [None]:
# Now flip around one of the simple expressions and see if the output matches your expectations:
1 < 2 and 3 < 2

**Or and Not**  
**Share**: What is the answer? Why?

In [None]:
1 < 2 or 1 > 2

In [None]:
not not (2 < 3)

### Exercise:
**Think, Pair, Share**
1. Quietly think about what would the results would be. *Tip: Use paper!*
2. Share with the person next to you what you think will happen. 
3. Try it out in the code cell below. 
4. Share anything you thought was surprising.
5. Instructor Demo

In [None]:
# Play around with compound expressions.
# Set i to different values to see what results this complex compound expression returns:
i = 7
(i == 2) or not (i % 2 != 0 and 1 < i < 5)


> **Takeaway:** Arithmetic operations on numeric data form the foundation of data science work in Python. Even sophisticated numeric operations are predicated on these basics, so mastering them is essential to doing data science.

## Strings

> **Learning goal:** By the end of this subsection, you should be comfortable working with strings at a basic level in Python.

Besides numbers, Python can also manipulate strings. Strings can be enclosed in single quotes (`'...'`) or double quotes (`"..."`) with the same result. Use `\` to escape quotes; that is, use `\` in order to use quotation marks within the string itself:

In [None]:
'spam eggs'  # Single quotes.

In [None]:
'doesn\'t'  # Use \' to escape the single quote...

In [None]:
"doesn't"  # ...or use double quotes instead.

The output string is enclosed in quotes and special characters are escaped with backslashes. 

Even though the output sometimes looks different, the two strings are equivalent. 

Internally, the string is enclosed in double quotes if it contains a single quote and no double quotes; otherwise, it’s enclosed in single quotes. The [`print()`](https://docs.python.org/3.6/library/functions.html#print) function produces a more readable output by omitting the enclosing quotes and by printing escaped and special characters:

In [None]:
'"Isn\'t," she said.'

In [None]:
print('"Isn\'t," she said.')

**Pause**

Notice the difference between the previous two code cells when they are run. 

We can use *raw strings* by adding an `r` before the first quote:

This means characters prefaced by `\` aren't interpreted as special characters

In [None]:
print('C:\some\name')  # Here \n means newline!

In [None]:
print(r'C:\some\name')  # Note the r before the quote.

### Notes about quotes

New features are added to Python via a review process called [Python Enhancement Proposals (PEPs)](https://www.python.org/dev/peps/pep-0001/), a form of [Requests for Comments (RFC)](https://en.wikipedia.org/wiki/Request_for_Comments). Like RFCs, some PEPs are informational. In Python, style guidelines are laid out by [PEP 8](https://www.python.org/dev/peps/pep-0008/). PEP8 sets standards for many aspects of Python development, including casing for variables and functions, tabs or spaces (spaces), and comments.

PEP8 does not put forth an opinion on single versus double quotes, however. As a general rule, it's best to choose one for your development, and only differ if you need to use a quote in your string. (In other words, if you are using single quotes, and need to put a single quote into a string literal, use double quotes for that string as an exception.)

### String literals

**Think, Pair, Share**

In [None]:
3 * 'un' + 'ium'

### Concatenating strings

In [None]:
'Py' 'thon'

In [None]:
prefix = 'Py'
prefix + 'thon'

In [None]:
prefix 'thon'

### String indexes

**Think, Pair, Share**

In [None]:
word = 'Python'
word[2]

In [None]:
word[0]
type(word[0])
len(word[0])

|P|y|t|h|o|n|
|-|-|-|-|-|-|
|0|1|2|3|4|5|

**Share**

In [None]:
word = 'Python'
word[5]

**Share**

In [None]:
word[-1]

In [None]:
word[-2]

In [None]:
word[-6]

### Slicing strings

Python also supports *slicing*, which extracts a substring. To slice, you indicate a *range* in the format `start:end`, where the start position is included but the end position is excluded:

|P|y|t|h|o|n|
|-|-|-|-|-|-|
|0|1|2|3|4|5|

**Think, Pair, Share**

In [None]:
word = 'Python'
word[0:2]

**Share**

In [None]:
word[2:5]

If you omit either position, the default start position is 0 and the default end is the length of the string:

|P|y|t|h|o|n|
|-|-|-|-|-|-|
|0|1|2|3|4|5|

**Share**

In [None]:
word[:2]

In [None]:
word[4:]

In [None]:
word[:]

In [None]:
word[-2:]

This characteristic means that `s[:i] + s[i:]` is always equal to `s`:

**Share**

In [None]:
word[:2] + word[2:]

In [None]:
word[:4] + word[4:]

**Share**

In [None]:
word[1:2] +" " + word[2:5]

![](https://i.kym-cdn.com/photos/images/newsfeed/001/255/097/022.jpg)

**TIP**

Think of indices as pointing between characters, with the left edge of the first character numbered 0. Then the right edge of the last character of a string of *n* characters has index *n*. For example:

**Share**

In [None]:
word = 'Python'
word[42]

**Share**

In [None]:
print(word)

In [None]:
word[4:42]

In [None]:
word[42:]

**Strings are [Immutable](https://docs.python.org/3.6/glossary.html#term-immutable)**

In [None]:
word[0] = 'J'

In [None]:
word[2:] = 'py'

**Share** 

Why is this different?

In [None]:
word2 = 'J' + word[1:]
print(word2)

In [None]:
word[:2] + 'Py'

In [None]:
_

**Built-In Function: len**

In [None]:
s = 'supercalifragilisticexpialidocious'
len(s)

**Built-In Function: str**

In [None]:
type(2)

In [None]:
type(2.5)

In [None]:
str(2)

In [None]:
type(str(2))

In [None]:
str(2.5)

In [None]:
type(str(2.5))

> **Takeaway:** Operations on string data form the other fundamental task you will do in data science in Python. Becoming comfortable with strings now will pay large dividends to you later as you work with increasingly complex data.

## Other data types

> **Learning goal:** By the end of this subsection, you should have a basic understanding of the remaining fundamental data types in Python and an idea of how and when to use them.

So far, we've just looked at strings and numbers - common data types in any language.

The other data types that we will now look at -- **lists**, **tuples**, and **dictionaries** -- set Python apart from C++ or Java by providing powerful and easy-to-use built-in data structures.

### Lists

Python has a number of compound data types, used to group other values. The most versatile is the [*list*](https://docs.python.org/3.5/library/stdtypes.html#typesseq-list), which can be written as a sequence of comma-separated values (items) between square brackets. 

Lists might contain items of different types, but usually the items all have the same type.

In [None]:
squares = [1, 4, 9, 16, 25]
squares

In [None]:
squares = [1, 4, 9, 16.0, '25']

**Indexing and Slicing is the Same as Strings**

In [None]:
squares[0] 

In [None]:
squares[-1]

In [None]:
squares[-3:]

All slice operations return a new list containing the requested elements. This means that the following slice returns a new (shallow) copy of the list:

In [None]:
new_list = squares[:]
new_list[0] = "new"
new_list

In [None]:
squares

**Think, Pair, Share**

In [None]:
new_list = squares + [36, 49, 64, 81, 100]
print(squares)

**Lists are [Mutable](https://docs.python.org/3.5/glossary.html#term-mutable)**

In [None]:
cubes = [1, 8, 27, 65, 125] # This has a mistake!
4 ** 3 

**Think, Pair, Share**

In [None]:
# How would we replace the incorrect value?
cubes[3] = 64
cubes

**Replace Many Values**

**Share**

In [None]:
letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
letters

In [None]:
letters[2:5] = ['C', 'D', 'E']
letters

**Share**

In [None]:
letters[2:5] = [] # assigning multiple values to an empty list
letters

In [None]:
letters[:] = [] # assigning all values to an empty list
letters

**Built-In Functions: len**

In [None]:
letters = ['a', 'b', 'c', 'd']
len(letters)

**Nesting Lists**

**Think, Pair, Share**

In [None]:
a = ['a', 'b', 'c']
n = [1, 2, 3]
x = [a, n]
x

`x` is a list of lists, and you can access its constituent lists through the same indexing you use with simpler lists:

**Share**

In [None]:
x[0]

**Share**

In [None]:
x[0][0]

### Exercise:

Nested lists come up a lot in programming, so it pays to practice.

In [None]:
# Which indices would you include after x to get ‘c’?
# x[0][0] gives 'a'


In [None]:
# How about to get 3?


### List object methods

Python includes a number of handy functions that are available to all lists.

For example, [`append()`](https://docs.python.org/3.6/tutorial/datastructures.html) and [`extend()`](https://docs.python.org/3.6/tutorial/datastructures.html) enable you to add to the end of a list, much like the `+=` operator:

**Share**

In [None]:
beatles = ['John', 'Paul']
beatles.append('George')
beatles

**Share** - how is this different?

In [None]:
beatles2 = ['John', 'Paul', 'George']
beatles2.append(['Stuart', 'Pete'])
beatles2

Let's try this instead:

In [None]:
beatles.extend(['Stuart', 'Pete'])
beatles

[`index()`](https://docs.python.org/3.6/tutorial/datastructures.html) returns the index of the first matching item in a list (if present):

In [None]:
beatles.index('George')

The [`count()`](https://docs.python.org/3.6/tutorial/datastructures.html) method returns the number of items in a list that match objects you pass in:

In [None]:
beatles.count('John')

**Share**

There are two methods for removing items from a list. The first is [`remove()`](https://docs.python.org/3.6/tutorial/datastructures.html), which locates the first occurrence of an item in the list and removes it (if present):

In [None]:
beatles.remove('Stuart')
beatles

The other method for removing items from lists is the [`pop()`](https://docs.python.org/3.6/tutorial/datastructures.html) method. If you supply `pop()` with an index number, it will remove the item from that location in the list and return it; otherwise, `pop()` removes the last item in a list and returns that:

In [None]:
beatles

In [None]:
last_item = beatles.pop(0)

### Obtaining a method signature

In Jupyter notebooks, you can obtain information about a method by putting a `?` after the name

In [None]:
beatles.pop?

In [None]:
last_item

The [`insert()`](https://docs.python.org/3.6/tutorial/datastructures.html) method enables you to add an item to a specific location in a list:

**Share**

In [None]:
beatles

In [None]:
beatles.insert(1, 'Shana')
beatles

**Share**

In [None]:
beatles.reverse()
beatles

In [None]:
beatles.sort()
beatles

Note that you can supply your own **lambda function** to `sort()` for use in comparing items in a list. We will cover lambda functions in Section 2.

### Exercise:

In [None]:
# What happens if you run beatles.extend(beatles)?
beatles.extend(beatles)
beatles

In [None]:
# How about beatles.append(beatles)?
beatles.append(beatles)
beatles

### Tuples

**Tuples** are an immutable data type. Immutable data types are useful to protect constant data from being overwritten on accident or to improve performance for iterating over data.

You create tuples much as you do lists, only using parentheses instead of brackets.

In [None]:
t = (1, 2, 3)
t

**Tuples are Immutable**

**Share** - What will happen?

In [None]:
t[1] = 2.0

**Share** - What will happen?

In [None]:
t[1]

**Share** - What will happen?

In [None]:
t[:2]

**Creating Tuples from Lists**

In [None]:
l = ['baked', 'beans', 'spam']
l = tuple(l)
l

**Creating Lists from Tuples**

In [None]:
l = list(l)
l

### Membership testing

The `in` operator lets you test lists and tuples for the membership of specific data. 

**Share**

In [None]:
tup = ('a', 'b', 'c')
'b' in tup

In [None]:
lis = ['a', 'b', 'c']
'a' not in lis

### Exercise:

What happens if you run: 

In [None]:
lis in lis

Is that the behavior you expected? If not, think back to the nested lists we’ve already encountered.

Feel free to experiment for a bit to understand why this is.

### Dictionaries

**Dictionaries** map information between unique keys and values. 

You create dictionaries by listing zero or more key-value pairs inside of braces, like this:

In [None]:
dict = {'key1' : 'value1'}
dict

Keys for dictionaries can be three things: 
1. strings
1. numbers
1. tuples (that contain only strings, numbers, or other tuples). 

Dictionary *keys* must be immutable, so lists cannot be used as keys in dictionaries, for example.

In [None]:
capitals = {'France': ('Paris', 2140526)}
capitals

**Share**

In [None]:
capitals['Nigeria'] = ('Lagos', 6048430)
capitals

### Exercise:

Now try adding another country (or something else) to the capitals dictionary:

In [None]:
# your code here:
capitals['England'] = ('London', 6000000)
capitals

**Interacting with Dictionaries**

In [None]:
capitals['France']

In [None]:
capitals['Nigeria'] = ('Abuja', 1235880)
capitals

In [None]:
len(capitals)

Similar to the `pop()` method for lists, the `popitem()` method randomly removes a key from the dictionary, along with its associated value:

In [None]:
capitals.popitem()

In [None]:
capitals

> **Takeaway:** Regardless of how complex and large the data you work with, these basic data structures will be your means for handling and manipulating it. Comfort with these basic data structures is essential to being able to understand and use Python code written by others.

## List comprehensions

> **Learning goal:** By the end of this subsection, you should understand how to economically and computationally create lists.

Sometimes, it makes more sense to generate a list algorithmically. Imagine we wanted list of numbers from 1 to 1,000. Rather than type those out, we can use a *list comprehension* to generate it.

This is how we would generate the list algorithmically with a for loop:

In [None]:
numbers = []
for x in range(1,1001):
    numbers.append(x ** 2)

numbers
# numbers[990:]

This is how it looks using a list comprehension:

In [None]:
numbers = [x ** 2 for x in range(1,1001)]

numbers[990:]

Here's how we break down what's happening:

In [None]:
numbers = [x for x in range(1,11)] # Remember to create a range 1 more than the number you actually want.
numbers

numbers = [x for x in range(1,11)]
numbers = [x for x in [1,2,3,4,5,6,7,8,9,10]]
numbers = [1,2,3,4,5,6,7,8,9,10]

We can do operations on the items generated as well:

In [None]:
for x in range(1,11):
    print(x*x)

In [None]:
squares = [x*x for x in range(1,11)]
squares

In [None]:
squares = [x*x for x in range(1,11)]
squares = [x*x for x in [1,2,3,4,5,6,7,8,9,10]]
squares = [1*1,2*2,3*3,4*4,5,6,7,8,9,10]
# squares = [1,2,9...]

**Think, Pair, Share**

In [None]:
odd_squares = [x*x for x in range(1,11) if x % 2 != 0]
odd_squares

In [None]:
odd_squares = [x*x for x in range(1,11) if x % 2 != 0]
odd_squares = [x*x for x in [1,2,3,4,5,6,7,8,9,10] if x % 2 != 0]
odd_squares = [x*x for x in [1,2,3,4,5,6,7,8,9,10] if x is odd]
odd_squares = [x*x for x in [1,3,5,7,9]]
odd_squares = [1*1, 3*3, 5*5, ...]

### Exercise:

In [None]:
# Now use a list comprehension to generate a list of odd cubes
# from 1 to 2,197
odd_cubes = [x**3 for x in range(1,15) if x %2 !=0]
odd_cubes

> **Takeaway:** List comprehensions are a popular tool in Python because they enable the rapid, programmatic generation of lists. The economy and ease of use therefore make them an essential tool for you (in addition to a necessary topic to understand as you try to understand Python code written by others).

### Importing modules

> **Learning goal:** By the end of this subsection, you should be comfortable importing modules in Python.

If you quit the Python interpreter and open it again, the definitions you have made (your functions and variables) will be lost. Similarly, you might also want to use a handy function that you’ve written in several programs without copying its definition into each program.

To support this, Python has a way to put definitions in a file and use them in a script or in an interactive instance of the interpreter. Such a file is called a [*module*](https://docs.python.org/3/tutorial/modules.html). Definitions from a module can be imported into other programs or modules.

For example, the `factorial()` function is not one of the standard functions built into Python. It is part of the Python [`math`](https://docs.python.org/3/library/math.html) module. So, when we run `factorial()` before importing `math`, we get an error:

In [None]:
factorial(5)

In [None]:
import math
math.factorial(5)

Notice that we still have to prepend `math` to the front of the `factorial()` function. We can use a different method to import that specific function from the `math` module and use it as if it were defined in our program:

In [None]:
from math import factorial
factorial(5)

BAD!

In [None]:
from math import *
factorial(5)


> **Takeaway:** There are several Python modules that you will regularly use in conducting data science in Python, so understanding how to import them will be essential (especially in this training).

You can add more cells to your notebook by clicking the **insert cell below (+)** button at the top of the window. The Python [`math`](https://docs.python.org/3/library/math.html) module has many functions in it. Try importing some of the other math functions and playing around with them.