# Intro to Python

## UTM Coders, Feb 5 2019

**Authors**: Ahmed Hasan and James Santangelo, borrowing heavily from Madeleine Bonsma-Fisher, Lina Tran, Charles Zhu, and Amanda Easson

---

## The interpreter

### Math with _integers_

In [None]:
# Like most languages, Python can be used as a calculator
2 + 2

In [None]:
2 * 3

In [None]:
2 ** 3

In [None]:
10 / 6

In [None]:
# % is the modulus operator
# x % y -> return the remainder of x / y
10 % 3

### Math with _floats_

Floats = floating point numbers

In [None]:
# Arithmetic operations involving floats always return floats
2.0 * 3

In [None]:
2.5 * 3

## Variables

In [None]:
year = 2019 # This is and integer
name = 'James' # This is a string

In [None]:
print(year) # Print is a function included in Python's standard distribution

print(name)

In [None]:
print(type(name))

In [None]:
print(type(2.0))

In [None]:
print(type(year))

In [None]:
# how exactly does print work again?

help(print)

## Lists and tuples

In [None]:
# lists - square brackets!

fruits = ['apple', 'orange', 'mango']

In [None]:
print(fruits)
print(type(fruits))

In [None]:
# Lists can hold different data types
misc = [42, 'python!', 2.57]
print(misc)

In [None]:
# concatenating lists
print(fruits + fruits)
print(fruits + misc)

In [None]:
# Tuples - parentheses!

fruit_tuple = ('apple', 'orange', 'mango')
print(fruit_tuple)

## Indexing lists and strings

In [None]:
# indexing in python begins at 0! 

print(fruits[0]) # Retrieve first element in the list
print(type(fruits[0]))

In [None]:
# slicing multiple things
# start : end (exclusive!)

# recall that fruits = ['apple', 'orange', 'mango']

print(fruits[0:2])
print(fruits[0:])
print(fruits[:3])

print(fruits[-1]) # last item of a list
print(fruits[-2]) # second to last item

In [None]:
# reassigning item in list
fruits[2] = 'banana'
print(fruits)

In [None]:
# Unlike lists, tuples do not support reassignment
fruit_tuple[2] = 'banana'
print(fruits)

In [None]:
# slicing and indexing strings

my_string = 'This is a string.'

print(my_string[1:])

In [None]:
# strings do not support reassignment
# try running this: my_string[0] = 't'

## Dictionaries

In [None]:
# dictionaries allow us to store key value pairs

fruit_colors = {'banana': 'yellow',
                'apple': 'red',
                'orange': 'orange'}

In [None]:
# keys are 'looked up' using square brackets

print(fruit_colors['banana'])

In [None]:
# additional keys can be added after the fact
fruit_colors['lemon'] = 'yellow'
print(fruit_colors)

In [None]:
# Dictionaries cannot have duplicated keys
fruit_colors['lemon'] = "red"
print(fruit_colors)

## If statements

In [None]:
# python can check whether certain statements are true or false

2 > 1

In [None]:
# use == to test for equality
1 == 1

In [None]:
x = 5
print(isinstance(x, int)) # A function to verify the type of a variable
print(isinstance(x, str))

In [None]:
# with these expressions, we can construct if statements
# if statements allow our scripts to encode more complex instructions

x = 5
if isinstance(x, int):
    print(x, 'is an integer')


In [None]:
# if-else

if isinstance(x, str):
    print(x, 'is a string')
else:
    print(x, 'is not a string')

In [None]:
# if-elif-else
# useful if we have multiple conditions to test

if isinstance(x, str):
    print(x, 'is a string')
elif isinstance(x, int):
    print(x, 'is an integer')
else:
    print(x, 'is neither a string nor an integer')

## For loops

In [None]:
# for loops allow us to automate repetitive operations

# how do we check which values in this list are even?
nums = [1, 2, 3, 4]

# could check them individually?
print(nums[0] % 2)
print(nums[1] % 2)

In [None]:
# for loops simplify this
# here, 'number' is a placeholder variable for each of the items in the list

for number in nums:
    if number % 2 == 0:
        print(number, 'is even')
    else:
        print(number, 'is not even')

In [None]:
# we can also loop over the contents of a string
vowels = 'aeiou'
for letter in vowels:
    print(letter)

In [None]:
# You can also iterate through dictionaries

# Recall fruit_colors = {'banana': 'yellow', 'apple': 'red', 'orange': 'orange', 'lemon': 'red'}

for key in fruit_colors.keys():
    print(key, ':', fruit_colors[key])
    
print('')

# Alternative approach
for key, value in fruit_colors.items():
    print(key, ':', value)

## Functions

In [None]:
# functions allow us to generalize operations
# what is the sum of squares of two numbers?

x = 5
y = 7

print((x ** 2) + (y ** 2))

In [None]:
def sum_of_squares(num1, num2):
    ''' (int, int) -> int
    input: two integers
    output: the sum of the squares of the two numbers
    '''
    ss_out = num1 ** 2 + num2 ** 2
    return ss_out

# def is the keyword to define functions
# each function typically ends with a return statement

In [None]:
output = sum_of_squares(x, y) # The function's return is assigned to the variable 'output'
print(output) # our operation from above
print(sum_of_squares(50, 42)) # works with any values we want!

In [None]:
# checking on our docstring
help(sum_of_squares)

## Some useful packages

In [1]:
# use a package by importing it
# these can be given a shorter alias

import numpy as np

In [None]:
# packages provide all sorts of useful functionality
# numpy allows for efficient numerical calculations in python

np_array = np.arange(15)
list_array = list(range(15))

print(np_array)
print(type(np_array))
print(list_array)
print(type(list_array))

In [None]:
# numpy arrays also allow for vectorized operations

print(np_array * 2) # Multiplies each array element by two
print(list_array * 2) # Concatenates lists

In [None]:
# numpy arrays also have helpful 'methods'
# a method is a special function 'attached' to an object, to be used on the object itself

# what's the mean of our array?
print(np_array.mean())

In [None]:
# the max value in our array?
print(np_array.max())

In [None]:
import pandas as pd
import seaborn as sns # we will use this for plotting
# This is a 'magic function' that allows for special line or cell functionality
%matplotlib inline 
iris = sns.load_dataset('iris')
print(type(iris))

In [None]:
# Run this only if above doesn't work (e.g. due to SSL certificate error)
iris = pd.read_csv('https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv')

In [None]:
iris.head()

In [None]:
iris.columns

In [None]:
# pull out specific rows with the .loc method
iris.loc[0:2]

In [None]:
# or rows AND columns
# use a list for multiple columns!

iris.loc[0:2, 'petal_length']

In [None]:
iris.loc[0:2, ['petal_length', 'species']]

In [None]:
sns.relplot(x='petal_length', y='petal_width', data=iris)

In [None]:
sns.relplot(x='petal_length', y='petal_width', hue='species', data=iris)

## Practice questions

1. Division of integers and floats can be done using either the `/` or `//` operators. Divide 10 by 6 using both operators. How do the outputs of `/` and `//` differ?
2. Copy the dictionary below into a code cell. Iterate through the dictionary and print only the key:value pairs for animals that have 0 legs. 
    ```
    animal_legs = {'dog': 4,
                   'cat': 4,
                   'fish': 0,
                   'human': 2,
                   'insect': 6,
                   'spider': 8,
                   'sponge': 0}
    ```
3. One of the best things about Python is the wealth of documentation and tutorials available online for anyone to refer to. With a bit of Googling and reading around, one can find a solution for nearly any Python problem or 'how do I do ___ in Python?' scenario. With this in mind, using our `iris` data frame object, find a way to create a new column called `Sepal Area` that is the product of `Sepal Length` and `Sepal Width`. (Hint: Search results from a website called Stack Overflow are usually a good place to look!)
4. There exists a method for string objects that allows users to replace all instances of a given character with something else. Try to find this method on the Internet and see if you can complete question 2 above with it.
5. A key part of making functions extra powerful and usable is ensuring they handle 'wrong inputs' instead of just blindly ignoring them or breaking. Write a function called `filter_list()` that takes in two arguments -- a list of integers and a cutoff value -- and returns a list that only contains values greater than the cutoff. However, if your function is provided a list that contains even one instance of something that's not an integer, have it print out some sort of error message instead. Remember to remind users what the correct inputs are in your function's docstring! (Note: mastering error handling is a key part of becoming a more versatile Python programmer. If you'd like to have a more advanced look at error handling, [this tutorial](https://en.wikibooks.org/wiki/Python_Programming/Exceptions) is a useful reference)
6. The numpy function `np.random.random_sample()` returns a decimal value between 0 and 1. Use this to write a function called `coin_toss()` that performs a fair coin toss and returns 'heads' or 'tails'.
    - Use your coin_toss function to perform 1000 coin tosses and save their results to a list called `coin_tosses`. Hint: Lists have a helpful method called `.append()` that will modify a list by adding specified input to it (i.e. if `cars` is a list, `cars.append('toyota')` will add 'toyota' to the list. `.append()` is a special method in that you do not need to perform a variable assignment, such as `cars = cars.append('toyota')`, to make it work)
    - Lists also have a method called `.count()` that will count instances of a given input. Use `.count()` to count how many heads and tails were flipped in your `coin_tosses` list. Do the values approach what you would expect? What if you go back and modify your `coin_toss()` function to bias the coin and then perform your 1000 tosses?
6. Similarly, the `numpy` function `np.random.randint()` returns an array of randomly drawn values from a specified range. For instance, `np.random.randint(0, 10, size=5)` will return 5 values between 0 and 9 (the higher value is exclusive). Use this to create a function that performs n die rolls called `roll_dice()`.
    - The `seaborn` function `sns.distplot` is used to plot histograms. Use your `roll_dice()` function to create an array of 1000 dice rolls, and find a way to plot them using `sns.distplot`.