# Introduction to Python
In this module, you will learn the basics of python programming that are useful in doing data science. The course you are taking assumes that you have a familiarity with basic programming skills, but it is possible to pick them up as you go, provided that you are motivated. 

There are many tutorials on python available online, and this should serve only as a quick start guide. The python documentation is an invaluable resource and is a good reference for when you need to know how to do something. The documentation can be found [here](https://docs.python.org/3/library/index.html)

The version of python that we will be using for this course is **Python 3**. Please make sure that you have downloaded the correct version. If you are using Docker (we will cover this later in the course), then be sure you are using a recommended container such as `jupyter/datascience-notebook` or `jupyter/scipy-notebook`

## Importing modules

One of the reasons why Python is such a good language for data science is because it has many rich and powerful libraries such as `numpy`, which is a library for fast numerical programming and provides easy array and matrix functionality, and `pandas`, which is one of the most important modules for data science that allows us to work with `DataFrames`. We can import these modules by executing the following code:

In [None]:
import numpy as np
import pandas as pd

That's it! Notice how we used the `as <alias>` syntax here. This is because in order to call functions from these libraries, you have to prefix them with the name of the library (or its alias) so that python knows where the functions are coming from:

In [None]:
np.mean([1, 50, 100])

As a matter of convention, `numpy` as almost always aliased as `np` and `pandas` is similarly aliased as `pd`. We will try to follow standard practices in this course, so that your code will be readable by others in the Data Science community

## Calculations and Variables

One of the simplest things that python can do is act as a calculator. We can do simple mathematical operations as you might expect:

In [None]:
13 + 12, 11 - 6, 2 * 7 # Addition, Subtraction, and Multiplication

Just a quick note: when we separate our operations by commas `,` the values are returned as a `tuple`, which is an immutable (unable to be changed) sequence of objects in python. We'll talk more about these later. For now, it's simply a way to concisely display some information

In [None]:
12/5, 12//5, 12.//5., 12./5. # Notice that // is integer division!

In [None]:
12 % 5 # Modulo (remainder)

You can figure out the type of any object in python by using the `type` function. For example,

In [None]:
type(12), type(12.), type('12') # integer, float, string

We can also assign values to variables, as you might imagine

In [None]:
a = 12
print(type(a))
print(a)

Here's something interesting. What happens when we take a variable and assign another variable to it? 

In [None]:
a = [1,2,3]
b = a
print(b)


It may look like we made a copy of `a`, and stored it in `b`. However, what happens when we modify `a` and then check `b`?

In [None]:
a.append(4)
print('the value of a is: {}'.format(a))
print('the value of b is: {}'.format(b))

Here `b` changed as well. This is because `b` is pointing to the values that are associated with `a`. If we *modify* the value of `a`, `b` changes as well. However, we if *re-assign* `a`, this behavior doesn't apply. Be careful about this in your code! 

In [None]:
a = [1,2,3]
b = a
# Now completely re-assign a
a = 4

print(a)
print(b)

### Boolean operators

boolean operators return `True` or `False`. For instance,

In [None]:
12 < 11

In [None]:
12 > 11

You can of course do this with variables as well:

In [None]:
a = 12
b = 11

a > b, a < b, a == b, a != b # Greater than, less than, equal to, not equal to

# Lists

Lists are some of the most useful objects in python. Many of the operations that are available on lists are also available on other python objects. Let's start out by creating some lists

In [None]:
empty_list = []
integer_list = [1, 2, 3, 4, 5]
float_list = [1., 2., 3., 4., 5.]
string_list = ['1', '2', '3', '4', '5']

In `Python`, lists can contain objects of multiple types. This makes them flexible, but also potentially dangerous. We can combine lists together by simply adding them together:

In [None]:
combined_list = integer_list + float_list + string_list
combined_list

Lists in `Python` are 0-indexed. Meaning that the "first" element in python is actually accessed by `list[0]`

In [None]:
combined_list[0]

You can get the total length of a list by using `len(list)`

In [None]:
len(combined_list)

What if we try to get the 15th element of this list?

In [None]:
combined_list[15]

Oops! Remember, Python is 0-indexed, so the last element is actually given by

In [None]:
combined_list[len(combined_list) - 1]

Alternatively (and preferably), you can access the last element by using `list[-1]`

In [None]:
combined_list[-1], combined_list[-2]

### List slicing

What if you only want to get out certain elements in a list? These operations are known as *slicing*. You can slice a list in many ways -- but here are the main ones:

In [None]:
combined_list[2:6] # Get the '2nd' element UP TO, but not including, the '6th'

In [None]:
combined_list[:5] # Get every element up to the 5th

In [None]:
combined_list[:-4] # Get every element up to the fourth from the last

In [None]:
combined_list[4:] # Every element from the 4th onward

In [None]:
combined_list[:] # get all elements

In [None]:
combined_list[::2] # Get all elements, but only grab every 2nd

### Iteration

One of the most important properties of lists is that they can be iterated over. For example:

In [None]:
for element in combined_list:
    print(element)
    print('the type of this element is {}'.format(type(element)))

We can also iterate through the indices by using the `range` constructor

In [None]:
for i in range(len(combined_list)):
    print(combined_list[i])

However, python offers a better way to do this, which is the `enumerate` function:

In [None]:
for index, value in enumerate(combined_list):
    print('the index is {}'.format(index))
    print('the value is {}'.format(value))

### Building lists

Oftentimes, we want to be able to build lists up according to some criteria. There are two main ways that this is recommended in Python. The first is by using the `.append` method and the other is through `list comprehensions`. We will cover both here:

In [None]:
my_list = [] # start with an empty list:

for i in range(10):
    my_list.append(i) # We are appending the value of i to our list for each iteration

print(my_list)

In [None]:
# Here is a similar version, but we are going to be more picky about what we append
my_list = []

for i in range(10):
    if i < 5:
        my_list.append(i)
    else:
        my_list.append('>= 5')

print(my_list)

>## Exercise 1:
> #### Construct a list using the above method that contains all odd numbers between 1 and 19 inclusive and print it out

In [None]:
# ====================
# Your code here:

### List Comprehensions

List comprehensions are another way to construct lists that are concise, and very useful! Here is an example of the same lists that were generated above.

In [None]:
my_list = [i for i in range(10)]
print(my_list)

In [None]:
my_list = [i if i < 5 else 'Greater than or equal to 5' for i in range(10)]
print(my_list)

>## Exercise 2:
>#### Construct the same list that you did in question 1, but this time in list comprehension form, and print it out

In [None]:
# ====================
# Your code here:

## Strings

Character `strings` in python are another base type that you may encounter whenever you are dealing with text data. Strings also exhibit many of the characteristics that lists do, in that they have lengths and can be iterated over. Let's see an example of this!

In [None]:
my_string = 'hello world'

In [None]:
len(my_string)

In [None]:
for char in my_string:
    print(char)

In [None]:
my_string[0:7] # We can also index strings the same way

In [None]:
my_string.append('!')

As you can see, `str` objects don't have the `append` methods, which are unique to lists. Here is an excellent resource on [using strings in python](https://realpython.com/python-strings/).

### A note about searching strings

Oftentimes, we are interested with searching through text in some way. In order to do this, it is recommended that you use `regular expressions`. We will learn about these in a later module!

## Dictionaries

Another way to store information in python is known as a dictionary. Dictionaries are unordered key, value pairs. You can spot a dictionary in python when you see the curly braces: `{}`. Because dictionaries are unordered, you cannot slice them in the same way that you can slice lists. In general, dictionaries are very fast data structures. 

You can construct a dictionary in the following way:


In [None]:
my_dict = {'key1': 1, 'key_two': 2, 3:'value 3'}

Note that keys do not have to be the same type, and similarly we can store anything we want in the value portion. The dictionary simply provides a map from a key to a value. To see the values of a dictionary, we can access them via


In [None]:
my_dict.values()


However, sometimes we want to look at the keys as well. To return the key, value pairs of the dictionary as a `list` of `tuples`, we can use the `.items()` method


In [None]:
my_dict.items()


When you iterate over a dictionary, you get back the keys. For example,


In [None]:
for i in my_dict:
	print(i)

Let's say that you want to check to see if a value is in your keys, you can simply check it by doing the following:


In [None]:
keylist = ['key1', 'key_two', 'value', 'value2', 'notakey']

for i in keylist:
	if i in my_dict:
		print('{} is already in your keys!'.format(i))
	else:
		my_dict[i] = 'a new value'

my_dict.items()

# Functions

Functions play a key role in python, and every other programming language. You have already seen many functions in the code above. For example, the `print() and len()` functions take in some input and return something. In addition, things like `.append()` are also functions that are associated with some object, in this case lists. These are also known as `methods`. The following section will teach you how to define your own functions and how to do so in a way that makes it clear to users of your functions what it is that you're trying to do.

Let's begin by defining a very simple function:

In [None]:
def my_func():
	return 'this is my function'

Notice that we didn't write anything in the parentheses. This means that the function takes no `arguments`. We can call the function by simply calling


In [None]:
my_func()


Ok, that's a pretty boring function. What if we want to do something sligtly more interesting? We can add `arguments` to our functions. Let's write a function that takes two numbers in as input and then squares and sums them.

In [None]:
def square_then_sum(a, b):
	return a**2 + b**2

try this function out yourself!


In [None]:
# Your code here:



Note that if we do _not_ specify any arguments, python yells at us:


In [None]:
square_then_sum()


Sometimes, it makes sense to have default arguments. In this case, we can create what are called `default arguments`. For example:

In [None]:
def exponentiate_then_square(a, b, exponent = 2):
	"""
	Sometimes you will see functions which are commented in this way. There
	are several style guides that dictate what good conventions are around
	commenting functions.

	This function returns each value to an exponent and then sums the result.

	Args:
		a (int or float)
		b (int or float)

	Returns:
		result (float) the result of performing the above operation
	
	This may seem overly verbose, but for more complex functions, it can be 
	EXTREMELY helpful to have this type of documentation in line
	"""
	result = a**exponent + b**exponent

	return float(result)

Now, if we call this function with whatever argument we want for `exponent`, but if don't specify one, it'll use the default that we chose in the function definition


In [None]:
exponentiate_then_square(2, 3)

In [None]:
exponentiate_then_square(2,3, exponent = 4)

We can also use what are called `positional` and `keyword` arguments. Sometimes, you are unsure how many arguments should be passed in. In this case, we can add an asterik `*` in front of an argument in the function definition. All values passed in at this point will be put together in a tuple that can be accessed in the function. This is known as a `positional` argument. Let's see an example of this:

In [None]:
def add_then_square(*args):
	"""
	This function takes in a variable number of arguments positionally and adds 
	them all together and then squares them:

	Args:
		*args (ints or floats) values to be added then squared

	Returns:
		init (int or float) The result of performing the above operation

	"""
	init = 0
	if len(args) == 0:
		print('No arguments given!, please specify arguments')
		pass
	else:
		for arg in args:
			init += arg

	init = init**2

	return init


You may notice a few things that are weird about this function: First, the `pass` statement means not to return anything, which can be useful in certain circumstances. Sometimes, it is better to have python error out so that your function fails rather than exiting gracefully. However, there are cases where you want just to print something, for instance. In these cases, you can call `pass` at the end of the function so that your user knows that the function doesn't actually have a return

Also, the `+=` operator is short for 'add then reassign'. For example:

In [None]:
a = 1

a += 12

a

This construct is especially useful inside loops just to make things a bit more concise. 

Now we can test this function to see how it works.

In [None]:
add_then_square()

In [None]:
add_then_square(1)

In [None]:
add_then_square(1,2,3) # (1+2+3)**2

Keyword arguments are also similar, except you pass them in as `named` arguments. You can then access these in a dictionary. For example:

In [None]:
def print_grades(**grades):
	"""
	This function takes in keyword arguments. The names for these arguments are
	the name of classes that you are taking and the value represents what you 
	thnk your grade will be. The function will then print all of the results:
	"""

	for class_ in grades: # Avoid using keywords like `class`, which actually mean something in python
		print('I expect to receive a {} in {}'.format(grades[class_], class_))

	pass # I'm not doing anything else in this function

In [None]:
print_grades(math = 'b+', bme590 = 'a', spanish = 'b')
