# Introduction to Python

Python is a programming language. If you have come from the world of point-and-click interfaces for data analysis the programming aspect alone can be challenging, but by writing your own code you will ultimately have more flexibility available to you. This section aims to introduce you to the main programmatic concepts of Python that you will need.

If you are not new to programming or Python but are new to Pandas then you can skip this section and start at the next section, Pandas Data Structures, without an issue.

### Variables
The two most fundamental aspects of programming in Python are variables and functions. In very simple terms variables are things and functions do things. We will start with a simple example.

To run this example click on the cell with the code and then click Run. The Run symbol is the solid triangle pointing right, that looks like the "Play" symbol on most TV remotes, and will be located at the top of this notebook.

In [None]:
first_variable = 'hello world!'
second_variable = 20

print(first_variable)
print(second_variable)

first_variable and second_variable are both variables. first_variable is a word and second_variable is a number but both are declared the same way: you write the name of the variable, an equals sign, and the value you want to assign to the variable. You have now essentially named those values with the variable names.

(Legal names for variables are anything that is not already in use and doesn't contain special characters, like spaces, quotation marks, or the # used to mark a comment.)


In some programming languages we would need to specify the type in advance, but in Python we don't. However, these variables do have types, and this limits what can be done with them. Run the example below.

In [None]:
third_variable = 'chicken'
fourth_variable = 4

result_1 = first_variable + third_variable  
print(result_1)
result_2 = second_variable + fourth_variable
print(result_2)

These run correctly because the variable is now just a name for the value it contains. Since we can add numbers adding two numeric variables works fine, and two text variables added together concatenates them (chains them together). These are exactly the same results you would get from writing `'hello world!' = 'chicken'` and `20 + 4`.

What happens when we mix variable types? Below, we are adding a text variable to a numer. When this runs it will give an error.

In [None]:
first_variable + second_variable

Every operation on a variable requires that the variable be of the right type or types. (And, again, if you simply wrote something like `'a' + 3` you would get this sort of error as well.) You can check what a variable is using the type() function.

In [None]:
type(first_variable)

Numbers are a bit tricky, because numbers can be a numeric type (`int` or `float`) or a text type (`str`). If you put the number in quotes Python knows you want the symbol for the number (`str` type), not a numeric type. What's tricky about this is that you can have things that look like numbers but are actually text. We'll discuss changing the types of data later.

A brief, but important note about text: if you want a variable to be some text enclose the text with either double or single quotes. For example:

In [None]:
a = 'table'
b = "table"

If you skip these quotes you will run into trouble. For example:

In [None]:
c = table

What happened? Since `table` wasn't in quotes Python went looking for the variable named table, rather than setting the variable c equal to the word "table". You'll note that because this is important text in quotes gets a special color in the Jupyter notebook, and that this happens automatically.

Another way to break code is to change the capitalization of variables. Python cares about capitalization. By convention, variables are lowercase with underscores between words, like test_variable. Other formats will work (although conventions keep code readable for other people) but you can't change the case of a variable and expect it to be recognized. The code below will try exactly this and produce an error.

In [None]:
capitalization_test = "This is a test"
capitalization_Test = "THIS IS ALSO A TEST"

print(capitalization_test)
print(capitalization_Test)
print(Capitalization_Test)

As you can see, both the lowercase capitalization_test and the one with a single uppercase letter worked, but when we tried to retrieve the value of one of them using Capitalization_Test with a capital C Python had no idea what we wanted.

There isn't much we can do with variables before we get to functions, so I'll hold off having you write your own code until then.

### Functions
We have already used the functions print() and type(), but let's discuss functions more. When we have used print and type we have used them with things printed inside the parentheses. Run the examples below and we will explain why.

In [None]:
print

In [None]:
print()

In [None]:
print("hello")

When we simply run "print" we get the text "\<function print>". This is just telling us that yes, print is a function. We didn't run it, though. print() runs the print function, but prints nothing because we didn't give the function anything to print. print("hello") prints "hello". We would call "hello" an argument, which is something we give a function to do something with.

Arguments are separated with commas. I will also use spaces, but this is for readability only. The code will still run with just commas. 

Print is an unusual function in that it takes any number of arguments. If you run the code below you will see what print does with five arguments.

In [None]:
print('These', 'are', 'my', 'five', 'arguments')

Most other functions take a set number of arguments and break if you give them the wrong number. len() is a function that tells us the length of whatever argument we give it.

In [None]:
len('This is a short sentence')

You should see that len() has told you that "This is a short sentence" has 24 characters in it. Below, try using len with the wrong number of arguments.

In [None]:
# type your code here


To get deeper into functions we need to start writing our own functions. This is the real power of using Python instead of a point-and-click interface. We can define custom functions that do exactly what we want. Below I have written out a simple function that uses the Pythagorean theorem. I am also using comments, which begin with #, to explain what is happening. Comments are ignored by Python, so you can write yourself (or others) notes with them.

In [None]:
# name the function and the arguments it gets
def pythagorean(a, b):
    # do the math
    c_squared = (a ** 2) + (b ** 2)
    c = c_squared ** 0.5
    # return sends the result back. We'll discuss this more in a moment.
    return c

When we run this nothing happens. Why? Well, we only defined the function. We haven't used it yet. (Using a function is often referred to as "calling" it.) We just said to Python, "When I ask you for the pythagorean function this is what you should do." Let's use it below.

In [None]:
pythagorean(2, 3)

If we attempted to use pythagorean() before defining it this call would fail. Always define your functions first, then use them!  

To define a function we always use the keyword def. You can remember this as being short for define, but always use def, not define! Directly after def comes the name of the function, and then parentheses with arguments, and then a colon. Forgetting the colon is a pretty common beginner mistake!

An important note about arguments: you can name the arguments whatever you want. In this example our arguments are what are called positional arguments, which means that Python determines the names of the arguments by position. In this case whatever argument comes first is `a` and whichever is second is now `b`. I'm renaming whatever gets passed to the functions, not trying to guess what they are already called.

The next lines are the meat of the function. They are all indented (one tab or four spaces). This indentation means that these lines "belong" to the less indented line above them that ends with a colon. If you don't indent the function won't work correctly. Another common beginner mistake is to "fix" the weird indenting, breaking the function. Incidentally, if you typed the colon on the end of the line that starts with `def` the notebook will automatically indent your next line. If it doesn't you probably forgot the colon!

#### If you wish to, take this copy of the function (under a new name) and change some of these things to see what happens. I suggest trying the following, minimally:
1) Rename the arguments
2) Change the indentation

In [None]:
# you can break this one
def pythagorean_2(a, b):
    c_squared = (a ** 2) + (b ** 2)
    c = c_squared ** 0.5
    return c

pythagorean_2(2, 3)

The last line is the return statement. To see how this works let's run a version of the function without it.

In [None]:
def pythagorean_no_return(a, b):
    c_squared = (a ** 2) + (b ** 2)
    c = c_squared ** 0.5

pythagorean_no_return(2, 3)

It appears that nothing happened. This is because of the return statement. In the first version of the function running the function produced a result which was then returned to the main Python program. In the second version the result is calculated but it doesn't get sent back. (Instead, a "None" is sent back to indicate that there's nothing returned.)

Another way to think of this is that a function call "becomes" the return value. Below I have added two len() calls together. Run this and see what happens.

In [None]:
len('cat') + len('bat')

You should see that each len() call becomes a number (3, in both cases) and then Python can add those two numbers together. This is also useful when we chain functions together.

In [None]:
print(pythagorean(len('bat'), len('chiroptera')))

Here instead of passing pythagorean two numbers as arguments we passed it two len() calls that became numbers. It is possible to chain functions in very deeply-nested ways, but this can be hard to read. Often, for readibility, you will want to save some intermediate steps as variables and then pass those variables on to the next functions.

Now it is time to write your own functions. 
#### Below, write a simple function that takes three arguments and adds the first two together and then divides that sum by the third argument. Make sure to return the result! And, of course, call your function and check that it works.

In [None]:
# Your code goes here


Some functions have optional arguments, called keyword arguments. A simple example is below:

In [None]:
def keyword_add(a, b=3, c=2):
    return a + b + c

b and c are keyword arguments, that take default values. If no argument is passed to the function the default is used. Keyword arguments always go after positional arguments. To pass a keyword argument you use the name (hence keyword) as shown in several different ways below.

In [None]:
keyword_add(2)

In [None]:
keyword_add(2, b=4)

In [None]:
keyword_add(2, b=5, c=10)

In [None]:
keyword_add(2, c=8)

The big advantage of keyword arguments is that you don't have to mention them if you are fine with the default

Take your earlier function and modify it so that the third argument, the one that you divide the sum of the other two with, is a keyword argument defaulting to 2.

In [None]:
# your code goes here


It is also possible to write a function that takes no arguments. These must still be called using parentheses, but there are no arguments in the parentheses.

### Scope
Python has two scopes for variables that you will encounter frequently: global and local. Variables in the global scope are accessible to anyone, but can't be modified without special permission by anything working in local scope. Local variables are accessible only to their functions. Run the example below (which includes two functions without either arguments or a return statement).

In [None]:
test_var = 1

def local_test_var():
    test_var = 100
    print('local_test_var test_var is', test_var)
    
def other_local_test_var():
    test_var = 50
    print('other_local_test_var test_var is', test_var)
    
print(test_var)
local_test_var()
other_local_test_var()
print(test_var)
test_var = 14
print(test_var)

In this case scoping worked as it should, although maybe not as you expected. While there appears to be one variable, test_var, there are three: a global test_var, a test_var that belongs to local_test_var, and a test_var that belongs to other_local_test_var.

Python treats global variables as "read only" without special permission, and so when we set test_var to 100 inside a function Python says, effectively, "You don't have permission to alter global test_var, so we're making you a local test_var instead."

We could, however, read the value of test_var inside a function, as shown below.

In [None]:
test_var = 1

def local_test_var():
    print('test_var is', test_var)
    
print(test_var)
local_test_var()
print(test_var)

How do we give ourselves permission to alter a global variable? With the keyword "global", as shown below.

In [None]:
test_var = 1

def modify_global_test_var():
    global test_var
    test_var = 100
    print("test_var is", test_var)
    
print(test_var)
modify_global_test_var()
print(test_var)

By adding the line `global test_var` before modifying `test_var` we told the function that `test_var` is not a local variable but the global one and gave it permission to modify it, which it did.

#### Try this yourself! I have provided a skeleton of a function below which you should fix so that it prints the total number of times it has been called. (E.g., calling it once should print 1, the second time it should print 2, and so on.)

You will need to add two lines of code, minimum.

In [None]:
# this is a global variable
counter = 0

def count():
    # because print doesn't alter counter we can print the global counter without special permission
    print(counter)
    
count()
count()
count()
count()

### Lists
Storing single variables is nice, but for data analysis we need data structures that hold lots of variables. We will ultimately use Pandas data structures, but these are based on lists and dictionaries, and you will probably want to use lists and/or dictionaries to hold data in mid-analysis. We will start with lists.

Lists are exactly what they sound like. They are ordered collections of data. Ordered means items are always in the same order, which isn't true of dictionaries. Lists are created with square brackets []. They can either be created empty or full, and there are methods to add and subtract items from them. The code below demonstrates all of these features.


In [None]:
# create a full list
a_list = [1, 2, 3]

# create an empty list
b_list = []

print(a_list)
print(b_list)

# add to an empty list
b_list.append('Boston')

print(b_list)

# add more places to this list
b_list.append('New York')
b_list.append('Khazad-Dum')
b_list.append('Los Angeles')
b_list.append('Baton Rouge')

print(b_list)

# remove a place from the list
b_list.remove('Khazad-Dum')

print(b_list)

Note that the append and remove functions have slightly different notation than the functions we have seen before. 
This is because these are technically methods, a distinction that isn't very easy to understand when you are starting out. However, methods are "attached" to their objects and so they always know about their own object. This is how the append in `b_list.append('New York')` knows to append 'New York' to `b_list`. Calling `b_list.append` or `b_list.remove` will always append or remove its argument from `b_list`.

This means of referencing a method is called dot notation, and we'll use it quite a lot later. You write the name of the owner object, then a period (hence dot notation), then the name of the method.

We also need to be able to read items from lists. To read an item from a list we use its index, which is its position in the list. Python, like many programming languages, indexes from zero, which means that the first item in the list is at index 0, not index 1, and the second is at index 1, not index 2, and so forth. The general form to read an item from a list is `list_name[item_index]`. E.g, to access 'Boston' from `b_list` above you would write `b_list[0]`.

#### In the block below, write a line that will retrieve "h" from the list under the comment.

In [None]:
alphabet_list = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']

# write your code here


It's not uncommon that we want the end of a list. Python takes negative indices, which helps with this. The code below will retrieve "z".

In [None]:
alphabet_list[-1]

You can also take pieces of a list. These are called slices, and are of the form `list_name[start:stop]`. E.g., `alphabet_list[0:3]` will return `['a', 'b', 'c']`.

A blank represents the start or end of the whole list. `alphabet_list[:3]` will also return `['a', 'b', 'c']` because 0 is the start of the list. `alphabet_list[3:]` will return the whole alphabet except for a, b, and c.

#### In the blank below write a slice that gives you x, y, z.

In [None]:
# write your code here


## Dictionaries
While lists "remember" items by position, dictionaries remember items by keys. This is often called a key-value pair, where a key is used to look up a value. The dictionary below has a set of key-value pairs where the key is the name of a city and the value is the country the city is in.

In [None]:
city_dict = {'Lagos': 'Nigeria',
             'Canberra': 'Australia',
             'Washington DC': 'USA',
             'Kuala Lumpur': 'Malaysia'}

We can look up a value by using the form `dictionary_name[key]`. Run the code below to see.

In [None]:
city_dict['Kuala Lumpur']

We cannot do the reverse. The code below will throw an error, because we are trying to look up the value, not the key.

In [None]:
city_dict['Australia']

Creating and adding to dictionaries is shown below. Note the use of {} instead of \[].

In [None]:
# create an empty dictionary
empty_dict = {}

# create a dictionary with things in it
other_dict = {'Cat': 'Felis catus', 'Rat': 'Rattus norvegicus'}

# add a key: value pair to the empty dictionary
empty_dict[1] = 'A'

print(empty_dict)
print(other_dict)

#### Create a dictionary with a few famous people in it. Set it up so that we can look up their first names using their last names. Test it, of course!

A quick warning: don't name the dictionary `dict` (or a list `list`). These are both functions that make dictionaries or lists, and so these names are already in use.

In [None]:
# put your code here


Add someone else to the dictionary of famous people. Print the dictionary to make sure it worked.

In [None]:
# put your code here


We can get fancy and make a dictionary of lists. Let's revisit our city dictionary and try to get both country and whether the city is the current capital of the country in the same value. (Kuala Lumpur is the seat of some branches of government but not others, so we will code this as a partial capital.)

In [None]:
city_dict = {'Lagos': ['Nigeria', 'No'],
             'Canberra': ['Australia', 'Yes'],
             'Washington DC': ['USA', 'Yes'],
             'Kuala Lumpur': ['Malaysia', 'Partly']}

We can now access the list by using the key and a particular item in the list using an index. For instance, if we wanted to know if Canberra is the capital of its country we could access that information using the code below:

In [None]:
city_dict['Canberra'][1]

`city_dict['Canberra']` gives us the list `['Australia', 'Yes']`. Is/is not capital is the second item (index 1), so we just ask for the index 1 item in `city_dict['Canberra']`.

#### Below, write a line of code that retrieves what country Washington DC is in.

In [None]:
# you code goes here


### For Loops
Now that we have introduced lists we can discuss for loops. One common pattern is that you want to perform the same operation on every item in a list. A for loop will do this. An example will be helpful to explain this, so we'll start with this code.

In [None]:
num_list = [2, 56, 17, 278]

for x in num_list:
    print(x + 1)

In this for loop we simply print a number 1 higher than the number in the list. This isn't particularly useful, but the point is to demonstrate the loop.

The for loop is created in the line `for x in num_list:`. Note that this line ends with a colon and that the next section is indented below it. We saw this in defining functions and will see it for if statements. This pattern means that the indented text belongs to the for loop.

`for x in num_list:` itself follows a general pattern "for variable_name in iterable:" Anything with elements that can be pulled up in order one at a time is an iterable. A list is an iterable, where we iterate over items. A word is an iterable, where we iterate over letters. (See below.)

In [None]:
for x in 'platyhelminth':
    print(x)

A dictionary is only sort of iterable. If we attempt to iterate over this dictionary we will just get the keys.

In [None]:
for x in {1: 'a', 2: 'b', 3: 'c'}:
    print(x)

The for loop also names the things that are iterated over. The statement `for x in num_list:` means, effectively, "Take the first item from num_list, name it x, do the code below on it. Take the next item from num_list, name it x, do the code below on it." This means we keep resetting what x is. See below. 

("x += 1" is a compact way to write "x = x + 1".)

In [None]:
for x in num_list:
    x += 1
    
print(x)

Only the last variable was retained! This is what we expect. If we wanted to save these variables we would need to put them somewhere, like this:

In [None]:
new_nums = []

for x in num_list:
    new_nums.append(x + 1)
    
print(new_nums)

#### It's time to try this yourself. In the first block of code below, make a for loop that doubles the original number and stores it in a new list.

#### In the second block of code make a function that prints every item in the list the first block of code made.

In [None]:
# put your first loop here


In [None]:
# put your second loop here


Another important thing to know in for loops: the `range()` function. The `range()` function is set up with these arguments: `range(start, stop, step)`. It produces an iterable of numbers starting at start and ending just before stop. If no step is provided the numbers count up normally, if a step is provided they go up by that step number.

Technically, stop is the number at which range says "Oh, I'm done now" so the stop value will not be part of the range.

Run the examples below.

In [None]:
# print every number between 0 and 9
for x in range(0, 10):
    print(x)

In [None]:
# print every even number between 0 and 9
for x in range(0, 10, 2):
    print(x)

In [None]:
# print every number from 9 to 0. Note that I have to fix step to count down or this will fail!
for x in range(9, -1, -1):
    print(x)

One of the most common uses for range() is to do something a certain number of times. Let's go back to num_list. What if we wanted every entry to be triplicated? We could use range(0, 3) to do something three times. The example below shows this and also how to nest for loops.

In [None]:
triplicated_list = []
for x in num_list:
    for y in range(0, 3):
        triplicated_list.append(x)
        
print(triplicated_list)

Here the for loop with the `range()` call is nested inside the first for loop. We signal this with indentation. Everything that is indented one tab/four spaces occurs inside one iteration of `for x in num_list`. This includes `for y in range(0, 3)`. Everything indented with two tabs or eight spaces occurs inside a loop of `for y in range(0, 3)`.

Note, also, that we use different variables in each loop. I have said that variable names in loops don't matter, but we wouldn't want to set x equal to the first element of num_list and then immediately set it to 0 (the first element of range(0, 3)). That could easily cause issues.

The loop below shows how to use the indenting rules to operate code inside the first for loop, but not the second, after the second for loop runs.

In [None]:
triplicated_list = []
for x in num_list:
    for y in range(0, 3):
        triplicated_list.append(x)
    print(triplicated_list)
        
print('Final list:', triplicated_list)

#### Use the space below to write a loop that takes each element from num_list and then succesively adds 1, 2, and then 3 to it, saving it to a new list each time.

In [None]:
# put your code here


### If statements
One reason to introduce nested loops is so that we can discuss if statements. You frequently want code that does one thing in one condition and something different in another. If statements (sometimes called if-else statements) provide this functionality. Run the example below:

In [None]:
for num in range(1, 11):
    if num < 5:
        print(num, 'is small')
    else:
        print(num, 'is large')

num < 5 is our condition. The condition must evaluate to either True or False. Run the code below to see how this works.

In [None]:
1 < 5

In [None]:
56839 < 9

The line `if num < 5:` tests num < 5 and if it is true the code indented below it runs. This is a familiar pattern by now: a line that ends in a colon and has an indented block under it. In this example if is also paired with an `else`. The else block (the code indented under `else:`) runs if the if statement is False.

In Python it is not necessary to write an else. Imagine that we want to check a list of systolic blood pressures and flag any over 300, perhaps because we think that this is impossibly high and is an indicator of data entry error. No else statement is needed because we don't need anything to happen if the pressure is lower.

In [None]:
systolic_pressure = [140, 210, 378, 97, 163, 550]

for p in systolic_pressure:
    if p > 300:
        print('Very high pressure detected')

Let's now imagine that we want to take a continuous scale and split it into categories. Our scale will run 0-100 and we want to split every third, roughly, into a category. The code below will do this:

In [None]:
for num in range(0, 100, 10):
    if num > 66:
        print(num, 'is large')
    if 66 >= num > 33:
        print(num, 'is medium')
    if num <= 33:
        print(num, 'is small')

In the middle if statement we introduce a compound condition. `num` must be less than or equal to 66 and also greater than 33. These can be very useful.  

Another way to do this same task is below:

In [None]:
for num in range(0, 100, 10):
    if num > 66:
        print(num, 'is large')
    else:
        if num > 33:
            print(num, 'is medium')
        else:
            print(num, 'is small')

In this case we first check if the number is large. If it isn't, and only if it isn't, we check if it is medium or small. Python provides a more compact way to do this, the `elif`, which means "else (previous ifs/elifs), if (new condition)". The code below is the same as the code above, but rewritten using elif.

In [None]:
for num in range(0, 100, 10):
    if num > 66:
        print(num, 'is large')
    elif num > 33:
        print(num, 'is medium')
    else:
        print(num, 'is small')

#### Now try some if statements yourself. We'll start simple: write an if statement that prints "Good" if you give it a number between 80-100 and "Bad" otherwise. You may want to put it inside a for loop that feeds it some numbers to test it.

In [None]:
# your code goes here


#### Now try a more complex one, where you probably want elifs. Write a block of code that takes a numeric grade and prints the letter grade. If you really want to test your skills, make this a function you can call using the numeric grade as an argument.

In [None]:
# your code goes here


### Errors
The last general Python topic to cover before we jump into Pandas and data analysis is error codes (which are often called "exceptions"). The reality is that any coder will make mistakes, and the way you find and fix these mistakes is by looking at error codes. First, let's make an error.

In [None]:
for x in range(0, 10):
    y = x + 'a'
    if y > 3:
        print('Yes')

This code is atrocious, with at least two problems, but let's look at the error message. The first thing to note is that the error message tells us that the error occured in line 2 of the code, `y = x + 'a'`. Now, maybe that error occured because of something you did earlier, but the Python interpreter isn't having an issue running your code until line 2.  

The second thing to see is that this is a TypeError. This means you are trying to do something with your variables that can't be done because of their type. In this case, as the error message explains, you can't add an integer (x is an integer) to a string ('a'). We could fix this error by making x a string. Let's try that.

In [None]:
for x in range(0, 10):
    y = str(x) + 'a'
    if y > 3:
        print('Yes')

As you can see, we now get a new TypeError. On line three we are asking whether y, which will be '0a' the first time the loop runs, is greater than 3. This doesn't mean anything, so we get an error.

Why didn't we get both errors up front? Because the code broke on line 2, so the interpreter never got to 3.

Four classes of errors are common enough to review them here. You can always look one up if it is not familiar, but I want to introduce you to the `TypeError`, `IndexError`, `KeyError`, and `NameError`.  

We have already covered `TypeError`s.

`IndexError` and `KeyError` are similiar. Both involve trying to retrieve items from iterables and failing. In the case of `IndexError` you are trying to retrieve an item by its index but no item exists at that index. The code below demonstrates this.

In [None]:
q = [1, 2, 3]
q[12]

KeyError works similarly, but in cases where you are retrieving an item by its key. See below.

In [None]:
example_dict = {'Apples': 20, 'Oranges': 13, 'Mangos': 34}
example_dict['Pears']

Finally, NameError means that you are trying to do something with a variable that doesn't exist. This often means that you misspelled something or forgot to capitalize correctly. Example below:

In [None]:
Loop_Counter = 0
for x in range(0, 10):
    loop_counter += 1

Below is a longer example. This is where recognizing errors and reading the output really helps. We'll assume that I have data coming in from a source where numbers are being treated as strings (i.e., 100 is the symbol 100, not the number 100). I will use the `float()` function to turn the third item in each line into a number and then save the resulting line in a new list. (`float()`, unlike `int()`, will leave decimals intact.)

In [None]:
input_data = [['June', 'B47', '123.8', 'Dehydration'],
              ['April', 'B22', '145.2', 'UTI'],
              ['September', 'B15', '211.9', 'Hallucinations'],
              ['January', 'B12', '156.9', 'Memory loss'],
              ['October', '132.0', 'B11', 'Fall injury'],
              ['November', 'B19', '101.3', 'Vampire bite']]

fixed_data = []

for data in input_data:
    for n in range(0, 5):
        if n == 3:
            d = float(data[n])
        else:
            d = data[n]
        fixed_data.append(d)
        
print(fixed_data)

We immediately get a ValueError on the line `d = float(data[n])`. Why? Because Python can't turn 'Dehydration' into a float value. However, we wanted '123.8' to be passed to float() on that line. This is a classical case of forgetting to index from zero. The third item, 123.8, isn't index 3, it's index 2. Below I have fixed the code. (There's no need to re-create input_data, so I haven't.)

In [None]:
fixed_data = []

for data in input_data:
    for n in range(0, 5):
        if n == 2:
            d = float(data[n])
        else:
            d = data[n]
        fixed_data.append(d)
        
print(fixed_data)

This is also a case of indexing incorrectly, but instead of just solving this I want to show you how to approach debugging this. In this case we know that `data[n]` is failing with an IndexError, but what is n when `data[n]` fails? Let's just print n on every pass to see. 

In [None]:
fixed_data = []

for data in input_data:
    for n in range(0, 5):
        print(n)
        if n == 2:
            d = float(data[n])
        else:
            d = data[n]
        fixed_data.append(d)
        
print(fixed_data)

We can see that the `data[n]` failure occurs when n is 4. This is another case of indexing from zero error, probably. While `range(0, 5)` only outputs 0-4 (5 is the stop point, not the last point) a four-item list only has indices 0-3. So let's fix that. (I am also removing the print statement.)

In [None]:
fixed_data = []

for data in input_data:
    for n in range(0, 4):
        if n == 2:
            d = float(data[n])
        else:
            d = data[n]
        fixed_data.append(d)
        
print(fixed_data)

Now we have a value error again. Somehow, the wrong data is still being passed. In this case I want to print the whole data line to see what's happening.

In [None]:
fixed_data = []

for data in input_data:
    print(data)
    for n in range(0, 4):
        if n == 2:
            d = float(data[n])
        else:
            d = data[n]
        fixed_data.append(d)
        
print(fixed_data)

We can see here that one of the lines has mis-entered data, with our number swapped with some other code. In these cases I simply want to skip these lines.

The easiest way to handle this is the try-except pair. This looks a bit like an if-else in format, but instructs Python to try something and then what to do if there is an error.

In [None]:
fixed_data = []

for data in input_data:
    for n in range(0, 4):
        if n == 2:
            try:
                d = float(data[n])
            except ValueError:
                pass
        else:
            d = data[n]
        fixed_data.append(d)
        
print(fixed_data)

The `except` portion takes an error message, which specifies which error messages to use the `except` for. It is possible to skip this specifier, but I don't recommend it. If `d = float(data[n])` throws an error that isn't a ValueError we might want to know this.

The `pass` after `except ValueError:` means "do nothing". You have to put something there, and `pass` is a way to have an instruction that doesn't do anything.

This isn't a key pattern to know right off the bat, but you may eventually want to use this to "keep moving" despite some errors.

#### Below are some blocks of code with issues. Use the error messages to figure out how to fix the code.

In [None]:
# this block of code is meant to produce a dictionary that reads {1: 'a', 2: 'b'} and so forth
# one fix will prevent the error from occuring, but will be one off in the mapping

alphabet_list = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
alphabet_positions = {}

for n in range(0, 27):
    alphabet_positions[n] = alphabet_list[n]
    
print(alphabet_positions)

In [None]:
# this is supposed to print "combined codes", which are things like 1a, 2b, 3c, using the previously-defined dictionary

for x in alphabet_positions:
    combined = x + alphabet_positions[x]
    print(combined)

### Putting it all together
That's it! We're ready to start using Pandas.

Or, rather, as soon as you feel comfortable with these skills and putting them together we're ready to start using Pandas. To check your skills I have included some advanced problems below that require putting together all the different things we have learned in this worksheet.

I have also provided some hints in the collapsible format below, where you can click to expand the hint but it is, by default, hidden.
<details>
<summary>Click this to see a hint (which you will need!).</summary>

The method `.lower()` makes any letter lowercase. So if `a = "M"` then `a.lower()` will be "m". If you don't use this hint you'll need to only write lowercase in your sentences! (Also, running `a.lower()` doesn't make `a` lowercase forever. It just gives you a lowercase version of `a` now.)
</details>

In [None]:
# write a function in this space that takes a sentence as input, 
# changes all letters to numbers (a = 1, b = 2, and so forth)
# and then adds these numbers together. If the sentence already 
# contains numbers leave them alone, but sum them with the others.

# here's your alphabet again. You will want it.
alphabet_list = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']



<details>
<summary>Click this to see a second hint.</summary>

The first thing you need to do is create a "code dictionary" where you set the key to be the letter and the value to be the positional index of the letter.
</details>

<details>
<summary>Click this to see a third hint.</summary>

A `for` loop will create your dictionary. Loop over the numbers, and use the numbers to look up the letter, then insert them into an empty dictionary.
</details>

<details>
<summary>Click this to see a solution to the parts referenced in the second and third hints.</summary>

    encoding_dict = {}
    for n in range(0, len(alphabet_list)):
        encoding_dict[alphabet_list[n]] = n
</details>

<details>
<summary>Click this to see a hint about the next sub-topic.</summary>

Once you have your code dictionary make a variable to hold the sum, and use `+=` to add values to it as you loop through the sentence.
</details>

<details>
<summary>Click this to see a hint if you're having issues with the letters.</summary>
    
If a letter can't be looked up it must not be lowercase. Make sure you're using `.lower()` correctly. If `letter` is the variable with the letter you need to write `letter.lower()` everywhere you want a lowercase letter.
</details>

<details>
<summary>Click this to see a hint if you're having issues with the numbers.</summary>

`int()` and `float()` will turn your string numbers (like "3") into real numbers. `int()` handles integers (no decimals), `float()` handles decimals. E.g., `int("3.1")` breaks, `float("3.1")` is 3.1.
</details>

<details>
<summary>Click this to see a hint if you're having issues with other characters.</summary>

Try-except is magic! We can just skip spaces, commas, periods, exclamation marks, and so on. Look what error you're getting. Use `try` on your current line and then `except` your error and just `pass` if you get it.
</details>

<details>
<summary>Click this to see a full solution.</summary>


    def encode_and_sum(sentence):
        encoding_dict = {}
        for n in range(0, len(alphabet_list)):
            encoding_dict[alphabet_list[n]] = n
        summed_sentence = 0
        for letter in sentence:
            if letter.lower() in alphabet_list:
                summed_sentence += encoding_dict[letter.lower()]
            else:
                try:
                    summed_sentence += int(letter)
                except ValueError:
                    pass
        print(summed_sentence)

    encode_and_sum('Hello world 3!')
    
</details>