#### Copyright 2018 Google LLC.

In [0]:
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Introduction to Python

Python is a language commonly used for machine learning. It is an approachable, yet rich, language that can be used for a variety of tasks.

It can take years to truly become an expert in the language, but luckily you can learn enough Python to become proficient in machine learning in a much shorter period of time.

This colab is a quick introduction to the core pieces of Python that you'll need to know to get started. This is only a brief peek into parts of the language that you'll commonly encounter as a data scientist.  As you progress through this course we'll introduce new Python concepts along the way.

If you already know Python, this lesson should just be a quick refresher. You might be able to simply skip to the challenges at the bottom of the document.

If you know another programming language, you will want to pay close attention because Python is markedly different than most popular languages in use today.

If you are new to programming, welcome! Hopefully this lesson will give you the tools to get started on your adventures in data science.

## Overview

### Learning Objectives

* Create, use, and troubleshoot variables and data structures.
* Read and write Python statements, expressions, conditionals, and loops.
* Use functions

### Prerequisites

* CS1
* CS2 (optional)

### Estimated Duration

60 minutes

## Variables
 
One of the most important features in any programming language is the ability to use variables. Variables are how we store data to use later.
 
When we have an expression like `2 + 3` Python calculates the value of `5` and then forgets that the value ever existed. This makes for a very limited and heavyweight calculator! We need some way to store the values of expressions so that they can be used later in the program. To do that we use variables.
 
In the example below the value of `2 + 3` is stored in the variable `x`.

In [0]:
x = 2 + 3
x

Variables store values and those values can be used in expressions.

In [0]:
num1 = 2
num2 = 3
num1 + num2

Variable value can also be changed... hence the name variable.

In [0]:
num = 123
num = 456
num

There is very little limit to what you can name a variable. The first character needs to be an alphabetic character. After the first character any alphanumeric character can be used. Underscores are also okay to use anywhere in a variable name, but stay away from naming a variable with two underscores at the beginning since Python uses leading double-underscores for internal things (We will talk about this more in a later lesson)

Here are a few valid variable names:

In [0]:
number = 1
my_number = 2
YourNumber = 3
_the_number_four = 4
n5 = 5
NUMBER = 6

number + my_number + YourNumber + _the_number_four + n5 + NUMBER

Notice that `number` and `NUMBER` are different variables. Case matters.

Although Python will accept other styles, it is common convention to name constants in all-caps (e.g. THE_NUMBER) and variables using lower_with_underscore syntax (e.g. a_number).

---

Variable names are an important aspect of computer programming. Variables serve as a form of documentation within your code. Good names will help your teammates and your future self understand what your code is doing when they are trying to modify it. Take some time to think about your variable names as you create new variables.
 
Also, keep your variable style consistent. Don't mix variable styles like `this_variable` and `thisVariable` together unless you have good reason. Python has a [guide to naming variables in an idiomatic manner](https://www.python.org/dev/peps/pep-0008/#naming-conventions). Adhere to the guide when you can. It will help others understand your code and it will train you to be able to read other programmers' Python code.

## Printing and Strings

### Strings

Strings are a data type that can contain a group of one or more characters. In order for python to know that they are data and not part of the code, they have to be wrapped in quotes.

Some example strings are:

In [0]:
"Python is a "
'useful programming language'

Single and double-quotes are common and interchangeable. Pick one and stick with it unless you need to use that type of quote in a string.

In [0]:
"I've been learning Python"

But what if you need to use both? In that case you can *escape* the embedded quote.

In [0]:
'They\'ve had a "good" trip'

You probably noticed that the escape character, `\`, shows up in the output. This is just a side-effect of how Colab is printing the string. We'll learn how to print a cleaner looking string soon.

The triple-quote is another type of quote that you can use to create a string. The triple-quote allows you to have a multi-line string. It is often used when writing documentation directly in your code.

In [0]:
"""This is a string
surronded by three quotes
and spanning multiple lines
"""

You can see in the output that the string shows up on one line with `\n` added where the line breaks were. `\n` is a special escape sequence that means "line feed", which is typewriter-speak for moving to the next line. `\t` is another common escape sequence that represents a tab. `\\` adds a backslash to the string.

Strings can be stored in variables. The `+` operator also works on strings. It concatenates them together. Operators like `-` and `/` don't work with strings.

In [0]:
s1 = "The Python "
s2 = "programming"
s3 = " language is easy to learn"
s1 + s2 + s3

Interestingly enough `*` works with strings. It causes a string to be repeated multiple times.

In [0]:
'ABC ' * 5

Python has a handy built in way to find the length of a string when you need it.

In [0]:
len("pneumonoultramicroscopicsilicovolcanoconiosis")

If you need to extract a specific character from a string you can specify that character by *indexing* it. Notice that in Python the first character of a string is at index 0. Most popular programming languages start counting at 0. You'll see this throughout your career as a data scientist.

In [0]:
"abcdefghijklmnopqrstuvwxyz"[1]

You can also extract a *slice* of a string. A slice is a portion of a string referenced by starting and ending index. The starting index is inclusive and the ending index is exclusive as you can see below where indexing a slice starting at 1 and ending at 5 returns a four-character string.

In [0]:
alphabet = "abcdefghijklmnopqrstuvwxyz"
alphabet[1:5]

Slices have some handy shortcuts if you want to start at the beginning or go all the way to the end of a string.

In [0]:
alphabet[:3]

In [0]:
alphabet[23:]

Strings are *objects* in Python. We won't get into the details of objects in this tutorial other than to mention that objects can have functions called on them using **dot notation**. There is an example object call on the `alphabet` string below that converts the entire string to upper-case. Notice how the function is called like `alphabet.upper()` instead of `upper(alphabet)` as we saw before with the `len()` function.

In [0]:
alphabet.upper()

We've barely scratched the surface of what you can do with strings in Python. More information can be found in the [Python string](https://docs.python.org/3.7/library/string.html) documentation.

### Printing

So far we have relied on Colab to print out the data that we have been working with. This works fine for simple examples, but doesn't work if you want to print multiple times in a code snippet.

The **`print`** function allows you to print any data structure to the screen.

In [0]:
my_variable = "I'm a variable"
print(my_variable)
print("Hello class!")
print(12345)
print(123.45)
print(['a', 3, 'element list'])
print(('a', 3, 'element tuple'))
print({"my": "dictionary"})

You'll notice that `print` adds a new line to the output every time it is called. We can add an `end=` argument that takes away the new line.

In [0]:
print("The magic number is ", end='')
print(42)

There is another Python feature that, though not strictly just for printing, is commonly used when printing: text formatting.

What happens if you want to print out multiple values in a print statement? Or mix strings and numbers? To do that we can use the formatting operator **`%`**.

In [0]:
print("%s says that the numbers %d and %f sum to %f" % ("Bob", 3, 5.1, 3 + 5.1))

There is quite a bit going on in the code above, so let's break it down into pieces.
 
The first string we see in the call to print is `"%s says that the numbers %d and %f sum to %f"`. This string contains `%s`, `%d`, and `%f` placeholders. These placeholders tell Python that we expect a string (`%s`), integer (`%d`), or floating-point number (`%f`) that we'll put in the string.
 
Next we see a percentage sign after the string: `%`. This is the string formatting operator. It comes after a string and lets Python know that it is about to get a tuple containing data to put into a string.
 
And finally comes the tuple containing the data. The data in the tuple should be in the same order that the placeholders appear in the string. For example the second placeholder is `%d` and the second item in the tuple is the integer `3`.

There is a more modern and object-oriented formatting function that can be used to achieve the same goal. Notice that in the target string in this function placeholders are all curly braces. Also, the floating-point values are truncated in the ``format`` output.

In [0]:
print("{} says that the numbers {} and {} sum to {}".format("Bob", 3, 5.1, 3 + 5.1))

When the values you want to print are saved as variables, there is also an easier way to do the same thing by typing a lowercase `f` before the quotation marks and putting the variable names in the curly braces.

In [0]:
name = "Bob"
value1 = 3
value2 = 5.1
print(f"{name} says that the numbers {value1} and {value2} sum to {value1 + value2}")

#### Printing Practice

In [0]:

#@markdown Run this cell before starting the exercise.
#@markdown It initializes the values of several hidden variables

name = "Alex"
fav_food = "chocolate"

We have hidden the values of variables `name` and `fav_food`. Use a print statement to find out their values.

In [0]:
# Use a print statement to show the values of the hidden variables 
# "name" and "fav_food"
print("My name is ____. I love eating ____.")

## Basic Datatypes

Even the simplest computer program performs some action on some type of data. The data that a program processes can be arbitrarily complex, but even the most complex data structures are built out of a few foundational data types. This section will introduce you to some of the more common data types that you'll encounter.

### Integers

How hard can numbers be? 1, 2, 3, 4, and so on.

To some extent numbers are that simple. Python is certainly capable of working with whole numbers (also known as integers). Take for instance the code example below that adds two numbers.

In [0]:
42 + 8

Python can do all sorts of common math operations on numbers as you can see below.

Subtraction

In [0]:
4 - 2

Multiplication

In [0]:
2 * 3

Raise to the power of an exponent

In [0]:
2 ** 3

Division is a bit more complicated. If a number doesn't divide evenly, such as `13 / 5`, we get what is called a floating-point number. We will talk about floating-points in a bit, but for now, we can use `//` when dividing, also known as "floor division." It will chop off any fractional remainder. To find this remainder, we can use a modulos (`%`) operator.

In [0]:
quotient = 14 // 4
remainder = 14 % 4

print(f"Quotient: {quotient}, Remainder: {remainder}")

Mathematical operations can be arbitrarily combined.

In [0]:
7 ** 7 + 7 * 7 - 7 * 7 // 7

The code above can be a bit difficult to read. Which numbers get processed first?

Python enforces an order of precedence where operations like taking the exponent come before multiplication and division, which comes before addition and subtraction. There are actually more operators that what we've seen so far and it can be tricky to remember, so when in doubt you can wrap parentheses around operations to make things clearer.

In [0]:
(7 ** 7) + (7 * 7) - ((7 * 7) // 7)

You can change the precedence by wrapping different expressions in parenthesis. Run the code snippets above and below and notice they have different results despite having the same numbers and operators.

In [0]:
(7 ** (7 + 7)) * 7 - (7 * (7 // 7))

#### Integers Practice

**Four Fours Problem**

Don't spend more than 15 minutes on this problem but give it a few tries.

Change the operations and add parenthesis as necessary to try and come up with as many numbers between 1 and 42 as you can.

In [0]:
4 + 4 + 4 + 4



---

### Floating-point Numbers

Why are there multiple ways to represent numbers in Python?

Rest assured that this is not the fault of Python or any other programming language. To understand this you have to remember that at the most basic level computers only recognize whether an electrical signal is above or below some threshold. You can think of it like a light switch that can be in one of two states. Even though a switch might move across many small positions to get from one state to another it is really only making the decision of if the lights are on or off.

Computer scientists had to figure out some way to take this binary signal and turn it into numbers that you and I would recognize. In the end a few schemes won out. Those schemes clustered into two primary buckets: schemes that didn't lose data but were limited to whole numbers and schemes that could represent arbitrary decimal numbers, but might have some rounding error.

Each type of number has a place in the world of computing that it shines. In some languages you have to go through great pains (we might be exaggerating here) to move from one type of number to another. In Python the mixing of integers and floating-point numbers is fluid. This is great most of the time, but can be problematic at times. It is important to be aware, especially when doing division, if you are working with floating-point numbers or integers.

Under the hood, floating-point numbers work like to scientific notation which generally takes the form of a small number with a decimal part, multiplied by 10 raised to some exponent. For example $1.23 \times 10^2$ could be represented with a floating-point in python as `1.23e2`, `12.3E1`, `123.0` etc. The `E` or `e` is used to represent the 10 raised to the power of whatever number follows it. it is useful for very large and very small numbers, but not necessary for writing floating-point numbers in general.

Earlier we looked at floor division, lets now take a look at regular division.

In [0]:
50 / 10

Here we can see that there is a decimal point (`.`) in the output. This is generally how we'll know that we're working with floating-point numbers. 

You can force operations to be floating-point operations by including one floating-point number in the equation.

In [0]:
2.0 + 3

First, let's do regular division and make sure that we get a floating-point number with decimal precision:

In [0]:
322.231 / 0.03

`10741.033333333333` definitely has an a decimal portion. Now apply floor division:

In [0]:
322.231 // 0.03

Note that the resultant `10741.0` loses precision, but remains a floating-point number.

As we showed earlier, we can use the **modulus** operator find the remainder:

In [0]:
14 % 4

In the code sample above `14 % 4` returned `2`, which is the remainder of `14 // 4`.

Does the **modulus** operator work with floating-point numbers? Yes, it does:

In [0]:
14.2 % 6.0

But notice that the result of modulus is `2.1999999999999993`. We expect it to be `2.2`. This is an example where the underlying floating-point representation of the data can lose some precision. (I'd prefer to keep this away from my bank account)

Python has a very useful package with more advanced math tools that is built in and easily added to any program. All you need to do is add the following line to your code.

In [0]:
import math

This package gives us acces to a bunch of useful things like sines, cosines and mathematical constants like $\pi$ and $e$. To use them, we just type `math.` followed by the specific function we want to use. In Colab if you've already imported the library, typing `math.` should show a list of available functions.

In [0]:
math.pi

Here, Pi ($\pi$) is represented as a floating-point number, and as we saw before, it is not exactly the correct value. The real value of $\pi$ is irrational, so its representation as a decimal goes on forever, but for the computer to use it, it has to be rounded. Similarly to how we got a rounding error earlier, by virtue of the way the value is stored in bits on the computer, `math.pi` is about as close to the real value of $\pi$ as `14.2 % 6.0` is to `2.2`.

---


While computers cannot generate truly random numbers, computer scientists have come up with some very clever ways to get pseudo-random numbers, which aren't perfectly random, but are usually close enough that it won't matter for whatever purpose we might need. In python, it is very easy to get a a variety of pseudo-random values using the `random` library. Like the math library, we can just import it into our code.

In [0]:
import random

As we saw with the math library, any time we want to use it we had to call it with a dot and then the specific function from the library we want to use. To get a random get a floating-point value uniformly chosen from between 1 and 10, we can type the following command:

In [0]:
random.uniform(0,10)

Alternatively, we can get a random value between 0 and 1 using `random.random()` and simply multiply it by 10 to achieve basically the same thing. There are cases where one method might be preferable over the other, so it is useful to have both.

In [0]:
random.random()*10

There are a lot more kinds of random numbers that we can use, but we'll talk about that more a bit later.



---

That wraps is up for our introduction to numbers in Python. This tutorial only scratched the surface, but hopefully gave you a gentle introduction to how Python represents numbers and how to perform mathematical operations on those numbers.
 
Don't be intimidated by the differences and interactions between floating-point numbers. Most of the time you won't have to be concerned about it. However, if you see surprising results when doing math in Python, but sure to double-check the type of numbers your equations are operating on.

### Booleans

Boolean values are another core data type in Python. Bools, as they are sometimes called, are found throughout most computer programs. They are the bit of data that directs the computer to make decisions. Boolean values can be represented simply by **True** and **False**.

In [0]:
True

In [0]:
False

Don't let the seemingly limited applications of **True** and **False** lead you to discount the power and prevalence of these data values. For one, they can be combined with logical operators such as **and**:


In [0]:
True and True

In [0]:
True and False

In [0]:
False and True

In [0]:
False and False

There is also the **or** operator that works with boolean values:

In [0]:
True or True

In [0]:
True or False

In [0]:
False or True

In [0]:
False or False

Notice that the presence of **False** in any **and** operation turns the expression **False**. Likewise, the presence of **True** in any **or** operation turns the expression **True**.

These expressions of *truthiness* can be expanded beyond two operands:

In [0]:
False and True or True and False or True

**and** and **or** have the same precedence, but that can be changed with parentheses just like with numbers:

In [0]:
False and (True or True and False or True)

Truthiness can be flipped with the **not** operator:

In [0]:
not True

In [0]:
not False

You will only work directly with **True** and  **False** on occasion. Most of the time these values will be returned from other expressions. Take for instance the greater than, less than, greater than or equal to, and less than or equal to expressions below:

In [0]:
2 > 1

In [0]:
2 < 1

In [0]:
1 >= 1

In [0]:
2 <= 1

Then there are equality and inequality checks:

In [0]:
1 == 2

In [0]:
1 != 2

Why is "equals" **==** instead of just a single equal sign? It turns out that the single equal sign is reserved for other uses in the language (and most languages). We'll get to this point soon.

Of course you can combine the logical **and**, **or**, and **not** expressions and the **>**, **>=**, **<**, **<=**, **==**, **!=** expressions as needed. Parentheses change the order of operations as expected.

In [0]:
(1 < 2) and (3 == 3) or ((4 > 1) and (not 1 < 2))


We can also use the **>**, **>=**, **<**, **<=**, **==**, **!=** expressions with two strings to determine whether they are in alphabetical order.

In [0]:
'apple' < 'banana'

Of note is that capital letters are sorted before lowercase e.g. `'A' < 'a'`.

#### Booleans Practice

In [0]:
#@markdown We've hidden the values of variables `name`, `fav_number`, 
#@markdown and `fav_animal`. Use boolean expressions to find the values of
#@markdown the number and the two strings.

#@markdown Run this cell to initialize the hidden 
#@markdown variables

name = "Taylor"
fav_number = 42
fav_animal = "koala"

In [0]:
name == 'AA'

In [0]:
fav_number > 100

In [0]:
fav_animal >= 'zz'

## Conditional Decisions

One of the most common tasks that you'll do as you program is ask some question and, depending on the answer, perform some action. The most common way to do this in Python is with the **`if`** statement. The `if` statement looks at a boolean value and if that value is `True`, the statement runs some code.

Let's look at an example.

In [0]:
if 1 > 3:
  print("One is greater than three")

if 1 < 3:
  print("One is less than three")

Only the second print statement was called. The condition `1 < 3` is true so the code under the second `if` executed.
 
Also notice that the two print statements are indented beneath each if statement. This isn't by accident. Python creates "blocks" of code using the code's indentation level. This indentation can be done with tabs or with spaces, but it must be consistent throughout your code file.
 
block 1
> block 1.1
>> block 1.1.1
 
>> block 1.1.1
 
> block 1.1
 
 
The code below shows blocks in action.

In [0]:
if False:
  print("This shouldn't print")
print("But this always will")

You will run into situations where you need to check a condition and if it is true do something and if it is not true do something else. You could write code to check both conditions or you can use the `else` condition.

In [0]:
if 1 > 3:
  print("Math is broken as we know it")
else:
  print("Everything looks normal")

You might also want to check many if conditions and only execute the code if one condition passes. For that you can use the `elif` clause.

For example, these would be useful if we wanted to do a simple rock, paper, scissors game.

In [0]:
#choose a random option from the list for the computer player
import random
computer = random.choice(["rock","paper","scissors"]) 

my_choice = "paper" # Feel free to change this

print(f"You chose {my_choice}!")
print(f"The computer chose {computer}!")

if my_choice == computer:
    print("Draw! Go again!")

elif my_choice == "rock" and computer == "paper":
    print("The computer wins. Try again?")

elif my_choice == "rock" and computer == "scissors":
    print("You smashed the computer's scissors!")

elif my_choice == "paper" and computer == "rock":
    print("You wrapped up the computer's rock!")

elif my_choice == "paper" and computer == "scissors":
    print("The computer wins. Try again?")

elif my_choice == "scissors" and computer == "rock":
    print("The computer wins. Try again?")

elif my_choice == "scissors" and computer == "paper":
    print("You sliced up the computer's paper!")

## Basic Data-Structures

### Lists

So far the data types we've seen can be thought of as singular entities. So far we've seen strings, integers, floating-points, and booleans.

Working with individual pieces of data can be useful, but many times you'll find yourself needing to work with multiple data elements. There are several options for organizing a collection of data into a data-structure. One option is to use a list.

A list is just a sequence of other data types.

In [0]:
[9, 8, 7, 6, 5]

You can mix different data types.

In [0]:
[True, "Shark!", 3.4, False, 6]

You can assign a list to a variable.

In [0]:
my_list = [True, "Shark!", 3.4, False, 6]
my_list

You can also index a list and take slices from it just like you can from a string. Conceptually you can think of "a string" to be a sequence of characters similar to a list.

In [0]:
my_list[3:]

Indexing can even be used to selectively replace items in a list.

In [0]:
my_list[1] = "Wolf!"
my_list

Lists have other interesting features. For example you can sort the list in place.

In [0]:
number_list = [4, 2, 7, 9 ,3, 5, 3, 2, 9]
number_list.sort()
number_list

You can also have lists within lists.

In [0]:
["List 1", ["List 2", 3, 4], False]

Lists-of-lists come in really handy, especially in data science since much of the data that you'll work with will be in a tabular format. In these cases the internal lists are typically the same size. For example, you might have a list of data points about a customer, such as their age, income, and the amount they spent at your company last month.

In [0]:
customers = [
    ["C0", 42, 56000, 12.30],
    ["C1", 19, 15000, 43.21],
    ["C2", 35, 123000, 45.67],
]
customers

How do you get data out of nested lists? You can stack indexes. In the example below we pull out the income of our second customer.

In [0]:
customers[1][2]

We will explore lists more deeply and other data structures in future tutorials, but lists are a key tool of machine learning work.

### Tuples

Tuples look and feel a whole lot like lists in Python. They can contain a sequence of data including lists and other tuples. The primary difference between lists and tuples is that you can't modify a tuple like you can a list.

Before we get too deep into immutability let's take a look at a tuple.

In [0]:
my_tuple = (1, "dog", 3.987, False, ["a", "list", "inside", 1.0, "tuples"])
my_tuple

Looks pretty list-like doesn't it? The visible difference is that we create a tuple with parentheses instead of square brackets.

You can index a tuple and take a slice from a tuple just like you can from a list. You just can't change a tuple.

This is useful because Python can perform some optimizations when it knows a data structure can't change. It can also do a few tricks. We'll take a peek at one of the tricks now and learn more later in this tutorial.

The trick that we are going to learn is variable swapping. In most languages you need three variables to swap the value of two variables. Here is an example.

In [0]:
var1 = "Python"
var2 = "Perl"

tmp = var1
var1 = var2
var2 = tmp

var1, var2

We had to introduce a `tmp` variable to perform the swap and needed three lines of easy to mess-up code. With tuples we can do this more cleanly.

**Note:** you might have noticed that when we put `var1, var2` at the bottom of the last code section a tuple was printed out. Having any variable followed by a comma automatically creates a tuple in Python.

In [0]:
var1 = "Perl"
var2 = "Python"

(var1, var2) = (var2, var1)

var1, var2

As you can see, swapping variables using tuples is much easier to read and less error-prone than having to use three variables.

---

You will encounter tuples throughout your Python programs. Sometimes you won't even realize that you are working with a tuple since they are so integrated with the language.

As we continue our journey into Python you'll see a few more places where tuples are vital, but for now let us move on into one of the more powerful data structures in Python: dictionaries.

### Dictionaries

Dictionaries are an invaluable data structure that you'll find yourself using often in Python programming. If you have experience with other programming languages you might have encountered a similar data structure with a different name such hash, map, or hashmap.

Dictionaries contain key/value pairs. You can look up keys in a dictionary in constant time regardless of the number of elements in a dictionary.

Let's take a look at some code that creates a dictionary and accesses a value in the dictionary by key.

In [0]:
my_dictionary = {
    "pet": "cat",
    "car": "Tesla",
    "lodging": "apartment",
}

my_dictionary["pet"]

Notice that we used the *indexing* notation that should be familiar to you from strings, lists, and tuples. Instead of a numeric index, the lookup is done by key.

A key can be any non-mutable data value. Keys can be numbers, strings, and even tuples. You can't us a dictionary or list as a key, but you can use them as values.

In [0]:
the_dictionary = {
    57: "the sneaky fox",
    "many things": [1, "little list", " of ", 5.0, "things"],
    (8, "ocho"): "Hi there",
    "KEY_ONE": {
        "a": "dictionary",
        "as a": "value"
    },
}

the_dictionary[(8, "ocho")]

The dictionary above is much more unstructured than dictionaries that you'll typically encounter in practice, but it illustrates the broad range of key types and value types that a dictionary can store.

You can also index many levels down in a dictionary. For example in `the_dictionary` above there is a sub-dictionary at the `KEY_ONE` key. Let's pull something out of the sub-dictionary.

In [0]:
the_dictionary["key_one".upper()]['as a']

You can also index into sub-lists.

In [0]:
the_dictionary["many things"][1]

Dictionaries, lists, tuples, and other data structures can nest as much as you want or need to nest them.

Dictionaries store their values by key. Only one value can exist per key, so if you put write a new value to a key, the old value goes away.

In [0]:
my_dictionary = {
    "k1": "name",
    "k2": "age"
}

my_dictionary["k1"] = "surname"

my_dictionary

You can add entries to a dictionary by assigning them to a key.

In [0]:
my_dictionary["k3"] = "rank"

my_dictionary

And you can remove entries from a dictionary using the **`del`** operator.

In [0]:
del my_dictionary["k2"]

my_dictionary

To see if a key exists in a dictionary use the **`in`** operator. Notice that it returns a boolean value.

In [0]:
"k2" in my_dictionary

It is advisable to check if a key exists in a dictionary before trying to index that key. If you try to access a key that doesn't exist using square brackets your program will throw an exception and possibly crash.

There is also a safer **`get`** method on the dictionary object that you can access using the dot notation reference earlier. You provide `get` with a key and a default value to return if the key isn't present.

In [0]:
my_dictionary.get("k2", "I can't find it")

Dictionaries are a powerful data structure with many uses. We've only mentioned the most common things to do with a dictionary. For more information check out the [official Python dictionary documentation](https://docs.python.org/3/tutorial/datastructures.html#dictionaries).

---

We've learned about the most fundamental data structures in Python: numbers, booleans, lists, tuples, and dictionaries. Along the way we learned how to store data in variables and how to change data in variables, dictionaries and lists. Each of these data types have more functionality than we have gone over in this tutorial, so please do take some time to see what more we can do with these data types in Python.
 
There are also many data types that we did not cover. Some are low-level data types dealing with bits and bytes on you computer. Some are higher-level, using these core data structures that we just studied to build rich representation of queues, dates, times, and more.
 
As we encounter the need for other types of data in our study of machine learning and data science we will introduce and explain them. But now, it is time to move on and learn how to make decisions about and with our data and control the flow of program execution.

## For Loops

We've seen lists, tuples, and dictionaries, but we haven't seen how to do something with everything in them in one operation. That is where `for` loops come in. `for` loops are a powerful tool that lets us look at every item in a data structure in order, and perhaps do some operations on it. Let's

In [0]:
my_list = ['a', 'b', 'c']

for item in my_list:
  print(item)

As you can see, the `for` loop executes print three times, once for each item in the list.

The `for` loop works for tuples too.

In [0]:
my_tuple = (5, 3, 1, -1, -3, -5)
for x in my_tuple:
  print(x)

Dictionaries are a little more interesting. By default the loop works in terms of keys.

In [0]:
my_dictionary = {
    "first_name": "Jane",
    "last_name": "Doe",
    "title": "Dr."
}

for k in my_dictionary:
  print(f"{k}: {my_dictionary[k]}")

If only values are interesting to you it is possible to ask the dictionary to return it's `values`.

In [0]:
for v in my_dictionary.values():
  print(v)

If you want both keys and values without a lookup you can asked the dictionary for its `items`.

In [0]:
for (k, v) in my_dictionary.items():
  print(f"{k}: {v}")

You can operate on a string character by character, which will give you a one-character string containing the current character.

In [0]:
for c in "this string":
  print(c)

If you need to iterate over a list or tuple and need the index of each item you can use the `range` function along with the `len` function to get the indices of the list or tuple.

In [0]:
for i in range(len(my_list)):
  print(f"{i}: {my_list[i]}")

`range` is a function that returns a sequence of numbers. It can take one argument, two arguments, or three arguments.

When it takes one argument it considers that argument to be the end of the range (exclusive).

In [0]:
for i in range(5):
    print(f"{i}",end=" ")

If there are two arguments, they are considered to be the start (inclusive) and end (exclusive) of the sequence.

In [0]:
for i in range(6, 12):
    print(f"{i}",end=" ")

If there are three arguments, they are considered to be the start (inclusive), end (exclusive), and step of the sequence of the sequence.

In [0]:
for i in range(20, 100, 10):
    print(f"{i}",end=" ")

Ranges are lazily evaluated so even very large ranges will not occupy a significant amount of memory.

---

`for` loops can also be useful for making a list as well. For example, if we wanted to generate a list of random numbers, we could use the random library in a `for` loop.

In [0]:
import random

random_numbers = []
for i in range(10):
    random_numbers += [random.randint(0,10)]
print(random_numbers)

If we want floating-point values instead, we could change the above code slightly.

In [0]:
import random

random_numbers = []
for i in range(10):
    random_numbers += [random.random()*10]
print(random_numbers)

## Functions

Functions are a way to organize and re-use your code. Functions allow you to take a block of your code, give it a name, and then call that code by name as many times as you need to.

Functions are defined by the `def` statement.

In [0]:
def my_function():
  print("I wrote a function")

my_function()
my_function()
my_function()

Standard function definitions always begin with the `def` keyword followed by the name of the function. Function naming follows the same rules as variable naming, which we covered earlier in this tutorial.

After the name of the function comes opening and closing parentheses. These parentheses can have more code between them, but will get to that in a bit.

Finally there is a trailing colon that signals that the following code will be part of the function.

The function's code is indented under the function definition.

Let's revisit those opening and closing parenthesis. They hold the names of variables that are considered *arguments* to the function. Function arguments, also called parameters, are used to provide the function with data.

In [0]:
def my_function(adj1, color, animal1, verb, animal2):
  print(f"The {adj1} {color} {animal1} {verb} over the lazy {animal2}")

my_function("quick", "brown", "fox", "jumped", "dog")

my_function("smelly", "fuschia", "chipmunk", "cartwheeled", "koala")

Functions can also return data. To illustrate this we'll write a function that doubles a number.

In [0]:
def doubler(n):
  return n * 2

print(doubler(42))

Functions can return multiple values as a tuple.

In [0]:
def min_max(numbers):
  min = 0
  max = 0
  for n in numbers:
    if n > max:
      max = n
    if n < min:
      min = n
  return min, max

print(min_max([-6, 78, -102, 45, 5.98, 3.1243]))

It is important to note that when you pass data to a function, the function gets a copy of the data. For numeric, boolean, and string data types, that means that the function can't directly modify the data you passed in. For lists and dictionaries it is a little more complicated. The function gets a copy of the location/address of the data structure. While the function can change that address, it can modify the data structure.

Let's see some examples to solidify the point. In this first example we can see that the number changer can't make any changes to `my_number`.

In [0]:
def number_changer(n):
  n = 42

my_number = 24
number_changer(my_number)
print(my_number)

The same is true for booleans. The function can't modify `my_bool`

In [0]:
def boolean_changer(b):
  b = False

my_bool = True
boolean_changer(my_bool)
print(my_bool)

We can see the same for strings.

In [0]:
def string_changer(s):
  s = "Got you!"

my_string = "Can't get me"
string_changer(my_string)
print(my_string)

Lists can be modified though.

In [0]:
def list_changer(list_parameter):
  list_parameter[0] = "pwned"

my_list = [1, 2, 3]
list_changer(my_list)
print(my_list)

However, the change of list location done by the function doesn't stick.

In [0]:
def list_changer(list_parameter):
  list_parameter = ["this is my list now"]

my_list = [1, 2, 3]
list_changer(my_list)
print(my_list)

Dictionaries interact with functions exactly like lists do.

In [0]:
def dictionary_changer(d):
  d["my_entry"] = 100

my_dictionary = {"a": 100, "b": "bee"}
dictionary_changer(my_dictionary)
print(my_dictionary)

In [0]:
def dictionary_changer(d):
  d = {"this is": "my dictionary"}

my_dictionary = {"a": 100, "b": "bee"}
dictionary_changer(my_dictionary)
print(my_dictionary)

So how do you get a function to modify a number, bool, or string? You simple assign the return value of the function to the original variable.

In [0]:
def number_changer(n):
  return n * n

def boolean_changer(b):
  return not b

def string_changer(s):
  return s.upper()

my_number = 42
my_bool = False
my_string = "Python"

my_number = number_changer(my_number)
my_bool = boolean_changer(my_bool)
my_string = string_changer(my_string)

print(my_number)
print(my_bool)
print(my_string)

There are many more fun things we can do with functions, but this is enough to get us started on our exploration into machine learning.

### Functions Practice

In [0]:
def rock_paper_scissors(player_choice):
    # Add code here that takes in the players choice of rock, paper, or
    # scissors and plays a game against the computer
    pass

In [0]:
rock_paper_scissors("rock")
rock_paper_scissors("paper")
rock_paper_scissors("scissors")

## While Loops

`for` loops aren't the only loops in town. There is also a `while` loop that you can use to repeat a block of code until some arbitrary condition is met.

In [0]:
counter = 0
while counter < 5:
  print(counter)
  if counter == 1:
    counter += 2
  else:
    counter += 1


`while` loops can be useful in many situations, especially those when you don't know for sure how many times you might need to loop.
 
You might have also noticed the `+=` operator in the example above. This is a shortcut that Python provides so that we don't have to write out `counter = counter + 1`. There are equivalents for subtraction, multiplication, division, and more.

### Break

There are times when you might want to exit a loop before it is complete. For this you can use the `break` statement.

In the example below the loop only executes 5 times despite having a range of 1,000,000 numbers to iterate.

In [0]:
for x in range(1000000):
  if x >= 5:
    break
  print(x)

### Continue

`continue` is similar to `break`, but instead of exiting the loop entirely it just skips the current iteration.

Let's see this in action with a loop that only prints out even numbers.

In [0]:
for x in range(10):
  if x % 2 > 0:
    continue
  print(x)

We now have a pretty good toolset for creating programs in Python. We know about data, how to make decisions about it, and how to operate on it in a loop.

However, if we were to write a large program it would quickly become difficult to understand because all of the code would be jammed together in one big block. Next we'll learn how to combat this with functions.

## Comments

So far we have only explored code that is to be consumed by the Python interpreter. We did discuss how naming variables is important to the human reader and how functions can help code be more programmer-friendly, but if you want to leave a note behind about what you did or why without changing the flow of the program, how do you do that? You use comments.

Comments are simply pieces of your code that will be skipped over when the program is running.

Python considers the hashtag, `#` to be the start of a comment. This hashtag can be just about anywhere in a line. Anything after the hashtag on the same line won't be executed.

Let's look at an example.

In [0]:
# This is a comment used to document.
# If I need more than one line
# then I need to add more hash-tags.

print("Hello") # comments don't have to be at the start of a line

# print("This won't run") # and this code won't run because it is 'commented out'

## Pass

`pass` is a Python keyword that is used as a placeholder when code hasn't been written yet. It is used in places where code would normally be required, but hasn't been written yet. You'll see `pass` often in your exercises as a placeholder for the code you'll need to write.

In [0]:
def do_nothing_function():
  pass

do_nothing_function()

# Exercises

## Practice Problems

In case you want to go back and look at the practice problems, we've made some links here to make it easy to go back and find them.

*   [Printing](#scrollTo=ED0MmXO9Ytkl)
*   [Integers](#scrollTo=fqQQ2FVTN0__)
*   [Booleans](#scrollTo=l1kunoRTl5GQ)
*   [Functions](#scrollTo=o9rgbpgRNRpy)

Once you feel comfortable with the concepts we've covered, you can move on to the challenge problems below.

## Exercise 1

In the code block below complete the function by making it return the number cubed.

### Student Solution

In [0]:
def find_the_cube(n):
  pass # your code goes here

print(find_the_cube(5))

### Answer Key

**Solution**

In [0]:
def find_the_cube(n):
  return n*n*n

print(find_the_cube(5))

**Validation**

In [0]:
assert list(map(find_the_cube, range(11))) == [0,1,8,27,64,125,216,343,512,729,1000], "Wrong result"

"LGTM"

## Exercise 2

In the code block below complete the function by making it return the sum of the even numbers of the provided sequence (list or tuple)

### Student Solution

In [0]:
def sum_of_evens(seq):
  pass # your code goes here

print(sum_of_evens([5, 14, 6, -2, 0, 45, 66]))

### Answer Key

**Solution**

In [0]:
def sum_of_evens(seq):
  return sum([x for x in seq if x%2==0])

print(sum_of_evens([5, 14, 6, -2, 0, 45, 66]))

**Validation**

In [0]:
test_list = [5, 14, 6, -2, 0, 45, 66]
assert sum_of_evens(test_list) == 84, "Wrong Result"

"LGTM"

## Exercise 3

We've provided a helper function for you that will take a random step, returning either -1 or 1. It is your job to use this function in another function that takes in a starting value `start`, a number of random steps to take on a random walk `num_steps`, and how many walks to make total `num_trials`, and store the final values of each walk in a list.

In [0]:
import random  

def random_step():
    """
    returns either -1 or 1 at random 
    """
    return random.choice([-1, 1])

### Student Solution

In [0]:
def random_walks(start, num_steps, num_trials):
    pass # your code goes here

print(random_walks(42,12,5))

### Answer Key

**Solution**

In [0]:
def random_walks(start, num_steps, num_trials):
    trials = []
    for trial in range(num_trials):
        pos = start
        for step in range(num_steps):
            pos += random_step()
        trials += [pos]
    return trials

print(random_walks(42,12,5))

**Validation**

In [0]:
walks = random_walks(42,12,5)
print(walks)

success = True

if len(walks) == 5:
    for i in walks:
        if i < 30 or 54 < i:
              success = False   
else:
    success = False

assert success, "Wrong Result"

"LGTM"