In [1]:
from __future__ import print_function, unicode_literals

## Lesson Outline
* **types: integers (ints), floats, booleans (bools), strings (strs)**
* **Python is dynamically typed**
* **types and type conversion**
* **using comments**
* **lists**
* **lists and strings are very similar but they're not the same type**
* **dictionaries**
* **functions**
* **for loops**
* **list comprehensions and for loops and how to convert one into the other**
* **loading things in from a file or the internet**

#### Outcomes

After this lesson you will be able to:

* use and understand the differences between the basic types in Python
* use lists and dictionaries to store collections of primitive Python objects
* implement simple mathematical functions and apply them to collections
* use for loops and comprehensions to traverse collections
* load data in from a file and off the internet (as long as the data is formatted)

### primitive types, objects, and variables

Python has a variety of basic (called **primitive**) data types:

In [2]:
x = 5

The command

`x = 5`

creates an object, an int (short for integer) with the name `x`. Let's check the `type` (what kind of an object `x` is), as it's not declared explicitly in `Python`:

In [3]:
type(x)

int

Assigning the value of an object like `5` to a variable name **is not required**, so you can check the `type` of anything that is syntactially allowed in `Python`:

In [4]:
type(5)

int

Here's another numeric type in `Python`, a `float`:

In [5]:
print(type(5.0)) #the object 5.0 doesn't have an explicit name, but its still an object!

<type 'float'>


In [6]:
type(5.0)

float

 In general, if you add a period after a number (without trailing digits), `Python` interprets it as a `float`:

In [7]:
print(type(5.))       #see??

<type 'float'>


Here's a `string` (called `str` in `Python`):

In [8]:
type('five')

unicode

`Python` also has a `Boolean` type, called a `bool`, [named after this really smart dude](https://en.wikipedia.org/wiki/George_Boole).

A `bool` can take only one of two values, either `True` or `False`:

In [9]:
type(True) # bool is short for Boolean

bool

In [11]:
is_valid = True

In [12]:
is_valid

True

### Python is a dynamically typed language

`Python` is a dynamically typed language. What this means is that you can change the type to which a variable refers for any object you create, again and again, and nothing will break (cause an `Exception`):

In [13]:
print("x is:",x)       # x is the name of an int object
print(type(x)) #check to make sure im not lying
x = 'dude'      #we can make x be a string now!
print("x is now:", x)
print(type(x))

x is: 5
<type 'int'>
x is now: dude
<type 'unicode'>


### types and type conversion

Objects in Python can also be converted (we call this converting **casting**) between different types (in certain cases):

In [14]:
y = "5.0"                    #y is a str
print("y is:", y,type(y))     #check
y = float(y)                 #newy is a float
print("y now is:", y,type(y)) #check

y is: 5.0 <type 'unicode'>
y now is: 5.0 <type 'float'>


Here we converted `y` from a `string` to a `float`.

Now let's try another conversion:

In [15]:
dude = "True"                        #dude is a str
print("dude is:",dude,type(dude))     #check
dude = bool(dude)                    #now dude is a Boolean
print("now dude is:",dude,type(dude)) #check again
dude = int(dude)                 #now dude is an int!
print("now dude is:",dude,type(dude)) #check again

dude is: True <type 'unicode'>
now dude is: True <type 'bool'>
now dude is: 1 <type 'int'>


However, type conversions don't always work.

You can't convert a `string` whose contents are not numbers into a number (`Python` will `return`, or "throw", an `Exception`):

In [16]:
bad_dude = "hello"       #bad_dude is a string
bad_dude = int(bad_dude) #bad_dude can't be cast into a number, because there is no numeric representation for "hello"

ValueError: invalid literal for int() with base 10: 'hello'

You also can't convert a `string` directly into an `int` if it should be converted into a `float` first:

In [17]:
float_string = "47.486"
float_string = int(float_string)

ValueError: invalid literal for int() with base 10: '47.486'

So you have to cast it to a `float` and then an `int`:

In [18]:
float_string = float(float_string)
print("float_string is:",float_string,type(float_string))
float_string = int(float_string)
print("float_string is:",float_string,type(float_string)) # No ValueError thrown! (This is an inline comment)

float_string is: 47.486 <type 'float'>
float_string is: 47 <type 'int'>


Why do you think `Python` would behave in this way?

### Comments and why you should be using them

I've been using comments a lot so far (the words after the `#` in the code), without explaining what they are.

**Comments simply describe code and are not executed.** 

There are two types of comments, multi-line and one-line. Neither type of comment is ever executed by the interpreter. They only exist to clarify or explain code. Use them wisely:

In [None]:
'''
Multi-line comments go between 3 quotation marks.
You can use single or double quotes.
'''

"""
This is a multiline
comment with triple quote
"""

# One-line comments are preceded by the pound symbol

### Lists

`Lists` are another very useful data type.

You can think of them as containers for the objects we looked at above.

Here's an example list. **Notice that it can contain objects of varying type**:

In [19]:
nums = [5, 5.0, 'five']     # an int, a float, and a string

Lists can be printed, just like the other objects we've looked at:

In [20]:
print(nums)                  # print the list
type(nums)                  # check the type: list

[5, 5.0, u'five']


list

Lists have lots of useful properties, like length (the number of elements in the list), and the ability to modify its contents:

In [21]:
print(len(nums))             # check the length: 3
print(nums[0])               # print first element
nums[0] = 6                 # replace a list element, we will get back to this briefly
print(nums)                  # it's changed!

3
5
[6, 5.0, u'five']


You can add to the back of a list using the `append` method and remove a specific element from a list using the `remove` method (method is another word for function):

In [22]:
nums.append(7)
print(nums)
nums.remove('five')
print(nums)

[6, 5.0, u'five', 7]
[6, 5.0, 7]


In [23]:
print(dir(list))

['__add__', '__class__', '__contains__', '__delattr__', '__delitem__', '__delslice__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getslice__', '__gt__', '__hash__', '__iadd__', '__imul__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__reversed__', '__rmul__', '__setattr__', '__setitem__', '__setslice__', '__sizeof__', '__str__', '__subclasshook__', 'append', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']


Lists can also be `sorted`:

In [1]:
nums.append?

Object `nums.append` not found.


In [24]:
print(sorted(nums))          # 'function' that does not modify the list
print(nums)                  # the original list remains the same!

[5.0, 6, 7]
[6, 5.0, 7]


However, you need to understand that if you're going to sort a list, all of the types need to be either numeric (`float` or `int`) or all `string`s. Otherwise, you'll get (what seems to be) weird behavior:

In [27]:
sorted([5,'two','ten',3])

[3, 5, u'ten', u'two']

Because the `sorted` method doesn't modify the original list, you need to overwrite (or assign) the `sorted` result to another variable (or to the same variable, in the case of an overwrite):

In [28]:
nums = sorted(nums)         # overwrite the original list
print(nums)

[5.0, 6, 7]


You can also sort the list in reverse order by passing an **optional** argument to the `sorted` function:

In [29]:
sorted(nums, reverse=True)  # optional argument

[7, 6, 5.0]

Lists have other interesting properties that allow you to select only **parts** of them.

Let's create the list `a`: 

In [30]:
a = [10, "hello", 304,3.631,True]     # create lists using brackets

You can select individual elements in the list. This is called **slicing**:

In [31]:
# slicing
print(a[0], type(a[0]))        # returns the element 10 (Python is zero indexed)
print(a[1:3], type(a[1:3]))    # returns the sublist ["hello", 304] (inclusive of first index but exclusive of second)
print(a[-1], type(a[-1]))      # returns True (last element)

10 <type 'int'>
[u'hello', 304] <type 'list'>
True <type 'bool'>


In [36]:
val="datascience"
val[5:len(val)]

u'cience'

In [37]:
val[-3:]

u'nce'

In [38]:
val[:-3]

u'datascie'

In [39]:
val[100]

IndexError: string index out of range

In [34]:
import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


In [41]:
items =[10, 20, 30, 40]

In [42]:
items[:3]

[10, 20, 30]

It is very important to always remember when you're writing code in `Python` that its indexes are always **zero-based**. 

This means that **any object that you can index (select individual components of) always has an index that starts at zero**

You can also add to the back or (`append`) to the front of (`prepend`) lists:

In [43]:
print("a before appending:", a)
a.append(6)                   # list method that appends 6 to the end
print("a after appending:", a)
a = a + [7]                   # use plus sign to combine lists
print("a after appending [7]:", a)
a = ["dude"]+ a
print("a after prepending [dude]:", a)

a before appending: [10, u'hello', 304, 3.631, True]
a after appending: [10, u'hello', 304, 3.631, True, 6]
a after appending [7]: [10, u'hello', 304, 3.631, True, 6, 7]
a after prepending [dude]: [u'dude', 10, u'hello', 304, 3.631, True, 6, 7]


Keep in mind, you can't assign outside the existing range of the list (`Python` will throw an `IndexError` `Exception`):

In [44]:
print(len(a))
a[len(a)] = "this wont be added"

8


IndexError: list assignment index out of range

### Lists and strings and how theyre very similar and very different

There are actually a lot of similarities between `strings` and `lists` in `Python`.

`Strings` can be indexed (and sliced) just like lists:

In [45]:
my_string = 'awesome'
print(my_string)
print(my_string[1:4])
print(my_string[0])
print(my_string[-1])

awesome
wes
a
e


And combined like `lists`:

In [46]:
print(my_string)
my_string = "dude" + my_string
print(my_string)

awesome
dudeawesome


You can even check their `length`, just like `lists`:

In [47]:
print(len(my_string))

11


In [49]:
"welcom to DS. This is our first class".split(",")

[u'welcom to DS. This is our first class']

But, strings can also be `split`:

In [48]:
my_string = "Hello my name is Sergey!"
my_string.split(" ") #split on spaces

[u'Hello', u'my', u'name', u'is', u'Sergey!']

And, remember that you always have to think about types when trying to combine `strings` with other objects:

In [50]:
print(my_string + ' there') # you can totally do this because both objects are strings

Hello my name is Sergey! there


So the following won't work (`Python` will throw a `TypeError` `Exception`):

In [51]:
my_string + 5 # error because 5 is an int type and my_string is a string type

TypeError: coercing to Unicode: need string or buffer, int found

In [53]:
my_string + str(5)

u'Hello my name is Sergey!5'

However, although `lists` and `strings` appear very similar, **there is a critical difference between them**.

You can replace subcomponents of `lists`, but you can't do the same for `strings`, you have to create new ones:

In [55]:
a_string = 'bros'
a_string = 'rose'
b_string = a_string
a_string == b_string

True

In [54]:
a_list = [1,45,"other_bros"]
a_string = "bros"

print(a_list)
a_list[0] = 20               #you can do this
print(a_list)
a_string[0] = 'c'            #this will throw a TypeError Exception

[1, 45, u'other_bros']
[20, 45, u'other_bros']


TypeError: 'unicode' object does not support item assignment

The reason this happens is because of the kinds of objects `lists` and `strings` are in `Python`. 

`Lists` are **mutable** objects in `Python`. This means you can change the same `list` object over and over again (this is called **mutating** it). 

You can't do that with `strings`. They're called **immutable** objects. So, when you "change" a string, you actually have to create a new version of it (with the changed part):

In [56]:
print(a_string)
a_string = "c" + a_string[1:] #this is how you would do what we tried to do above with "bros"
print(a_string)

rose
cose


You can do other stuff with `strings` that you can't do with `lists`:

In [57]:
print(dir(a_string))

['__add__', '__class__', '__contains__', '__delattr__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__getslice__', '__gt__', '__hash__', '__init__', '__le__', '__len__', '__lt__', '__mod__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmod__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '_formatter_field_name_split', '_formatter_parser', 'capitalize', 'center', 'count', 'decode', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'index', 'isalnum', 'isalpha', 'isdecimal', 'isdigit', 'islower', 'isnumeric', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'partition', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']


In [58]:
print(a_string.upper())
print(a_string.lower())
print(a_string.capitalize())

COSE
cose
Cose


Before we start talking about functions (which we are about to) and what they are, you should use a simple one that generates `lists` of sequential `ints` called `range`:

In [59]:
a = range(2, 10)  # creates a list of integers that includes first value but excludes second value and is ordered
b = range(3)      # when you only pass a single number, it generates a list starting at zero
c = range(2,10,2) # when you pass in 3 numbers, it generates a list where the "step" is the 3rd parameter
print(a)
print(b)
print(c)

[2, 3, 4, 5, 6, 7, 8, 9]
[0, 1, 2]
[2, 4, 6, 8]


#### Exercise Time!

* create a variable `my_new_list` and set it to contain "dude" and the string "55"
* create a new variable `dude55` that is the concatenation of "dude" and "55"
* create a variable `my_int` that is the int representation of "55"
* create a new string called `my_substring` that is the 3rd through 5th characters of `dude55`
* create a list called `my_range` that is all the multiples of 3 from 3-26 

In [4]:
my_new_list = ["dude","55"]
dude55 = "dude"+"55"
my_int = int("55")
my_substring = dude55[2:5]
my_range=range(2,25,3)
print(my_new_list)
print(dude55)
print(my_int)
print(my_substring)
print(my_range)

['dude', '55']
dude55
55
de5
[2, 5, 8, 11, 14, 17, 20, 23]


### Dictionaries

The last data type we are going to talk about in this whirlwind tour of `Python` are called dictionaries, or `dicts`. 

Dictionaries are `Python` objects that behave like real dictionaries in meatspace:
* dictionaries are made of key-value pairs (word and definition)
* dictionary keys must be unique (each word is only defined once)
* you can use the key to look up the value, but not the other way around (you can't ask a dictionary give me the word whose definition is "furry domesticated animal that's sometimes an asshole" and get "cat")

Dictionaries are similar to lists in that:
* they can contain multiple data types
* you can move through (iterate) them
* they are mutable (meaning you can change, or mutate, them).

However, dictionaries are different from lists in that they:
* are unordered (lists are ordered on their index)

Ok, enough of that, here's an actual `Python` dict:

In [11]:
nyc = {'manhattan':'work', 'brooklyn':'play', 'best borough':'brooklyn'}

# examine the dictionary
print(type(nyc))
print(nyc['brooklyn'])
print(len(nyc))
print(nyc.keys())
print(nyc.values())
print(nyc.items())
print(nyc[nyc["best borough"]]) #what happened here?

<type 'dict'>
play
3
['brooklyn', 'manhattan', 'best borough']
['play', 'work', 'brooklyn']
[('brooklyn', 'play'), ('manhattan', 'work'), ('best borough', 'brooklyn')]
play


Remember, `dicts` are unordered. So, working with a dict like you do with a list FAILS:

In [5]:
print(nyc[0])

KeyError: 0

However, you can modify dictionaries in a similar way to how you can modify lists:

In [12]:
nyc['queens'] = 'baller status'                    # add a new entry
nyc['best borough'] = 'queens'                     # edit an existing entry
del nyc['manhattan']                               # delete an entry using its key
nyc['other boroughs'] = ['bronx', 'staten island'] # value can be a list

# accessing a list element within a dictionary
# print(nyc['other boroughs'][1])
print(nyc)

{'brooklyn': 'play', 'other boroughs': ['bronx', 'staten island'], 'best borough': 'queens', 'queens': 'baller status'}


#### Exercise Time!
* print the name of the best borough (in the dictionary).
* create a new key-value pair for `SF` (give it any value you like)

In [13]:
print(nyc['best borough'])

queens


In [69]:
sf = {
     "temperature":70,
      "transit_options":['muni','bart', 'caltrain'],
      "fav_city": True,
      "sports_team": {"baseball": "Giants", "basketball":"Warriors"
    }
}

In [70]:
print(sf['sports_team'])

{u'basketball': u'Warriors', u'baseball': u'Giants'}


### Functions

Ok, so we've just been using functions (like `print`, `len`, `append`, `upper`, `capitalize`) without really explaining what they are or how they work. 

If you did the pre-work (you did it, right?), you should be pretty familiar with functions in Python. Nonetheless, here are some examples:

In [71]:
def give_me_one(): # function definition begins with the keyword def, ends with colon
    return 1       # indentation required for function body to create proper function scope

This is a really dumb function. It simply returns the `int` 1. Lets look at how it would work:

In [72]:
print(give_me_one())

1


To make a function "do its work", you need to **instantiate** it. That's what we did when we called `give_me_one` with `()` at the end. 

However, you can all a function without instantiating it too:

In [73]:
print("The type of give_me_one is:",type(give_me_one))
give_me_one

The type of give_me_one is: <type 'function'>


<function __main__.give_me_one>

And of course you can assign the result of a function to a new variable:

In [74]:
dude = give_me_one()
print(dude)

1


You can also assign an uninstantiated function to a variable, and then instantiate that:

In [75]:
another_dude = give_me_one
print(another_dude)
print( "Now we are going to instantiate another dude:")
another_dude()

<function give_me_one at 0x10462e398>
Now we are going to instantiate another dude:


1

Pay attention to the syntax of `give_me_one`. 

Every function definition starts with the keyword `def` and ends with a colon (:). The actual function body **must be indented** to work properly. Furthermore, a function can have an optional `return` keyword, which tells you what the function returns (in this case the `int` 1).

Functions dont have to return anything. Here's a dumb function that just prints "Python is awesome" without a `return` statement and doesnt return anything that you can then store in a variable:

In [81]:
def print_values(values):
    return values + ".  Functions are great"

In [86]:
values = print_values("Hello World")
values

u'Hello World.  Functions are great'

In [87]:
def print_python():
    print("Python is awesome!")

In [88]:
print(type(print_python())) #The type of the instantiated function is NoneType because the function didnt return anything
print(type(print_python))
print_python()
a_var = print_python()
print(a_var)

Python is awesome!
<type 'NoneType'>
<type 'function'>
Python is awesome!
Python is awesome!
None


Here's another function. It's a bit more complicated, and has parameters that you pass into it:

In [89]:
def calc(x, y, op):         # three parameters (without any defaults)
    if op == 'add':         # conditional statement
        return x + y
    elif op == 'subtract':
        return x - y
    else:
        print('Valid operations: add, subtract')

Let's test it on some input:

In [90]:
print(calc(5, 3, 'add'))
print(calc(5, 3, 'subtract'))
print(calc(5, 3, 'multiply'))
calc(5, 3)

8
2
Valid operations: add, subtract
None


TypeError: calc() takes exactly 3 arguments (2 given)

Why did we get a `TypeError Exception` for the last call here?

This is because we had to pass 3 arguments to the `calc` function, as all of them were required (none of the parameters had default values). Let's make one of the parameters optional (by adding a default value) to fix this problem:

In [91]:
def calc_default_add(x, y, op="add"): # three parameters (with addition being the default value for the 3rd parameter)
    if op == 'add':         # conditional statement
        return x + y
    elif op == 'subtract':
        return x - y
    else:
        print('Valid operations: add, subtract')

In [14]:
sorted()

TypeError: Required argument 'iterable' (pos 1) not found

In [92]:
calc_default_add(5, 3)

8

Voila! It works as intended!

Also, some more explanation. 

This function includes some **conditional** statements (`if`, `elif`, `else`) and tests for equality (the double equals sign `==`). Again, **pay attention to the indentation**. The indentation lets `Python` know about the scope of variables (where they can be used), and is required for your programs to work. If you don't indent properly, your code will throw errors.

#### Exercise Time!

Write a couple functions:
* One called `compute_pay` that takes two parameters (`hours` and `rate`), and returns the total pay.
* One called `get_hours_worked` that takes two parameters (`total_pay` and `rate`) and returns the total hours worked.

In [93]:
def compute_pay(hours, rate):
    return hours*rate

print(compute_pay(40, 10.50))

420.0


In [15]:
def get_hours_worked(total_pay,rate):
    return total_pay / rate

print(get_hours_worked(400,10.5))
print(get_hours_worked(500,10))

38.0952380952
50


### For loops

Ok, now we are going to talk about loops, which allow you to iterate (move across) objects like lists and strings.

The basic loop in `Python` is what is called the `for` loop. It starts with the reserved keyword `for`:

In [98]:
for item in sf.keys():
    print(item)

fav_city
sports_team
temperature
transit_options


In [100]:
for item in sf.values():
    print(item)

True
{u'basketball': u'Warriors', u'baseball': u'Giants'}
70
[u'muni', u'bart', u'caltrain']


In [101]:
x = 5
y = 10

In [102]:
x, y=y, x

In [103]:
sports={"baseball": "Giants", "basketball": "Warriors"}

In [104]:
sports.items()

[(u'basketball', u'Warriors'), (u'baseball', u'Giants')]

In [105]:
for key , val in sports.items():
    print("Sports", key, "Team", val)

Sports basketball Team Warriors
Sports baseball Team Giants


In [96]:
for item in ['apples','bananas','oranges']:
    print(item)

apples
bananas
oranges


In [97]:
for i in range(5):
    print(i)

0
1
2
3
4


Here we are simply printing the numbers 0-4 inclusive to the screen, one per line. 

Here's another example. This time, we are doing something to each `string` object we get from a `list` of `strings`:

In [106]:
# print each list element in uppercase
people = ['TJ', 'Ramesh', 'Sergey']
for i in range(len(people)):
    print(people[i].upper())

# does the same thing, BUT WAY WAY BETTER
for i in people:
    print(i.upper())

# for loop to print 1 through 5
nums = range(1, 6)      # create a list of 1 through 5
for num in nums:        # num 'becomes' each list element for one loop
    print(num)

# for loop to print 1, 3, 5
other = [1, 3, 5]       # create a different list
for x in other:         # name 'x' does not matter
    print(x)             # this loop only executes 3 times (not 5)

TJ
RAMESH
SERGEY
TJ
RAMESH
SERGEY
1
2
3
4
5
1
3
5


#### Exercise Time!

* write a for loop that adds 7 to each of the numbers 1-10 inclusive and prints the result to the screen

In [18]:
nums = range(1,11)
for num in nums:
    num7=num+7
    print(num7)
    

8
9
10
11
12
13
14
15
16
17


### List comprehensions

Many `for` loops can actually be converted into what are called list comprehensions, especially when you're trying to store the result of the `for` loop in a list.

Imagine you wanted to collect all of the squares of the numbers 1-10 in a `list`. You could accomplish this in several ways.

Here's a way to do it with `for` loops:

In [20]:
# range(10)
squares = []
for value in range(10):
    squares.append( value ** 2)
print(squares)

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]


In [107]:
squares = []                # create empty list to store results
for num in range(1,11):     # loop through nums (will execute 10 times)
    squares.append(num*num) # append the square of the current value of num
print("For loop result:",squares)

For loop result: [1, 4, 9, 16, 25, 36, 49, 64, 81, 100]


And here's the equivalent list comprehension:

In [19]:
better_squares = [num*num for num in range(1,11)] #exact same computation as above
print("List comprehension result:", better_squares)

('List comprehension result:', [1, 4, 9, 16, 25, 36, 49, 64, 81, 100])


#### Exercise Time!

* Given `words = ['yo','hello','awesome']` write a list comprehension that returns `["YO","HELLO","AWESOME"]`
* Given `word = "fancy"` write a list comprehension that returns `['F','A','N','C','Y']`
* Write a function called `awesome_sauce` that prints the numbers from 1 to 100. However, for multiples of 2 it should print 'awesome' instead of the number, and for multiples of 7 it should print 'sauce' instead of the number, and dor numbers which are multiples of both 2 and 7 it must print 'awesome sauce!'.

In [21]:
words = ['yo', 'hello', 'awesome']
capital_words = [word.upper() for word in words]
print (capital_words)

['YO', 'HELLO', 'AWESOME']


In [35]:
# Here's one way of doing this, but it's apparently without using a list comprehension approach
word = "fancy"
list(word.upper())

['F', 'A', 'N', 'C', 'Y']

In [36]:
# Another way of doing this that does seem to involve a list comprehension
word ="fancy"
[w for w in word.upper()]


['F', 'A', 'N', 'C', 'Y']

In [68]:
def awesome_sauce():
    for i in xrange(1, 101):
        if i % 2 == 0:
            print "awesome"
        elif i % 7 == 0:
            print "sauce"
        #elif i % 14 == 0:
        elif i % 2 == 0 and i % 7 == 0:
            print "awesome sauce!"
        else:
            print i
    
awesome_sauce()
# There is no number between 1 and 100 that is a multiple of both 2 and 7 (14, 28, etc ??) ?? There still seems
# to be an error here that needs to be corrected

1
awesome
3
awesome
5
awesome
sauce
awesome
9
awesome
11
awesome
13
awesome
15
awesome
17
awesome
19
awesome
sauce
awesome
23
awesome
25
awesome
27
awesome
29
awesome
31
awesome
33
awesome
sauce
awesome
37
awesome
39
awesome
41
awesome
43
awesome
45
awesome
47
awesome
sauce
awesome
51
awesome
53
awesome
55
awesome
57
awesome
59
awesome
61
awesome
sauce
awesome
65
awesome
67
awesome
69
awesome
71
awesome
73
awesome
75
awesome
sauce
awesome
79
awesome
81
awesome
83
awesome
85
awesome
87
awesome
89
awesome
sauce
awesome
93
awesome
95
awesome
97
awesome
99
awesome


In [69]:
def awesome_sauce():
    for i in xrange(1, 101):
        if i % 14 == 0:
            print "awesome sauce!"
        elif i % 7 == 0:
            print "sauce"
        elif i % 2 == 0:
            print "awesome"
        else:
            print i
    
awesome_sauce()
# This attempt seems to work unlike the one above for some reason.
# See https://www.rosettacode.org/wiki/FizzBuzz#Python for further possibilities

1
awesome
3
awesome
5
awesome
sauce
awesome
9
awesome
11
awesome
13
awesome sauce!
15
awesome
17
awesome
19
awesome
sauce
awesome
23
awesome
25
awesome
27
awesome sauce!
29
awesome
31
awesome
33
awesome
sauce
awesome
37
awesome
39
awesome
41
awesome sauce!
43
awesome
45
awesome
47
awesome
sauce
awesome
51
awesome
53
awesome
55
awesome sauce!
57
awesome
59
awesome
61
awesome
sauce
awesome
65
awesome
67
awesome
69
awesome sauce!
71
awesome
73
awesome
75
awesome
sauce
awesome
79
awesome
81
awesome
83
awesome sauce!
85
awesome
87
awesome
89
awesome
sauce
awesome
93
awesome
95
awesome
97
awesome sauce!
99
awesome


### Loading things in from a file or the internet

The data you work with as a data scientist needs to come from somewhere.

Here's a simple way to get some data from a file:

In [2]:
#imports to make things work properly
import csv
import requests

with open('../data/vertebral_column_2_categories.dat', 'r') as f:
    vertebral_data = [row for row in csv.reader(f)]

for line in vertebral_data:
    print(line)

['63.03 22.55 39.61 40.48 98.67 -0.25 AB']
['39.06 10.06 25.02 29 114.41 4.56 AB']
['68.83 22.22 50.09 46.61 105.99 -3.53 AB']
['69.3 24.65 44.31 44.64 101.87 11.21 AB']
['49.71 9.65 28.32 40.06 108.17 7.92 AB']
['40.25 13.92 25.12 26.33 130.33 2.23 AB']
['53.43 15.86 37.17 37.57 120.57 5.99 AB']
['45.37 10.76 29.04 34.61 117.27 -10.68 AB']
['43.79 13.53 42.69 30.26 125 13.29 AB']
['36.69 5.01 41.95 31.68 84.24 0.66 AB']
['49.71 13.04 31.33 36.67 108.65 -7.83 AB']
['31.23 17.72 15.5 13.52 120.06 0.5 AB']
['48.92 19.96 40.26 28.95 119.32 8.03 AB']
['53.57 20.46 33.1 33.11 110.97 7.04 AB']
['57.3 24.19 47 33.11 116.81 5.77 AB']
['44.32 12.54 36.1 31.78 124.12 5.42 AB']
['63.83 20.36 54.55 43.47 112.31 -0.62 AB']
['31.28 3.14 32.56 28.13 129.01 3.62 AB']
['38.7 13.44 31 25.25 123.16 1.43 AB']
['41.73 12.25 30.12 29.48 116.59 -1.24 AB']
['43.92 14.18 37.83 29.74 134.46 6.45 AB']
['54.92 21.06 42.2 33.86 125.21 2.43 AB']
['63.07 24.41 54 38.66 106.42 15.78 AB']
['45.54 13.07 30.3 32.47 117.

And heres a way to get data from the internet:

In [3]:
r = requests.get('https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data') #iris dataset on the internet
iris_data = [row.decode() for row in r.iter_lines()]

for line in iris_data:
    print(line)

5.1,3.5,1.4,0.2,Iris-setosa
4.9,3.0,1.4,0.2,Iris-setosa
4.7,3.2,1.3,0.2,Iris-setosa
4.6,3.1,1.5,0.2,Iris-setosa
5.0,3.6,1.4,0.2,Iris-setosa
5.4,3.9,1.7,0.4,Iris-setosa
4.6,3.4,1.4,0.3,Iris-setosa
5.0,3.4,1.5,0.2,Iris-setosa
4.4,2.9,1.4,0.2,Iris-setosa
4.9,3.1,1.5,0.1,Iris-setosa
5.4,3.7,1.5,0.2,Iris-setosa
4.8,3.4,1.6,0.2,Iris-setosa
4.8,3.0,1.4,0.1,Iris-setosa
4.3,3.0,1.1,0.1,Iris-setosa
5.8,4.0,1.2,0.2,Iris-setosa
5.7,4.4,1.5,0.4,Iris-setosa
5.4,3.9,1.3,0.4,Iris-setosa
5.1,3.5,1.4,0.3,Iris-setosa
5.7,3.8,1.7,0.3,Iris-setosa
5.1,3.8,1.5,0.3,Iris-setosa
5.4,3.4,1.7,0.2,Iris-setosa
5.1,3.7,1.5,0.4,Iris-setosa
4.6,3.6,1.0,0.2,Iris-setosa
5.1,3.3,1.7,0.5,Iris-setosa
4.8,3.4,1.9,0.2,Iris-setosa
5.0,3.0,1.6,0.2,Iris-setosa
5.0,3.4,1.6,0.4,Iris-setosa
5.2,3.5,1.5,0.2,Iris-setosa
5.2,3.4,1.4,0.2,Iris-setosa
4.7,3.2,1.6,0.2,Iris-setosa
4.8,3.1,1.6,0.2,Iris-setosa
5.4,3.4,1.5,0.4,Iris-setosa
5.2,4.1,1.5,0.1,Iris-setosa
5.5,4.2,1.4,0.2,Iris-setosa
4.9,3.1,1.5,0.1,Iris-setosa
5.0,3.2,1.2,0.2,Iris

[**Requests**](http://docs.python-requests.org/en/master/) is an incredibly useful library for getting data from the internet (or just handling http requests generally). We will use it a lot when getting example data to play with off the internet.

In [10]:
# Here's an example
r = requests.get('https://api.github.com/markyashar', auth=('user', 'pass'))
r.status_code

401

In [11]:
r.headers['content-type']

'application/json; charset=utf-8'

In [12]:
r.encoding

'utf-8'

In [13]:
r.text

u'{"message":"Bad credentials","documentation_url":"https://developer.github.com/v3"}'

In [14]:
r.json()

{u'documentation_url': u'https://developer.github.com/v3',
 u'message': u'Bad credentials'}

#### Exercise Time!
* Split every item in `iris_data` on the commas
* Split every item in `vertebral_data` on the spaces
* Get only the numeric entries in each item in `iris_data`

In [54]:
# pass #your code here
r = requests.get('https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data') #iris dataset on the internet
iris_data = [row.decode() for row in r.iter_lines()]

for line in iris_data:
    print(line.split(","))


[u'5.1', u'3.5', u'1.4', u'0.2', u'Iris-setosa']
[u'4.9', u'3.0', u'1.4', u'0.2', u'Iris-setosa']
[u'4.7', u'3.2', u'1.3', u'0.2', u'Iris-setosa']
[u'4.6', u'3.1', u'1.5', u'0.2', u'Iris-setosa']
[u'5.0', u'3.6', u'1.4', u'0.2', u'Iris-setosa']
[u'5.4', u'3.9', u'1.7', u'0.4', u'Iris-setosa']
[u'4.6', u'3.4', u'1.4', u'0.3', u'Iris-setosa']
[u'5.0', u'3.4', u'1.5', u'0.2', u'Iris-setosa']
[u'4.4', u'2.9', u'1.4', u'0.2', u'Iris-setosa']
[u'4.9', u'3.1', u'1.5', u'0.1', u'Iris-setosa']
[u'5.4', u'3.7', u'1.5', u'0.2', u'Iris-setosa']
[u'4.8', u'3.4', u'1.6', u'0.2', u'Iris-setosa']
[u'4.8', u'3.0', u'1.4', u'0.1', u'Iris-setosa']
[u'4.3', u'3.0', u'1.1', u'0.1', u'Iris-setosa']
[u'5.8', u'4.0', u'1.2', u'0.2', u'Iris-setosa']
[u'5.7', u'4.4', u'1.5', u'0.4', u'Iris-setosa']
[u'5.4', u'3.9', u'1.3', u'0.4', u'Iris-setosa']
[u'5.1', u'3.5', u'1.4', u'0.3', u'Iris-setosa']
[u'5.7', u'3.8', u'1.7', u'0.3', u'Iris-setosa']
[u'5.1', u'3.8', u'1.5', u'0.3', u'Iris-setosa']
[u'5.4', u'3.4', u'1

In [57]:
with open('../data/vertebral_column_2_categories.dat', 'r') as f:
    vertebral_data = [row for row in csv.reader(f)]
for line in vertebral_data:
    print(line)
    
# Couldn't figure this out. When I use print(line.split()) I get:

# " AttributeError: 'list' object has no attribute 'split' "


['63.03 22.55 39.61 40.48 98.67 -0.25 AB']
['39.06 10.06 25.02 29 114.41 4.56 AB']
['68.83 22.22 50.09 46.61 105.99 -3.53 AB']
['69.3 24.65 44.31 44.64 101.87 11.21 AB']
['49.71 9.65 28.32 40.06 108.17 7.92 AB']
['40.25 13.92 25.12 26.33 130.33 2.23 AB']
['53.43 15.86 37.17 37.57 120.57 5.99 AB']
['45.37 10.76 29.04 34.61 117.27 -10.68 AB']
['43.79 13.53 42.69 30.26 125 13.29 AB']
['36.69 5.01 41.95 31.68 84.24 0.66 AB']
['49.71 13.04 31.33 36.67 108.65 -7.83 AB']
['31.23 17.72 15.5 13.52 120.06 0.5 AB']
['48.92 19.96 40.26 28.95 119.32 8.03 AB']
['53.57 20.46 33.1 33.11 110.97 7.04 AB']
['57.3 24.19 47 33.11 116.81 5.77 AB']
['44.32 12.54 36.1 31.78 124.12 5.42 AB']
['63.83 20.36 54.55 43.47 112.31 -0.62 AB']
['31.28 3.14 32.56 28.13 129.01 3.62 AB']
['38.7 13.44 31 25.25 123.16 1.43 AB']
['41.73 12.25 30.12 29.48 116.59 -1.24 AB']
['43.92 14.18 37.83 29.74 134.46 6.45 AB']
['54.92 21.06 42.2 33.86 125.21 2.43 AB']
['63.07 24.41 54 38.66 106.42 15.78 AB']
['45.54 13.07 30.3 32.47 117.

In [83]:
# Get only the numeric entries in each item in iris_data
r = requests.get('https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data') #iris dataset on the internet
iris_data = [row.decode() for row in r.iter_lines()]
for line in iris_data:
    print(line[0:15])
# This is a bit of a "cheat". I couldn't figure this one out either. Need to think about / research this more and 
# get back to it later.

5.1,3.5,1.4,0.2
4.9,3.0,1.4,0.2
4.7,3.2,1.3,0.2
4.6,3.1,1.5,0.2
5.0,3.6,1.4,0.2
5.4,3.9,1.7,0.4
4.6,3.4,1.4,0.3
5.0,3.4,1.5,0.2
4.4,2.9,1.4,0.2
4.9,3.1,1.5,0.1
5.4,3.7,1.5,0.2
4.8,3.4,1.6,0.2
4.8,3.0,1.4,0.1
4.3,3.0,1.1,0.1
5.8,4.0,1.2,0.2
5.7,4.4,1.5,0.4
5.4,3.9,1.3,0.4
5.1,3.5,1.4,0.3
5.7,3.8,1.7,0.3
5.1,3.8,1.5,0.3
5.4,3.4,1.7,0.2
5.1,3.7,1.5,0.4
4.6,3.6,1.0,0.2
5.1,3.3,1.7,0.5
4.8,3.4,1.9,0.2
5.0,3.0,1.6,0.2
5.0,3.4,1.6,0.4
5.2,3.5,1.5,0.2
5.2,3.4,1.4,0.2
4.7,3.2,1.6,0.2
4.8,3.1,1.6,0.2
5.4,3.4,1.5,0.4
5.2,4.1,1.5,0.1
5.5,4.2,1.4,0.2
4.9,3.1,1.5,0.1
5.0,3.2,1.2,0.2
5.5,3.5,1.3,0.2
4.9,3.1,1.5,0.1
4.4,3.0,1.3,0.2
5.1,3.4,1.5,0.2
5.0,3.5,1.3,0.3
4.5,2.3,1.3,0.3
4.4,3.2,1.3,0.2
5.0,3.5,1.6,0.6
5.1,3.8,1.9,0.4
4.8,3.0,1.4,0.3
5.1,3.8,1.6,0.2
4.6,3.2,1.4,0.2
5.3,3.7,1.5,0.2
5.0,3.3,1.4,0.2
7.0,3.2,4.7,1.4
6.4,3.2,4.5,1.5
6.9,3.1,4.9,1.5
5.5,2.3,4.0,1.3
6.5,2.8,4.6,1.5
5.7,2.8,4.5,1.3
6.3,3.3,4.7,1.6
4.9,2.4,3.3,1.0
6.6,2.9,4.6,1.3
5.2,2.7,3.9,1.4
5.0,2.0,3.5,1.0
5.9,3.0,4.2,1.5
6.0,2.2,

Awesome, that's all we have for today. Except...did you finish the **Python** prework?

If not:

* Go to [learn python the hard way](http://learnpythonthehardway.org/book/) and do all of the exercises you havent done.
* [Codecademy's Python course](http://www.codecademy.com/en/tracks/python): Good beginner material, including tons of in-browser exercises.