<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Numeric-Data-Types-in-Python" data-toc-modified-id="Numeric-Data-Types-in-Python-1">Numeric Data Types in Python</a></span><ul class="toc-item"><li><span><a href="#Built-in-operations-for-numeric-data-types" data-toc-modified-id="Built-in-operations-for-numeric-data-types-1.1">Built in operations for numeric data types</a></span></li><li><span><a href="#Importing-Additional-Methods-from-the-Math-Module" data-toc-modified-id="Importing-Additional-Methods-from-the-Math-Module-1.2">Importing Additional Methods from the Math Module</a></span></li></ul></li><li><span><a href="#Strings" data-toc-modified-id="Strings-2">Strings</a></span><ul class="toc-item"><li><span><a href="#Indexing-and-Slicing-Strings" data-toc-modified-id="Indexing-and-Slicing-Strings-2.1">Indexing and Slicing Strings</a></span></li><li><span><a href="#Working-with-Strings" data-toc-modified-id="Working-with-Strings-2.2">Working with Strings</a></span></li></ul></li><li><span><a href="#Converting-between-string-and-numeric-types" data-toc-modified-id="Converting-between-string-and-numeric-types-3">Converting between string and numeric types</a></span></li><li><span><a href="#Lists" data-toc-modified-id="Lists-4">Lists</a></span><ul class="toc-item"><li><span><a href="#Creating-Lists" data-toc-modified-id="Creating-Lists-4.1">Creating Lists</a></span></li><li><span><a href="#Indexing-Lists" data-toc-modified-id="Indexing-Lists-4.2">Indexing Lists</a></span></li><li><span><a href="#Working-with-Lists" data-toc-modified-id="Working-with-Lists-4.3">Working with Lists</a></span></li><li><span><a href="#Practice:-Creating-and-Sorting-List" data-toc-modified-id="Practice:-Creating-and-Sorting-List-4.4">Practice: Creating and Sorting List</a></span></li><li><span><a href="#Practice:-Inserting-Elements-in-a-List" data-toc-modified-id="Practice:-Inserting-Elements-in-a-List-4.5">Practice: Inserting Elements in a List</a></span></li><li><span><a href="#Practice:-List-Indexing" data-toc-modified-id="Practice:-List-Indexing-4.6">Practice: List Indexing</a></span></li></ul></li><li><span><a href="#Tuples" data-toc-modified-id="Tuples-5">Tuples</a></span></li><li><span><a href="#Dictionaries" data-toc-modified-id="Dictionaries-6">Dictionaries</a></span><ul class="toc-item"><li><span><a href="#Creating-Dictionaries" data-toc-modified-id="Creating-Dictionaries-6.1">Creating Dictionaries</a></span></li><li><span><a href="#Working-with-Dictionaries" data-toc-modified-id="Working-with-Dictionaries-6.2">Working with Dictionaries</a></span></li><li><span><a href="#Dictionaries-are-mutable" data-toc-modified-id="Dictionaries-are-mutable-6.3">Dictionaries are mutable</a></span></li><li><span><a href="#Use-dictionary-keys-to-access-the-values" data-toc-modified-id="Use-dictionary-keys-to-access-the-values-6.4">Use dictionary keys to access the values</a></span></li><li><span><a href="#Dictionaries-compared-to-lists" data-toc-modified-id="Dictionaries-compared-to-lists-6.5">Dictionaries compared to lists</a></span></li><li><span><a href="#You-can-easily-extend-a-dictionary-by-adding-new-keys" data-toc-modified-id="You-can-easily-extend-a-dictionary-by-adding-new-keys-6.6">You can easily extend a dictionary by adding new keys</a></span></li><li><span><a href="#You-can-loop-through-dictionaries" data-toc-modified-id="You-can-loop-through-dictionaries-6.7">You can loop through dictionaries</a></span></li><li><span><a href="#Dictionary-Summary" data-toc-modified-id="Dictionary-Summary-6.8">Dictionary Summary</a></span></li></ul></li><li><span><a href="#What-You-Learned" data-toc-modified-id="What-You-Learned-7">What You Learned</a></span></li></ul></div>

*******
# The Python Standard Library: Built-in Data Types, Functions, & Modules
*******

In this session, we explore basic data types and modules that operate on them using Python methods associated with each type.  In this and subsequent notebooks, we draw on material from various sources, including Jean Mark Gawron's book "Python for Social Science", available [here](http://www-rohan.sdsu.edu/~gawron/python_for_ss/course_core/book_draft/index.html)

Today we will be covering material in sections 3.3 - 3.4.2

First, some terminology:

- **Function**: a Python object that stores a generic, reusable _statement_ or series of statements which define an operation to perform on some future input for the purpose of generating some future output. Always starts with `def do_somthing()` and ends with a `return` statement. For example:

In [None]:
def add_2(x):
    return x + 2

- **Module**: a .py file that is _not_ a script. Instead it stores a combination of statements and functions and variable definitons that can be referenced by other scripts or interactive sessions. Sometimes called a library or a package. All modules must be _imported_ before you can use them. Some modules come pre-installed with Python, while others must be installed manually with a package manager like `pip` or `conda`.
- **Method**: a function that is defined within a module. It's a bit more complicated than that in reality, but its OK to think of it that way for now.

The Python Standard Library refers to a collection of modules and standalone functions that are included with every Python installation. Because there is no need to install these separately, we say they are "built-in". You can find a full list of built-in Python functions [here](https://docs.python.org/3/library/functions.html)

## Numeric Data Types in Python

We have already seen some of the basic interactions with numbers in Python.  The main two numeric types are Int and Float.  In Python 2 there were two versions of integers (int and long), but these have been unified in Python 3.

In [None]:
# Integers are the simplest numeric type
type(12)

In [None]:
# Float or Floating Point numbers enable more precision
type(12.0000000000001)

In [None]:
# We can assign values to a variable to reuse them
x = 12.000000000000001
y = 12
print(x)
print(y)

Why not use floats all the time?  They are more precise, after all.  A couple of reasons.  One is that it can be more complicated to do certain things, like compare numbers to see if they are equivalent.

In [None]:
# Test whether x is equal to y.  
x == y

**Try the above by changing the number of decimal places to the original x and test it again...**

If two numbers are within some tolerance of each other, Python will consider them close enough to call equal in value.

A second reason is that floating point numbers require more space in memory and on disk if they are in a file.  This is not a problem for a single value, but if you were working with really large databases, it adds up, and could cause you to run out of memory or disk if you used float as the type for all your numeric data.

You can **cast** the type of a number to convert it to a specified type, like converting from float to int:


In [None]:
print(x)
y = int(x)
print(y)

In [None]:
float(y)

### Built in operations for numeric data types

Reviewing some of the built-in methods in Python that apply to numeric data types:

In [None]:
x = 200
y = 12

In [None]:
# summation
x - y

In [None]:
# multiplication
x * y

In [None]:
# division
x / y

In [None]:
# integer division -- floored quotient
x // y

In [None]:
# remainder (modulus) of x / y
x % y

Question: How could you use the modulo (%) operator to find out if a variable is an even number?

In [None]:
79 % 2

In [None]:
# Flipping the sign
y = -x
y

In [None]:
# Works in the other direction as well
-y

In [None]:
# Raising x to the power of y
x = 10
y = 5
x ** y

In [None]:
pi = 3.141592653589793
round(pi,4)

### Importing Additional Methods from the Math Module ###

In addition to the functions and operators above, many more are available from the `math` library, which is always available because it is part of the Python Standard Library. But `math` is its own module, so you must explicitly import it before you have access to the methods. A few examples below.

In [None]:
import math
math.sqrt(x)

You can see the full list of functions available in the math library by using tab after the name of the library and a dot:

In [None]:
math.

And you can get more documentation on a specific function by asking for it:

In [None]:
math.log?

In [None]:
math.log(x)

In [None]:
# What happens if we take the log of a number with a value of 0?
x = 0
math.log(x)

In [None]:
# We could add a 1 to x to avoid this problem
math.log(x+1)

In [None]:
# Or we could use one of the other log functions in math that does this and avoids returning an error
math.log1p(x)

In [None]:
# A common problem is division where the denominator has a value of zero
y / x

In [None]:
y / (x + 1)

In [None]:
# Comparing two values to see if they are approximately the same, within some tolerance:
x = 12.1
z = 12.2
math.isclose(x, z, rel_tol=.01)

In [None]:
# Of course you can put several operations together to compute things, like a quadratic equation.  
# We will see how to do this on set of numbers a bit later.
a = 2
b = 3
c = 4
y = a + b * x + c * x**2 
y

Note that ordering of operations matters.  Evaluation is not just left to right.  There is a term that might help remember the order by which calculations are done: Parenthesis, Exponents, Multiplication, Division, Addition, Subtraction (PEMDAS).  It is often helpful to use parentheses to group operations even just for readability, but it can make the difference between getting the result right or wrong.

In [None]:
a + b * x

In [None]:
(a + b ) * x

## Strings

Strings are just text, like in the introductory "Hello World!" example.  

Let's explore some methods that operate on them, and explain an important distinction between data types.  Let's review quickly what we already know about strings.  We can assign any string to a variable like we would assign an integer or a float to a variable:

In [2]:
# Try this first just with a text string assigned to a variable
a = CP255

NameError: name 'CP255' is not defined

In [3]:
# The string needs to be in quotes for this variable assignment to work
a = "CP255"
type(a)

str

In [4]:
# The quotes can be single or double, but have to match, 
# or you will get an error as Python can't find the end of the string.
a = 'CP255"

SyntaxError: EOL while scanning string literal (<ipython-input-4-ecad0abe4292>, line 3)

What if you need to create a string that has multiple lines?  There are two ways to create such a string.  The first uses triple quotes.

In [5]:
X = """
  The Zen of Python:
  
  Beautiful is better than ugly.
  Explicit is better than implicit.
  Simple is better than complex.
  Complex is better than complicated.
"""

print(X)


  The Zen of Python:
  
  Beautiful is better than ugly.
  Explicit is better than implicit.
  Simple is better than complex.
  Complex is better than complicated.



In [6]:
# The second way uses \n to insert the line endings
X = "\n   Beautiful is better than ugly.\n   Explicit is better than implicit.\n   Simple is better than complex.\n   Complex is better than complicated."
print(X)


   Beautiful is better than ugly.
   Explicit is better than implicit.
   Simple is better than complex.
   Complex is better than complicated.


In [None]:
# Notice that the string object X actually has \n line endings as part of it. 
# The print function does not print those characters, it just starts a new line.
# But if you just type X, its built-in function to print itself shows its contents:
X

### Indexing and Slicing Strings

We can get individual elements of a string (characters) by using indexes, that give us pointers to the positions within a string.  

**Notice that counting in Python starts from zero -- essentially all counters are offsets from the first position. This can take a bit of getting used to -- think of it like the way building floors in Europe generally start with zero.  The first floor in Europe would be a second floor in the U.S.**

In [7]:
a[0]

'C'

We can use a the string indexing method to extract a range, or a specific section of a string, beginning from any position and ending in any position.  

Python uses a syntax that separates the starting from the ending index position by a colon.  If we leave out the first or last, then the indexing gives all the values up to (but not including) the second value, or all the ones from the first value to the end.  Some examples should make this clearer: 

In [8]:
a[1:5]

'P255'

In [9]:
a[:5]

'CP255'

In [10]:
a[8:]

''

Let's see how this is going... How would we get a slice of a string 'a' that contains the first two elements?

In [12]:
a[:2]

'CP'

### Working with Strings

In [13]:
# A variable containing a string is still an object, and can do things like print itself
a

'CP255'

Print works with strings the same way as with numbers, suppressing the quotes

In [None]:
print(a)

In [14]:
a = 'This is CP255!'

We can find the length of a string using the built-in len function

In [15]:
len(a)

14


Related to indexing, here is a string function to look up a specific substring within a string, and return its index, or position:

In [16]:
str.find(a, 'C')

8

Let's see what other string functions are available, using tab completion after str.:

In [None]:
str.

Some of these function names are pretty self-explanatory, like 'capitalize', but others are less so.  As usual, you can look up some quick help on any of those functions:

In [17]:
str.expandtabs?

Note that since we assigned a string to a variable, a, that variable is now an object of type string, and it has access to the string methods directly:

In [18]:
print(a)
a.find('T')

This is CP255!


0

We can check whether a string contains a character or substring:

In [19]:
'R' in a

False

We can remove specific characters in a string with the strip method:

In [20]:
a.strip('!')

'This is CP255'

To remove any leading and trailing spaces from a string, just use the strip function with no argument:

In [21]:
b = ' ' + a
print(b)
print(b.strip())

 This is CP255!
This is CP255!


It is often helpful to put several operations together on one line, nesting them.  Going from left to right, we first take the values from the 8th index value to the end of the string, and then we strip the '!' from that result, and then we capitalize the result:

In [22]:
a[8:].strip('!').lower()

'cp255'

Another handy function lets you capitalize each word:

In [None]:
a.title()

Note that we cannot assign a new letter to part of the string by its index location.  This is because in Python, strings are an **immutable** data type.  As we will see shortly, other data types like lists are **mutable**.

In [23]:
a[0] = 't'

TypeError: 'str' object does not support item assignment

There is a function that will let you replace string values, however:

In [24]:
print(a)
print(a.replace('!', '?'))

This is CP255!
This is CP255?


We can also convert strings to lists of strings by splitting on a delimeter:

In [25]:
c = 'lastname,firstname,streetnumber,streetname,city'
c.split(',')

['lastname', 'firstname', 'streetnumber', 'streetname', 'city']

In [26]:
c.split(',')[0]

'lastname'

## Converting between string and numeric types

In [27]:
rent = '2500'
type(rent)

str

Let's say we have a string object that contains numeric values and we want to do mathematical operations on it.  What happens?

In [28]:
rent * 2

'25002500'

In [29]:
rent * 1.5

TypeError: can't multiply sequence by non-int of type 'float'

If we need to do mathematical operations, we really need to convert this string object to a numeric type -- either an integer or a float.

In [30]:
rent_int = int(rent)
type(rent_int)

int

In [31]:
rent_int * 2

5000

In [32]:
rent_float = float(rent)
rent_float

2500.0

Recall that you can also convert an integer to a float by a mathematical operation that involves a floating point component so that the result is forced to type float:

In [33]:
rent_flt = rent_int * 1.5
rent_flt

3750.0

But notice that the int method won't convert a string that looks like a floating point number:

In [34]:
rent_i = int('2500.0')

ValueError: invalid literal for int() with base 10: '2500.0'

But you can do this if you first convert to float and then convert to int:

In [35]:
rent_i = int(float('2500.0'))
print(rent_i)
type(rent_i)

2500


int

Of course, you sometimes may need to convert data from numeric to string type.  It works the same way:

In [36]:
rent_str = str(rent_int)
rent_str

'2500'

## Lists

You can think of strings as an ordered list of characters.  In Python, **lists** are another basic data type. Lists can contain any kind of object: strings, integers, floats, and others -- in any combination.  The syntax for lists is to include them as a sequence separated by commas, and enclosed in square brackets.  

### Creating Lists

We can create an empty list, and add elements to it:

In [37]:
mylist = []
mylist.append('this')

In [38]:
mylist

['this']

Notice that we can add lists, like we can add strings, to contatenate them:

In [39]:
# Besides using append as above, we can use + to add a list to a list, in this case we are adding a list with 1 item
mylist = mylist + ['that']
mylist

['this', 'that']

In [53]:
# We can also insert items in a specified location in a list
mylist.insert(1, 'and')
mylist

['this', 'and', 'and']

We can also convert a string that might be a sentence, or a line of data, to a list, so we can work with its elements more easily:

In [41]:
a = 'This is CP255!'
print('a = ', a)
b = str.split(a)
print('b = ', b)

a =  This is CP255!
b =  ['This', 'is', 'CP255!']


In [42]:
# And recalling that a is a string object, we can use the split function directly on a 
a.split()

['This', 'is', 'CP255!']

### Indexing Lists

Note that indexing works for lists like it does for strings.  And if you have a list of strings, you can index into both in a nested way.

In [43]:
# What is the content of the first item in the list?
mylist[0]

'this'

In [44]:
# What is the content of the last item in the list? We can use the index value -1 to get the last item
mylist[-1]

'that'

To get a range of values from a list, use a slice of the index values: [0:2] would get the first through the 2nd entry, since the range goes up to, but does not include, the value of the index after the colon.

In [45]:
mylist[0:2]

['this', 'and']

In [46]:
# How would we find the first character of the second word in our list?  We can 'nest' the indexing like this:
mylist[1][0]

'a'

### Working with Lists

What functions are available for list objects?

In [47]:
list.reverse?

In [49]:
# Find out the length of a list using len
len(mylist)

3

In [50]:
# Let's count the number of times we encounter a character in the list, or a word
a.count('5')

2

You can check whether a list contains an item, just as we did with strings.

In [48]:
'this' in mylist

True

In [54]:
# Delete the 3rd item in the list (remember it is indexed from 0). Let's make a copy of the list first
# since del is an inplace deletion
print(mylist)
shortlist = mylist
del(shortlist[2])
shortlist

['this', 'and', 'and']


['this', 'and']

Remember that strings are immutable and we were unable to directly substitute a value of a character based on its index position?  Well, **lists are mutable**, and it does work to replace a value directly by its index value:

In [55]:
b[2] = 'mutable!'
b

['This', 'is', 'mutable!']

and we can put the list of strings together again to make a string from a list, inserting a space between each element:

In [56]:
c = str.join(' ',b)
c

'This is mutable!'

We can reverse the order of the items in a list. Notice that this is an in place operation.  Try it twice.

In [59]:
b.reverse()
b

['mutable!', 'is', 'This']

We can use the sort function to order the list.  Let's try it with a list of numbers first.

In [60]:
nums = [1, 3, 4, 5, 8, 6]
nums.sort()
nums

[1, 3, 4, 5, 6, 8]

In [61]:
nums.reverse()
nums

[8, 6, 5, 4, 3, 1]

And now with a list of words.

In [62]:
words = ['A', 'big', 'apple', 'pie']
words.sort()
print(words)

['A', 'apple', 'big', 'pie']


Note that -1 indexes the last item in a list

In [63]:
words[-1]

'pie'

and that nesting a second, or nested, index slices into the string in an item in a list

In [64]:
words[-1][:-1]

'pi'

There is a range function that is often helpful in creating a list of integers.  It requires one argument (the length of the range) but can optionally accept arguments for the start, end, and step size of the range.

In [65]:
a = list(range(10))
print(a)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


In [66]:
b = list(range(1,5))
print(b)

[1, 2, 3, 4]


In [67]:
c = list(range(10,100,5))
print(c)

[10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95]


In [68]:
# Why did we use the 'list' method combined with 'range'?  Let's see what 'range' does by itself.
range(10,100,5)

range(10, 100, 5)

In [69]:
# It creates a special 'range' object
a = range(10)
type(a)

range

But the Python 'range' method is very helpful in a lot of contexts, including these simple examples used to create a list

### Practice: Creating and Sorting List ###
Write code that creates a list with even numbers from 0 to 100 (including 100) and print the result in reverse order. 


In [72]:
d = list(range(0,101,2))
d.reverse()
print(d)

[100, 98, 96, 94, 92, 90, 88, 86, 84, 82, 80, 78, 76, 74, 72, 70, 68, 66, 64, 62, 60, 58, 56, 54, 52, 50, 48, 46, 44, 42, 40, 38, 36, 34, 32, 30, 28, 26, 24, 22, 20, 18, 16, 14, 12, 10, 8, 6, 4, 2, 0]


### Practice: Inserting Elements in a List
Write code that add the name "Norah" to the following list, after the name "Mike" names= ['Akshara','Anna','Aqshems','Chester','Echo','James','Jessica','Matthew','Michael','Philip','Sarah']

In [74]:
names = ['Akshara','Anna','Aqshems','Chester','Echo','James','Jessica','Matthew','Michael','Philip','Sarah']
names.insert(9, "Norah")
print(names)

['Akshara', 'Anna', 'Aqshems', 'Chester', 'Echo', 'James', 'Jessica', 'Matthew', 'Michael', 'Norah', 'Philip', 'Sarah']


### Practice: List Indexing

Let's say we have a list called 'thing' containing the integers from 1 to 7 and we have variables low = 2 and high=5:

For each operation below, first think about what you think the answer will be, then write it as code in the cell below and confirm that it does what you expected. For readibility add one cell at a time and execute it, starting by creating a list called 'thing' with the integer values 1...7, and variables low and high with values 2 and 5. Then answer each question below.

1- What does thing[low:high] do?

2- What does thing[low:] (without a value after the colon) do?

3- What does thing[:high] (without a value before the colon) do?

4- What does thing[-1] (just a colon) do?

5- What does thing[:-1] (just a colon) do?

6- What does thing[:] (just a colon) do?

7- How long is the list thing[low:high]?



In [78]:
#start by creating a list called 'thing' with the integers from 1 to 7, without typing them into a list
thing = list(range(0,8))
print(thing)
low = 2
high = 5

[0, 1, 2, 3, 4, 5, 6, 7]


In [79]:
thing[low:high]

[2, 3, 4]

## Tuples

Continuing from numeric types, strings and lists, we now cover three more useful data types in Python: tuples, dictionaries, and arrays.  We will cover how to create them, what they are used for, and how to use some of their methods.

Tuples are like lists, but are **immutable**.  The syntax is similar except tuples use parentheses instead of square brackets.

In [80]:
d = ('a', 'b', 'c')
print(d)

('a', 'b', 'c')


In [81]:
d[2] = 'z'

TypeError: 'tuple' object does not support item assignment

See?  It really is immutable.  You'll just get a traceback if you try.  Use immutables only when you don't want to allow them to be modified.

In [82]:
d.remove['c']

AttributeError: 'tuple' object has no attribute 'remove'

If you want to remove an element or update it, you could translate the tuple back to a list first.

In [83]:
print(d)
e = list(d)
e.remove('c')
print(e)

('a', 'b', 'c')
['a', 'b']


But notice that e is a list, not a tuple.  If we want the result to be a tuple, we have to convert it back from a list.

In [None]:
f = tuple(e)
print(f)

The zip function takes two equal-length collections (like lists) and combines them element by element to create tuples of the pairs with the same index value. Here we create two lists of integers and zip them to create a list of tuples:

In [84]:
x = [1,2,3]
y = [4,5,6]
zipped = zip(x,y)
print(list(zipped))

[(1, 4), (2, 5), (3, 6)]


## Dictionaries

A Dictionary (or "dict") is a way to store data just like a list, but instead of using only numbers to get the data, you can use almost anything. This lets you treat a dict like it's a database for storing and organizing data.

A python dictionary is a collection of key, value pairs. The **key** is a way to name the data, and the **value** is the data itself. 

Dictionaries are a very handy data type that can be used to manage data you need to look up by a key.  Dictionaries are unordered key - value pairs, separated by a colon.  They are much more general than the word : definition kind of pairing, since the value can be many different kinds of objects.  The syntax in this case identifies a dictionary with curly braces, containing lists of key-value pairs. 

### Creating Dictionaries


There are a few different ways to create dictionaries.  The first two create an empty dictionary.

In [3]:
newdict = {}

In [4]:
newdict=dict()

Another way to create a dictionary is to provide key: value pairs in a list, and put these into curly brackets:

In [5]:
antonyms = {'hot': 'cold', 'fast': 'slow', 'good': 'bad'}
print(antonyms)

{'hot': 'cold', 'fast': 'slow', 'good': 'bad'}


We can then add items to a dictionary using update, or assigning a value to a new key:

In [6]:
newdict.update({'new': 'item'})

In [7]:
newdict["next"] = "thing"

In [8]:
newdict

{'new': 'item', 'next': 'thing'}

Another way to do create a dictionary is by converting lists.  This is a convenient thing to do with real data that comes from files, compared to the simple data we are using here.  The zip function is a bit advanced -- we will come back to it later when we talk about loops and iterables.  For now, just understand that it creates an iterable (think list) of tuples, containing the paired entries from the Keys and Values lists.

Notice that we can use the zip function to combine the keys and values to make the dictionary, making tuples of key-value pairs:

In [91]:
Keys = ['hot', 'fast', 'good']
Values = ['cold', 'slow', 'bad']
antonyms2 = dict(zip(Keys,Values))
print(antonyms2)

{'hot': 'cold', 'fast': 'slow', 'good': 'bad'}


### Working with Dictionaries
As usual, find the functions available for this class by using its name, dot, and tab:

In [None]:
dict.

We can retrieve the value of any dictionary entry by its key:

In [92]:
antonyms['hot']

'cold'

In [93]:
antonyms.get('hot')

'cold'

We can get the length, keys, and values of a dictionary:

In [94]:
len(antonyms)

3

To see all the keys in a dictionary, use the keys function:

In [95]:
print(antonyms.keys())

dict_keys(['hot', 'fast', 'good'])


The same thing works to get the values:

In [96]:
print(antonyms.values())

dict_values(['cold', 'slow', 'bad'])


### Dictionaries are mutable

We already saw that we can add elements to a dictionary. We can change the value associated with a particular key by just assigning a value:

In [97]:
antonyms['fast'] = 'gorge'
antonyms

{'hot': 'cold', 'fast': 'gorge', 'good': 'bad'}

As you can see, working with dictionaries is kind of like working with
lists and tuples, except that you can’t join dicts with the plus operator
(+). If you try to do that, you’ll get an error message:

In [98]:
antonyms = {'hot': 'cold', 'fast': 'slow', 'good': 'bad'}

synonyms = {'hot': 'very warm', 'fast': 'quick', 'good': 'fine'}

antonyms+synonyms

TypeError: unsupported operand type(s) for +: 'dict' and 'dict'

OK, merging antonyms and synonyms into a single dictionary doesn't make a lot of sense, but we're just learning how to use dictionaries... Here is one way to merge the list.  But notice the result has only three elements, not six. Why?

In [99]:
newdict = {}
newdict.update(antonyms)
newdict.update(synonyms)
newdict

{'hot': 'very warm', 'fast': 'quick', 'good': 'fine'}

Maybe the result is different if we ensure the keys are unique?

In [103]:
antonyms = {'hot': 'cold', 'fast': 'slow', 'good': 'bad'}
antonyms2 = {'blue': 'cold', 'red': 'hot'}
newdict = {}
newdict.update(antonyms)
newdict.update(antonyms2)
newdict

{'hot': 'cold', 'fast': 'slow', 'good': 'bad', 'blue': 'cold', 'red': 'hot'}

If you want to delete a dictionary entry, use del:

In [104]:
del newdict['red']
newdict

{'hot': 'cold', 'fast': 'slow', 'good': 'bad', 'blue': 'cold'}

What happens if you try to rerun the cell above after you have already run it?

In [105]:
cityPlanners_dict = {"name": "Jane Jacobs", \
                     "year of birth": 1916, \
                     "year of death": 2006, \
                     "place of birth": "Pennsylvania"}

- The keys have to be **unique** and are **immutable**. The usual suspects are strings and integers.
- The values can be anything, including lists, and even other dictionaries (nested dictionaries):

In [106]:
cityPlanners_dict = {"name": "Jane Jacobs", \
                     "year of birth": 1916, \
                     "year of death": 2006, \
                     "place of birth": "Pennsylvania", \
                     "books": ["The Death and Life of Great American Cities",\
                               "Cities and the Wealth of Nations","Dark Age Ahead",\
                               "Eyes on the Street: The Life of Jane Jacobs",\
                               "The Economy of Cities"]}


- key/value pairs are **unordered**. Even though they print in a particular way, this doesn't mean that one comes before the other.

In [108]:
print(cityPlanners_dict)
print(cityPlanners_dict.keys())

{'name': 'Jane Jacobs', 'year of birth': 1916, 'year of death': 2006, 'place of birth': 'Pennsylvania', 'books': ['The Death and Life of Great American Cities', 'Cities and the Wealth of Nations', 'Dark Age Ahead', 'Eyes on the Street: The Life of Jane Jacobs', 'The Economy of Cities']}
dict_keys(['name', 'year of birth', 'year of death', 'place of birth', 'books'])


### Use dictionary keys to access the values

- Instead of using indices to extract items, dictionaries uses key-value pairs to find and retrieve information.

In [109]:
print(cityPlanners_dict.keys(),'\n')
print(cityPlanners_dict.values())

dict_keys(['name', 'year of birth', 'year of death', 'place of birth', 'books']) 

dict_values(['Jane Jacobs', 1916, 2006, 'Pennsylvania', ['The Death and Life of Great American Cities', 'Cities and the Wealth of Nations', 'Dark Age Ahead', 'Eyes on the Street: The Life of Jane Jacobs', 'The Economy of Cities']])


- If you wanted the value of a particular key:

In [110]:
cityPlanners_dict["name"]

'Jane Jacobs'

- Or perhaps you wanted the last element of the `books` list

In [111]:
cityPlanners_dict["books"][-1]

'The Economy of Cities'

### Dictionaries compared to lists

In general, if you need data to be ordered or you have only simple data not needing to be subset, use a list.

If the data is complex or hierarchical, the dictionary's `key` / `value` structure can be very helpful. If you are only concerned about membership in a collection, dictionaries will always be much faster to reference, as the computer doesn't have to keep track of order. And to make a hierarchical or nested data structure, you can put a list (or even another dictionary!) inside a dictionary as the `value`.

**Looking Ahead**: when you begin looking at data embedded in websites, it is generally going to be in JSON format, which is comprised of, guess what? Nested Dictionaries!

### You can easily extend a dictionary by adding new keys

- Note that dictionaries are "indexed" with square braces, just like lists--they look the same, even though they're very different.

In [112]:
cityPlanners_dict["place of birth"] = "San Francisco"
print(cityPlanners_dict)

{'name': 'Jane Jacobs', 'year of birth': 1916, 'year of death': 2006, 'place of birth': 'San Francisco', 'books': ['The Death and Life of Great American Cities', 'Cities and the Wealth of Nations', 'Dark Age Ahead', 'Eyes on the Street: The Life of Jane Jacobs', 'The Economy of Cities']}


In [113]:
cityPlanners_dict["gender"] = "Female"
print(cityPlanners_dict)

{'name': 'Jane Jacobs', 'year of birth': 1916, 'year of death': 2006, 'place of birth': 'San Francisco', 'books': ['The Death and Life of Great American Cities', 'Cities and the Wealth of Nations', 'Dark Age Ahead', 'Eyes on the Street: The Life of Jane Jacobs', 'The Economy of Cities'], 'gender': 'Female'}


### You can loop through dictionaries

We haven't gotten to iteration just yet - we'll cover it in more detail in the next couple of sessions, but one of rhe things you can do with dictionariea is to iterate over their elements.  You can do this will all data types that are *iterable*, including lists, strings, and tuples we have already covered.

- There are several ways to loop through dictionaries. Looping over `.keys()` using a 'for' loop is an easy method.
- Note the order is not sorted by key.

In [114]:
race = {'white': 0.643, 'african_american': 0.068, 'asian': 0.21, 'other': 0.079}

for key in race.keys():
    print(key, race[key])

white 0.643
african_american 0.068
asian 0.21
other 0.079


Using a for loop makes it really easy to change the value of items in the dictionary, like transforming fractions to percentages:

In [115]:
# translate fractions to percentages 
race = {'white': 0.643, 'african_american': 0.068, 'asian': 0.21, 'other': 0.079}
for value in race.keys():
    race[value] = round(100 * race[value],2)

print(race)

{'white': 64.3, 'african_american': 6.8, 'asian': 21.0, 'other': 7.9}


To see if something is in a collection like a list or a dictionary, use the `in` operator:

In [116]:
countries = ["Afghanistan", "Canada", "Denmark", "Japan"]
race = {'white': 0.643, 'african_american': 0.068, 'asian': 0.21, 'other': 0.079}

print('Japan' in countries)
print('Iran'in countries)
print('asian' in race)
print('asian' not in race)

True
False
True
False


*****

### Dictionary Summary

1. A python dictionary is a collection of key, value pairs.
2. Use dictionary keys to access the values.
3. Once a dictionary has been created, you can change the values of the data and assign new keys.
4. You can loop through key/value pairs in a dictionary.

## What You Learned


In this session, you learned how Python uses numeric data types like ints and floats, and now string data types work.  You learned that Python Indexing starts at 0 and learned out to do some string manipulations and convert data between types.

Now would be a good time to review and experiment with these data types and methods to get comfortable with them.  They are the first pieces of the foundation you are building to become productive in Python programming and urban data science.

# Sources

This notebook was heavily adapted from previous course material by Prof. Paul Waddell and Samuel Maurer.