## Welcome to Python!

In [None]:
%matplotlib inline

This notebook includes a quick overview of the elements of the Python language we will need to get started.   Each editable box contains a chunk of code that will execute when you press **shift + enter**.  Try it here - try it with and without the "print":

In [None]:
print ("Hello world!")

### Variables

A variable references a piece of data that you can store in memory for later use.  Variables can have one of several **types**, especially **string** (sequence of characters), **integer** (whole number), or **float** (number with decimals). Variables can have any name we want, without spaces, usually consisting of letters and underscores. 

You can put a value into a variable using the equals sign.  More details are <a href="http://www.python-course.eu/variables.php">here</a>.

Try running these commands to see variables in action:

In [None]:
s = "Hello world"
print (s)
print (type(s))

Note that comments are preceded by the hash sign. 

In [None]:
f = 123.45
print (f)
print (type(f))

s = str(f)  # str() forces the floating point number to be converted to a string.
print(s)
print (type(s))

f = float(0)   # here, float() forces the zero to be interpreted as a floating-point number instead of an integer
print (type(f))

print(type(5))  # the decimal point forces the variable to a float as well
print(type(5.0))

Python supports basic arithmetic operators, where are overloaded to support different types, but not all types.  For example:

In [None]:
print (s + "!")
print (f + 0.5)
print (s + 0.5)  # Note that this causes an error! How might you fix this?

The built-in dir() function can help you keep track of the variables you have created; del deletes them:

In [None]:
test_variable = 0
print (dir())
del(test_variable)
print (dir())

### Lists and Dictionaries 

Lists are the workhorse collection type for managing series of similar variables.  They consist of a series of elements, indicated by square brackets. For more information on lists and dictionaries, see: https://docs.python.org/3/tutorial/datastructures.html.

In [None]:
my_list = [1,2,9,6,2]
print (my_list)

Individual elements are accessed by a sequential index number, also surrounded by square brackets, starting with zero. Negative numbers wrap to the end of the list.  Note the error when we access an index number past the end of the list:

In [None]:
print (my_list[2])
print (my_list[-1])
print (my_list[99])  # Note that this causes an error!

A new list can be created with empty brackets or list():

In [None]:
another_list = []
print (type(another_list))
another_list = list()
print (type(another_list))

There are several useful built-in functions that let you manipulate and query lists:

In [None]:
lst = [1,9,4,7]
print (lst)

print(lst.index(9))  # Gets the index of an item in the list.

lst.append(5)  # Appends a single item
print(lst)

lst.insert(0, 99)  # Inserts a single item
print(lst)

lst.extend( [25, 24, 23, 22] )  # Appends a list of items
print(lst)

print (sum(lst))  # adds up the items
print (len(lst))  # returns the number of items

In [None]:
a = max(lst)
b = min(lst)
print (a, b) # note that print can handle a flexible number of parameters

my_sorted_list = sorted(lst)  # Sorts the list and makes a copy
print ("Sorted list:", my_sorted_list)
print ("Original unsorted list:", lst)
lst.sort(reverse=True)  # Sorts the list
print ("Reverse-sorted in place:", lst)

Array indexing lets you slice and dice lists:

In [None]:
print (lst)
print(lst[0:3])  # The first three elemensts in the list
print(lst[-2])  # The second-to-last item in the list
print(lst[:3])  

Note that strings work similarly

In [None]:
name_str = 'Foundation Center'
print(name_str[0:3])  # The first three elemensts in the list
print(name_str[-2])  # The second-to-last item in the list
print(name_str[:3])

Dictionaries are also very common in Python.  They consist of key / value pairs, indicated by curly braces:

In [None]:
my_dictionary = { "word": "the", "count": 123456789 }
print (my_dictionary)

Individual elements are accessed the key, also using *square* brackets, for example:

In [None]:
print (my_dictionary["word"])
print (my_dictionary["count"])

The keys and values can be accessed as lists:

In [None]:
print (my_dictionary.keys())
print (my_dictionary.values())

With both lists and dictionaries, you can test whether or not they contain individual elements:

In [None]:
if ("word" in my_dictionary.keys()):  # Note colon sets off if/else statements
    print (my_dictionary["word"])
else:
    print ("Word not found")

if (25 in lst):
    print ("25 found!")

Lists and loops go together!

In [None]:
for s in my_dictionary.keys():
    print (my_dictionary[s])

You can also iterate over both simultaneously:

In [None]:
for k, v in my_dictionary.items():
    print ("{0} is the key; {1} is the value".format(k, v))  # Note the string placeholders and format call.

The range() built-in function returns a sequential series, which can be cast to a lst:

In [None]:
print(list(range(5)))  # 0 to n-1
print(list(range(1, 5)))  # m to n-1
print(list(range(0, 100, 5)))  # m to n-1 step x

*List comprehensions* use a special variation of the bracket notation to iterate over a list.

In [None]:
a = range(10)
print (a)
b = [n/10.0 for n in a]
print (b)

Finally, you can zip lists together into 

In [None]:
list1 = ["a", "b", "c"]
list2 = [123, 945, 876]
t = zip(list1, list2) # zip create an iterable object that combines the two lists 
print (dict(zip(list1, list2)))  # This can then be converted to a list (of tuples - like immutablelists) - or a dictionary
print (list(zip(list1, list2)))

***
### Questions

Here is a list of the populations of the 50 states from geonames.org. Run the cell to load the numbers into the state_populations variable:

In [None]:
state_populations = [
    4530315,
    660633,
    5863809,
    2757631,
    37691912,
    4678630,
    3527249,
    838549,
    552433,
    17671452,
    8975842,
    1284220,
    1416564,
    12772888,
    6265933,
    2955010,
    2740759,
    4206074,
    4515939,
    1325518,
    5624246,
    6433422,
    9883360,
    5141953,
    2901371,
    5768151,
    930698,
    1757399,
    2399532,
    1316216,
    8751436,
    1912684,
    19274244,
    8611367,
    630529,
    11467123,
    3547049,
    3642919,
    12440621,
    1050292,
    4229842,
    770184,
    5935099,
    22875689,
    2427340,
    624501,
    7642884,
    6271775,
    1817871,
    5535168,
    505907
]

***
### 1. Write some code below that prints:
* The number of states
* The total population of all states
* The maximum population
* The average population
* The top 10 populations

***
### 2. Use the data loaded below to:

* figure out what the highest and lowest per-capita arts funding is.
* figure out what states those amounts refer to
* create a dictionary where the keys are state names, and the values are per-capita arts funding.

The arts_funding_data file contains, in the same order:
* state_names
* state_populations
* state_arts_funding

In [None]:
import arts_funding_data as d  # Import runs the content of the reference file; "as" gives it an alias.

print (d.state_names, d.state_populations, d.state_arts_funding)

## Functions

In Python, functions are indicated by parentheses following the name. Functions must first be *defined* before they can be used.  

Note the colon following the function definition. Also, note that in Python indentation is significant - it knows the second line below is part of the function because it is indented following the function name:

In [None]:
def say_hello():
    print ("Hello world")

Functions are *called* by using the parentheses only, without the "def" or colon:

In [None]:
say_hello()

Within the parentheses, *arguments* can be *passed* to the function, which operates on the parameter.  For example:

In [None]:
def say_hello(name):
    print ("hello, " + name + "!")

This then lets us use the function with multiple different values:

In [None]:
say_hello("there")
say_hello("bunny rabbit")
s = "Horatio Hornblower"
say_hello(s)

Functions can also be set up to return values, using the keyword "return":

In [None]:
def square(n):
    return (n ** 2)

The returned values can then be assigned to variables, for example:

In [None]:
b = square(4)
c = b * square(b)
print (c)

Python functions can return multiple values.

In [None]:
def sqr_sqrt(x):
    return x**2, x**.5
sqr_sqrt(100)

### Question

Using what we have learned about lists and dictionaries, write three functions:

* One that returns average value of any given list of numbers.
* Another that returns a dictionary of information containing descriptive data about any given list of number, for example:
```
{
  "count": 12,
  "average": 123.45,
  "min": 0,
  "max": 500
}
```
* And a third that returns the top 5 elements in the list.

###  Modules

Built-in functions are often grouped into *modules*, for example, random - returns a random number between zero and one.  The dot notation means that you are referencing something that is contained in another, in this case the rand function within the random module.

In [3]:
import random 
random.randint(1, 10)

6

###  Flow Control

*for*-*in* iterates over items; *if*-*else* conditionally selected between statements.  Note the colon and indentation define which statements are included in the conditional blocks.

In [4]:
x = range(10)
for a in x:
    if a > random.random() * 10:
        print (a)
    else:
        print ("false")
print ("all done")

false
false
false
false
false
5
6
false
8
9
all done


### Sets and Tuples

List features include that you can find it by index number, data need not be unique, the contents can be heterogeneous and the structure is changeable.

In contrast, a *set* is a collection of unique item, useful when data needs to be unique, and when you need to carry out set operations on your data. You can use sets to extract unique items from lists, by casting lists as sets.

In [5]:
my_list = [1,2,2,3]
print (my_list)
s1 = set(my_list)
print (s1)
s2 = {3,4,5}
print (s2)
print (type(s2))

[1, 2, 2, 3]
set([1, 2, 3])
set([3, 4, 5])
<type 'set'>


Set operators can occasionally be handy:

In [6]:
print (s1 | s2) # Union
print (s1 & s2) # Intersection
print (s1 - s2) # Difference
print (s1 ^ s2) # Symmetric difference - elements unique to both

set([1, 2, 3, 4, 5])
set([3])
set([1, 2])
set([1, 2, 4, 5])


A tuple is a collection of items, not necessarily unique. They are lighter-weight than lists, and don't support as many ancillary operations like inserting or appending.  Good for things where you know there will always be the same number of items, like coordinate pairs, where you know there are always two numeric elements, versus polygon vertex lists, which can contain an unpredictable number of items.


In [7]:
t1 = (1,2,3)
print (t1)
t2 = (6,) # For python to know this single item is a tuple and not just a number with parens around it, it needs the trailing comma
print (t1 + t2)

(1, 2, 3)
(1, 2, 3, 6)


***
### Some basic plots

Matplotlib is a basic plotting library comonly used in Python.  The notebook environment lets you display plots inline with your code; outside of notebooks, they come up as separate windows.

In [None]:
import matplotlib
import matplotlib.pyplot as plt
import arts_funding_data as d 

plt.hist(d.state_populations) # Histogram
plt.show()
plt.scatter(d.state_populations, d.state_arts_funding) # Scatterplot
plt.show()
plt.bar(range(len(d.state_arts_funding)), height=d.state_arts_funding)
plt.show()

Subplot command lets you tile plots as needed. See http://matplotlib.org/users/pyplot_tutorial.html for more examples.

In [None]:
plt.subplot(1,3,1)  # Num rows, num columns, current plot index
plt.hist(d.state_populations) # Histogram
plt.subplot(1,3,2)
plt.scatter(d.state_populations, d.state_arts_funding) # Scatterplot
plt.subplot(1,3,3)
plt.bar(range(len(d.state_arts_funding)), height=d.state_arts_funding)
plt.show()

***
### 3. Using this information, create a bar chart of the top 10 states for per-capita arts grant funding.

***
### 4. Write a function that creates a scatterplot between two lists of numbers.  Test it with 5 different sets random numbers.