In [None]:
#
# SciPy Teen Track Morning Session
# Part 1 (Before Break)
# Emily Quinn Finney
#

# Introduction
# Time estimate: 10 min
"""
Today we are going to review the basics of Python, including some of its
special built-in objects. We will also study other SciPy packages useful
for data analysis. These include NumPy ("Numerical Python"), which helps
analyze numerical data with many-dimensional tables called arrays, and
matplotlib ("Matlab Plot Library"), which helps visualize data with various
plotting tools. We will go through examples of Python code, as well as
write our own scripts. At the end of the day, we will do a mini-analysis
that puts all those skills together.

We will have a morning session and an afternoon session, each of which
will have a fifteen-minute break in the middle. Bathrooms are located _,
and please no eating and drinking in the conference room(?).

Here's our tentative schedule:
9am-10:30am   Introduction to Jupyter Notebooks. Introduction to basic
              Python syntax (data types, if/else statements, for/while
              loops, indexing, string slicing). 

10:45am-12pm  Introduction to more advanced data types (dictionaries,
              lists, sets) and their functionality. Introduction to
              importing packages. Brief introduction to NumPy. 

1:30pm-3pm    Introduction to functions. Introduction to matplotlib.
              We'll start to learn what's cool in our IMDB data by
              making plots of movie popularity.

3:15pm-5pm    Mini-analysis of the IMDB data set.


I have given each person a green post-it note and a red post-it note. As
we are going through the material, if you have finished the tutorial task,
put up your green post-it note to let me know. If you are confused about
the task, put up your red post-it note. In addition, if you have a question
or for any reason want to talk to someone about the tutorial material, feel
free to raise your hand to ask me or one of the tutorial volunteers for
assistance. That's what we're here for!
"""

In [None]:
# Introduction to Jupyter Notebooks.
# Time estimate: 10 min

"""
Here I'll have them set up their computers for the tutorial. Access the 
files (via a URL on github) and then start jupyter notebook from their 
Canopy environments. Make sure everything is running smoothly. Explain 
a little bit about what a notebook is/does, and how to run basic commands
(like adding a new cell, running a cell, etc.)
"""

In [None]:
# Introduction to basic Python syntax
"""
Many of you are already familar with the basic syntax of Python. But we're
going to spend this first section reviewing, to make sure everyone is on
the same page for later in the day. I learned some new stuff about Python
while preparing for this tutorial, so even if you are already familiar
with basic Python syntax be on the lookout for cool new tidbits! Again, if
you feel lost or stuck, put up a red post-it note or raise your hand. I
and the tutorial volunteers are happy to help.
"""

# Data types
# Time estimate: 10 min
# ints, floats, booleans, strings (won't mention long, complex)
"""
If you've used a programming language before, you are probably familiar
with some of the objects you can use to communicate with your computer.
"""
2.0
print(type(2.0))

2
print(type(2))

True
print(type(True))

"Hi scipy"
print(type("Hi scipy"))

In [None]:
# lists
"""
There are also objects that group other objects together.
"""
[1,2,3]
print(type([1,2,3]))

['h','i',' ','s','c','i','p','y']
print(type(['h','i',' ','s','c','i','p','y']))

In [None]:
# assigning variables
movie_title = "Batman Begins"
print(movie_title)
print(type(movie_title))

"""
Here have them do a few minutes of examples with types, perhaps involving
the Davis farmers' market because that is ridiculous.
"""

In [62]:
# Concatenating, indexing, slicing
# Time estimate: 20 min
"""
So now that we've got types down, I'd like to start using slightly more 
"""

# indexing
movie_title[0]
print(movie_title[0])

movie_title[1]
print(movie_title[1])

movie_title[-1]
print(movie_title[-1])

# slicing (introduce len)
print(movie_title[:6])

# adding strings and numbers etc.

NameError: name 'movie_title' is not defined

In [65]:
# If/else
# Time estimate: 10 min
# create a more introductory example
answer = input("What is your favorite animal? ")
length = len(answer)
if answer[1:length] == "at":
    print("Your favorite movie must be " + str(answer) + "man!")
elif answer == "spider":
    print("Your favorite movie must be " + str(answer) + "man!")
else:
    print("I don't think you even like movies.")

"""
Have them practice, with examples.
"""

What is your favorite animal? Bat
Your favorite movie must be Batman!


'\nHave them practice, with examples.\n'

In [None]:
# Importing packages
# Time estimate: 10 min
"""
There are also outside packages that do even cooler operations on lists, sets,
and dicts. A package is a collection of Python operations that go above and
beyond what you might use in every program. For instance, I may not need to
calculate a square root every day. But if I do, I can use the math package.
"""
import math
print(math.sqrt(9))

"""
If I don't want to type a long name in front of my function every time I use
it, I can do this:
"""
from math import sqrt
print(sqrt(9))

"""
If I always want to know where my package comes from, here's another thing
I can do:
"""
import math as m
print(m.sqrt(9))

"""
Or:
"""
from math import sqrt as sq
print(sq(9))

"""
Here's some other cool packages. Have them do examples with random.
"""

In [57]:
"""
Later in this tutorial, we are going to be working with numerical data 
about movies (like the movie's budget, average IMDB score, number of
Facebook likes, etc.) When we want to store numerical data in a table,
one of the best packages to use is NumPy, which stands for Numerical
Python. So we're going to import the NumPy package and spend the rest
of our time in this section of the tutorial playing with NumPy data
types.
"""

import numpy as np

"""
We load a NumPy table by using the loadtxt command. The documentation 
is here: https://docs.scipy.org/doc/numpy/reference/generated/numpy.loadtxt.html
There are various options we can use to load the text; we will use the
sep = ' ' option because that tells NumPy to interpret spaces as 
ways to delineate new columns in our table.
"""
table = np.loadtxt('imdb_numerical.txt')

In [61]:
# NumPy rows
print(table[0])

# NumPy columns
print(table[:,0])

# accessing a single element
data_point = table[5,3]
print(data_point)

"""
You may be wondering: how exactly do we know what columns are what? 
The short answer is, NumPy doesn't have a very straightforward way 
to label columns. (There is another package, Pandas, that is great for
this, but we won't have time to cover it.) Use your text editor to 
open up the file 'imdb_numerical.txt' and read the first line of the 
file to get the column names. In the future, we may choose to name 
individual columns so we don't have to remember that column 0 is the 
budget, for instance.
"""

id = table[:,0]
num_critic_for_reviews = table[:,1]
# etc.

[  0.00000000e+00   7.23000000e+02   1.78000000e+02   0.00000000e+00
   8.55000000e+02   1.00000000e+03   7.60505847e+08   8.86204000e+05
   4.83400000e+03   3.05400000e+03   2.37000000e+08   2.00900000e+03
   9.36000000e+02   7.90000000e+00   3.30000000e+04]
[  0.00000000e+00   1.00000000e+00   3.00000000e+00 ...,   5.03500000e+03
   5.03700000e+03   5.04200000e+03]
15.0


In [None]:
"""
That's all we'll go over for now. In the next section, we'll talk about
how to add and remove items from lists, and we'll learn about a new data
type, dictionaries. We'll also learn how to create loops, which allow us
to perform an operation on a lot of data quickly. 
"""