# Python Basics

This chapter focuses on the Python componemnt of the class.  We will focus exclusively on using Python for interactive data science work rather than on "programming" per se.

## Python Book
VanderPlas, J. <i>Python Data Science Handbook: Essential Tools for Working with Data</i>, O'Reilly Media, 2016.  Available on 
Amazon and at https://jakevdp.github.io/PythonDataScienceHandbook/.

Note that this is not a book that teaches Python.  Instead, it's focused on using Python for Data Science. 

## Key Python Modules
We will use several common Python <i>modules</i>.  A modules is similar to a library in other lanugages.  The modules that we will use include:
<ul>
<li />NumPy
<li />Pandas
<li />SciPy
<li />Matplotlib and Seaborn
<li />Python MySQL Connector
</ul>
In addition, we will also learn how to create and use our own modules.


## Introduction to Python

In [None]:
print ("Hello World!")

In [None]:
# Unlike many languages that use {} and other start/end identifers
#   to identify blocks, Python uses intention.  
for i in range(5):
    j = i + 1
    print ("[{:}] Hello World!".format(j))

In [None]:
for i in range(20):
    if i % 2:
        print(i)

In [None]:
# simple implementation of the pseudo code from Slide 5 of the Intro to 
#   Python slide set.
Numbers = [123, 87, 96, 24, 104, 16, 55, 24, 19, 86, 776, 1945, 87.5, 12.34]
Total = 0
Count = 0
for Num in Numbers:
    Total = Total + Num
    Count=Count+1
if Count > 0 :
    Average = Total/Count
else :
    Average = "Can't compute the average of a sample of size 0."
print (Average)


### Strings

In [None]:
# Create a string object, a variable (s), and a reference from s to
#    the string object.
s = "The dog is hungry."
s

In [None]:
# Individual elements of the string
s[0], s[1], s[2], s[12]

In [None]:
# string length ... Why doesn't s[len(s)] == '.'?
len(s)
#s[len(s)]

In [None]:
# Strings are immutable
s[0] = 'x'

In [None]:
# slices - s[i:j] - give me everything from i up to (but not including) j
s[4:9]

In [None]:
a = 6
b = len(s) - 1
s[a:b]

In [None]:
# If the intial number is blank, start at the front,
#   if the second is blank, go to the end.
s[:3]
#s[3:]

In [None]:
# concatenation
s + " So give her some food."
# note that this does not change s
#s

In [None]:
# find substring
s1 = s + " So give her some food."
s1.find('gry')

In [None]:
s1[s1.find('gry'):]

In [None]:
# replace
s1.replace('dog', 'cat')
# note that this creates a new object -- it doesn't change s1
#s1

In [None]:
# split - splits a string into substrings
s2 = "eight, nine, 12, seventy, four"
s2.split(',')

In [None]:
# chaining functions/methods
(s + " So give her some food.").split()

In [None]:
(s + " So give her some food.").replace('.','').split()

### The format() method for string objects

https://docs.python.org/3.1/library/string.html#format-specification-mini-language

In [None]:
a= 'Jim'
b = 'Carl'
c = 'Nancy'
"{:}, {:}, and {:} are going on a trip".format(a, b, c)

In [None]:
# Can use modifiers to format the variable display
salary = 122000
"{:}'s salary is ${:} per year.".format(c, salary)
#"{:}'s salary is ${:,.2f} per year.".format(c, salary)

### Introduction to Lists

In [None]:
# define a list and then show the list
l1 = [1, 2, 3, 17, 967, 45, "dog", 'cat']
l1

In [None]:
# element referencing, i, -i, 0-based.
l1[1]

In [None]:
# Nested lists
l2 = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
l2

In [None]:
# this one and the previous are equivalent -- which is clearer?
l2 = [ [1, 2, 3]
     , [4, 5, 6]
     , [7, 8, 9]
     ]
l2

In [None]:
# to view the nexted list in matrix form:
for r in l2:
    print(r)

In [None]:
# This is important and causes beginners problems.
# Unlike simple objects, lists are mutable.
l1 = [1, 2, 3]
l2 = l1
l1, l2

In [None]:
l1[0] = "The dog ate my homework."
# Now show both lists.  
l1, l2

In [None]:
# compare to the similar actions with simple objects
x = 123
y = x
x, y

In [None]:
x = "The dog ate my homework."
x, y

### Introduction to List Comprehensions
This is an important concept -- comprehensions are your friends!

In [None]:
# Define a matrix
m = [ [1, 2, 3]
    , [4, 5, 6]
    , [7, 8, 9]
    ]

# show the third row
m[2]


In [None]:
# to show the third column - use a list comprehension
[r[2] for r in m]

In [None]:
# to show the diagonal (upper left to lower right)
[m[i][i] for i in [0, 1, 2]]

In [None]:
# more generally ...
[m[i][i] for i in range(len(m))]

In [None]:
# even elements of the second column
[r[1] for r in m if r[1] % 2 == 0]

In [None]:
# 18 random dice rolls
import random
[random.randint(1, 6) for i in range(18)]
# note that the iterator (i) is only used to iterate and isn't used in the expression.

### Sample Data Structure and Accessing/Processing/Comprehensions

In [None]:
# creating a list to define a person
person = ["Tom Howard", 54, 6.0]

# creating a list of lists to define a team
people = [
    ["Tom Howard",          54,  6.0],
    ["Jane Grimm",          19,  4.9],
    ["Sam Brown",           25,  6.2],
    ["Sarah Joan Spade",    26, 5.25],
    ["Blaine Jones",        62,  5.8],
    ["Devin Callahan",      32, 5.92],
]

In [None]:
person

In [None]:
people

In [None]:
# How many people on the team
len(people)

In [None]:
# Print each person's name and age
for p in people:
    print("{:} is {:} years old".format(p[0], p[1]))

In [None]:
# Create a list of all names
[p[0] for p in people]

In [None]:
# Create a list of all last names
[p[0].split()[-1] for p in people]

In [None]:
# Create a list of all ages
[p[1] for p in people]

In [None]:
# Compute the average age
sum([p[1] for p in people])/float(len(people))

In [None]:
# Find the oldest person
# max age
ages =[p[1] for p in people] 
max(ages)
# which is max? - 62
#ages.index(62)
# who - person 4
#people[4][0]

In [None]:
# all together
people[[p[1] for p in people].index(max([p[1] for p in people]))][0]

In [None]:
# Find the youngest person
people[[p[1] for p in people].index(min([p[1] for p in people]))][0]