# Session 1: Crash Course on Python

## Programming Basics

**Our end goal:**

* Input data
* Manipulate data until it's in the form we want
* Produce a compelling graphic

**Jupyter Notebook Basics**

* Entering and executing Python
* Documenting what you've done

**Python Basics Pt 1**

* Numbers and Calculations
* Comments
* Assignments
* Built-in functions (print, type), 
* Strings and Quotes
* Flow of Control

**Python Basics Pt 2**

* Lists
* Extending Python by Importing Packages
* Objects and Methods
* Tab Completion
* Getting help

## Numbers

Python has various "types" of numbers (numeric literals). We'll mainly focus on integers and floating point numbers.

* Integers are just whole numbers, positive or negative. For example: 2 and -2 are examples of integers.

* Floating point numbers in Python are notable because they have a decimal point in them, or use an exponential (e) to define the number. For example 2.0 and -2.1 are examples of floating point numbers. 4E2 (4 times 10 to the power of 2) is also an example of a floating point number in Python.

## Calculations

The most basic functionality that I can imagine for using Python is to treat it as a calculator. 
As a starting point, let's take a few minutes to run through the most basic arithmetic operations. 
In each of the cells below, enter an expression that will return the requested results. 
Remember, after you type the expression, hit <CTL>-return to execute the cell. 

To get you started, here are some of the most fundamental arithmetic operations:

| Operator | Function |
|:--------:|:--------:|
| + | Addition |
| - | Subtraction | 
| * | Multiplication |
| / | Division |

<div class="alert alert-info" role="alert">
You may have noticed that each of the cells in this notebook starts with a line that begins with a hashtag or "#". This hashtag tells Python that the line should be considered as a comment and shouldn't be executed. 
</div>

In [1]:
# add two integers, say 1 and 2



In [2]:
# multiply 2 integers, say 2 and 3



In [3]:
# now divide 2 integers, say 3 by 2 as 
# 3 / 2
# pay attention to what happens here



In [4]:
# In Python 2, when we divide two integers, we get what's called "classic" division within the Python community.
# We can avoid this "classic" division, by making sure that one of the numbers is a floating point number, for example try

# 3.0 / 2   



In [5]:
# And, in this case, the order in which we use the floating point number doesn't matter.

# 3 / 2.0



In [6]:
# We can force the issue by using the "float()" function to "cast" one of the integers to be a floating point number. 
# Note: we'll talk more about functions in a little while, but for now
# float(3) / 2



In [7]:
# Finally, we can control the order in which operations are executed using parentheses, for instance, try

# 2 * 3 + 4


In [8]:
# and conversely, notice the difference is we use the following

# 2 * ( 3 + 4 )


So what happened here? By default, Python has a specific order in which it will execute operations. As a rule, it will perform multiplication before addition. 
In these final examples, the parentheses are introduced to force the expression "3 + 4" to be executed first.
If you are completely new to programming, this can be a bit confusing and I recommend that you take a look at the Python documentation for more detail on the order in which operations are executed. 

## Assigning Variables

If we could only use Python as a calculator to evaluate expressions, then we'd be pretty limited in terms of the types of problems that we could solve. As a first step in moving towards more interesting computations, We can store results by assigning a label, or a variable name, to our results as we move along. For example, let's say that we want to store the value "10" using a variable named "a". 

Python does this as:

``` python
    a = 10
```

Try it in the cell below:

If you typed that in exactly as shown, then you shouldn't have gotten any type of output. There are a couple of ways in which we can test whether or not, the variable was actually set though. 

First, you can simply evaluate a cell with the name of the variable:

```python
    a
```

Alternatively, you can use the "print" function in python which will evaluate an expression (in this case the variable) and print the results to the screen. In Python 2, the syntax for this is:

```python
    print a
```

Try both techniques here so you can verify that the variable **a** was set to 10:

In terms of creating variable names, there are some rules on what's allowable:

    1. Names can not start with a number.
    2. There can be no spaces in the name
    3. Can't use any of these symbols :'",<>/?|\()!@#$%^&*~-+


Using variable names can be a very useful way to keep track of different variables in Python. For example:

    1. It's considered best practice that the names are lowercase.
    2. Also, if you need to create a long name, maybe comprised of several words string together, Python best practice is to connect these words using underscores. e.g. this_is_johns_variable

## Review

Construct a series of expressions to compute the taxes you would need to pay on a typical data scientists sallary. 
Assume the following:

* My income is $ 20,000,000
* My tax rate is 30%

As a simple exercise, please set up variables called my_income and my_tax_rate compute the taxes you owe using simnple multiplication. Finally, print out the taxes you owe.

### Answer

Your answer should look something like the following
```python
    my_income = 20000000
    my_tax_rate = 0.30
    my_taxes = my_income * tax_rate
    print my_taxes
```

When you evaluate your results, you should get something like:

```python
    Out [ ]: 6000000.0
```    
Depending on how you actually entered the lines of code, you may not have gotten the "Out" prompt

## Strings

* Double quotes or single quotes
* Index from 0
* Slicing
* Does not include the right index "up to but not including"
* Include everything
* Negative indexing
* frequency / step size / reversing a sting using a -1 step size

### String Properties
Strings are immutable, i.e. once they are created, they can't be changed or replaced.
Concatenation using + 
Duplication using *

### String methods
Actions on the objects themselves

In [9]:
len('Hello world!')

12

In [10]:
s = 'Hello world'

In [11]:
s[::-1]

'dlrow olleH'

In [12]:
s + ' there'

'Hello world there'

In [13]:
s*2

'Hello worldHello world'

In [14]:
s = 'hello'

In [15]:
s.upper()

'HELLO'

In [16]:
s.capitalize()

'Hello'

In [17]:
s.find('l')

2

In [18]:
s.split('e')

['h', 'llo']

## Lists

In [19]:
my_list = [1,2,3]

In [20]:
my_list

[1, 2, 3]

In [21]:
my_list = ['string', 23, 1.234, '0']

In [22]:
len(my_list)

4

## Index and Slicing

* Works just like strings
* can do list concatenation

* no fixed sizes in list
* no type constraints on contents

## Methods

In [23]:
l = [1,2,3]

In [24]:
l.append('append me')

In [25]:
l

[1, 2, 3, 'append me']

In [26]:
l.pop()

'append me'

In [27]:
l

[1, 2, 3]

In [28]:
l.pop(0)

1

In [29]:
l

[2, 3]

In [30]:
x = l.pop(0)

In [31]:
x

2

In [32]:
l

[3]

In [33]:
l[99]

IndexError: list index out of range

In [None]:
new_list = ['a', 'e', 'x', 'b', 'c']

In [None]:
new_list

In [None]:
new_list.reverse()

In [None]:
new_list

In [None]:
new_list.sort()

In [None]:
new_list

In [None]:
l_1 = [1,2,3]
l_2 = [4,5,6]
l_3 = [7,8,9]

In [None]:
matrix = [l_1, l_2, l_3]

In [None]:
matrix # nested data structures

In [None]:
matrix[0]

In [None]:
matrix[1][2]

## List Comprehensions

In [None]:
first_column = [row[0] for row in matrix]

In [None]:
first_column

# Dictionaries

A dictionary is basically a mapping between objects. What we've seen so far are sequences, objects where we can index them based on their position. keys must be unique

In [66]:
my_dict = {'key1': 'value', 'key2': 'value2'}

In [67]:
my_dict

{'key1': 'value', 'key2': 'value2'}

In [None]:
my_dict['key1']

In [None]:
my_dict = {'key1': 'value', 'key2': 123, 'key3': [1,2,3]}

In [None]:
my_dict['key1']

In [None]:
my_dict['key3']

In [None]:
my_dict['key3'].pop()

In [None]:
my_dict

In [None]:
my_dict['key1'].upper()

In [None]:
my_dict

In [None]:
my_dict['key2'] *= 2

In [None]:
my_dict

In [None]:
d = {}
d

In [None]:
d['animal'] = 'Dog'
d['answer'] = 42

In [None]:
d

In [None]:
d = {'k1': {'nestKey': {'subnest': 'value'}}}

In [None]:
d

In [None]:
d['k1']

In [None]:
d['k1']['nestKey']['subnest'].upper()

In [None]:
d = {}

In [None]:
d['k1'] = 1
d['k2'] = 2
d['k3'] = 3

In [None]:
d

In [None]:
d.keys()       # note the order. Dictionaries do not guarantee any order

In [None]:
d.values()

In [None]:
d.items()       # this returns tuples of all of the dictionary key-value pairs

# Tuples

Similar to lists but immutable

In [37]:
t = (1,2,3)

In [38]:
t

(1, 2, 3)

In [39]:
len(t)

3

In [40]:
t[1]

2

In [41]:
t = ('one', 2)

In [42]:
t[0]

'one'

In [43]:
t[1]

2

In [44]:
t[-1]

2

In [45]:
t.index('one')

0

In [46]:
t.count('one')

1

In [47]:
t = (1,1,2,3)

In [48]:
t.count(1)

2

Tuples are immutable

In [49]:
l = [1,2,3]

In [50]:
l

[1, 2, 3]

In [51]:
t = (1,2,3)

In [52]:
l[0] = 'string'

In [53]:
l

['string', 2, 3]

In [54]:
t[0] = 'string'

TypeError: 'tuple' object does not support item assignment

# Files

Python uses a file object to interact with the files on your disk drive

In [55]:
ls

01a_jupyter_notebooks.ipynb  02b_input_and_output.ipynb  Untitled1.ipynb
01b_fundamentals.ipynb	     'Notebook Basics.ipynb'	 data/
02a_dataframes.ipynb	     Untitled.ipynb		 images/


In [56]:
file = open('data/test.txt')

IOError: [Errno 2] No such file or directory: 'data/test.txt'

In [None]:
file.read()

In [None]:
file.read()

In [None]:
file.seek(0)

In [None]:
file.read()

In [None]:
file.seek(0)

In [None]:
file.readlines()     # stores list entirely in memory. Could be problematic

In [None]:
%%writefile new.txt
First line
Second line

In [None]:
for line in open('new.txt'): 
    print line

# Sets

An unordered collection of *unique* objects

In [None]:
x = set()

In [None]:
x.add(1)

In [None]:
x

In [None]:
x.add(2)

In [None]:
x

In [None]:
x.add(1)

In [None]:
x

In [None]:
# can use the set function to cast a list into a subset of unique elements
l = [1,1,1,2,2,2,2,2,3,3,3,3]

In [None]:
l

In [None]:
set(l)

# Booleans

True / False / None

In [57]:
a= True

In [58]:
# can be created as the results on conditional operators

In [59]:
b= None   # essentially a place holder

In [60]:
b

### Using built-in functions to create an expression
If we want to use other mathematical functions, such as trigonometric functions or 
essentially anything more complex than arithmetic, then we'll need to import the functions from Python's **math** package. 

By itself, the core Python language is pretty simple. Fortunately, however, there are an enormous number of packages which are available which incorporate additional functionality. We'll come back to this later in more detail, but for now let's just introduce this concept so that we can use
