Helper Syntax
---

In my notes and questions, you will come across some syntax you should be aware of:

1. `<Shift-Enter>`: This is a keystroke. Hold `Shift`, press `Enter`. Keys will always be capitalized (i.e. `<Tab>` means press `Tab`)
2. `<var1> <operator> <var2>`: This means I want you to enter something, or I am demonstrating a pattern.
    - In this case, I am showing a binary operator (e.g. multiplication) between 2 variables `var1,var2`

Python Syntax
---

Comments
---

Lines beginning with `#` are completely ignored. Everything after `#` is ignored. Comments help increase the readability of code. In Jupyter, you can highlight blocks of code with the mouse and type `<Ctrl-l>` to comment or uncomment blocks of code.

In [None]:
# print(1)
print(1) # this line will print 1

Whitespace
---

In Python, and unlike many programming languages, whitespace is important! Try running the following cell:

In [None]:
print("hello")
 print("world")

Indentation
---

In this case, you have 2 options:

In [None]:
 print("hello")
 print("world")

In [None]:
print("hello")
print("world") 

Notes
---

```python
print("hello")
print("world")
```

The above is called a _block_ of code. All blocks must share the same indentation. Although you **can** put spaces in front of both lines and the code runs, it is strongly discouraged!

- Rules of thumb:
    - code blocks shouldn't have any leading space.
    - nested code blocks should be indented by 4 spaces (we'll look at nested blocks later)    

Documentation
---

By now, you probably realize what `print` is doing. However, if you aren't sure Jupyter, has built in documentation! Run this cell:

In [None]:
?print

If you can't get the documentation from Jupyter:

1. Google it with string: "python <thing you are looking for>", places you might end up:
    - [Stack Overflow](https://stackoverflow.com) - Best place for programming questions
    - [Python Documentation](https://docs.python.org/3) - Good place for Python

Notes
---

- At the bottom you should see: `Type: builtin_function_or_method`
    - We'll come back to functions and methods later. First, we should discuss types

Variables and Types
---

- Defining variables is as simple as using the assignment operator `<name> =`
- Determine the type of the object with `type(<object>)`
- You can write to output using `print(<something>)`
- Write commands on seperate lines:

In [None]:
x = 1
print(x)

Notes about Printing Variables
---

There are 3 primary ways to print to output in Python (demonstrated above).

1. `print(<something>, <other>)`: separate variables by commas
2. `print("{} {}".format(<something>, <other>)`: formatted print (this is my preferred style)
3. Jupyter specific: The final line of a cell will print whatever it returned. In the above code cell you could have wrote the below (note the difference from above!):

In [None]:
x = 1
x

#### Tasks:

1. Define a variable called `x` equal to `1`, print `x` and the type of `x`
2. Define a variable called `y` equal to `"a"`, print `y` and the type of `y`
3. Define a variable called `z` equal to `1.0`, print `z` and the type of `z`
4. Define a variable called `b` equal to `True`, print `b` and the type of `b`
5. Try the different style of `print` statements

### Notes

- Objects:
    - Everything in python is an object
    - Objects are composed of methods which can be accessed using `.`
        - i.e. `format` is called a method of the `str` (string) object
        - You can see all of the methods of `str`:
            - Visit the documentation: [String Methods](https://docs.python.org/3/library/stdtypes.html#string-methods)
            - Or, type `y.<Tab>` to see the names, many are self explanatory
- Variables persist between cells, i.e. `x` should still return `1`. Does it?   

How would I figure out what a method does?
---

Guided Example:

- Let's try to figure out what the `center` method does. Run the cells below:

In [None]:
y.center()

- Okay, `center` expects at least one argument. We don't know what it is. Maybe it centers one string within another?

In [None]:
y.center("I am a string")

- Interesting, the function takes an integer.

In [None]:
y.center(10)

- Cool, we figured out what the `center` method does by listening to the interpreter!
- If you get one thing from this workshop, it should be that the interpreter is your friend!

#### Tasks

1. Try some of the methods of `y`. i.e. `y.<Tab>` or choose some from https://docs.python.org/3/library/stdtypes.html#string-methods

- Pro Tip: You could also just type `?str.center`

Operators
---

**We'll skip "Tasks" for this section because you have done some programming**

- Binary arithmetic operators
    - addition (+), subtraction (-), multiplication (\*), division (/)
    - floor division (//): quotient without fractional parts
    - modulus (%): Integer remainder of quotient
    - exponentiation (\*\*)
- Precedence
    - Just like in Mathematics order of operations which is obeyed and parentheses can be used to change the order
    
### Notes

- For the operations between `float` and `int`, the value which is returned is a `float`
- For `int`-`int` operations, one result was promoted to a `float` (which one?)
- It's important to understand that the returned types aren't necessarily what goes into the calculations!
- Additionally, sometimes operations aren't smart enough to promote values and you end up with a result you didn't expect

Binary Operators with Assignment
---

- The binary operators can be combined with an `=` sign, i.e. `<variable> <operator>= <value>` is expanded to: `<variable> = <variable> <operator> <value>`

Comparison Operators
---

- Comparison operators return Boolean values, i.e. `True` or `False`
    - `a == b`: `a` equals `b`?
    - `a != b`: `a` not equals `b`?
    - `a >= b`: `a` greater than or equals `b`?
    - `a < b`: `a` less than `b`?
    
Boolean Operators
---

- Compare Boolean values:
    - `<bool> and <bool>`: Both must be `True` to evalute to `True`
    - `<bool> or <bool>`: Either must be `True` to evaluate to `True`
    - `not <bool>`: Invert Boolean

Data Structures (Containers)
---

- Lists (`list`):
    - e.g. `[1, 2, 3]`
- Tuple (`tuple`):
    - e.g. `(1, 2, 3)`
- Dictionary (`dict`):
    - `<key>: <value>` store e.g. `{'a': 1, 'b': 2}`
- Set (`set`)
    - e.g. `{1, 2, 3}`
    
#### Definitions

- Ordered:

Consider a container called `a`, if the elements can be accessed using `[n]` syntax where `0 <= n < len(a)` the container is ordered. As an example, try the following cell:

In [None]:
a = [0]
print("Length: {}".format(len(a)))
print("a[0] is: {}".format(a[0]))

`0` is in the range `0 <= n < len(a)` and I can access it, `a` is ordered.

- Mutable:

Consider a container called `a`, if the indexed item can be changed the container is mutable. As an example, try the following cell:

In [None]:
a = [0]
a[0] = 1
print("a[0] is: {}".format(a[0]))

We know `a` is ordered, therefore I can access the elements. I can also change the elements which means `a` is mutable.

#### Tasks

1. Create a `list` called `a` which contains the numbers 4, 4, 8, and 12.
    - Are `list`s ordered?
    - Are they mutable?
    - Try accessing the element `a[len(a)]`, why doesn't this work? How do you access the last element?
2. Create a `tuple` called `b` which contains the number 2, 3, 4.
    - Are `tuple`s ordered? mutable?
    - When would you want to use a `tuple` over a `list`?
3. Create a `dict` called `c` which contains the following key-value pairs `'one': 1`, `'three': 3`, `'five': 5`.
    - Note: Dictionaries are accessed by their keys, e.g. `c['one']` would yield 1
    - Are `dict`s ordered? mutable?
4. Create a `set` called `d` using the same data as `a`
    - Are `set`s ordered?
    - Print `a` and `d`, what is the difference? What does this imply about `set`s
    - Are they mutable?
        - try adding an integer to the set with `<set>.add(<value>)`
5. Keep the definitions of `a,b,c,d`, but comment any code which you used to test order and mutability. Rerun the cell

- Pro tip: You can access the last element of a list with `<list>[-1]`

Membership Operators
---

- Return `True` or `False` based on membership, e.g.

``` python
1 in [1, 2, 3] # evaluates to True
0 in [4, 5, 6] # evaluates to False

'key' in {'key': 'value'} # True
'value' in {'key': 'value'} # False

# by values:
'value' in {'key': 'value'}.values() # True
```

Membership-based `for` Loops
---

#### Syntax

```python
for <temporary> in <iterable variable>:
    # Do something with <temporary>, e.g.
    print(<temporary>)
```

- An `iterable variable` could be a `list`, `set`, `dict`, etc.

#### Tasks

1. Write membership based `for` loops for `a`, `b`, `c`, and `d` which `print` the contents on a separate line
    - Notice anything interesting?
2. Use the `range` function to print the even integers from 2 to 20 (hint: `?range` will bring up documentation)
    - Definitions: In mathematics, inclusivity (edges are included) is denoted with brackets `[]` and exclusivity (edges are not included) with parenthesis `()`
    - Do all of the following: `[2, 20]`, `(2, 20)`, `[2, 20)`, `(2, 20]`
    - What is the default in the above notation?

Unpacking Complicated Data Structures
---

You probably noticed that your loop over the dictionary only printed the values. What if you wanted both keys and values?

Python allows you to "unpack" dictionaries (using the method `items`) as well as list of lists, e.g.

```python
for key, value in {'a': 1, 'b': 2}.items():
    print("{}: {}".format(key, value))
```

or,

```python
for i, j in [[1, 2], [3, 4]]:
    print("{}: {}".format(i, j))
```

#### Tasks

1. Write a membership based for loop for `c` (dictionary you made previously)
2. Generate the following list of list: `[[9, 8], [6, 5], [3, 2]]`. Write a membership based for loop which multiplies the pairs.

Flow Control
---

1. Conditional Statements (`if-elif-else`)
    - Use comparison operators and boolean operators which, when evaluate to true, runs some block of code.
    - `elif` stands for `else if`
  
#### Tasks:

```python
x = 12

if x == 0:
    print("{} is zero".format(x))
elif x > 0:
    print("{} is negative".format(x))
elif x > 0:
    print("{} is positive".format(x))
else:
    print("I don't recognize {}".format(x))
```

1. Given the code above, what do you think is printed?
2. Copy and paste the code into the cell below, what is actually printed? Why?
3. Will the `else` block ever trigger?

#### Bugs vs Errors

- "Bugs" are unintended consquences of running code. For example, the original code tells us that 12 is negative
- "Errors" are produced by the interpreter because something can't be completed

#### Tasks

1. Fix the bugs to print whether a number is positive, negative, or zero

- Unfortunately, we don't have time to talk about error handling now. Check out this documentation at a later time: https://docs.python.org/3/tutorial/errors.html.

Functions
---

Let's start with a definition for a function which squares the input:

```python
def square(x):
    return x * x
```

Notes:

- `def` is the start of the function definition
- `square` is the name of the function. Example use: `a = square(2)`
- `return` passes back information to the caller of the function (i.e. `a` above is 4)
- Functions don't have to return anything (behind the scenes they return `None`)
- When the interpreter evaluates a `return` it will ignore anything else in the function
- Meaningful function names help you 

#### Tasks

1. Verify that a function without a `return` statement returns `None`
2. Verify that a function with multiple `return` statements only returns the first instance
3. Write the definition of a cube function
4. Write the definition of the area for a circle given a radius
    - $ A = \pi * r^2 $
5. Write the definition of pressure from the ideal gas law (don't worry about the units, this isn't a chemistry class):
    - $ PV = nRT $
        - `P` is pressure
        - `V` is volume
        - `n` is the number of moles of gas
        - `R` is the universal gas constant
        - `T` is a temperature

Anonymous Functions
---

These are unnamed functions which are referred to as "syntactic sugar". You will see this term a lot. It basically means that when you use them, you write less code. An example:

```
square = lambda x: x ** 2
var = square(2)
```

which is exactly equivalent to the square function we wrote before, but in a more compact syntax. These are really helpful in the next section.

#### Tasks

1. Write an anonymous `cube` function
2. Write an anonymous `addition` function

Iterators
---

Previously, I called `for value in iterator` a membership based for loop. However, I didn't explain the `iterator` part at all. Some iterators, operate a little differently. For example, `range` doesn't actually build a `list` in memory even though it looked very similar to our `for value in list` example previously. Instead a `range` operates in "constant stack space", i.e. the space it occupies in memory never changes. For example,

```python
n = 10 ** 12 # a really huge number
for x in range(n):
    if x >= 10: break
    else: print(x)
```

If the above code did build an object to store all of the numbers from 0 to `10 ** 12` it would fail, but iterators are special.

Other iterators:

- `enumerate`, e.g.

```python
L = [1, 2, 3, 4]
for index in range(len[L]):
    print(index, L[index])
    
for index, item in enumarate(L):
    print(index, item)
```

- `zip`:

```python
L = [2, 4, 6]
R = [3, 6, 9]
for l, r in zip(L, R):
    print(l * r)
```

- `map`:

```python
for squares in map(lambda x: x ** 2, range(10)):
    print(squares)
```

- `filter`:

```python
for evens in filter(lambda x: x % 2 == 0, range(10)):
    print(evens)
```

#### Tasks

1. Use `enumerate` and `range` to print the index and value of even integers between [0, 10).
2. Use `enumerate` and `zip` to sum the two lists into a new list `c`:
    - `a = [1, 2, 3]`
    - `b = [3, 4, 5]`
    - Initialize `c = [None] * len(a)`
    - Try `list(enumerate(zip(a, b)))` to see the shape of what to unpack
3. Use `map` to print the square of the cubes in the range [0, 10) using only anonymous functions
4. Use `filter` to print the cube of even squares in the range [0, 10) using only anonymous functions

Comprehensions
---

In "2." of the previous question we had to build a list prior to the loop. We could have also used `c = []` and `c.append(x + y)`, but note that `append` is an extremely slow function. Python also provides some syntactic sugar for this case, e.g.

```python
a = [1, 2, 3]
b = [3, 4, 5]
c = [x + y for x, y in zip(a, b)]
```

This is extremely terse and readable. This is called a "List Comprehension" and is probably my favorite Python feature. You can also append filter clauses! e.g.

```python
a = [1, 2, 3]
b = [3, 4, 5]
c = [x + y for x, y in zip(a, b) if x % 2 == 0]
```

There are also dictionary comprehensions! e.g.

```python
values_dict = {'one': 1, 'three': 3, 'five': 5}
squares_dict = {k: v ** 2 for k, v in values_dict.items()}

# or

songs = ["Tom Sawyer", "Money", "Simple Man"]
artists = ["Rush", "Pink Floyd", "Lynyrd Skynyrd"]
songs_dict = {k: v for k, v in zip(artists, songs)}
```

#### Tasks

1. Using the `songs` and `artists`, create the following dictionaries:
    - keys: songs, values: artists, only if the artist has an `n` in the name
2. Create a list that contains the cubes divided by the squares in the range [1, 10) only if the division is divisible by 2
    - Note: you likely already have `square` and `cube` defined as anonymous functions!
    - Predefine the range! i.e. `vals = range(<something>)`
3. What function would you need to make "2." a dictionary with the index as the key? You are welcome to try this, but the hard work is already done :)

Modules and Packages
---

Python has a rich ecosystem of packages for all kinds of research domains!

- Syntax variants
``` python
import <package>
from <package> import <member>
```
- Nice feature about Jupyter Notebooks is that they autcomplete member names

#### Tasks

1. Try to import just the `sub` method from the `re` module.

Reading and Writing Files
---

Turns out, we can use `open` to generate a file iterator. We can work on a file line by line and generate just the content into a list in memory (`strip` removes the newline character):

```python
lines = [line.strip() for line in open(<filename>)]
```

or, maybe you need process each line and write a new file (by default `'w'` overwrites the file):

```python
with open(<otherfile>, 'w') as f:
    for line in open(<filename>):
        f.write(<modify line in some way>) # Write doesn't append a newline automatically, therefore you may need to write a newline
        f.write('\n') # Or, on one line: f.write("{}\n".format(<modify line in some way>))
```

To append to a file use `open(<filename>, 'a')`.

Project 1
---

#### Description

You were given a "Comma Separated Value" (csv) file, but the program you are interested in using only reads data in "Tab Separated Value" (tsv) format. The file is massive and crashes your Excel, you tried Google Sheets too but the browser freezes when opening the file. Now what?

#### Hints

- To convert one character to something else, use `str.replace("<old>", "<new>")`:

In [None]:
a = "Hello World"
a.replace("l", ":)")

- The tab character is represented by `\t`.

#### Tasks

1. Use the idioms you already know to convert `example.csv` to `example.tsv`
    - Typically the first line of a csv file contains some descriptors of the data inside. I removed that for simplicity, but would it have mattered?

Project 2
---

#### Description

Recently I was asked to figure out the average wait times for jobs on H2P, our advanced research computer as Pitt.

#### Hints

- Times and dates are notoriously annoying to deal with. Luckily in this case, the times are in a consistent format. Each time is given in this shape: `"2017-06-30T14:57:29"`, which in Python is represented by the following string `"%Y-%m-%dT%H:%M:%S"`. You can convert these to `datetime` objects with the function `strptime`. To find the difference between 2 times in hours:

In [None]:
from datetime import datetime
time_start = "2017-06-30T14:57:29"
time_end = "2017-07-02T14:59:45"
time_format = "%Y-%m-%dT%H:%M:%S"
delta = datetime.strptime(time_end, time_format) - datetime.strptime(time_start, time_format)
print(delta.total_seconds() / (60 * 60))

- The format of this file is bar `|` separated (for whatever reason), to process each line:

In [None]:
line_of_file = "don't need|time_start|time_end"
line_of_file.split('|')

#### Tasks

1. Write a defined function which takes a line in the date format above, subtracts them, and converts them to hours (you can use a defined function)
    - Hint: don't forget about the newline
2. Write a list comprehension to apply your function to each line in the file
3. Determine the mean wait times

#### Additional Resources

1. Me! Don't hesitate to email me (bmooreii@pitt.edu) and we can make an appointment
2. This workshop is losely modeled after [World Wind Tour](http://nbviewer.jupyter.org/github/jakevdp/WhirlwindTourOfPython/blob/master/Index.ipynb)
3. [Stack Overflow](https://stackoverflow.com). If you "Google it" it is probably coming from Stack Overlow. Pro tip, search for: `python <search phrase>`

#### [Python Docs](https://docs.python.org/3/library/index.html)

#### Useful Python Packages

1. [docopt](http://docopt.org) - command line arguments (not helpful in a Jupyter Notebook)
2. [pandas](http://pandas.pydata.org) - great for processing data
3. [matplotlib](https://matplotlib.org) - plotting tool, can view inside Jupyter Notebooks!
4. [requests](http://docs.python-requests.org/en/master) - HTTP Requests
5. [numba](https://numba.pydata.org) - just-in-time compilation for Python
6. [subprocess](https://docs.python.org/3/library/subprocess.html) - run external commands

#### Quick Survey

In [None]:
from IPython.display import IFrame
IFrame("", width=760, height=500)