# Learning python

The Jupyter framework is a rather safe place to learn by trying things. 

The way to try things is to engage in the *scientific method:* 
* Predict what will happen. 
* Try an experiment. 
* Correct your prediction and iterate. 

Consider the following: 

In [None]:
x = 1
print(x)
x = "hello"
print(x)
x = 4.7
print(x)
x

Before you run this, 
1. Consider what you think it should do. 
2. Write that down here: 

___your answer:___

Now run it and see if you are correct. 

# Formatting

There are literally hundreds of ways to print things. 
One of the most useful is formatting. Consider the following code. 

In [None]:
print("the square of {} is {}".format(5, 5*5))

Before you run this,

Consider what you think it should do.
Write that down here:

___your answer:___

Now run it and correct your answer.

# How formatting works
1. Arguments are *positional:* the order of arguments in the format statement determines when they're printed. 
2. *Placeholders* `{}` in the format string determine where they're printed. 
3. If the thing to be printed isn't a string, it's *converted* to a string before printing it. 

# Types in python

There are several basic types in Python: 
1. Integers, e.g., `42`
2. Floating point numbers, e.g., `42.57`
3. Strings, e.g., `'hello'` or `"hello"`. 
4. Lists, e.g., `[1, 2, 3]`
5. Tuples, e.g., `(1, 2, 3)`
etc. 

# Variables 
* In python, the statement `x = {something}` sets the variable `x` to some value. 
* The `{something}` can be anything above: an integer, a string, a list, etc. 
* The type of a variable is the type of the value to which it is set. 
* The type of a variable can change over time, as it is set to different things. 

# pprint
Your secret weapon for understanding variables: *pretty-printing.* Consider this code: 

In [None]:
from pprint import pprint
pprint(2.5)
pprint([1, 2, 3])
x = [1, 2, 3, [4, [5]]]
pprint(x)

* `from pprint import pprint` -- load the pprint function from a *library.* This is additional code that is not part of the python core language. 
* `pprint(x)` -- print a pretty representation of `x`, where `x` is any value whatever. 

# Application: understand a mysterious piece of code
* I will very commonly feed you a piece of code that you don't immediately understand. 
* But with code in hand, and the ability to run it, you can figure out what it does! 
* Two basic steps: *printing* and *refinement*. 
* *pprint* things that you can easily print. 
* *refine* the code so you can print other things. 

Consider the following code, which you might recognize from the 'Workflows' exercise: 

In [None]:
# take a look at the file
f = open('data.txt', 'r')
lines = []
for line in f: 
    lines.append(line.strip()) 
f.close()
lines

To investigate what this does, let's add pprint statements to look at its variables. Let's copy the code from the cell into the next cell, and add `pprint` statements to understand how line and lines are changing. 

In [None]:
# add pprint statements to this
from pprint import pprint
f = open('data.txt', 'r')
lines = []
for line in f: 
    lines.append(line.strip()) 
f.close()
lines

Based upon this output, what does `append` do? 

___your answer:___

A second trick is *refinement*. Let's try to understand what `strip()` does. To do this, we can proceed two ways. 
a. remove it and see what happens
b. change the code so we can see what it does. 

Let's do (b) now: 

There's a simple refinement that we can do. 

Instead of writing 
```
lines.append(line.strip())
```
write
```
foo = line.strip()
lines.append(foo)
```
These two snippets of code do the exact same thing. The difference is that one can then print `foo` to see what happens before and after `strip()`. 

This is one application of the *substitution principle*: If you set `x` to something, you can use `x` wherever you'd use the something. 

Rewrite the code below in that manner, and then pprint both `line` and `x`. 

In [None]:
# take a look at the file
f = open('data.txt', 'r')
lines = []
for line in f: 
    lines.append(line.strip()) 
f.close()
lines

Based upon this, what does `strip()` do? 

___your answer:___ 

# Now let's investigate a more complex piece of code

Consider the following: 

In [None]:
count = {}
for l in lines: 
    words = l.split(' ')
    for w in words: 
        w = w.lower()
        if w.endswith('.') or w.endswith(','): 
            w = w[0:-1]
        if w in count: 
            count[w] += 1
        else: 
            count[w] = 1
count

Use printing and refinement to infer the answers to the following questions: 

1. What does `lower()` do? Provide evidence. 

___your answer:___

2. What does `w = w[0:-1]` do? Provide evidence.  

___your answer:___

# When you're done, submit the notebook

You can submit a notebook by saving it as PDF. In the cluster environment, it's File | Print (Save as PDF) and submit to Gradescope. https://www.gradescope.com/courses/182658,On other versions, it may be File | Download As (PDF) and then submit to Gradescope.

To submit to Gradescope, log into the [website](https://www.gradescope.com/courses/182658), add course **9W7PW3** (if not already added) and submit. The assignment name should match the name of this notebook.