# Patterns

## Notebook Content
Enter data here for the histogram below.

In [None]:
data = [1,4,5,2,3,4,4,7,8,2,2,2,3,1,1,1,2,20,2,2,9,10,1,2,3]
num_bins = 3

# Cell that plots a histogram

- Preconditions
    - `data` is a list of numbers. 
    - `num_bins` = the number of bins to use. 
- Postconditions
    - The output is a histogram of `data`, binned into `num_bins` equal-sized bins.

In [None]:
%matplotlib notebook
import matplotlib
import numpy as np
import matplotlib.pyplot as plt

fig, ax = plt.subplots()
n, bins, patches = ax.hist(data, num_bins)
ax.set_title(r'Histogram')
plt.show()

# A few exercises to get you started. 

1. Change the statement above to `num_bins = 10` above and run both cells. What happens?

___Your answer:___

2. Change the statement above to `num_bins = None` above and run both cells. What happens? 

___Your answer:___ 

3. Change the data statement to `data = [1,1,1,2,2,2,3,3,3,4,4,4,5,5,5]` and run both cells. What happens? 

___Your answer:___ 

4. Based upon what we've observed from these experiments, are preconditions complete? What would we write instead?

___Your answer:___ 

5. Based upon what we've observed from these experience, are postconditions complete? What should we write instead?

___Your answer:___ 

# Going further

Programming for Data Science is often a matter of meeting preconditions and then letting someone else's code do the actual work. 

For example, suppose that instead of a list of data, we have a string: 

    dstring = '1 1 2 1 3 3 1 4 5 2 3 4'
    
How would we go about converting this to a list of numbers in `data`? 

Hints: 

1. string to list of strings: string.split(' ') 
2. string to integer: int(s)

Write your answer in the following cell: 

In [None]:
dstring = '1 1 2 1 3 3 1 4 5 2 3 4'
# type your solution here 
data = ...

#  What are the preconditions for your code in the previous cell? 

Hint: Does 

    dstring = '1,1,2,1,3,3,1,4,5,2,3,4' 

work properly? 

___Your answer:___

# What are the postconditions for your code in the previous cell?

___Your answer:___ 

# The principle of weakest preconditions. 

> A sequence of program fragments is *correct* if, for each fragment, 
> the postconditions for the previous fragment are the same or 
> more restrictive than the current fragment's preconditions.

1. Correctness depends upon order of execution. 
2. Thus, which cells one executes first are crucial. 
3. The principle only guarantees what will happen if all preconditions and postconditions align. 
4. If preconditions are not met by prior postconditions, anything can happen. 


# Programming with "black boxes" 

Several times during this course, I will be giving you recipes or patterns that utilize "black boxes" for which only the preconditions and postconditions are known. I won't be telling you much about what's inside the box. You will have to piece together these boxes into a reasonable data flow. 

For example, consider the following situation: 

1. A function `splitter` translates a string into an array of integers represented by the string. If x is a string, then splitter(x) is a list of the string elements.
2. A function `plotter` plots a histogram. It has parameters `data` and `nboxes`, where `data` is a list of integers and `num_bins` is the number of bins to use, as before. 
3. A function `numeric` translates a list of strings into a list of integers. If x is a list of strings, numeric(x) is the corresponding list of integers. 

In what order should these be executed to provide the plot above? 

___Your answer:___ 

In [None]:
# here's the code for our functions 
def splitter(s): 
    return s.split(' ')

def plotter(data, num_bins):
    fig, ax = plt.subplots()
    n, bins, patches = ax.hist(data, num_bins)
    ax.set_title(r'Histogram')
    plt.show()

def numeric(s): 
    return [ int(x) for x in s ]

Write the actual code to do this in the cell below. Use these functions. Don't worry about what they do, yet. 

In [None]:
dstring = '1 1 2 1 3 3 1 4 5 2 3 4'
num_bins = 5
# { replace this with code to plot dstring as a histogram }

# Challenge problem: do it in one line of code
Using the substitution principle, make that plot in one line of code in the following cell:

In [None]:
# { replace this with code to plot dstring as a histogram }

# An afterword on rituals and patterns
The difference between a professional data scientist and a dilettante is a complete understanding of this lesson. Many people who think of themselves as programmers do not know how to properly apply a pattern, nor do they understand even the questions to ask in order to apply a pattern correctly. This leads to inaccuracy of results and -- in extreme cases -- financial losses. 

# When you're done, submit the notebook

You can submit a notebook by saving it as PDF. In the cluster environment, it's File | Print (Save as PDF) and submit to Gradescope. https://www.gradescope.com/courses/182658, On other versions, it may be File | Download As (PDF) and then submit to Gradescope.

To submit to Gradescope, log into the [website](https://www.gradescope.com/courses/182658), add course **9W7PW3** (if not already added) and submit. The assignment name should match the name of this notebook.