<font size = 6> <b>Programming: putting things together</b></font>

In this NB we go through some of the steps in solving problems and packaging the solutions into python scripts.

# Exercise 1

Our goal in this exercise is to write a function that removes all elements of a sequence that do not occur at least two times.   

Note: This should remind you of an exercise from a previous notebook but refresh your memory and try it anyway.  We are going to proceed by steps and in the final step in this exercise we are going to write the function described above. 

We'll break the problem into two parts.

Using the string `test_str`, write a `for` loop that **counts**
the number of times each item in `test_str` occurs. We will store these counts in
a dictionary.  We are going to do this using a special kind of dictionary called a `Counter`.  If you know enough about `Counter`s to do this without a `for`-loop, go ahead and do so.  Otherwise, write a `for`-loop that loops through the string using `ctr` (defined in line 4 below) to  keep count of how many times each character has occurred.  Use the `test_str` defined below to test your code.  

In [None]:
test_str = 'abracadabra'
from collections import Counter

ctr = Counter()
#[Your for loop to get all the counts for items in test_str]
ctr

If you defined `ctr` correctly, it should look like this:

```
Counter({'a': 5, 'r': 2, 'b': 2, 'c': 1, 'd': 1})
```

Morevover if you want to know the count of some character in `test_str`, you do like this

```
In [5]: ctr['a']
Out[5]: 5
```

Having now counted, we are going to write a line of code that removes all elements of `test_str` that do not occur at least 2 times.  Hint:  You should use a list-comprehension with a test:

```
[x for x in test_str if **test**]
```

This collects only the members of `test_str` that pass the test.

In the final step we put all the code we've written into a function that not only works on `test_str`, but on any sequence, returning a version of the sequence with all singleton elements removed.  We've started the function definition below.  Remember to use `return`.

In [None]:
def remove_singletons (seq):
    #[Your code here]
    pass


# Some items to test on follow
test_str1  = 'abracadabra'
# Make sure to be okay on a boundary case
test_list1 = []
# Make sure to be able to do nothing
test_list2 = list(range(7))
# Another kind of boundary case
test_str2 = test_str1 + test_str1

print(remove_singletons(test_str1))
print(remove_singletons(test_list1))
print(remove_singletons(test_list2))
print(remove_singletons(test_str2))

The steps we took:


1.  **Analysis**. Break the problem down into steps you know how to do in Python (yes, this is the hard part).  We know how to count the number of occcurrences of the elements of a container.  We know how filter out things that don't meet some criterion.  We broke the task of producing a sequence with the singletons removed into those two doable pieces.  
2.  **Be example based.**  Write the code to execute one of your steps on one of your examples.  Interact with Python at this stage.  Get the basic idea working.
3.  **Write a function**. Turn your code into a reusable function.
4.  **Test**.  Test on a variety of cases.  Frequently I'll give you a set of cases to test on in an exercise. As you get more experienced you'll be able to generate your own useful test items.  If this is going to be a reusable piece of code you're going to change and maintain for a while, this is an extremely iomportant step.  You're building your first test suite.

# Exercise 2: For loops with Sudoku

Just execute the code cell below.  We need it for the discussion and exercises that follow.

In [None]:
digits   = '123456789'
rows     = 'ABCDEFGHI'
cols     = digits
squares  = [r+c for r in rows for c in cols]
unitlist = ([[r+c for r in rows] for c in cols] + 
           [[r+c for c in cols] for r in rows] + 
           [[r+c for r in rs for c in cs] for rs in ('ABC','DEF','GHI') for cs in ('123','456','789')])

# Assign to each square s, 3 sets of squares, the unique row and column and box s belongs to.
units = dict((s, [u for u in unitlist if s in u])
             for s in squares)
## Assign to each square s a set of squares, namely those that cant have the same value as s.
peers = dict((s, set(sum(units[s],[]))-set([s]))
             for s in squares)
grid1  = '003020600900305001001806400008102900700000008006708200002609500800203009005010300'
grid1_soln = '483921657967345821251876493548132976729564138136798245372689514814253769695417382'
grid2  = '003020600900305001001806400008102900700000008006708200002689500800203009005010300'

The code above defines the row labels in a Sudoku puzzle  in a variable `rows`. Write a `for`-loop that prints out the row labels in a Sudoku puzzle.

The code above defines the column labels in a Sudoku puzzle  in a variable `cols`. Write a `for`-loop that prints out the column labels in a Sudoku puzzle.

Write a list comprehension that returns a list of the squares in rows `ABC` a Sudoku puzzle.  For a hint, look at the list comprehension that defines `squares` in the code above.

In [None]:
ABC  = [] # Put your list comprehension in the the square 

In [None]:
ABC

If the variable `squares` had not been precomputed for you, you could also have computed it yourself with a **double loop** in a list comprehension, as discussed in [the section on loops](http://www-rohan.sdsu.edu/~gawron/python_for_ss/course_core/book_draft/programming_intro/list_comprehension.html) in the online text.

In [None]:
[r+c for r in rows for c in cols]

This is an example of a very important idea with applicability to a lot of computations.  You're pairing every element in `rows` with every element in `cols`. That's called taking the cross-product of the two containers.  So now let's turn this idea into a function that won't just apply to `rows` and `cols`, but to any two containers with elements that can be combined with `+`. Let's call the function `cross` and the two containers that are its arguments `A` and `B`.  Finish the definition below.

In [None]:
def cross (A, B):
    return [your list comprehension code here]

Try writing another function `cross2tuple` that works as follows:

In [None]:
cross2tuple(('rock','paper','scissors'),('rock','paper','scissors'))

[('rock', 'rock'),
 ('rock', 'paper'),
 ('rock', 'scissors'),
 ('paper', 'rock'),
 ('paper', 'paper'),
 ('paper', 'scissors'),
 ('scissors', 'rock'),
 ('scissors', 'paper'),
 ('scissors', 'scissors')]

One of the most powerful things about functions is that it allows you to **simplify**.  Once you identify an important operation like `cross` you can see it in other computations.  Consider the definition of `units` given below.  We want to create a list that contains all
of the **units**, that is, all
of the columns (9 groups of squares) all of the rows (9 groups of squares) and
all of the boxes (again, 9 groups of squares).  In the code above (adapted from
Norvig's code) this was done by concatenating together three lists, as shown
in the snippet below.  Line 1 computes a list of the 9 columns; line 2 a list the 9 rows,
and lines 3 & 4 a list of the 9 squares.
```
1 unitlist = ([[r+c for r in rows] for c in cols] +
2             [[r+c for c in cols] for r in rows] +
3             [[r+c for r in rs for c in cs] for rs in ('ABC','DEF','GHI') 
4                                            for cs in ('123','456','789')])
```
Notice that each line involves a `cross` operation.  In line 1, we take a column `c`
and *cross* all the things in `rows` with all the things in `c` (namely `c`).  Similarly in
line 2, we take a row `r` and *cross* it with all things in `cols`.  Finally, to compute the contents of each box unit, we *cross* the rows in that box unit with the cols in that box unit. So with `cross` defined, we can rewrite the code above as
```
1 unitlist = [cross(rows,c) for c in cols] +
2            [cross(r,cols) for r in rows] +
3            [cross(rs,cs) for rs in ('ABC','DEF','GHI') 
4                          for cs in ('123','456','789')]
```
Notice the code's much easier to read and understand.  We don't have to rethink the parts of a *cross* operation each time we get to one.  That simplification of the comprehension process is one of the huge benefits of functions.

The code for `cross` is also quite general.  Thinking about what the code does, answer the following questions.

1. Suppose `A` is of length 3.  Does `B` have to also be of length 3 in order for `cross(A,B)`
to make sense?
2. Suppose `A` is of length `M` and `B` is of length `N`.  What is the length of `cross(A,B)`?
3. Try to guess what `cross([1,1,1],[-1,-1,-1,-1])` will be and describe your in answer in a single sentence.  Verify it using Python.
4. Try to guess what `cross(['1','1','1'],['-1','-1','-1','-1'])` will be and describe your in answer in a single sentence.  Verify it using Python.

[Your answers to questions 1-4 in prose in this markdown cell.]

In [None]:
[Test your answers to question 3 & 4 in this code cell]

# Imports & namespaces

In this exercise we try to diagnose and fix namespace-related errors.  As discussed in the `running_python` notebook `numpy` is a module that defines a `log` function, called as follows in our notebook:

```
log(2)
```

But note that function  doesnt work here

In [None]:
log(2)

NameError: name 'log' is not defined

Of course in the running_python notebook we had automagically imported numpy functions with our 

```
%pylab inline
```

cell at the beginning of the notebook.  So let's try importing `numpy` and see if that works.

In [None]:
import numpy 
log(2)

NameError: name 'log' is not defined

Still doesn't work!

Describe what happened when you evaluate the cell above and why.  What does it have to do with namespaces?  To answer this question, you might need to review [the online book draft section on name spaces](http://gawron.sdsu.edu/python_for_ss/book_draft/anatomy/nme_space.html).

For a little extra credit, explain why `log(2)` did work in the running_python notebook.

[Your answer in this markdown cell]

The code cell below is a copy of the cell above.  Edit it and fix the problem

In [None]:
import numpy
log(2)

The `string` module defines a character string `ascii_letters`.  The line in the code looks like this:

```
ascii_letters = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
```

Fix the problem in the next cell.


In [None]:
import string
ascii_letters

NameError: name 'ascii_letters' is not defined

Finally, `numpy` is a popular module, often frequently invoked by the programs that use
it.  Hence it's convenient to give that namespace an easier-to-type *nickname* with the following
import command:

In [None]:
import numpy as np

**Exrcise**:  What would the proper fix to the namespace problem above be if you imported
`numpy` this way?

Put your answer in this markdown cell.  Note  that if you've executed both `import numpy`
and `import numpy as np`, both names for the module will work.

There's one more variant of the `import` command worth knowing, since you'll see it a loy in other people's  code
even if youdon;'t use it yourself.

In [None]:
from numpy import log

This imports just the log function and the name `log`. So the following works.

In [None]:
log(2)

0.6931471805599453

Recall that python's `math` module also defines a version of the `log` function, albeit a slightly different one.

What do you will happen when we now execute the following cell?

In [None]:
from math import log

Surprised?  Welcome to the wonderful word of powerful customization.

You now know enough to cause yourself serious grief.

**Exercise**: Which version  of `log` do you think will now be called when you execute
`log(2)`?

Confirm or disconfirm your guess by trying out one of the features defined for one function but not
the other, as discussed in the running_python notebook.

Put your demonstration in the cell below.

The moral here is that when you do `from <module> import <name>`, you are importing `<name>` into your global namespace.  You are actually circumventing the usual namespace machinery, but it's
not bad coding, because you are declaring what you've done in a prominent place in your
code (where all your `import` statements are).

Just as when you assign any object to a name, that name is now used up. And you need to be careful
not to abuse it by giving that name another meaning.

# Customized class example

Namespace issues also arise  whenever we have `class` definitions and instances
of classes.  The following  class definition defines the `Point`
class (this example is al;so used in the Modules, Namespaces, and Classes notebook,
but the details are a little different). A `Point` instance represent points in the `xy`-plane;  `Point`
instances  have methods like `distance_from_origin` and `distance` (from another
point instance).

In [None]:
import math
## Not used in class dfn below but helpful for discussion
import numpy as np

class Point:

    def __init__(self, x, y):
        self.x = x
        self.y = y

    def __str__(self):
       return "Point({0}, {1})".format(self.x, self.y)

    def __eq__ (self, other):
        if self.x == other.x and self.y == other.y:
            return True
        else:
            return False

    def distance_from_origin (self):
        return math.sqrt(self.x**2 + self.y**2)

    def distance(self, p2):
        return math.sqrt((p2.x - self.x)**2 + (p2.y - self.y)**2)

## Creating instances of the class

In [None]:
p1 = Point(3,4)
p2 = Point(1,2)
p3 = Point(3,4)
print((p1.x))
print((p2.y))
p1 == p3

3
2


True

To find the string to print for a class instance, `print` calls the instance's `__str__` method.

In [None]:
print(p1)

Point(3, 4)


To find the string to print out when the value of a cell is a class instance, the Python interpreter calls the instance's `__repr__` method.

Since we didn't define one above a generic inherited `__repr__` method is called.

In [None]:
p1

<__main__.Point at 0x7fa9ffc16750>

**Exercise**:  There is an error raised by the code in the next cell

In [None]:
p4 = Point()

TypeError: __init__() missing 2 required positional arguments: 'x' and 'y'

Make a very small modification in the definition of `Point`   so that the following code cell works:

In [None]:
p4 = Point()
print(p4)

Point(0, 0)


After fixing the problem, Explain what went
wrong and why your fix works.  To answer this question, you might need to review [the online book draft section on clases.](http://gawron.sdsu.edu/python_for_ss/book_draft/anatomy/classes.html).

Your explanation here

In [None]:
class Point1:

    def __init__(self, x, y):
        self.x = x
        self.y = y

    def __str__(self):
       return "Point1({0}, {1})".format(self.x, self.y)

    def __eq__ (self, other):
        if self.x == other.x and self.y == other.y:
            return True
        else:
            return False

    def distance_from_origin (self):
        return math.sqrt(self.x**2 + self.y**2)

    def distance(self, p2):
        return math.sqrt((p2.x - self.x)**2 + (p2.y - self.y)**2)

**Exercise**  Modify the above class definition (which is a copy of `Point`
that has been renamed `Point1`) so that a point `P1` has a new  method
that finds a new point on the midpoint of the line conecting `P1` and `P2`. 
For example,

```
In: [] P1 = Point1(3, 4)
In: [] P2 = Point1(-1, 5)
In: [] P3 = P1.average(P2)
In: [] print1(P3)
Point(1.0, 4.5)
```

Note: Test your method and be sure to re-execute the class definition and create
new points using the redefined class when you do.  Otherwise, `P1` won't have
the method defined for it.

In [None]:
# Test your solution here after modifying and re-executing the def of Point1 above

**Optional  extended exercise**.  Edit the class `Point` to create a new class `Point2`, which
allows points be created either with
Cartesian (x,y) coordinates or Polar coordinate (angle $\theta$ and line length $\rho$).
Creating points should look like this

```
>>> from numpy import pi
>>> p1 = Point2(cartesian = (2,3))
>>> p2 = Point2(polar = (pi/3, 2))
```

You want any point, regardless of how it was created, to support the following

```
>>> p1.x
2
>>> p1.y
3
>>> p1.rho()
3.605551275463989  # np.sqrt of 3**2 + 2**2
>>> p1.theta()
0.982793723247329 # np.arccos(2/p1.distance_from_origin())
```

The easiest way to do this is just store x and y
values in the `__init__` functiom.

```
self.x = x
self.y = y
```

Of course if `Point(polar = (pi/3, 2))` has been used,
supplying `theta` and `rho` instead of `x` and `y`,
you will have to compute `x` and `y` from `theta` and `rho`.
For that you can use the relationship

```
x = rho * np.cos(theta)
y = rho * np.sin(theta)
```

What this means is that you will need to define
the methods `rho` and `theta` as in, for example
`p1.rho()` above.  The comments above should
help.