# Day 4 Reading Journal – Responses

### Exercise 10.3 
Write a function called `middle` that takes a list and returns a new list that contains all but the first and last elements. So `middle([1,2,3,4])` should return `[2,3]`.

In [6]:
import doctest

def middle(lis):
    '''
    Takes a list 'lis' and returns a copy of the list with the first and last elements removed
    >>> middle([1,2,3,4])
    [2, 3]
    '''
    
    return lis[1:len(lis)-1]
   
doctest.testmod()

Kudos for documentation!

`lis[1:len(lis)-1]` can be abbreviated `lis[1:-1]`. A negative number as a list index, indexes backwards from the end of the list. For example:

In [1]:
lis = [1, 2, 3, 4]
print('lis[-1] =', lis[-1])

s = "Hello"
print('s[-1] = ', s[-1])

lis[-1] = 4
s[-1] =  o


In [2]:
def middle (list):
    a=len(list)
    new_list=list[1:a-1] #we included 1 to exclude 0, or the first number, and if the list continues on,then we just need to subtract 1 from the list
    return new_list

middle ([1,2,3,4])

This is similar to the first solution, but with a temporary variable to hold the length of the list.

Whether to use a temporary variable here is a judgement call. On the plus side, it makes it easier to debug while you're developing the code.

Improvements:
* [Python style](https://www.python.org/dev/peps/pep-0008/) is to surround operators by spaces: `a = len(list)`, `new_list = list[1:a - 1]`
* Choose a different name from `list`. (`lst` is conventional.) `list` is a Python global. The `list` parameter to `middle` hides this from code inside `middle`.

In [None]:
def middle(l):
    return l[1:-1]

Short and to the point.

In [3]:
def middle(myList):
    return myList[1:-1]

print(middle([1,2,3,4]))

Almost as short, but `myList` makes it easier to read what the function does even in the absence of a doc string, and the `print` lines shows that what it does and that it's been tested.

In [9]:
def middle(items):
    """
    Removes first and last item in a list
    Returns modified list
    >>> middle([1,2,3,4,5])
    [2, 3, 4]
    """
    del items[0]
    del items[len(items)-1]
    return items


import doctest
doctest.testmod(verbose=True)

This deviates from the instructions in that it returns the existing list instead of creating a new list.

How could the doc tests be extended to test whether the function returns a *new* list?

In [13]:
def middle(l):
    
    if len(l) is 2:
        return None
    elif len(l):
        return(l[1:len(l)-1])


print(middle([1,2,3,4]))
print(middle([1, 2, 3, 4, 5, 6]))
print(middle([0, 1, 2, 3, 4, 5]))
print(middle([1,2,3,4,45]))
print(middle([1,2]))
print(middle([1,2,3]))

In [14]:
def middle(list):
    '''
        DEFINITION: returns a string with all elements except the first and the last that were in the previous list.
        ARGUMENTS:
            list: list of objects
        RETURNS: list of objects without the first and last objects
        
        >>> middle([1,2,3,4])
        [2, 3]
        >>> middle([1,2,4])
        [2]
        >>> middle([1,4])
        []
        >>> middle([])
        []
        '''
    newlist = []
    if(len(list)>2):
        newlist = list[1:-1]
    return newlist

import doctest
doctest.run_docstring_examples(middle, globals(), verbose=True)

An excellent set of test cases.

Each programming language has conventions about how to document a functions parameters and return value.
This has the right information in a non-Python accent. [This document](https://www.python.org/dev/peps/pep-0257/#multi-line-docstrings) shows the convention for Python. [Relevant xkcd](https://xkcd.com/927/).

### Exercise 10.4 
Write a function called `chop` that takes a list, modifies it by removing the first and last elements, and returns `None`.

What is the difference between `middle` and `chop`? Sketch out the program state or take a look at each in Python Tutor and answer the question in the Markdown cell below.

In [28]:
def chop(items):
    """
    Removes first and last item in a list
    Returns None
    
    >>> chop([1,2,3,4,5])
    
    """
    del items[0]
    del items[len(items)-1]
    return None

import doctest
doctest.testmod(verbose=True)

This is correct. It removes the first and last element.

`return None` is unnecessary but harmless. A function without an explicit return value, implicitly return `None`.

In this case I might `return None` as this code does, to make explicit the match between the code, and what the instructions ask for.

In general, a function that is executed "for effect" (it modifies its arguments or other non-local values) is often assumed to be fruitless, and you wouldn't return `None`.

What does `chop([1])` do? What *should* it do?

In [None]:
def chop(lis):
    lis = lis[1:len(lis)-1]

Function parameters such as `lis` are variables that are local to the function – assignments to them change the value of the variable, but don't change what's going on outside the function.

This is a problem for code that uses the above definition as in:

    evens = [2, 4, 6, 8]
    chop(evens)

It means that `evens` after all this is done still has the value `[2, 4, 6, 8]` – chop changed the value of `lis` (into a new list), but it didn't modify the list itself. 

In [4]:
def chop(oldList):
    oldList = oldList[1:(len(oldList)-1)]
    return 'None'

middle([1, 2, 3, 4])

Be careful of the distinction between the special value `None`, and the string `"None"`.

(Also `False` versus `"False"`.)

In [16]:
def chop(l):
    del l[0]
    del l[len(l)-1]

    return None

print(chop([1,2,3,4]))
print(chop([1, 2, 3, 4, 5, 6]))
print(chop([0, 1, 2, 3, 4, 5]))
print(chop([1,2,3,4,45]))
print(chop([1,2]))
print(chop([1,2,3]))

A good set of test cases. I would also test the extreme: `chop([])`.

In [47]:


def chop(x):
    """
    This function should accept list x and remove the first and last terms. it returns None
    
    >>> chop([1, 2, 3, 4])
    None
    """
    x.pop(0)
    x.pop(len(x)-1)
    return None


v = [1, 2, 3, 4, 5]
chop(v)
print(v)

`lst.pop(i)` is an alternative to `del lst[i]`. Unlike `del`, `pop` is a function, that returns the removed item - although that isn't used here.



In [9]:
def chop(list):
    list.remove(list[0])
    list.remove(list[len(list)-1])
    return None

`lst.remove(n)` removes by value, instead of position. It removes the first *n* in *lst*.

Questions:
* For what values of `[1, 2, 3, 2]` is `lst.remove(lst[i])` *not* equivalent to `lst.pop(i)`?
* Given this, what's a test value that breaks the above definition of `chop`?

In [11]:
def chop (strand):
    del strand[0]
    del strand[-1]
    return (strand)
chop ([1,2,3,4,5,6,7])

#The difference between middle and chop is that in middle, you are not
#making modifications to the string in any way. Alternatively, in chop,
#you are modifying the string. This means that if you have to reference
#the string later on, the input will not be the same. This can be good
#or bad depending on your application. While middle is a fruitful
#function, chop merely modifies an input.

Thumbs up on the explanation.

### Exercise 10.6 
Two words are anagrams if you can rearrange the letters from one to spell the other. Write a function called `is_anagram` that takes two strings and returns `True` if they are anagrams.

In [16]:
import doctest

def is_anagram(str1, str2):
    """Takes two strings and returns true if they are anagrams of one another.
    
    >>> is_anagram("anagram","nag a ram")
    True
    >>> is_anagram("aaaaa", "abbbb")
    False
    """
    for c in str1:
        if str2.find(str1[i]) == -1:
            return False
        str2 = str2.replace(str1[i], '', 1)
    return True

doctest.testmod()

This verifies that each letter in `str1` is present in `str2`. It also "uses up" the letters in `str2`, so that a letter that occurs twice in `str1` has to be used twice in `str2`, and so on.

The last line should return `str2 == ''` or `not str2` instead of `True`. Can you see why, and write a test case that demonstrates this?

In [13]:
def is_anagram(word1, word2):
    listA = list(word1)
    listB = list(word2)
    listA.sort()
    listB.sort()
    if listA == listB:
        return True 
    else:
        return False

is_anagram("bananab", "ananabb")

Sometimes it's easier to "normalize" two values and then compare the normalized values for equality. This solutions uses the fact that there's a "normal form" for a string, for anagram purposes: the sorted list of its letters.

The last statement could be simply `return listA == listB`. In general, `if test: return True else: return False` and `if test: return False else: return True` can be replaced by `return test` and `return not test`, respectively.

The following function uses this fact. It also uses `sorted`, which is the fruitful version of `list.sort` – `sorted(lst)` does *not* modify `lst`, and *does* return a value. (How does this compare to the `middle` and `chomp`, above?

In [None]:
def is_anagram(word1, word2):
    listA = list(word1)
    listB = list(word2)
    return sorted(listA) == sorted(listB):

is_anagram("bananab", "ananabb")

Since `listA` and `listB` are only used once, we can use the values that they're initialized with instead.

In [None]:
def is_anagram(word1, word2):
    return sorted(list(word1)) == sorted(list(word2)):

is_anagram("bananab", "ananabb")

### Exercise 10.8  
The (so-called) Birthday Paradox: <br /><br />
1\. Write a function called `has_duplicates` that takes a list and returns `True` if there is any element that appears more than once. It should not modify the original list.

### Solutions

There were a variety of different strategies, many of which were new to me.

In [17]:
def has_duplicates(lst):
    """Return true iff lst contains duplicate elements.
    
    >>> has_duplicates([])
    False
    >>> has_duplicates([1])
    False
    >>> has_duplicates([1, 2])
    False
    >>> has_duplicates([1, 2, 1])
    True
    """
    for i in range(len(lst)):
        for j in range(i + 1, len(lst)):
            if lst[i] == lst[j]:
                return True
    return False

import doctest
doctest.run_docstring_examples(has_duplicates, globals())

In [20]:
def has_duplicates(lst):
    lst = sorted(lst)
    for i in range(1, len(lst)):
        if lst[i - 1] == lst[i]:
            return True
    return False

# or, using list comprehensions:
def has_duplicates(lst):
    lst = sorted(lst)
    return any(lst[i - 1] == lst[i] for i in range(1, len(lst)))

def has_duplicates(lst):
    lst = sorted(lst)
    return any(a == b for a, b, in zip(lst[1:], lst[:1]))

In [None]:
def has_duplicates(lst):
    seen = set()
    for item in lst:
        if lst in seen:
            return True
        seen.add(lst)
    return False

In [28]:
def has_duplicates(lst):
    for i, x in enumerate(lst):
        if x in lst[i + 1:]:
            return True
    return False

In [24]:
def has_duplicates(seq):
    seen = set()
    seen_twice = {x for x in seq if x in seen or seen.add(x)}
    return bool(seen_twice)

has_duplicates([1,2,3,2,1,5,6,5,5,5])

# Learned from a post by user Ritesh Kumar on stackoverflow.com

Kudos for finding this, and for appropriate use of attribution.

I modified the last line to return a boolean instead of a list of seen elements.

In [17]:
import collections

def has_duplicates(l):
    return collections.Counter(l) != collections.Counter(set(l))

has_duplicates([1,2,5,4])

In [20]:
def has_duplicates(myList):
    for x in myList:
        if myList.count(x) != 1:
            return True
    return False

print(has_duplicates(['a','a','b']))

### Part 2

2\. If there are 23 students in your class, what are the chances that two of you have the same birthday? Put your answer in the Markdown cell below. You can estimate this probability by generating random samples of 23 birthdays and checking for matches. Hint: you can generate random birthdays with the randint function from the [random module](https://docs.python.org/2/library/random.html).

You can read about this problem at http://en.wikipedia.org/wiki/Birthday_paradox, and you can download Allen's solution from http://greenteapress.com/thinkpython2/code/birthday.py.

In [42]:
import random

def generate_birthdays(num):
    days = [0] * num
    for i in range(num):
        days[i] = random.randint(1, 365)
    return(days)

def estimate_probability(student_count, trials):
    positives = 0
    for _ in range(trials):
        if has_duplicates(generate_birthdays(student_count)):
            positives += 1
    return positives / trials

print(estimate_probability(23, 1000))

0.511


In [45]:
# using comprehension and a nested function

import random

def generate_birthdays(num):
    return [random.randint(1, 365) for i in range(num)]
                           
def estimate_probability(student_count, trials):
    def trial():
       return has_duplicates(generate_birthdays(student_count))

    return sum(trial() for _ in range(trials)) / trials

print(estimate_probability(23, 1000))

0.519
