#### More intro to python...

We continue where we left off in the previous intro worksheet.

The following is a *try-except* block.  

You use it when you want to do something you know might not work.

This snippet uses the built-in **index** function.


In [1]:
L = ["1","2","3"]
a = input()
try:
    print("The index of {} in L is {}".format(a,L.index(a)))
except:
    print("Uh oh, not there.")

 5

Uh oh, not there.


#### Dictionaries

Python has a built-in associative array type called a `dict`.  This is similar to `map` in the C++ standard library.

It is based on key value pairs.  

In a standard array (or list) you can think of indexes as _keys_ and the array element at the indexed location as the _value_.

For example if we do:

    L = ['a','b','c','banana']
    
Then for the key 3, the value is 'banana'.  The rule for access is:
    
    L[key] == value

Dictionaries are similar. The difference is that they keys do not have to be integers. They can be any "[hashable](https://stackoverflow.com/questions/14535730/what-do-you-mean-by-hashable-in-python)" object.

Here are some examples.


In [57]:
D = dict()
D[3.14] = "pi"
D[2.71] = "e"
D

{3.14: 'pi', 2.71: 'e'}

If you know JSON format you will recognize the similarity to the presentation of the dictionary `D`.
In python it is very easy to go from dictionaries to JSON and back again.
This might be relevant at some point if we need to load JSON data.


In [4]:
D[3.14]

'pi'

In [5]:
## Exercise:

##  Try to find the value in D for a key that doesn't exist (eg 5).

In [6]:
D.keys()

dict_keys([3.14, 2.71])

In [7]:
D.items()

dict_items([(3.14, 'pi'), (2.71, 'e')])

In [8]:
D.values()

dict_values(['pi', 'e'])

In [9]:
#By default iteration is over keys
for d in D:
    print(d)

3.14
2.71


In [58]:
# But you can also iterate over key,value pairs, aka items.
for e,d in D.items():
    print(f"{e} is the key for the value {d}")

3.14 is the key for the value pi
2.71 is the key for the value e


In [59]:
# Exercise

##  Create the inverse dictionary E of D.
## That is, where D[key] = value, have E[value] = key.
E = dict()
for key,val in D.items():
    E[val] = key
E    


{'pi': 3.14, 'e': 2.71}

You will often see dictionaries defined in the following way:
    

In [12]:
F = {"hey":"nonny", "hi":"ninny", "ho":"monny"}
F

{'hey': 'nonny', 'hi': 'ninny', 'ho': 'monny'}

One common use of a dictionary in data applications is to specify a pandas dataframe.

In [13]:
import pandas as pd

data = {'a':[1,2,3],'b':[4,5,6],'c':[7,8,9]}
df = pd.DataFrame(data)
df

Unnamed: 0,a,b,c
0,1,4,7
1,2,5,8
2,3,6,9


#### Sets 

As the name suggests, a set is an unordered list.  It can handle set theoretic operations (union, intersection, etc).



In [8]:
L = [1,2,3,3,3,3]
S = set(L)
S

{1, 2, 3}

In [11]:
T = {2,3,4,5}
S.intersection(T)

{2, 3}

In [17]:
s = """
You may sometimes want to find the number of distinct words in a body of text.
Even in this short bit of text many words occur multiple times.
The set datatype is very useful for text processing.
As an exercise, let us produce a list of the distinct words occurring in this text.
"""
s

'\nYou may sometimes want to find the number of distinct words in a body of text.\nEven in this short bit of text many words occur multiple times.\nThe set datatype is very useful for text processing.\nAs an exercise, let us produce a list of the distinct words occurring in this text.\n'

In [18]:
s = s.lower()  #  first convert all words to lowercase.
s = s.replace(".","")  # delete all periods
s = s.replace("\n","")
s

'you may sometimes want to find the number of distinct words in a body of texteven in this short bit of text many words occur multiple timesthe set datatype is very useful for text processingas an exercise, let us produce a list of the distinct words occurring in this text'

In [21]:
## now we use split() to convert s into a list of words

L = s.split()
L

['you',
 'may',
 'sometimes',
 'want',
 'to',
 'find',
 'the',
 'number',
 'of',
 'distinct',
 'words',
 'in',
 'a',
 'body',
 'of',
 'texteven',
 'in',
 'this',
 'short',
 'bit',
 'of',
 'text',
 'many',
 'words',
 'occur',
 'multiple',
 'timesthe',
 'set',
 'datatype',
 'is',
 'very',
 'useful',
 'for',
 'text',
 'processingas',
 'an',
 'exercise,',
 'let',
 'us',
 'produce',
 'a',
 'list',
 'of',
 'the',
 'distinct',
 'words',
 'occurring',
 'in',
 'this',
 'text']

In [27]:
print("The number of words occurring in L is {}".format(len(L)))
S = set(L)
print("The number of distinct words occurring in L is {}".format(len(S)))
counts = {w:L.count(w) for w in S}
counts

The number of words occurring in L is 50
The number of distinct words occurring in L is 37


{'sometimes': 1,
 'bit': 1,
 'exercise,': 1,
 'is': 1,
 'useful': 1,
 'set': 1,
 'us': 1,
 'an': 1,
 'multiple': 1,
 'number': 1,
 'occurring': 1,
 'short': 1,
 'very': 1,
 'processingas': 1,
 'words': 3,
 'to': 1,
 'in': 3,
 'many': 1,
 'a': 2,
 'find': 1,
 'the': 2,
 'body': 1,
 'of': 4,
 'this': 2,
 'you': 1,
 'may': 1,
 'for': 1,
 'occur': 1,
 'list': 1,
 'timesthe': 1,
 'produce': 1,
 'want': 1,
 'texteven': 1,
 'text': 3,
 'distinct': 2,
 'let': 1,
 'datatype': 1}

#### Functions

Here we give examples of how to define functions in python.  Notice that indentation is crucial.

The basic syntax for declating a function is

```
def func_name(parameter1, parameter2):
    # code block
    # optional return statements
```

Below we implement selection sort as an example.

In [34]:
def selection_sort(L):
    _L = L[:]  ## A deep local copy
    Lsorted = []
    while len(_L)!=0:
        # pop the least element of L and append it to Lsorted
        Lsorted.append(_L.pop(_L.index(min(_L))))
    return Lsorted    

## By the magic of OO programming, the same function can sort words or numbers

print("Using the function to sort alphabetically:")
L = "what a great dog you have".split()
print(selection_sort(L))

print("\n\nUsing the exact same function to sort numerically:")
import random
L = [random.randint(0,10) for i in range(13)]
print(selection_sort(L))

Using the function to sort alphabetically:
['a', 'dog', 'great', 'have', 'what', 'you']


Using the exact same function to sort numerically:
[0, 1, 1, 1, 3, 4, 6, 7, 7, 7, 8, 8, 9]


You can pass functions to functions.

In [37]:
def sort_wrapper(L, sort_method):
    return sort_method(L)
L = "borrowed books must be promptly returned".split()
sort_wrapper(L,selection_sort)

['be', 'books', 'borrowed', 'must', 'promptly', 'returned']

Unlike C++, python functions can have optional parameters with default values.

In [38]:
def bab_sqrt(S,iterations=3,initial_value=2):
    """Approximate the square root of S using the Babylonian method (Newton's method)"""
    x = initial_value
    for i in range(iterations):
        x = 1/2*(x + S/x)
    return x

a = bab_sqrt(40)
    
a, a**2

(6.392010163749294, 40.85779393347428)

In [39]:
a = bab_sqrt(40,6,2)
a

6.324555320336758

In [40]:
# If you don't know the order of the parameters you can do this:

a = bab_sqrt(55, initial_value = 10, iterations = 100)
a, a**2

(7.416198487095663, 55.0)

In [41]:
# The docstring shows up using the help function
bab_sqrt?

In [51]:

# Exercise:
## implement GCD as a function

In [1]:
#Exercise:

## implement the dot product of two lists as a function.
## make sure they are the same length!

### FileIO

We will mostly be using the very convenient "pandas" module to load and export datasets.

However we should cover at least the basics of FileIO in python. 

Below are examples of how to write and read simple text files.  

In real applications a lot of code is added to make sure that the file actually opens and stuff like that.

I'll be ignoring all of that in these simple examples.



In [48]:
## Writing files

letter = "Dear John Jay,\n You were great on the first Supreme Court of the USA."

filename = "Dear_John"

file_pointer = open(filename,"w")  #open a file for writing.  The "a" mode would append and not clobber.

file_pointer.write(letter)  # write the string

file_pointer.close()   # close the file


### Shell commands from Jupyter

This is a convenient place to mention that you can use shell commands from Jupyter by adding a ! before the command.

For instance the "cat" command dumps the contents of a file to the terminal.  

The "ls" command lists the contents of a directory.

Below I will use cat and ls to check whether the code in the above cell worked.  

In [45]:
!ls  ## The file Dear_John appears

Dear_John  python.ipynb  s.term


In [46]:
!cat Dear_John  ## The contents of the file

Dear John Jay,
 You were great on the first Supreme Court of the USA.

In [54]:
!cat {filename}  ## This is how python variable can interact with the shell in Jupyter

Dear John Jay,
 You were great on the first Supreme Court of the USA.

In [56]:
files = !ls   ## You can also assign the result of a shell command to a python variable
files

['Dear_John', 'python.ipynb', 's.term']

In [57]:
current_dir = !echo $PWD
current_dir

['/home/hunter/Documents/ML_course/ML_Spr_20/Slides/Intro2Python2']

In [61]:
### Reading files
## Reading files is just as easy as writing to files.

fp = open(filename,'r') #open in read mode
content = fp.read()
fp.close()
content

'Dear John Jay,\n You were great on the first Supreme Court of the USA.'