# Chapter Fourteen Workshop and Exercises
## Exercise 14.1

The Downey text provides the following function that recursively lists files in the directory passed as a string argument. Note that it thus lists the contents of all subdirectories. Try the code and ensure that you understand it.

In [7]:
def walk(dirname):
    for name in os.listdir(dirname):
        path = os.path.join(dirname, name)
        if os.path.isfile(path):
            print(path)
        else:
            walk(path)
        
walk("Enter your own directory name")

C:\Users\admin\JNotebooks
C:\Users\admin\JNotebooks\.ipynb_checkpoints--DIR--
C:\Users\admin\JNotebooks\bmi.ipynb
C:\Users\admin\JNotebooks\Chapter Five Exercise.ipynb
C:\Users\admin\JNotebooks\ChapterEightExercises.ipynb
C:\Users\admin\JNotebooks\ChapterEightSolutions.ipynb
C:\Users\admin\JNotebooks\ChapterElevenExercises.ipynb
C:\Users\admin\JNotebooks\ChapterElevenSolutions.ipynb
C:\Users\admin\JNotebooks\ChapterFifteenExercises.ipynb
C:\Users\admin\JNotebooks\ChapterFifteenSolutions.ipynb
C:\Users\admin\JNotebooks\ChapterFiveExercises.ipynb
C:\Users\admin\JNotebooks\ChapterFiveSolutions.ipynb
C:\Users\admin\JNotebooks\ChapterFourteenExercises.ipynb
C:\Users\admin\JNotebooks\ChapterFourteenWorkshopAndExercises.ipynb
C:\Users\admin\JNotebooks\ChapterOneExercises.ipynb
C:\Users\admin\JNotebooks\ChapterOneSolutions.ipynb
C:\Users\admin\JNotebooks\ChapterSevenExercises.ipynb
C:\Users\admin\JNotebooks\ChapterSevenSolutions.ipynb
C:\Users\admin\JNotebooks\ChapterSixExamples.ipynb
C:\Users

Copy and modify the above code to provide a function **listext** which takes a full path directory name and a file extension. It should recursively traverse all directories listing files with the specified extension. Provide a suitable docstring.

In [None]:
# Exercise 14.1


## Exceptions

Exceptions are a feature of most contemporary programming languages. Exceptions are **objects** that are **raised** when an error occurs. When an error is likely to occur (or at least possible) we can attempt to protect the code concerned in a manner which will allow us to recover from the error - rather than let the program fail with an unhelpful message. To protect a block of code we use the **try:** statement (terminated by : as a block of code follows). After the code block an **except:** clase introduces a block that is executed if an exception is raised. Following this there is *optionally* a **finally:** clause which is executed whether or not an error occurred. The finally clause is often used to close open files.

Run the following example:

In [12]:
x = int(input("Enter an Integer Divisor "))
y = (22/7)/x
print(y)

Enter an Integer Divisor 0


ZeroDivisionError: float division by zero

If you enter the value 0 as the requested integer divisor, the program stops at the second statement reporting a **ZeroDivisionError**. Now run the following example:

In [13]:
try:
    x = int(input("Enter an Integer Divisor "))
    y = (22/7)/x
except:
    print("We have a problem")
    y=0
    
print(y)    

Enter an Integer Divisor 0
We have a problem
0


This time the code executes until the end as we have 'caught' the error and ensured that y has a value (which might or might not be a good thing!). The program does not fail and we could attempt to do something to correct the error.

### Exceptions and File Handling

One of the occasions that things are most likely to go wrong with a program is when external files are being accessed. The file may not contain the data that was expected, the file could be corrupt, the device may be missing or faulty and so on. When accessing files it is thus common to place the relevant code in a **try** protected block and at minimum inform the user of the error in the **except** clause.

## JSON

Pickle is mentioned in the Downey text. Since the text was written (2014) a data representation language known as JSON (JavaScript Object Notation) has been adopted by many as a means of storing data structures to files and of transmitting data across networks. It is supported by many different languages - a big advantage over the pickle method which is Python specific. It is therefore important that you are aware of it and the support provided for it by Python. 

First, take a look at this simple JSON example:

{
    "Currency":"EUR",
    "Super":"Euro",
    "Sub":"Cent"
}

Now that you are familiar with Python, you will recognise that this is also a Python Dictionary! Lists are also essentially identical. These similarities in format have made it straightforward to include support for JSON storage within Python through the **json** module.

JSON provides structured, recursive storage, and the JSON module provides the parsing methods necessary to save and load JSON data. Consider this reasonably complex data structure:

```
cur = {
 "EUR":{"Name":"Euro","Super":"Euro","Sub":"Cent","coins":[200,100,50,20,10,5,2,1]},
 "GBP":{"Name":"Pound Sterling","Super":"Pound","Sub":"Penny","coins":[200,100,50,20,10,5,2,1]},
 "USD":{"Name":"United States Dollar","Super":"Dollar","Sub":"Cent","coins":[100,50,25,10,5,1]},
 "AUD":{"Name":"Australian Dollar","Super":"Dollar","Sub":"Cent","coins":[200,100,50,20,10,5,1]}, 
 "JPY":{"Name":"Japanese Yen","Super":"Yen","Sub":"Yen","coins":[500,100,50,10,5,1]}       
}
```
We have a dictionary each item of which has a value which is a dictionary which in turn has items which reference strings or in the final case a list.

Lets examine how we might read and write such a data structure using the JSON format:

In [15]:
import json

cur = {
 "EUR":{"Name":"Euro","Super":"Euro","Sub":"Cent","coins":[200,100,50,20,10,5,2,1]},
 "GBP":{"Name":"Pound Sterling","Super":"Pound","Sub":"Penny","coins":[200,100,50,20,10,5,2,1]},
 "USD":{"Name":"United States Dollar","Super":"Dollar","Sub":"Cent","coins":[100,50,25,10,5,1]},
 "AUD":{"Name":"Australian Dollar","Super":"Dollar","Sub":"Cent","coins":[200,100,50,20,10,5,1]}, 
 "JPY":{"Name":"Japanese Yen","Super":"Yen","Sub":"Yen","coins":[500,100,50,10,5,1]}       
}

with open('currencies.json', 'w') as f:
    json.dump(cur, f)




Note that the use of the **with..as** statement when opening a file ensures that it is automatically closed when the subsequent statement block ends.
Run the above code. If you check your Jupyter working directory you should see the file *currencies.json*. If you open and view the file in a text editor, you will see the *cur* data structure in JSON format. Close the text editor. We will now attempt to read the file back into a different variable:

In [17]:
import json

f = open('currencies.json')
newcur = json.load(f)
f.close()
print(newcur['EUR']['coins'])

[200, 100, 50, 20, 10, 5, 2, 1]


## Exercise 14.2

In the resources folder for the module you will find a file **txtsample.txt** containg a simple text narrative from a children's book. Download and save the file to either your Jupyter document directory or somewhere you can access it from Spyder. Write a program to read the file, keeping tally of the frequency with which each unique word occurs (as a dictionary). The file access should be protected in a **try:** block and any exception should lead to an explanatory message to the user.

Write the dictionary to a file **samplewords.json** using the JSON format.

In order to solve this problem you may need some extra input from me:

### The **sorted** Function and Lamda Functions

Dictionaries are not sorted. They have a **hash function** which allows a key to be found in a constant time (otherwise we would be doing sequential traversals to find a key - which would be very time consuming). In this case, once we have constructed the word frequency list, we can sort it once. We need to sort it by value which makes things a little trickier than sorting by key.

Python has a function **sorted**. Here is its syntax:

```sorted(iterable, key=key, reverse=reverse)```

- iterable is anything that can be iterated like strings lists and dictionaries
- key is a function to execute to decide the order - returns the value to be sorted on
- reverse is a boolean. False will sort ascending. True descending.

Sorted iterates over the iterable using a for..in loop. Unfortunately, when you iterate over a dictionary, only the key of each key-value pair is returned on each iteration. This is illustrated below: 



In [None]:
d = {"A":88,"B":22,"C":66,"D":32,"E":95}
print(d)
print(sorted(d)

As you can see, there is a further problem. The dictionary keys are in order of key. We could reverse it thus:

In [None]:
d = {"A":88,"B":22,"C":66,"D":32,"E":95}
print(d)
print(sorted(d, reverse=True))

To sort on values, we have to specify the value to sort on using a function which will be automatically be passed the value returned on each iteration (the key of the dictionary):

In [None]:
def getval(x):
    global d
    return d[x]

d = {"A":88,"B":22,"C":66,"D":32,"E":95}
print(d)
print(sorted(d, key = getval))

Placing a print statement within the getval function may make this easier to understand:

In [None]:
def getval(x):
    global d
    print(x)
    return d[x]

d = {"A":88,"B":22,"C":66,"D":32,"E":95}
print(d)
print(sorted(d, key = getval))

So, now we have the keys sorted in order of the values. There is an improvement we could make though. Python supports Lambda functions which are anonymous and can be defined and executed inline. Given the value x returned on each iteration of sorted, we can defined a function of x using a simple notation:

In [None]:
d = {"A":88,"B":22,"C":66,"D":32,"E":95}
print(d)
print(sorted(d, key = lambda x:d[x]))

The statement lamda x:d[x] simply means that for each value of x (the dictionary key) we will return the corresponding value d[x] to be used as the sort value. Lambda functions are incredibly useful and efficient in this manner. Note that sorted returns a list.

There is one further improvement that we could make. The dictionary class has a method items that returns a list of tuples containing the name-value pairs. To obtain this list prior to the sort would be slightly more efficient and give us both keys and values in the return value of sorted:

In [None]:
d = {"A":88,"B":22,"C":66,"D":32,"E":95}
print(d)
print(sorted(d.items(), key = lambda x:x[1]))

The value available to the function assigned to key is now a tuple containing the key and value of each item in the dictionary. Thus for each x (tuple) we return the value (x[1]).

### Text Files

Not all text files are created equal! Many different encoding protocols are used to represent characters stored to files. Python uses UTF-8 as **default**. The text file **sample.txt** was prepared on a central european windows 10 machine and hence has **cp1250** encoding. The encoding can be specified when you open a file. Assuming that **sample.txt** is in the current working directory, this is how you might open it:

file = open("sample.txt",'r',encoding ='cp1250')

You should now be in a position to solve the problem!

In [None]:
## Exercise 14.2


## Exercise 14.3

Write a program that reads the file **samplewords.txt** ensuring that the access is protected in a **try:** block. If the read was successful, print the five most and five least frequently occurring words.

In [None]:
# Exercise 14.3
