# Introduction to Python - Day 06 (25 July 2017)
### Agenda for today:
+ Basic File handling
+ Basic Exception Handling
+ a useful package, module: os, os.path
+ higher order functions, lambda functions, sorting complex objects

# Data Persistence

+ Files
    + **<font color='blue'>\*.txt**</font>, \*.xml, *.json
    + \*.csv, \*.tab, *.xlsx (covered later, with pandas)
+ Databases (not covered in this course)

# Built-in *<font color='blue'>file</font>* object

+ Basic format:
```python
fh = open('<filename>', '<mode>')   # Creates a file object fh
```
+ *filename* can be _**absolute**_ or _**relative**_
+ *mode*: {'r', 'w', 'a'}; Default='r'
+ 'r': open file for reading if exists, else **<font color='blue'>FileNotFoundError</font>**
+ 'w': open new file for writing; overwrite if exists; use 'a' to avoid overwriting

```python
fh = open('data/data.txt')
type(fh)
dir(fh)
```

+ If the file does not exist, open will raise a **FileNotFound** error with traceback

**Notes**:
+ *fh* is not the file itself, but a handle/reference to it. Use it to do desired operations (read/write).
<br />
![alt text](filehandle.svg)
<br />
+ <font color='blue'>Some additional mode options: 'rb', 'wb' for reading and writing binary files; '+' to open the file for both reading and writing.

In [2]:
fh = open('data/data.txt')
type(fh)
dir(fh)

['_CHUNK_SIZE',
 '__class__',
 '__del__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__enter__',
 '__eq__',
 '__exit__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__lt__',
 '__ne__',
 '__new__',
 '__next__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '_checkClosed',
 '_checkReadable',
 '_checkSeekable',
 '_checkWritable',
 '_finalizing',
 'buffer',
 'close',
 'closed',
 'detach',
 'encoding',
 'errors',
 'fileno',
 'flush',
 'isatty',
 'line_buffering',
 'mode',
 'name',
 'newlines',
 'read',
 'readable',
 'readline',
 'readlines',
 'seek',
 'seekable',
 'tell',
 'truncate',
 'writable',
 'write',
 'writelines']

# Reading from Files in "text mode"
+ File content is always read in as strings  
<br />
+ Here are the **most common approaches**:

    - **Read all data at once as a string**

    ```python
    import pprint
    fh = open('data/data.txt', 'r')
    data = fh.read()              # to read in all data as one big string
    print(type(data), '\n\n'
    pprint.pprint(data)

    pprint.pprint(data.split('\n'))   # split the big string on new line character (\n)
    ```

In [5]:
import pprint
fh = open('data/data.txt', 'r')
data = fh.read()              # to read in all data as one big string
print(type(data), '\n\n')
pprint.pprint(data)

pprint.pprint(data.split('\n')) 

<class 'str'> 


('Writing programs or programming is a very creative\n'
 'and rewarding activity  You can write programs for\n'
 'many reasons ranging from making your living to solving\n'
 'a difficult data analysis problem to having fun to helping\n'
 'someone else solve a problem  This book assumes that\n'
 '{\\em everyone} needs to know how to program and that once\n'
 'you know how to program, you will figure out what you want\n'
 'to do with your newfound skills\n'
 '\n'
 'We are surrounded in our daily lives with computers ranging\n'
 'from laptops to cell phones  We can think of these computers\n'
 'as our personal assistants who can take care of many things\n'
 'on our behalf  The hardware in our current-day computers\n'
 'is essentially built to continuously ask us the question\n'
 'What would you like me to do next\n'
 '\n'
 'Our computers are fast and have vasts amounts of memory and \n'
 'could be very helpful to us if we only knew the language to \n'
 'speak to explain t

+ **file pointers and *<font color='blue'>seek</font>* operation**

    ```python
    data_empty = fh.read()        # can't read more without resetting the read pointer
    print("length of empty_data is: ", len(data_empty))
    help(fh.seek)
    fh.seek(0, 0)          # reset the pointer to beginning of the file
    data_now = fh.read()
    ```

In [10]:
data_empty = fh.read()        # can't read more without resetting the read pointer
print("length of empty_data is: ", len(data_empty))
fh.seek(100, 0)          # reset the pointer to beginning of the file
data_now = fh.read()
print(data_now)

length of empty_data is:  0
r
many reasons ranging from making your living to solving
a difficult data analysis problem to having fun to helping
someone else solve a problem  This book assumes that
{\em everyone} needs to know how to program and that once
you know how to program, you will figure out what you want
to do with your newfound skills

We are surrounded in our daily lives with computers ranging
from laptops to cell phones  We can think of these computers
as our personal assistants who can take care of many things
on our behalf  The hardware in our current-day computers
is essentially built to continuously ask us the question
What would you like me to do next

Our computers are fast and have vasts amounts of memory and 
could be very helpful to us if we only knew the language to 
speak to explain to the computer what we would like it to 
do next If we knew this language we could tell the 
computer to do tasks on our behalf that were reptitive  
Interestingly, the kinds of thin

+ **Read individual lines as strings**
```python
fh.seek(0, 0)
data = fh.readlines()       # returns list of strings
print(type(data), "\n\n")   # check out the \n newline character at the end of lines
pprint.pprint(data)         # (\r\n on windows machines)
```

In [11]:
fh.seek(0, 0)
data = fh.readlines()       # returns list of strings
print(type(data), "\n\n")   # check out the \n newline character at the end of lines
pprint.pprint(data)

<class 'list'> 


['Writing programs or programming is a very creative\n',
 'and rewarding activity  You can write programs for\n',
 'many reasons ranging from making your living to solving\n',
 'a difficult data analysis problem to having fun to helping\n',
 'someone else solve a problem  This book assumes that\n',
 '{\\em everyone} needs to know how to program and that once\n',
 'you know how to program, you will figure out what you want\n',
 'to do with your newfound skills\n',
 '\n',
 'We are surrounded in our daily lives with computers ranging\n',
 'from laptops to cell phones  We can think of these computers\n',
 'as our personal assistants who can take care of many things\n',
 'on our behalf  The hardware in our current-day computers\n',
 'is essentially built to continuously ask us the question\n',
 'What would you like me to do next\n',
 '\n',
 'Our computers are fast and have vasts amounts of memory and \n',
 'could be very helpful to us if we only knew the language to \n',
 

+ **Iterate over large files**
```python
fh.seek(0, 0)
for line in fh:           # 'fh' is iterable; use in iteration context for efficient
    print(len(line), line)  # reading of large files
fh.close()                # close file; good practice (esp. when writing files)
```

In [15]:
fh = open('data/data.txt')
fh.seek(0, 0)
for line in fh:           # 'fh' is iterable; use in iteration context for efficient
  print(len(line), line)  # reading of large files
fh.close()  

51 Writing programs or programming is a very creative

51 and rewarding activity  You can write programs for

56 many reasons ranging from making your living to solving

59 a difficult data analysis problem to having fun to helping

53 someone else solve a problem  This book assumes that

58 {\em everyone} needs to know how to program and that once

59 you know how to program, you will figure out what you want

32 to do with your newfound skills

1 

60 We are surrounded in our daily lives with computers ranging

61 from laptops to cell phones  We can think of these computers

60 as our personal assistants who can take care of many things

57 on our behalf  The hardware in our current-day computers

57 is essentially built to continuously ask us the question

34 What would you like me to do next

1 

61 Our computers are fast and have vasts amounts of memory and 

61 could be very helpful to us if we only knew the language to 

59 speak to explain to the computer what we would like it 

+ **Context manager**
```python
with open('data/data.txt', 'r') as fh:  # Context-manager; automatically closes the file
    for line in fh:
        print(line)
print('\nfh status: ', fh.closed)
```

In [17]:
with open('data/data.txt', 'r') as fh:  # Context-manager; automatically closes the file
  for line in fh:
      pass
print('\nfh status: ', fh.closed)


fh status:  True


## Very Brief Introduction to Exception Handling
+ In programs, everything doesn't always happen as it is supposed to (KeyError, IndexError, FileNotFoundError, ZeroDivisionError etc.)
+ Events that are triggered on errors (or manually, for signalling)
+ Python machinery allows jumping out of an arbitrary code-block when exceptions occur
+ Errors need not always terminate the script execution
    - could be handled appropriately
    - recover gracefully from it if possible
        * may be we want to ignore errors
        * may be we want to handle errors in a specific way (for ex, using a default response, or closing all resources - files, database connections etc - before we exit)
        * may be we want to provide a user some informative message and carry on
        
```python
while True:
    x = input('Enter a number')
    print(int(x)*2)
```
<br />
```python
while True:
    x = input('Enter a number')
    try:    
        print(int(x)*2)
    except ValueError:
        print('Please enter a valid number')
```
<br />
```python
import pprint
try:
    fh = open('wrong_filename.txt', 'r')
    data = fh.read()              # to read in all data as one big string
    print(type(data), '\n\n')
    pprint.pprint(data)
except FileNotFoundError:
    print('Please enter a valid filename')
```

In [20]:
import pprint
try:
    fh = open('wrong_filename.txt', 'r')
    data = fh.read()              # to read in all data as one big string
    print(type(data), '\n\n')
    pprint.pprint(data)
except Exception as e:
    print('Please enter a valid filename')

Please enter a valid filename


# Writing to Files in "text mode"
+ Like reading, writing is also done as strings
<br />
+ Here are the **most common approaches**:

```python
fh = open('data/fresh.txt', 'w')           # Open a new file in write mode
fh.write('This is the 1st line\n')    # Write a line; 
                                      # Note that newline chars must be explicitly added
fh.close()
```



In [21]:
fh = open('data/fresh.txt', 'w')           # Open a new file in write mode
fh.write('This is the 1st line\n')    # Write a line; 
                                      # Note that newline chars must be explicitly added
fh.close()

+ **Flushing buffers**

```python
fh = open('data/fresh.txt', 'a')           # Open the earlier file in 'append' mode
                                      #    to avoid overwriting
fh.write('This is the 2nd line\n')    # Write a line; 
                                      # Note that newline chars must be explicitly added
fh.flush()                            # Clears the buffer
```

In [22]:
fh = open('data/fresh.txt', 'a')
fh.write('This is the 2nd line\n')

21

In [23]:
fh.flush()

+ **Write multiple lines at once**

```python
fh.writelines(['This is the 3rd line\n', 'This is the 4th line\n'])   # Note the newline
fh.flush()
```

+ **Write iteratively**

```python
more_lines = ['5th line', '6th line', '7th line']

for line in more_lines:                   # iteration context
    fh.write(line + '\n')
    fh.flush()
fh.close()
```



## os.path module
```python
import os
dir(os.path)
```
+ **path parsing:**
    - os.path.split(<path_str>):
    - os.path.splitext(<path_str>) 
```python
# Ex.
print(os.path.split('/Users/groveh01/Documents/my_data.txt'))
print(os.path.splitext('my_data.txt'))
```
+ **path building:**
    - os.path.join(<path_components>)
```python
# Ex.
print(os.path.join('/Users', 'groveh01', 'Documents', 'Teaching'))
```
+ **common tests:**
    - os.path.<test>, where test = {isdir(), isfile(), exists(), ...}
```python
# Ex.
print(os.path.isdir('data/data.txt'))
print(os.path.isfile('data/data.txt')
print(os.path.exists('data')
print(os.path.exists('data.txt')
print(os.path.exists('/Users/groveh01')
```
+ **listing contents of a dir:**
    - os.listdir
```python
# Ex.
import pprint
pprint.pprint(os.listdir('/Users/groveh01'))
pprint.pprint(os.listdir('.'))
pprint.pprint(os.listdir('..'))
```

In [25]:
import os
print(os.path.split('/Users/groveh01/Documents/my_data.txt'))
print(os.path.splitext('my_data.txt'))
print(os.path.join('/Users', 'groveh01', 'Documents', 'Teaching'))

('/Users/groveh01/Documents', 'my_data.txt')
('my_data', '.txt')
/Users/groveh01/Documents/Teaching


In [27]:
print(os.path.isdir('data/data.txt'))
print(os.path.isfile('data/data.txt'))
print(os.path.exists('data'))
print(os.path.exists('data.txt'))
print(os.path.exists('/Users/groveh01'))

False
True
True
False
False


In [32]:
import os
print(os.path.splitext('my_file.txt'))

('my_file', '.txt')


# Some useful higher-order functions
+ Take other functions as input
+ <font color='blue'>**map**</font>: Apply a function to each element of an iterable
```python
Syntax: map(<some fn>, <some iterable>)      # returns a lazy object
          help(map)
```

```python
# Ex. 1
from math import sqrt
x = [1,2,3,4]
z = map(sqrt, x)
print(type(z), z)      # lazy map object; great for working with large iterables
print(list(z))
```
```python
# Ex. 2
def fn(a, b):
    return a+b
x = [1,2,3,4]
y = [10,11,12,13]
z = map(fn, x, y)
print(list(z))
```

In [34]:
from math import sqrt
x = [1,2,3,4]
z = map(sqrt, x)
print(type(z), z)      # lazy map object; great for working with large iterables
print(list(z))

<class 'map'> <map object at 0x1104c9f60>
[1.0, 1.4142135623730951, 1.7320508075688772, 2.0]


In [36]:
def fn(a, b):
    return a+b
x = [1,2,3,4]
y = [10,11,12]
z = map(fn, x, y)
print(list(z))

[11, 13, 15]


+ <font color='blue'>**reduce**</font>: reduce an iterable to a single value
+ **Syntax**:
```python
from functools import reduce
reduce(<some fn>, <some sequence>)
help(reduce)
```

+ Ex:

```python
from functools import reduce
def sum_(a, b):
    return a+b
    
x = [1,2,3,4]
z = reduce(sum_, x)
print(z)
```

[1, 2, 3, 4]<br />
&nbsp;&nbsp;&nbsp;\/<br />
&nbsp;&nbsp;[3, 3, 4]<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\/<br />
&nbsp;&nbsp;&nbsp;&nbsp;[6, 4]<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\/<br />
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;10<br />
     
+ <font color='blue'>_**reduce**_</font> can also take an initial value (chk with _**dir(reduce)**_)
```python
print(reduce(sum_, x, 100)
```

+ **<font color='blue'>anonymous (lambda functions)</font>**
    + useful shortcut for small functions to be created at run-time
    + single expression, whose value is returned
    + use in places where _**'def'**_ can't be used
```python
x = [1,2,3,4]
z = reduce(lambda a, b: a+b, x)       
print(z)
```

In [37]:
from functools import reduce
def sum_(a, b):
    return a+b

x = [1,2,3,4]
z = reduce(sum_, x)
print(z)

10


In [39]:
print(reduce(sum_, x, 100))

110


# Another useful example (sorting complex objects)
```python
x = [('d', 2), ('a', 4), ('c', 1), ('b', 3)]
x.sort()
print(x)
x.sort(key=lambda z: z[1])      # Sort list objects (tuples) using 2nd item as the key
print(x)
```


```python
import pprint
x = [{'firstname': 'Frodo', 'lastname': 'Took'}, 
     {'firstname': 'Samwise', 'lastname': 'Brandybuck'}, 
     {'firstname': 'Pippin', 'lastname': 'Gamgee'}, 
     {'firstname': 'Merry', 'lastname': 'Baggins'}]
x.sort(key=lambda name: name['firstname'])  # Sort list objects (dicts) using  value for
                                            # 'firstname' as the sorting key
pprint.pprint(x)
x.sort(key=lambda name: name['lastname'])   # Sort list objects by value for 'lastname'
pprint.pprint(x)```