# Lecture 14 2018-10-04: I/O, exceptions, style

This worksheet accompanies the lecture notes.

## Input/Output


### getting user input

To get input from STDIN, which is the terminal by default, after showing (on STDOUT) "prompt":

>input('prompt')

In [None]:
while True:
    ans = input('continue? (Y/N)')
    if ans == 'Y': break

Why is this better?

In [43]:
while True:
    ans = input('continue? (Y/N)')
    if ans.upper()[0] == 'Y': break

KeyboardInterrupt: 

### Using files

We have been providing input to our code using variables, such as 
>dna = 'AGGGCCTA'

This is poor practice, since you would have to change your code every time you look at new data. 
It is an easy way to set up a small test case, but it is poor practice.

Usually, you want to read from a file, usually from a text file.
To do that you need to open, read, and close the file

#### Digression: bash revisited

It is often unpleasant to use a relative or full path to access a file, especially if it is in a different directory than your program (or notebook). 

There are two ways around this, in the bash shell (from an open terminal):
* use a *link*
* use a *system variable*

For example, to use the *file* command from a given directory, you could use something like

>file /Users/jamesafoster/Computer_Skills_F18/Class_Resources/Datasets/TSE.txt

or

>file ~/Computer_Skills_F18/Class_Resources/Datasets/TSE.txt

or, if you were in the sandbox directory, something like

>file ../../Computer_Skills_F18/Class_Resources/Datasets/TSE.txt

Or you could change to the other directory first

>cd ../../Computer_Skills_F18/Class_Resources/Datasets
> file TSE.txt

But this is inconvenient. 
Suppose you are in the *sandbox* directory, and your datasets are elsewhere. In this example, I use my own directory structure, which is probably different from yours. 

##### links

A link is a special file that gives enough information to the operating system to find and access the file to which it links. (This is like an "alias" in OSX.)

You will usually use a "soft" link. (we won't discuss "hard" links). 

to create a link named l for file f, use
>ln -s f l

(note the order: from left to right, or from existing file to link)

From my sandbox directory

```
ln -s ../Class_Resources/Datasets/TSE.txt tse.txt
ls -l
file tse.txt
```

Any variation of *../Class_Resources/Datasets/*, such as */Users/jamesafoster/Documents/Teaching/_Comp-skills-course/_2018/Class_Resources/Datasets* or *~/Documents/Teaching/_Comp-skills-course/_2018/Class_Resources/Datasets* would also work.

##### System variables

We briefly mention bash system variables, such as *$PATH*. 
By convention, system variables are upper case, though this isn't required.

Adding a "$" to the beginning of a variable name returns the **value** of that variable.

To see all variables that exist in a given bash session:

>set | less

You can create variables with the set command. To create a CLASS_DATASETS variable with the full path to that directory:

```
CLASS_DATASETS="/Users/jamesafoster/Documents/Teaching/_Comp-skills-course/_2018/Class_Resources/Datasets"

echo $CLASS_DATASETS
```

**Note:**
* there are no spaces around "="! (python more forgiving than bash)
* the double quotes in the above *set* command will be "evaluated", so you can use other variables inside a value. 
For example:

```
CLASS_REPO="/Users/jamesafoster/Documents/Teaching/_Comp-skills-course/_2018/Class_Resources/"
CLASS_DATASETS="$CLASS_REPO/Datasets/"
CLASS_ASSIGNMENTS="$CLASS_REPO/Homework/"
CLASS_HOMEWORK="$CLASS_REPO/../_jamesafoster/"

set | grep 'CLASS'
```

#### Back to using files in python

I assume that I have a soft link in sandbox to the TSE.txt file in the Resources/Databases directory

All access to a file in python is through the methods in a *file handle*. 
To create a file handle, use the *open* command with the name of the file and 'r', 'w', 'a', depending on whether you want to read from, write to, or append to the file. If the file does't exist and you use 'w' or 'a', the system will create it.

>input_file_handle = open('input filename','r')
>output_file_handle = open('output filename','w')
>append_file_handle = open('append filename','a')

Notice that *readline* "remembers" where you stopped reading, so that the next read begins there. 

In [37]:
# from sandbox, assuming link to Resources/Databases/TSE.txt named tse.txt
sandbox_directory = '/Users/jamesafoster/Documents/Teaching/_Comp-skills-course/_2018/sandbox/'

input_file_handle = open(sandbox_directory+'tse.txt','r')
output_file_handle = open(sandbox_directory+'output.txt','w')
append_file_handle = open(sandbox_directory+'append.txt','a')

To read from a file, use the *readline* method to read the next line or *read* to read the entire file (including new line characters) at once.
you rarely use *readline* because you can just iterate through the file handle (see first example below)

to write string s to a file, use the *write(s)* method.

In [38]:
output_file_handle.write('first new line')
output_file_handle.write('first new line')

append_file_handle.write('first append line')
append_file_handle.write('second append line')

# go look at files


18

In [39]:
# read one line at a time, easy way
input_file_handle = open(sandbox_directory+'tse.txt','r')
for next_line in input_file_handle:
    print('next line>', next_line.strip())  # show what .strip() does

input_file_handle.close()

next line> this is the way the world ends
next line> this is the way the world ends
next line> this is the way the world ends
next line> 
next line> not with a bang
next line> 
next line> but a whimper


In [40]:
input_file_handle = open(sandbox_directory+'tse.txt','r')

x=input_file_handle.read()      # get everything
file_content = x.split('\n')    # split into a list
for next_line in file_content:  # walk through the list
    print('next line...', next_line)

input_file_handle.close()

next line... this is the way the world ends
next line... this is the way the world ends
next line... this is the way the world ends
next line... 
next line... not with a bang
next line... 
next line... but a whimper


In [41]:
file_content

['this is the way the world ends',
 'this is the way the world ends',
 'this is the way the world ends',
 '',
 'not with a bang',
 '',
 'but a whimper']

Always close a file when you are done. This commits all the writes and frees up space.

>file_handle.close()

Python will close for you automatically if it knows you are finished with the file handle. 
It knows this if the file handle goes out of scope. 
To create a new scope, use *with...as*. 
This is a common idiom.

In [42]:
with open(sandbox_directory+'tse.txt','r') as input_file_handle:
    for next_line in input_file_handle:
        print('next line>', next_line.strip())

next line> this is the way the world ends
next line> this is the way the world ends
next line> this is the way the world ends
next line> 
next line> not with a bang
next line> 
next line> but a whimper


## Exceptions

When something goes wrong, python "throws an exception" with a "traceback". The type of the exception tells you what went wrong and the traceback shows you where it went wrong. 

Sometimes you have to look at the instruction *before* the one where the exception occurred to find the problem. 

the class preparation has many examples: https://docs.python.org/3/tutorial/errors.html

### syntax errors

The python interpretr can't figure out what you want. 
(these are technically errors, not exceptions, because python couldn't even get far enough to take exception.)

In [None]:
if True
   print('that was true')

my_string = 'this isn't the way it should be

### Exceptions
There are specific types of exceptions, depending on the problem that python encountered.
When the python interpretr throws an exception, it **stops** immediately. 
So, you end up fixing your exceptions one at a time--which is good because it helps you focus.

Type of Exception  | reason for exception
:----------------  | :-------------------
ZeroDivisionError  | division by zero
NameError          | you referenced an object that doesn't exist, it has no name
TypeError          | you did something not allowed for this type of object
EOFError           | you tried to use a file when you were already at the end
IOError            | you try to do something with a file that can't be done
ImportError        | you try to import a module that doesn't exist
IndexError         | you try to index something at a point where it has no index
TypeError          | you do something with an object that doesn't work with that type of object

Let's come up with examples.

## Style

Read the preparation url (just ignore stuff that we haven't discussed): https://www.python.org/dev/peps/pep-0008/

This is also a good site: https://docs.python-guide.org/writing/style/

Here are some examples of bad style. Discuss and correct them.

But first, check out the *zen of python*

In [None]:
import this

In [None]:
def cc(s, c): # this is my function
    x,y=[],0
    for nc in c:x.append(s.count(nc))
    return(x) 

print(cc('ACNTKT','ACGTN?XYK'))

In [None]:
c = {'ATA':'I', 'ATC':'I', 'ATT':'I', 'ATG':'M','ACA':'T', 'ACC':'T', 'ACG':'T', 'ACT':'T','AAC':'N', 'AAT':'N', 'AAA':'K', 'AAG':'K','AGC':'S', 'AGT':'S', 'AGA':'R', 'AGG':'R','CTA':'L', 'CTC':'L', 'CTG':'L', 'CTT':'L','CCA':'P', 'CCC':'P', 'CCG':'P', 'CCT':'P','CAC':'H', 'CAT':'H', 'CAA':'Q', 'CAG':'Q','CGA':'R', 'CGC':'R', 'CGG':'R', 'CGT':'R','GTA':'V', 'GTC':'V', 'GTG':'V', 'GTT':'V','GCA':'A', 'GCC':'A', 'GCG':'A', 'GCT':'A','GAC':'D', 'GAT':'D', 'GAA':'E', 'GAG':'E','GGA':'G', 'GGC':'G', 'GGG':'G', 'GGT':'G','TCA':'S', 'TCC':'S', 'TCG':'S', 'TCT':'S','TTC':'F', 'TTT':'F', 'TTA':'L', 'TTG':'L','TAC':'Y', 'TAT':'Y', 'TAA':'_', 'TAG':'_','TGC':'C', 'TGT':'C', 'TGA':'_', 'TGG':'W',}

In [None]:
# not using methods 
d = {'hello': 'world'}
if d.has_key('hello'):
    print d['hello']    # prints 'world'
else:
    print 'default_value'

In [None]:
# not using for or while (while i in a, for x in a)
a = [3, 4, 5]
for i in a:
    if i > 4:
        a.remove(i)

i = 0
while i < len(a):
    print(a)
    i=i+1