# [CptS 111 Introduction to Algorithmic Problem Solving](http://eecs.wsu.edu/~gsprint/cpts111/)
[Washington State University](wsu.edu)

[Gina Sprint](http://eecs.wsu.edu/~gsprint/)
# L10-1 More File I/O and Practice

## The File "Cursor"
When you open a file for reading ("r" mode), the cursor marking the current position at which to read from starts at the beginning of the file (position 0). As `readlines()` is called, the cursor moves through the file. To find out the position of the cursor, you can call `tell()`:

In [13]:
in_file = open("files\\transactions.txt", "r")

print("File cursor is at position: %d" %(in_file.tell()))

# read data from the file advances the cursor by a certain number of bytes, depending on the number of characters in the line
transaction = in_file.readline()
print("File cursor is at position: %d" %(in_file.tell()))
# %r placeholder displays all characters in a string. we use it see the newline character as \n
print("First line contains: %r which contains %d characters (including newline)" %(transaction, len(transaction)))
in_file.close()

File cursor is at position: 0
File cursor is at position: 7
First line contains: '13.42\n' which contains 6 characters (including newline)


To move the cursor back to the beginning of the file, you can either:
1. Close the file and re-open it
1. Use `seek(0,0)`:

In [27]:
in_file = open("files\\transactions.txt", "r")

print("File cursor is at position: %d" %(in_file.tell()))

# read data from the file advances the cursor by a certain number of bytes, depending on the number of characters in the line
transaction = in_file.readline()
print("File cursor is at position: %d" %(in_file.tell()))
# %r placeholder displays all characters in a string. we use it see the newline character as \n
# len() returns the number of characters in the string
print("First line contains: %r which contains %d characters (including newline)" %(transaction, len(transaction)))
# move the cursor back to the beginning of the file
in_file.seek(0,0) 
print("File cursor is at position: %d" %(in_file.tell()))
in_file.close()

File cursor is at position: 0
File cursor is at position: 7
First line contains: '13.42\n' which contains 6 characters (including newline)
File cursor is at position: 0


Note: In the code above I used a built-in function called [`len()`](https://docs.python.org/3/library/functions.html#len). `len()` accepts a string as an argument and returns the number of characters in the string.

Digression: On Windows, newlines are actually represented by \r\n (carriage return and newline). Python combines the carriage return and newline for us so we don't have to worry about this. Knowing this least helps explain the cursor position of 7 above.

|Position|0|1|2|3|4|5|6|7|8|...|
|-|-|-|-|-|-|-|-|-|-|-|
|Character|1|3|.|4|2|\r|\n|2|7|...|

We can remove whitespace characters (like \n and \r) with a call to a string function `strip()`:

In [31]:
print("With whitespace characters: %r without: %r" %(transaction, transaction.strip()))

With whitespace characters: '13.42\n' without: '13.42'


## Common Errors when Working with Files
* Using the wrong file handle to refer to a file
* Opening a nonexistent file for reading
* Opening a file for reading or writing without the appropriate access rights
* Opening a file for writing when no disk space is available
* Opening a file for writing ("w") when the users wants to preserve the previous contents of the file ("w" discards all contents of file)

## Revisiting `print()`
There are several ways to print strings with the `print()` function. We already know how to use placeholders, such as:

In [60]:
print("Integer: %d, Float: %f, Float with 1 decimal: %.1f, String: %s" %(7, 8.4898899, 3.14, ":)"))

Integer: 7, Float: 8.489890, Float with 1 decimal: 3.1, String: :)


It is helpful to be aware of other printing approaches, especially when you want to format output a particular way. Check out these alternatives:

In [76]:
print(4, 5.5, ":P", 8)
print("A string without the added newline character", end="")
print("This sentence runs into the previous", end="!\n")
print("A comma", "separated", "list", sep=", ")

# https://docs.python.org/3/library/string.html
print("A {} form of placholders {:.1f}. You can also use keywords {name}".format("alternative", 9.99, name="cpts111"))

# alternative way to write to a file using print()
fout = open("files\\out_demo.txt", "w")
print("Writing this output via print()", file=fout)
fout.close()

4 5.5 :P 8
A string without the added newline characterThis sentence runs into the previous!
A comma, separated, list
A alternative form of placholders 10.0. You can also use keywords cpts111


## Practice
For the following problems, we will need to download a file: [words.txt](http://thinkpython2.com/code/words.txt). This file contains 113,809 official crossword words, one per line. Using words.txt, write a program with the following functionality:
1. A function called `open_input_file()` that opens words.txt for reading and returns the file object associated with words.txt
1. A function called `close_file()` that accepts the file object as an argument and closes the file
1. A function called `first_five_words()` that displays the first 5 words of the file. Try to display the words one on each line, without an extra newline between the words like:
```
aa
aah
aahed
aahing
aahs
```
Hint: read the [Python input/output tutorial](https://docs.python.org/3/tutorial/inputoutput.html) for more info about how to do this with `print()`.
1. Displays the word nearest to the 1 millionth character
1. Challenge: Display the character at the end of the file. Hint: read the [Python input/output tutorial](https://docs.python.org/3/tutorial/inputoutput.html) for more info about how to do this with `seek()`. Also, use the %r placeholer to print this character.

In [57]:
def open_input_file(fname):
    '''
    
    '''
    in_file = open(fname, "r")
    return in_file

def close_file(file_2_close):
    '''
    
    '''
    file_2_close.close()
    
def first_five_words(in_file):
    '''
    
    '''
    print(in_file.readline(), end="")
    print(in_file.readline(), end="")
    print(in_file.readline(), end="")
    print(in_file.readline(), end="")
    print(in_file.readline())
    
def main():
    '''
    
    '''
    fin = open_input_file("files\\words.txt")
    first_five_words(fin)
    
    location = 1000005
    fin.seek(location, 0)
    word = fin.readline()
    print("The word nearest location %d: %s" %(location, word))
    
    location = 0
    # 2 refers to the end of the file
    fin.seek(location, 2)
    word = fin.readline()
    # from the tutorial
    # f.readline() reads a single line from the file; a newline character (\n) is left at the end of the string
    # and is only omitted on the last line of the file if the file doesn’t end in a newline. 
    print("The last character of the file: %r" %(word))
    
    close_file(fin)
    
main()

aa
aah
aahed
aahing
aahs

The word nearest location 1000005: teledus

The last character of the file: ''


## TODO
1. 

## Next Lesson
