<h1>8. Text I/O</h1>

<h2>10/19/2020</h2>

<h2>8.0 Last Time...</h2>
<ul>
    <li>You can use functions within NumPy to find the shape, number of dimensions, and number of elements in any array.</li>
    <li><b>numpy.where()</b> will let you find the locations of elements within an array that meet a specified criterion.</li>
    <li>Some functions within NumPy allow you to manipulate arrays, such as reshaping, transposing, "unraveling", concatenation, and repeating individual array elements.</li>
    <li>When you're in a situation with multiple nested loops, often you can instead perform all operations a lot more easily by using array syntax.</li>
    <li>Mathematical operations act on arrays elementwise.</li>
    <li>When testing within an array, <b>and</b> and <b>or</b> cannot be used; instead, use NumPy's built-in functions.</li>
</ul>

<h2>8.1 File Objects</h2>

A file object is just a variable that represents the file within Python. The process of creating a file object is the same general idea as creating any variable: you create it by assignment.

For a text file, you can create a file with the built-in <b>open()</b> statement. The first argument in <b>open</b> gives the filename, and the second sets the mod for the file:
<ul>
    <li><b>'r'</b>: sets the file to read-only.</li>
    <li><b>'w'</b>: sets the file to writing mode.</li>
    <li><b>'a'</b>: sets the file to append mode (you can only add new things to the end).</li>
</ul>

In [None]:
# Try opening the 'test.txt' file that's been added to your server.

fileobj = open('test.txt','r')

When you're done with a file, you can use the <b>close()</b> method.

In [None]:
# Close that file back up.

fileobj.close()

<h2>8.2 Text Input/Output</h2>

To read a line from a file into a variable, you can use the <b>readline()</b> method.

In [None]:
# First, open the file.
my_file = open('test.txt','r')

# Assign the first line of text to the variable aline.
aline = my_file.readline()

# Calling readline() multiple times in a row will print the next row.
bline = my_file.readline()

# Print those first two lines of text.
print(aline)
print(bline)

# Close the file. (This is good practice!)
my_file.close()

You can also write a loop to go through the whole file!

In [None]:
my_file = open('test.txt','r')

for line in my_file:
    print(line)
    
my_file.close()

Okay, but that's fairly limiting; more often, you'll want to read the whole file and put each line into a list as an element; this can be done using <b>readlines()</b> (note the plural!).

In [None]:
# Let's open the file again.
my_file = open('test.txt','r')

# Save the file's contents to a list.
contents = my_file.readlines()

print(contents)

# Close that file!
my_file.close()

Note that there's a newline (<b>\n</b>) character at the end of each line (except the last one).

To write to a file, you can use the <b>write()</b> method (obviously this doesn't work if a file is in read-only mode).

In [None]:
# Let's open a file in writing mode.
my_file = open('test.txt','w')

# Write a phrase to the file.
my_file.write('Hello world!')

my_file.close()

Note that this overwrites everything currently inside the file! To write multiple lines (in list format) to a file, use <b>writelines()</b>.

In [None]:
my_file = open('test.txt','w')

# Earlier in this notebook we saved the contents of our file to a variable 'contents'.
my_file.writelines(contents)

my_file.close()

<h2>8.3 Processing File Contents</h2>

As you might imagine, the contents of files can be pretty unwieldy. Luckily, there are a lot of methods that will make data easier to read!

Sometimes (as with .csv files) you'll want to take a string and break it into list using a particular separator. <b>split()</b> is a useful tool!

In [None]:
# Let's create a single string that has three pieces of data in it.
a = '3.4 2.1 -2.6'

# The obvious choice for a separator is a space.
print(a.split(' '))

# You might also run into comma-separated strings.
a = '3.4,2.1,-2.6'

print(a.split(','))

If everything we read from a file is a string, we're sometimes going to have to convert to integers or floats.

In [None]:
# We'll need NumPy for this!
import numpy as np

# Let's look at a typical situation: we've grabbed some numbers from a csv file.
a = '3.4,2.1,-2.6'
a = a.split(',')

# Note that these are still strings.
print(a)

In [None]:
# We can convert these to floats the way we did before!
anum = np.zeros(len(a))
for i in range(len(a)):
    anum[i] = float(a[i])

print(anum)

Alternatively, we can convert to an array and use the <b>astype()</b> function built-in there. <b>'d'</b> is a float (double-precision), <b>'l'</b> is an integer (long integer).

In [None]:
bnum = np.array(a).astype('d')
print(bnum)

<h2>8.4 Take-Home Points</h2>
<ul>
    <li>The <b>open()</b> statement lets you open a file in read, write, or append mode.</li>
    <li>Files should always be closed using the <b>close()</b> statement.</li>
    <li>You can read a single line with <b>readline()</b>, and multiple lines with <b>readlines()</b>.</li>
    <li>The <b>write()</b> method allows you to write a single line, and the <b>writelines()</b> method allows you to write multiple lines.</li>
    <li><b>split()</b> lets you break strings based on defined separators.</li>
</ul>