# Review on Python Files
As with any other modern high-level programming languages, Python provides developers with built-in methods that allow them to read, append and write to files. Without further ado, let's dive in and test them out!

## Open Method
Since we have not had any file with us yet, it will be a good exercise to bring a file into existence, though temporary on Google Colab, and write content to it using Python. Let's see how we do this.

In [1]:
# Name your file
filename = "america.txt"
# Create a file object with open() method
# which takes two args: 1. file name 2. mode
file_obj = open(filename, 'w')

## Write Method
Above, we have effectively managed to make a file handler which we will use to write to our file. Although we could have provided a file name string as the first argument to our method open(), it is much cleaner and better for code maintainablity as we may want to change the name of our file in the future. You may notice the second argument and wonder what it means. So, we have different ways to indicate to our file handler what we want to do with our file. In summary, we can tell it to read our file's content ('**r**'), append to it ('**a**'), and write to it ('**w**'). There is also a choice for you to read and write with this '**r+**' option. Explore other [modes](https://docs.python.org/3/library/functions.html#open) on your free time! One important note, with regards to our modes, is that the write mode is different from append mode in that it will overwrite whatever you may have had within your file. Be careful when you write to your file; know what is in your file and whether you want to start anew with it! With that said, let's write to our file!





In [2]:
# Write to file
file_obj.write('''    America
Centre of equal daughters, equal sons,
All, all alike endear'd, grown, ungrown, young or old,
Strong, ample, fair, enduring, capable, rich,
Perennial with the Earth, with Freedom, Law and Love,
A grand, sane, towering, seated Mother,
Chair'd in the adamant of Time.
    -Walt Whitman-''')
# Close file object
file_obj.close()

## Close Method
It is always good practice to close your your file whenever you are done with it, and you can do that with close(). Or if you don't want to ever forget to close it, you can try this cool trick, using the _with_ keyword when working with file objects as shown below:

In [3]:
with open('funny.txt', 'w') as f:
  f.write("A friend once told me, 'When one door closes, another one opens.' Great guy. Terrible cabinet maker.\nKnock knock. Who's there? Tank. Tank who? You're welcome.")

With that, you have effectively freed up your used resource, just like how you did it with _file_obj.close()_. Any attempt to use the file object again after closing will cause a **ValueError** which lets you know that you cannot do any I/O operation with a closed file anymore.

## Text Mode vs. Binary Mode
By now, we have only written text to our files, and we will normally open files in _text mode_ most of the times (it is a default setting). However, you should know that there are other data besides text, and, in order to work with files containing such data, we can append **'b'** to the mode (discussed above), telling our programs to open files in _binary mode_. This will come in handy on our upcoming demo, but here's an example of how to specify multiple modes!

In [4]:
# Show how to open file in read and binary mode
with open('america.txt', 'rb') as f:
  # do something interesting here!
  data = f.read()

## Read Method
So far we have only discussed the write method of a *file* object. We can also read from our file. Let's see how to do that!

In [5]:
with open('funny.txt', 'r') as f:
  f.read()

When we execute our cell, we don't see any result because read method returns data instead of outputing it. So let's save it into a variable and print it out.

In [6]:
with open('funny.txt', 'r') as f:
  content = f.read()
  print(content)

A friend once told me, 'When one door closes, another one opens.' Great guy. Terrible cabinet maker.
Knock knock. Who's there? Tank. Tank who? You're welcome.


There we go! We see what is in our text file. We also notice that the read method returns the whole file content, but we can tell it to return a specific amount of text by giving it a _size_ number:

In [7]:
with open('funny.txt', 'r') as f:
  content = f.read(8) # size is a quantity of data
  print(content) # content can be either string or bytes object

A friend


Here, _size_ is the number of characters in your text content. Because we specify 8 as our argument to the read method, we get back a string of 8 characters. There are other [ways](https://docs.python.org/3/tutorial/inputoutput.html#methods-of-file-objects) to read your files too! But now let's turn attention to two other methods that will be essential in our demo later.

## Seek & Tell 
These two methods are powerful because they give us the rein to freely move around our file and examine the data. Tell method, while in binary mode, returns an integer which is the number of bytes from the start of the file, and it indicates the file object's current position. Meanwhile, seeks allow us to determine what reference point we want to use to look at our file. We can seek from the beginning of the file, current file position, or the end of the file. All we need to do is to specify the _offset_ and _whence_ arguments of the seek method. We don't have to specify _whence_ if we are going to measure from the beginning of the file (_whence_ is 0 by default, with 1 being the current file position and 2 being the end of the file). It should be noted that the offset for referencing our file backwards is always a negative value, and we must open our file in binary mode if we want to use reference points from places other than the beginning of the file. For more information, check out the [docs](https://docs.python.org/3/tutorial/inputoutput.html#methods-of-file-objects) for methods of file objects!

In [8]:
with open(filename, 'rb') as f:
  # Go to position 12 in text
  f.seek(12)
  # Read in 6 characters
  data = f.read(6)
  # Display data
  print("Display what we read in binary:")
  print(data)
  # Tell me where we are currently at
  print(f'Where we are inside our file: position {f.tell()}\n') # position 18
  # From current position go 10 characters ahead
  f.seek(10, 1)
  # Read in 9 characters
  data = f.read(9)
  # Display data
  print("Display what we read in binary:")
  print(data)
  # Tell me where we are currently at
  print(f'Where we are inside our file: position {f.tell()}\n') # position 37
  # Go to position 8 from end of file
  # in a backwards fashion
  f.seek(-8, 2)
  # Read in 8 characters
  data = f.read(8)
  # Display data
  print("Display what we read in binary:")
  print(data)
  # Tell me where we are currently at
  print(f'Where we are inside our file: position {f.tell()}') # position 296

Display what we read in binary:
b'Centre'
Where we are inside our file: position 18

Display what we read in binary:
b'daughters'
Where we are inside our file: position 37

Display what we read in binary:
b'Whitman-'
Where we are inside our file: position 296


# It's Hacking Time!
Our problem is as follows: Write a program that reads a file and writes out a new file with the lines in reversed order; in other words, the first line in the old file becomes the last one in the new file.  
Let's create milestones of achievement for this assignment:
  1. Open existing file in read and binary mode
  2. Bring the reference point to the end of file, read and display the last character
  3. This time, read and display a group of characters. Try to get "Whitman-" from the file.
  4. This time, read and display the last line of file
  5. This time, display the last two lines of file in reverse
  6. While reading file, write the last two lines in reverse, which must not be concatenated to each other, to a new file
  7. Update step 6 to read each line of the existing file and write it to the new file  

In the cells below, we will be using `america.txt` referenced by variable `filename`. As you go along with these exercises, remember to make a copy of this Colab notebook and create new code cells to test your coding knowledge. **NOTE**: Run all cells above to avoid runtime errors.  
  

## Milestone #1

In [9]:
#@title Exercise 1
# Let's tackle milestone 1
# First accomplishment
with open(filename, 'rb') as fread:
  #TODO
  pass

## Milestone #2

In [10]:
#@title Exercise 2
# Second accomplishment
with open(filename, 'rb') as fread:
  # bring the reference point to 
  # end of file, situating at last character
  fread.seek(-1, 2)
  # read in one character
  one_char = fread.read(1)
  # display result
  print(one_char) 

b'-'


## Milestone #3

In [11]:
#@title Exercise 3
# Third accomplishment
with open(filename, 'rb') as fread:
  # seek 8 characters away 
  # from end of file
  fread.seek(-8, 2)
  # read in 8 characters
  group_char = fread.read(8)
  # display result
  print(group_char) 

b'Whitman-'


## Milestone #4
There are a few things you may want to keep in mind:  
  1. We need a programming construct that helps us move through the file content character by character. NOTE: Even though we use the word character, what we are actually looking at is a byte object, or sequence of bytes. In our case, we want to look at a byte object that has only one character.
  2. With the advice above, we will need a way to track where we are in the file. Hint: use a variable. 
  3. Be careful when you update your tracker device. Re-read section on seek method from **Review on Python Files** if you run into problems.
  4. Also be sure to know what argument read method accepts. Review section on read method from **Review on Python Files** if you need to.
  5. If you run into Invalid Argument error, think back to point #1. Ask youself, what have you done wrong? Investigate your tracker device. Does it stop where you think it should?

In [12]:
#@title Exercise 4
# Fourth accomplishment
with open(filename, 'rb') as fread:
  # position of our reference pointer
  current_pos = 0
  char = None
  while char != b'\n':
    #print(current_pos) # debugging code
    # seek the file at current_pos
    fread.seek(current_pos, 2)
    # read in data 
    char = fread.read(1)
    # look at next char by moving 
    # our position to the left by one
    current_pos -= 1
  # read in the last line
  last_line = fread.read(-current_pos) # arg must be positive
  # display the last line
  print(last_line)


b'    -Walt Whitman-'


Hopefully, you have gotten to this point where you can produce the last line for view by runnning your code cell. From the last three milestones, if you notice we repeatedly specify 1 when reading in one character or 2 when seeking from the end of file, you are not alone! Here is a piece of advice that has been passed down by many generations of great developers, and it is a good one: Don't Repeat Yourself (DRY). So, how do we apply this principle in our code? Since these literal values are the same in all three milestones, we can make constant variables storing them. 

In [13]:
#@title Updated Version of Ex. 4
ONE_CHAR = 1 
END_OF_FILE = 2

with open(filename, 'rb') as fread:
  # position of our reference pointer
  current_pos = 0
  char = None
  while char != b'\n':
    #print(current_pos) # debugging code
    # seek the file at current_pos
    fread.seek(current_pos, END_OF_FILE)
    # read in data 
    char = fread.read(ONE_CHAR)
    # look at next char by moving 
    # our position to the left by one
    current_pos -= 1
  # read in the last line
  last_line = fread.read(-current_pos) # arg must be positive
  # display the last line
  print(last_line)

b'    -Walt Whitman-'


## Milestone #5

In [14]:
#@title Exercise 5
# Fifth accomplishment

ONE_CHAR = 1 
END_OF_FILE = 2

with open(filename, 'rb') as fread:
  # position of our reference pointer
  current_pos = 0
  # track position at start of line
  start_line_pos = 0
  # track position at end of line
  end_line_pos = 0
  # update char with current character
  char = None

  # Continue loop if we don't see ',' (comma)
  while char != b',': 
    # seek from end of file
    fread.seek(current_pos, END_OF_FILE)
    # read in character at current_pos           
    char = fread.read(ONE_CHAR)
    # Go into if block if we see '\n'
    if char == b'\n':
      # update tracker start_line_pos
      start_line_pos = -current_pos
      # read in the line
      # what's passed into read is the size of the line
      line = fread.read(start_line_pos - end_line_pos)
      # display current line
      print(line)
      # update tracker end_line_pos
      # which is now at start_line_pos
      end_line_pos = start_line_pos
    # move reference point position to the left by one
    current_pos -= 1

b'    -Walt Whitman-'
b"Chair'd in the adamant of Time.\n"


## Milestone #6

In [15]:
#@title Exercise 6
# Sixth accomplishment

ONE_CHAR = 1 
END_OF_FILE = 2

# Open two files whose modes may or may not be the same
# We will open one file for reading, & the other for writing
with open(filename, 'rb') as fread, open('rev_doc.txt', 'w') as fwrite:
  # current position of our reference pointer
  current_pos = 0
  # tracking position for start of a line
  start_line_pos = 0
  # tracking position for end of a line
  end_line_pos = 0
  # current character that we are looking at
  char = None
  # track how many lines have been read
  line_number = 0 

  # while current character is not a comma
  while char != b',':
    # seek from end of file
    fread.seek(current_pos, END_OF_FILE)
    # read in character at current position
    char = fread.read(ONE_CHAR)
    # if current character is a newline character
    if char == b'\n':
      # track position for start of a line
      start_line_pos = -current_pos
      # write the line to a new file
      
      bytes_char = fread.read(start_line_pos-end_line_pos)
      line = bytes_char.decode()
      if line_number == 0:
        fwrite.write(line + '\n')
      else:
        fwrite.write(line)
      # track position for end of a line, which 
      # will be at the position of start of a line
      end_line_pos = start_line_pos
      # increment line_number
      line_number += 1
    # else: decrement current position (move left)
    current_pos -= 1
  print(f"Number of lines that we have written to the new file is {line_number}.")

Number of lines that we have written to the new file is 2.


## Milestone 7

In [30]:
#@title Exercise 7
# Seventh Accomplishment

ONE_CHAR = 1 
END_OF_FILE = 2
RES_FILE = 'result.txt'

# Open two files whose modes may or may not be the same
# We will open one file for reading, & the other for writing
with open(filename, 'rb') as fread, open(RES_FILE, 'w') as fwrite:
  # track current position of our reference pointer
  current_pos = 0
  # track position of character with tell method
  tell_pos = None
  # track start of the line position
  start_line_pos = 0 
  # track end of the line position
  end_line_pos = 0
  # number of lines that have been processed
  num_of_lines = 0
  # track current character
  char = None

  # continue if tell does not return 0
  while tell_pos != 0:
    # seek from end of file with current position
    fread.seek(current_pos, END_OF_FILE)
    # read in a character
    char = fread.read(ONE_CHAR)

    # if character is a newline character
    if char == b'\n':
      # update start of the line position
      start_line_pos = -current_pos
      # read in the line as a bytes object containing characters
      bytes_char = fread.read(start_line_pos-end_line_pos)
      # decode bytes object
      line = bytes_char.decode()
      # write the line to a new file
      if num_of_lines == 0: # if we see last line for first time
        fwrite.write(line + '\n') # add newline char
      else:
        fwrite.write(line)
      # update end of the line position
      end_line_pos = start_line_pos
      # update number of processed lines
      num_of_lines += 1 

    # update current position
    current_pos -= 1
    # We seek at the next position (to the left)
    # This seek is important: it lets us know if we
    # have reach 0 when we call tell method below
    fread.seek(current_pos, END_OF_FILE)
    # update character position indicated by tell method
    tell_pos = fread.tell()
  
  # first line of the file will not be added yet
  # update start of the line position
  start_line_pos = -current_pos
  # read in the bytes characters
  bytes_char = fread.read(start_line_pos-end_line_pos)
  # decode the bytes object
  first_line = bytes_char.decode()
  # write the first line of the file to new line
  fwrite.write(first_line)
  # update number of processed lines
  num_of_lines += 1 

  print(f'Number of processed lines is {num_of_lines}.')

Number of processed lines is 8.
