### Writing Files

<p>We can open a file object using the method <code>write()</code> to save the text file to a list. To write to a file, the mode argument must be set to <b>w</b>. Let’s write a file <b>Example2.txt</b> with the line: <b>"This is line A"</b></p>

In [1]:
import os

dir_name = os.path.join(".", "data")
os.makedirs(dir_name, exist_ok=True)

In [2]:
example = os.path.join(dir_name, "example2.txt")
with open(example, "w") as file:
    file.write("This is line A.")

<p>We can read the file to see if it worked:</p>

In [3]:
with open(example, "r") as file:
    print(file.read())

This is line A.


<p>We can write multiple lines:</p>

In [4]:
with open(example, "w") as file:
    file.write("This is line A. \n")
    file.write("This is line B. \n")

<p>The method <code>write()</code> works similarly to the method <code>readline()</code>, except instead of reading a new line it writes a new line. You can check the file to see if your results are correct.</p>

In [5]:
with open(example, "r") as file:
    print(file.read())

This is line A. 
This is line B. 



<p>We write a list to a <b>txt</b> file as follows:</p>

In [6]:
lines = ["This is line 1. \n", "This is line 2. \n", "This is line 3. \n"]
lines

['This is line 1. \n', 'This is line 2. \n', 'This is line 3. \n']

In [7]:
with open(example, "w") as file:
    for line in lines:
        print(line)
        file.write(line)

This is line 1. 

This is line 2. 

This is line 3. 



<p>We can verify the file is written by reading it and printing out the values:</p>

In [8]:
with open(example, "r") as file:
    print(file.read())

This is line 1. 
This is line 2. 
This is line 3. 



<p>However, note that setting the mode to <b>w</b> overwrites all the existing data in the file.</p>

In [9]:
with open(example, "w") as file:
    file.write("The contents are overwritten.")

with open(example, "r") as file:
    print(file.read())

The contents are overwritten.


### Appending Files

<p>We can write to files without losing any of the existing data as follows by setting the mode argument to append: <b>a</b>. You can append a new line as follows:</p>

In [10]:
with open(example, "w") as file:
    file.write("This is line A. \n")
    file.write("This is line B. \n")
    file.write("This is line C. \n")

In [11]:
with open(example, "a") as file:
    file.write("This is line D. \n")
    file.write("This is line E. \n")
    file.write("This is line F. \n")

<p>You can verify the file has changed by running the following cell:</p>

In [12]:
with open(example, "r") as file:
    print(file.read())

This is line A. 
This is line B. 
This is line C. 
This is line D. 
This is line E. 
This is line F. 



### Additional modes

<p>It's fairly inefficient to open the file in <b>a</b> or <b>w</b> and then reopen it in <b>r</b> to read any lines. Luckily we can access the file in the following modes:</p>

<ul>
    <li><b>r+</b>: Reading and writing. Cannot truncate the file.</li>
    <li><b>w+</b>: Writing and reading. Truncates the file.</li>
    <li><b>a+</b>: Appending and Reading. Creates a new file, if none exists.</li>
</ul>

<p>You don't have to dwell on the specifics of each mode for this lab.</p>

<p>Let's try out the <b>a+</b> mode:</p>

In [13]:
with open(example, "a+") as file:
    file.write("This is line G. \n")
    print(file.read())




<p>There were no errors but <code>read()</code> also did not output anything. This is because of our location in the file.</p>

<p>Most of the file methods we've looked at work in a certain location in the file. <code>write()</code> writes at a certain location in the file. <code>read()</code> reads at a certain location in the file and so on. You can think of this as moving your pointer around in the notepad to make changes at a specific location.</p>

<p>Opening the file in <b>w</b> is akin to opening the txt file, moving your cursor to the beginning of the text file, writing new text and deleting everything that follows.</p>

<p>Whereas opening the file in <b>a</b> is similar to opening the txt file, moving your cursor to the very end and then adding the new pieces of text.</p>

<p>It is often very useful to know where the "cursor" is in a file and be able to control it. The following methods allow us to do precisely this.</p>

<ul>
    <li><code>tell()</code> returns the current position in bytes.</li>
    <li><code>seek(offset, from)</code> changes the position by "offset" bytes with respect to "from". <b>from</b> can take the value of 0, 1, 2 corresponding to the beginning, relative to current position and end.</li>
</ul>

<p>Now let's revisit <b>a+</b>:</p>

In [14]:
with open(example, "a+") as file:
    print(f"Initial location is {file.tell()}.")

    data = file.read()
    if not data:
        print("Read nothing.")
    else:
        print(data)

    print(f"Location after reading is {file.tell()}.")

    file.seek(0, 0) # move 0 bytes from beginning.

    print(f"New location is {file.tell()}.")
    data = file.read()
    if not data:
        print("Read nothing.")
    else:
        print(data)

    print(f"Location after reading is {file.tell()}.")

Initial location is 126.
Read nothing.
Location after reading is 126.
New location is 0.
This is line A. 
This is line B. 
This is line C. 
This is line D. 
This is line E. 
This is line F. 
This is line G. 

Location after reading is 126.


<p>Finally, a note on the difference between <b>w+</b> and <b>r+</b>. Both of these modes allow access to read and write methods. However, opening a file in <b>w+</b> overwrites it and deletes all pre-existing data.</p>

In [15]:
with open(example, "r+") as file:
    file.seek(0, 0)
    file.write("Line 1. \n")
    file.write("Line 2. \n")
    file.write("Line 3. \n")
    file.write("Finished.\n")

    file.seek(0, 0)
    print(file.read())

Line 1. 
Line 2. 
Line 3. 
Finished.
is line C. 
This is line D. 
This is line E. 
This is line F. 
This is line G. 



<p>To work with a file on existing data, use <b>r+</b> and <b>a+</b>. While using <b>r+</b>, it can be useful to add a <code>truncate()</code> method at the end of your data. This will reduce the file to your data and delete everything that follows.</p>

In [16]:
with open(example, "r+") as file:
    file.seek(0, 0)
    file.write("Line 1. \n")
    file.write("Line 2. \n")
    file.write("Line 3. \n")
    file.write("Finished. \n")

    file.truncate()
    file.seek(0, 0)
    print(file.read())

Line 1. 
Line 2. 
Line 3. 
Finished. 



### Copy a file

<p>Let's copy the file <b>example2.txt</b> to the file <b>example3.txt</b>:</p>

In [17]:
example2 = example
example3 = os.path.join(dir_name, "example3.txt")
with open(example2, "r") as file1:
    with open(example3, "w") as file2:
        for line in file1:
            file2.write(line)

<p>We can read the file to see if everything works:</p>

In [18]:
with open(example3, "r") as file:
    print(file.read())

Line 1. 
Line 2. 
Line 3. 
Finished. 



<p>After reading files, we can also write data into files and save them in different file formats like <code>txt</code>, <code>csv</code>, <code>xls</code>(for <code>excel</code> files), etc. You will come across these in further examples.</p>

### Exercise

<p>Your local university's Raptors fan club maintains a register of its active members on a txt document. Every month they update the file by removing the members who are not active. You have been tasked with automating this with your Python skills.</p>

<p>Given the file <code>current_mem</code>, Remove each member with a "no" in their Active column. Keep track of each of the removed members and append them to the <code>ex_mem</code> file. Make sure that the format of the original files in preserved. <i>Hint: Do this by reading/writing whole lines and ensuring the header remains.</i></p>

<p>Run the code block below prior to starting the exercise. The skeleton code has been provided for you. Edit only the <code>clean_files</code> function.</p>

In [19]:
from random import randint as rnd

mem_reg = os.path.join(dir_name, "members.txt")
ex_reg = os.path.join(dir_name, "inactive.txt")
fee = ("yes", "no")

def gen_files(current: str, old: str) -> None:
    with open(current, "w+") as written_file:
        written_file.write("Membership No  Date Joined  Active  \n")
        written_data = "{:^13}  {:<11}  {:<6} \n"

        for row_no in range(20):
            date = f"{rnd(2015, 2020)}-{rnd(1, 12)}-{rnd(1, 25)}"
            written_file.write(written_data.format(rnd(10000, 99999), date, fee[rnd(0, 1)]))

    with open(old, "w+") as written_file:
        written_file.write("Membership No  Date Joined  Active  \n")
        written_data = "{:^13}  {:<11}  {:<6} \n"
        for row_no in range(3):
            date = str(rnd(2015, 2020)) + "-" + str(rnd(1, 12)) + "-" + str(rnd(1, 25))
            written_file.write(written_data.format(rnd(10000, 99999), date, fee[1]))
            

gen_files(mem_reg, ex_reg)

<p>Now that you've run the prerequisite code cell above, which prepared the files for this exercise, you are ready to move on to the implementation.</p>

#### Exercise: Implement the clean_files function in the code cell below

In [20]:
def clean_files(current: str, ex: str) -> None:
    with open(current, "r+") as cur_file:
        cur_file.seek(0, 0)
        header = cur_file.readline() # skip reading header
        members_data = cur_file.read().splitlines(keepends=True)

        inactive_data = [member for member in members_data if "no" in member]

        cur_file.seek(0, 0)
        cur_file.truncate()
        cur_file.write(header)

        with open(ex, "a+") as ex_file:
            for member in members_data:
                if member in inactive_data:
                    ex_file.write(member)
                else:
                    cur_file.write(member)

In [21]:
# The 3 code cells below is to test the code whether is correct
clean_files(mem_reg, ex_reg)

In [22]:
print("Active Members:")
with open(mem_reg, "r") as file:
    print(file.read())

Active Members:
Membership No  Date Joined  Active  
    78829      2015-5-14    yes    
    65848      2015-11-24   yes    
    45327      2016-5-5     yes    
    83040      2020-11-19   yes    
    55406      2016-8-21    yes    
    18274      2020-8-18    yes    
    25600      2019-7-12    yes    
    25513      2020-12-19   yes    
    96235      2019-3-24    yes    
    98097      2016-8-11    yes    



In [23]:
print("Inactive Members:")
with open(ex_reg, "r") as file:
    print(file.read())

Inactive Members:
Membership No  Date Joined  Active  
    62961      2016-3-20    no     
    79549      2015-11-23   no     
    11621      2015-2-9     no     
    47204      2020-1-13    no     
    16179      2018-10-1    no     
    60454      2016-8-10    no     
    83765      2020-10-20   no     
    79654      2016-8-16    no     
    70623      2020-3-14    no     
    20316      2015-7-10    no     
    49004      2020-1-3     no     
    58432      2018-8-5     no     
    58413      2015-1-16    no     



<p>The code cell below is to verify your solution. Please do not modify the code and run it to test your implementation of <code>clean_files</code>.</p>

In [24]:
def test(status: bool) -> str:
    return "Test Passed" if status else "Test Failed"

test_write = "test_write.txt"
test_append = "test_append.txt"
passed = True

gen_files(test_write, test_append)

with open(test_write, "r") as file:
    og_write = file.readlines() # original write

with open(test_append, "r") as file:
    og_append = file.readlines() # original append

try:
    clean_files(test_write, test_append)
except Exception as e:
    print(f"Error: {e}")
    exit(1)

with open(test_write, "r") as file:
    clean_write = file.readlines()

with open(test_append, "r") as file:
    clean_append = file.readlines()

og_len = len(og_write) + len(og_append)
clean_len = len(clean_write) + len(clean_append)

if og_len != clean_len:
    print("The number of rows do not add up. Make sure your final files have the same header and format.")
    passed = False

for line in clean_write:
    if "no" in line:
        passed = False
        print("Inactive members in the file.")
        break
    else:
        if line not in og_write:
            print("Data in the file does not match original file.")
            passed = False

print(test(passed))

os.remove(test_write)
os.remove(test_append)

Test Passed


****
This is the end of the file.
****