## **Objectives**

After completing this lab you will be able to:

* Write to files using Python libraries

## Writing Files

We can open a file object using the method `write()` to save the text file to a list. To write to a file, the mode argument must be set to **w**. Let's write a file **Example2.txt** with the line: **"This is line A"**

In [6]:
# Write line to file

exmp2 = 'C:/Users/matth/OneDrive/Personal/Repositories/Coursera/Python for DataScience_AI_Dev/Module 4/Working with Data in Python/Write and Save Files/Example2.txt'

with open(exmp2, 'w') as writefile:
    writefile.write("This is line A")

We can read the file to see if it worked:

In [7]:
# Read the newly created 'Example2.txt' file we created

with open(exmp2, 'r') as TestWriteFile:
    print(TestWriteFile.read())

This is line A


We can write multiple lines:

In [8]:
# Write multiple lines to a file

with open(exmp2, 'w') as writefile:
    writefile.write("This is line A\n")
    writefile.write("This is line B\n")

The method `.write()` works similar to the method `.readline()`, except instead of reading a new line it writes a new line. The process is illustrated in the figure. The different color coding of the grid represents a new line added to the file after each method is call.

<img src="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-PY0101EN-SkillsNetwork/labs/Module%204/images/WriteLine.png" width="500">

You can check the file to see if your results are correct:

In [9]:
# Check whether write to file results are correct

with open(exmp2, 'r') as TestWriteFile:
    print(TestWriteFile.read())

This is line A
This is line B



We write a list to a **.txt** file as follows:

In [10]:
# Sample list of text

Lines = ["This is line A\n", "This is line B\n", "This is line C\n"]

Lines

['This is line A\n', 'This is line B\n', 'This is line C\n']

In [12]:
# Write the strings in the list to a text file

with open (exmp2, 'w') as WriteFile:
    for line in Lines:
        print(line)
        WriteFile.write(line)

This is line A

This is line B

This is line C



We can verify the file is written by reading it and printing out the values:

In [13]:
# Verify we successfully wrote to the Example2 file

with open(exmp2, 'r') as TestWriteFile:
    print(TestWriteFile.read())

This is line A
This is line B
This is line C



However, note that setting the mode to **w** overwrites all the existing data in the file

In [14]:
with open(exmp2, 'w') as WriteFile:
    WriteFile.write("Overwrite\n")

with open(exmp2, 'r') as TestWriteFile:
    print(TestWriteFile.read())

Overwrite



## Appending Files

We can write to files without losing any of the existing data as follows by setting the mode arguement to append: **a**. You can append a new line as follows:

In [15]:
# Write a new line to text file

with open(exmp2, 'a') as TestWriteFile:
    TestWriteFile.write("This is line C\n")
    TestWriteFile.write("This is line D\n")
    TestWriteFile.write("This is line E\n")

We can verify the file has changed by running the following cell:

In [19]:
# Verify if the new line is in the text file

with open(exmp2, 'r') as TestWriteFile:
    print(TestWriteFile.read())

Overwrite
This is line C
This is line D
This is line E
This is line E
This is line E



## Additional Modes

It's fairly inefficient to open the file in **a** or **w** and then reopening it in **r** to read any lines. Fortunately we can access the file in the following modes:

* **r+**: Reading and writing. Cannot truncate the file.
* **w+**: Writing and reading. Truncates the file.
* **a+**: Appending and reading. Creates a new file if none exists. You don't have to dwell on the specifics of each mode for this lab.

Let's try out the **a+** mode:

In [23]:
with open(exmp2, "a+") as TestWriteFile:
    TestWriteFile.write("This is line E\n")

    # The print function below will NOT work
    print(TestWriteFile.read())




There were no errors but `read()` also did not output anything. This is because of our location in the file.

Most of the file methods we've looked at work in a certain location in the file. `.write()` writes at a certain location in the file. `.read()` reads at a certain location in the file and so on. You can think of this as moving your pointer around in the notepad to make changes at a specific location.

Opening the file in **w** is akin to opening the .txt file, moving your cursor to the beginning of the text file, writing new text and deleting everything that follows. Wheras opening the file in **a** is similar to opening the .txt file, moving your cursor to the very end and then adding the new pieces of text.

It is often very useful to know where the 'cursor; is in a file and be able to control it. The following methods allow us to do precusely this - 

* `.tell()` - Returns the current poisiton in bytes.
* `.seek(offset,from)` - Changes the position by 'offset' bytes with respect to 'from'. From can take the value of 0, 1, 2 corresponding to beginning, relative to current position, and end.

Now let's revisit **a+**

In [25]:
with open(exmp2, 'a+') as TestWriteFile:
    print("Initial Location: {}".format(TestWriteFile.tell()))

    data = TestWriteFile.read()

    # Empty strings return false in Python
    if (not data):
        print("Read nothing")
    else:
        print(TestWriteFile.read())

    # Move 0 bytes from the beginning
    TestWriteFile.seek(0,0)

    print("\nNew Location: {}".format(TestWriteFile.tell()))

    data = TestWriteFile.read()

    if (not data):
        print("Read nothing")
    else:
        print(data)

    print("Location after read: {}".format(TestWriteFile.tell()))

Initial Location: 59
Read nothing

New Location: 0
Overwrite
This is line C
This is line D
This is line E

Location after read: 59


Finally, a note on the difference between **w+** and **r+**. Both of these modes allow access to read and write methods, however, opening a file in **w+** overwrites it and deletes all pre-existing data.

In the following code block, run the code as it is first and then run it without the `.truncate()`.

In [26]:
with open(exmp2, 'r+') as TestWriteFile:

    # Write at the beginning of the file
    TestWriteFile.seek(0,0)
    
    TestWriteFile.write("Line 1" + "\n")
    TestWriteFile.write("Line 2" + "\n")
    TestWriteFile.write("Line 3" + "\n")
    TestWriteFile.write("Line 4" + "\n")
    
    TestWriteFile.write("Finished\n")

    TestWriteFile.seek(0,0)

    print(TestWriteFile.read())

Line 1
Line 2
Line 3
Line 4
Finished

This is line E



To work with a file on existing data, use **r+** and **a+**. While using **r+**, it can be useful to add a `.truncate()` method at the end of your data. This will reduce the file to your data and delete everything that follows.

In [27]:
with open(exmp2, 'r+') as TestWriteFile:

    #Write at the beginning of a file
    TestWriteFile.seek(0,0)

    TestWriteFile.write("Line 1" + "\n")
    TestWriteFile.write("Line 2" + "\n")
    TestWriteFile.write("Line 3" + "\n")
    TestWriteFile.write("Line 4" + "\n")

    TestWriteFile.write("Finished\n")

    TestWriteFile.truncate()

    TestWriteFile.seek(0,0)

    print(TestWriteFile.read())

Line 1
Line 2
Line 3
Line 4
Finished



## Copy a File

Let's copy the file **Example2.txt** to the file **Example3.txt**:

In [29]:
# Copy one file to another

exmp3 = 'C:/Users/matth/OneDrive/Personal/Repositories/Coursera/Python for DataScience_AI_Dev/Module 4/Working with Data in Python/Write and Save Files/Example3.txt'

with open(exmp2, 'r') as ReadFile:
    with open(exmp3, 'w') as WriteFile:
        for line in ReadFile:
            WriteFile.write(line)

We can read the file to see if everything worked:

In [30]:
# Verify we successfully copied the file

with open(exmp3, 'r') as TestWriteFile:
    print(TestWriteFile.read())

Line 1
Line 2
Line 3
Line 4
Finished



After reading files, we can also write data into files and save them in different file formats like **.txt, .csv, .xls (for excel files) etc**. You will come across these in further examples.

**NOTE:** If you wish to open and view the `Example3.txt` file, go the the 'Write and Save Files' directory. All associated files for this lab will be located within this directory.

---

## Exercise

Your local university's Raptors fan club maintains a register of its active members on a .txt document. Every month they update the file by removing the members who are not active. You have been tasked with automating this with your Python skills.

Given the file `CurrentMem`, remove each member with a 'no' in their 'Active' column. Keep track of each of the removed member and append them to the `exMem` file. Make sure that the format of the original files is preserved. (*Hint: Do this by reading/writing whole lines and ensuring the header remains*)

Run the code block below prior to starting the exercise. The skeleton code has been provided for you. Edit only the `cleanFiles` function.

In [149]:
# Run this section of code prior to starting the exercise

from random import randint as rnd

memReg = 'C:/Users/matth/OneDrive/Personal/Repositories/Coursera/Python for DataScience_AI_Dev/Module 4/Working with Data in Python/Write and Save Files/members.txt'
exReg = 'C:/Users/matth/OneDrive/Personal/Repositories/Coursera/Python for DataScience_AI_Dev/Module 4/Working with Data in Python/Write and Save Files/inactive.txt'
fee = ('yes', 'no')

def genFiles(current,old):
    with open(current, 'w+') as WriteFile:
        WriteFile.write('Membership No  Date Joined  Active  \n')
        
        data = "{:^13}  {:<11}  {:<6}\n"

        for rowno in range(20):
            date = str(rnd(2015,2020)) + '-' + str(rnd(1,12)) + '-' + str(rnd(1,25))
            WriteFile.write(data.format(rnd(10000,99999), date, fee[rnd(0,1)]))

    with open(old, 'w+') as WriteFile:
        WriteFile.write("Membership No  Date Joined  Active  \n")

        data = "{:^13}  {:<11}  {:<6}\n"

        for rowno in range(3):
            date = str(rnd(2015,2020)) + '-' + str(rnd(1,12)) + '-' + str(rnd(1,25))
            WriteFile.write(data.format(rnd(10000,99999), date, fee[1]))


genFiles(memReg, exReg)

After running the prerequisite code cell above we see the files **'members.txt'** and **'inactive.txt'** have been generated for this exercise. We are now ready to move to implementation.

### **Exercise:** Implement the cleanFiles function in the code cell below.

In [150]:
'''
*******************************************************************************************
*                                                                                         *
*            The two arguements for this function are the files:                          *
*                - currentMem: File containing list of current members                    *
*                - exMem: File containing list of old members                             *
*                                                                                         *
*-----------------------------------------------------------------------------------------*
*                                                                                         *
*            This function should remove all rows from currentMem containing 'no'         *
*            in the 'Active' column and append them to exMem                              *
*                                                                                         *
*******************************************************************************************
'''

# Define our function to seperate active and inactive members to their respective lists
def cleanFiles(currentMem, exMem):
    
    with open(currentMem, 'r+') as ReadWriteFile:
        
        # Define list of active members (if line contains value 'yes') to update 
        # members.txt file
        ActiveMembers = [line for line in ReadWriteFile.readlines() if 'yes' in line]

        # Return to beginning of file to iterate through list again
        ReadWriteFile.seek(0,0)

        # Define list of inactive members (if line contains value 'no') to update 
        # inactive.txt file
        InactiveMembers = [line for line in ReadWriteFile.readlines() if 'no' in line]

        # In 'members.txt', move to first line after the headers
        ReadWriteFile.seek(38,0)

        # Add active members to text file line by line, beginning after header line
        i = 0;
        for list in ActiveMembers:
            ReadWriteFile.write(ActiveMembers[i])
            i += 1

        # Remove all items on the 'members.txt' list after the members we have added
        # ensuring all duplicates and non-members have been removed
        ReadWriteFile.truncate()

        # Open the 'inactive.txt' file and add all inactive members to the list
        # by using the a+ (amend) method to ensure members are added to end of list
        with open(exMem, 'a+') as WriteFile:
            i = 0;

            for list in InactiveMembers:
                WriteFile.write(InactiveMembers[i])
                i += 1

            ReadWriteFile.seek(0,0)
            WriteFile.seek(0,0)

# Define variable paths to 'members.txt' and 'inactive.txt' files to pass to 
# 'cleanFiles() function
memReg = "C:/Users/matth/OneDrive/Personal/Repositories/Coursera/Python for DataScience_AI_Dev/Module 4/Working with Data in Python/Write and Save Files/members.txt"
exReg = "C:/Users/matth/OneDrive/Personal/Repositories/Coursera/Python for DataScience_AI_Dev/Module 4/Working with Data in Python/Write and Save Files/inactive.txt"

# Call 'cleanFiles() function
cleanFiles(memReg, exReg)

# Define Headers
headers = "Membership No  Date Joined  Active  \n"

# Print list of active members from 'members.txt' file
with open(memReg, 'r') as ReadFile:
    print("Active Members: \n\n")
    print(ReadFile.read())

# Print list of inactive members from 'inactive.txt' file
with open(exReg, 'r') as ReadFile:
    print("Inactive Members: \n\n")
    print(ReadFile.read())

Active Members: 


Membership No  Date Joined  Active  
    24248      2017-11-5    yes   
    19964      2019-8-5     yes   
    71055      2015-1-20    yes   
    91326      2015-5-14    yes   
    16515      2019-7-2     yes   
    53343      2019-6-20    yes   
    92369      2020-5-4     yes   
    24396      2018-9-11    yes   
    72731      2019-5-17    yes   

Inactive Members: 


Membership No  Date Joined  Active  
    39531      2016-12-14   no    
    31773      2017-5-14    no    
    23983      2016-12-10   no    
    43738      2019-1-23    no    
    65893      2015-2-22    no    
    87091      2015-5-2     no    
    18903      2015-2-1     no    
    39088      2018-11-16   no    
    35760      2019-3-15    no    
    21059      2016-5-12    no    
    70334      2016-7-13    no    
    23278      2018-4-25    no    
    41521      2020-6-24    no    
    73425      2015-5-17    no    



The code cell below is to verify my solution. This code was provided in the course and remained unmodified. I ran this code to test my implementation of the `cleanFiles` function.

In [151]:
def testMsg(passed):
    if passed:
       return 'Test Passed'
    else :
       return 'Test Failed'

testWrite = "C:/Users/matth/OneDrive/Personal/Repositories/Coursera/Python for DataScience_AI_Dev/Module 4/Working with Data in Python/Write and Save Files//testWrite.txt"
testAppend = "C:/Users/matth/OneDrive/Personal/Repositories/Coursera/Python for DataScience_AI_Dev/Module 4/Working with Data in Python/Write and Save Files//testAppend.txt" 
passed = True

genFiles(testWrite,testAppend)

with open(testWrite,'r') as file:
    ogWrite = file.readlines()

with open(testAppend,'r') as file:
    ogAppend = file.readlines()

try:
    cleanFiles(testWrite,testAppend)
except:
    print('Error')

with open(testWrite,'r') as file:
    clWrite = file.readlines()

with open(testAppend,'r') as file:
    clAppend = file.readlines()
        
# checking if total no of rows is same, including headers

if (len(ogWrite) + len(ogAppend) != len(clWrite) + len(clAppend)):
    print("The number of rows do not add up. Make sure your final files have the same header and format.")
    passed = False
    
for line in clWrite:
    if  'no' in line:
        passed = False
        print("Inactive members in file")
        break
    else:
        if line not in ogWrite:
            print("Data in file does not match original file")
            passed = False
print ("{}".format(testMsg(passed)))
    



Test Passed
