<a href="https://colab.research.google.com/github/sarabjeet050/Learning_Python/blob/main/4_2_WriteFile.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Writing and Saving Files in Python**

#### Writing Files
###### We can open a file object and save the list to text file using write() method. To write a file, the mode argument must be set to "w"

In [10]:
# Write line "This is line A" to file "Example2.txt"
example2 = "/Example2.txt"
with open(example2, "w") as writeFile:
  writeFile.write("This is line A")

In [11]:
# Read file to see if it worked:
with open(example2, "r") as testwriteFile:
  print(testwriteFile.read())

This is line A


In [12]:
# Write multiple lines to file
with open(example2, "w") as writeFile:
  writeFile.write("This is line A\n")
  writeFile.write("This is line B\n")

###### The .write() method works similarly to the method .readline(), except instead of reading a new line it writes a new line with each line ending with "\n" which mean new line enter

In [13]:
# Check whether write to file is correct
with open("/Example2.txt", "r") as testwriteFile:
  print(testwriteFile.read())

This is line A
This is line B



#### We write a list to a .txt file

In [14]:
#### Sample list to file
Lines = ["This is line A\n", "This is line B\n", "This is line C\n"]
Lines

['This is line A\n', 'This is line B\n', 'This is line C\n']

In [15]:
# Write the string in the list to text file as lines
with open(example2, "w") as writeFile:
  for line in Lines:
    print(line)
    writeFile.write(line)

This is line A

This is line B

This is line C



In [16]:
# Verify if writing to file is successfully executed:
with open("/Example2.txt", "r") as testwriteFile:
  print(testwriteFile.read())

This is line A
This is line B
This is line C



###### However, note that setting the mode to 'w' overwrites all the existing data in the file.



In [17]:
with open(example2, 'w') as writeFile:
  writeFile.write("Overwrite\n")

with open("/Example2.txt", 'r') as testwriteFile:
  print(testwriteFile.read())

Overwrite



#### Appending Files
###### We can write to files without losing any of the existing data as follows by setting the mode argument to append "a"

In [18]:
# Write a new line to text file with append "a":
with open("/Example2.txt", 'a') as testwriteFile:
  testwriteFile.write("This is line C\n")
  testwriteFile.write("This is line D\n")
  testwriteFile.write("This is line E\n")

In [19]:
# Verify if the new lines are appended in the text file:
with open("/Example2.txt", 'r') as testwriteFile:
  print(testwriteFile.read())

Overwrite
This is line C
This is line D
This is line E



#### Additional modes:
###### It's fairly inefficient to open the file in **a** or **w** and then reopen it in **r** to read any lines.
###### Luckily we can access the file in the following modes:
- **r+** : Reading and writing. Cannot truncate the file.
- **w+** : Writing and reading. Truncates the file.
- **a+** : Appending and Reading. Creates a new file, if none exists.

In [20]:
# try a+ mode
with open("/Example2.txt", 'a+') as testwritefile:
  testwritefile.write("This is new line E\n")
  print(testwritefile.read())




There were no errors but <code>read()</code> also did not output anything. This is because of our location in the file.


###### Most of the file methods we've looked at work in a certain location in the file. <code>.write() </code> writes at a certain location in the file. <code>.read()</code> reads at a certain location in the file and so on. You can think of this as moving your pointer around in the notepad to make changes at a specific location.

###### Opening the file in **w** is akin to opening the .txt file, moving your cursor to the beginning of the text file, writing new text and deleting everything that follows.
Whereas opening the file in **a** is similar to opening the .txt file, moving your cursor to the very end and then adding the new pieces of text. <br>
It is often very useful to know where the 'cursor' is in a file and be able to control it. The following methods allow us to do precisely this -
- <code>.tell()</code> - returns the current position in bytes
- <code>.seek(offset,from)</code> - changes the position by 'offset' bytes with respect to 'from'. From can take the value of 0,1,2 corresponding to the beginning, relative to current position and end


In [21]:
# Now lets revisit a+:
with open("/Example2.txt", 'a+') as testwritefile:
  print("Initial Location: {}".format(testwritefile.tell()))

  data = testwritefile.read()
  if (not data): # empty string returns false in python
    print("Nothing")
  else:
    print(testwritefile.read())

  testwritefile.seek(0,0) # move 0 bytes from beginning

  print("\nNew Location: {}".format(testwritefile.tell()))
  data = testwritefile.read()
  if (not data):
    print("Read Nothing")
  else:
    print(data)

  print("Location after read: {}".format(testwritefile.tell()))

Initial Location: 74
Nothing

New Location: 0
Overwrite
This is line C
This is line D
This is line E
This is new line E

Location after read: 74


In [22]:
# Finally, Note: The difference between w+ and r+ is that these modes allow to read and write methods;
# however, opening a file in w+ mode allows to overwrite it and deletes all pre-existing data.

In [23]:
with open("/Example2.txt", 'r+') as testwritefile:
  testwritefile.seek(0,0) # write at the beginning of file

  testwritefile.write("Line 1" + "\n")
  testwritefile.write("Line 2" + "\n")
  testwritefile.write("Line 3" + "\n")
  testwritefile.write("Line 4" + "\n")
  testwritefile.write("finished \n")

  testwritefile.seek(0,0) # read from the beginning of file
  print(testwritefile.read())

Line 1
Line 2
Line 3
Line 4
finished 
D
This is line E
This is new line E



In [24]:
# To work with a file on existing data, use r+ and a+ . While using r+,
# it can be useful to add .truncate() method at the end of your data.
# This will reduce the file to your data and delete everything that follows.

In [25]:
with open("/Example2.txt", 'r+') as testwritefile:
  testwritefile.seek(0,0) # write at the beginning of file

  testwritefile.write("Line 1" + "\n")
  testwritefile.write("Line 2" + "\n")
  testwritefile.write("Line 3" + "\n")
  testwritefile.write("Line 4" + "\n")
  testwritefile.write("finished \n")
  testwritefile.truncate()

  testwritefile.seek(0,0) # read from the beginning of file
  print(testwritefile.read())

Line 1
Line 2
Line 3
Line 4
finished 



#### Copy a File

In [26]:
# Copy file Example2.txt to another Example3.txt
with open("/Example2.txt", "r") as readFile:
  with open("Example3.txt" , "w") as writeFile:
    for line in readFile:
      writeFile.write(line)

In [27]:
# Verify if the copy is successful executed
with open("Example3.txt", "r") as testwritefile:
  print(testwritefile.read())

Line 1
Line 2
Line 3
Line 4
finished 



 After reading files, we can also write data into files and save them in different file formats like **.txt, .csv, .xls (for excel files) etc**.

#### Exercise

Your local university's Raptors fan club maintains a register of its active members on a .txt document. Every month they update the file by removing the members who are not active. You have been tasked with automating this with your Python skills. <br>
Given the file `currentMem`, Remove each member with a 'no' in their Active column. Keep track of each of the removed members and append them to the `exMem` file. Make sure that the format of the original files in preserved.   (*Hint: Do this by reading/writing whole lines and ensuring the header remains* )
<br>
Run the code block below prior to starting the exercise. The skeleton code has been provided for you. Edit only the `cleanFiles` function.


In [28]:
# Code block for starting the exercise:
from random import randint as rnd

memReg = "members.txt"
exReg = "inactive.txt"
fee = ("yes", "no")

def genFiles(current, old):
  # creating/writing the current/new file
  with open(current, "w+") as writefile:
    writefile.write("Membership No  Date Joined  Active  \n")
    #giving format of each row in txt file as per header before
    data = "{:^13}  {:<11}  {:<6}\n"  # < left align, ^ center align, 11, 6, 13 means chars count
    for rowno in range(20):
      date = str(rnd(2015,2020)) + '-' + str(rnd(1,12)) + '-' + str(rnd(1,25))
      writefile.write(data.format(rnd(10000,99999), date, fee[rnd(0,1)]))

  # creating/writing the old file
  with open(old, "w+") as writefile:
    writefile.write("Membership No  Date Joined  Active  \n")
    #giving format of each row in txt file as per header before
    data = "{:^13}  {:<11}  {:<6}\n"  # < left align, ^ center align, 11, 6, 13 means chars count
    for rowno in range(3):
      date = str(rnd(2015,2020)) + '-' + str(rnd(1,12)) + '-' + str(rnd(1,25))
      writefile.write(data.format(rnd(10000,99999), date, fee[1]))


genFiles(memReg, exReg)

In [29]:
# Now prerequisite code which prepared the files for this exercise is done.
# Do Exercise : Implement the cleaFiles function  below

In [30]:
'''
The two arguments for this function are the files:
    - currentMem: File containing list of current members
    - exMem: File containing list of old members

    This function should remove all rows from currentMem containing 'no'
    in the 'Active' column and appends them to exMem.
'''
def cleanFiles(currentMem, exMem):
  # ToDo: Open the currentMem file in r+ mode
    # ToDo: Open the exMem file in a+ mode

    # ToDo: Read each member in currentMem (1 Member per row) file into a list
    # Hint : Recall that the first line in the file is header.

    # ToDo: Iterate through the list and create a new list of inactive members.

    # Go to the beginning of currentMem file
    # ToDo: Iterate through the members list.
    # If a member is inactive, add them to exMem, otherwise write them to currentMem

    pass # Remove this line when implementation done

In [31]:
# Method 1 : as per above description
def cleanFiles(currentMem, exMem):
  with open(currentMem, "r+") as readFile:
    with open(exMem, "a+") as writeFile:
      # print(readFile.tell())
      currentList = []

      # get current txt file as a list of lines
      for line in readFile:
        currentList.append(line)
      header = currentList[0]
      # print(header)

      # create inactive members list
      inactiveMemList = []

      for line in currentList :
        if (line != header) : # Check for header in txt file
          if (line.find('no') != -1) : # check for active status 'no' in each line of currentMem file
            inactiveMemList.append(line)

      # print(inactiveMemList)
      # print(readFile.tell())

      # Go to the beginning of the currentMem file
      readFile.seek(0,0)

      # iterate through members list
      for line in currentList:
        # print(line)
        if (line != header) : # Check for header in txt file
          if line in inactiveMemList: # Check for inactive members
            # print("Inactive to exMem: ", line)
            writeFile.write(line) # append inactive members to exMem
          else:
            # print("Active to currentMem: ", line)
            readFile.write(line) # write active from beginning after header in currentMem

      # now after rewriting the currentMem file trunctate the unnecesary data after write pointer
      readFile.truncate()
      # print(readFile.tell())

In [None]:
# Method 2: currentList using readlines and using 'in' for 'no' status
def cleanFiles(currentMem, exMem):
  with open(currentMem, "r+") as readFile:
    with open(exMem, "a+") as writeFile:

      # get current txt file as a list of lines
      currentList = readFile.readlines()
      header = currentList[0]

      # create inactive members list
      inactiveMemList = []

      for line in currentList:
      #   print(line)
        if (line != header): # Check for header in txt file
          if ('no' in line): # check for active status 'no' in each line of currentMem file
            # print("Inactive: ", line)
            inactiveMemList.append(line)

      # Go to the beginning of the currentMem file
      readFile.seek(0,0)
      readFile.write(header)

      # iterate through members list
      for line in currentList:
      #   print(line)
        if (line != header): # Check for header in txt file
          if (line in inactiveMemList): # Check for inactive members
            # print("inactive member")
            writeFile.write(line) # append inactive members to exMem
          else:
            # print("active member")
            readFile.write(line) # write active from beginning after header in currentMem

      # now after rewriting the currentMem file trunctate the unnecesary data after write pointer
      readFile.truncate()

In [6]:
# Most feasible Method: using readlines, list comprehension, pop header
def cleanFiles(currentMem, exMem):
 with open(currentMem, "r+") as writeFile:
  with open(exMem, "a+") as appendFile:
    # get the data
    writeFile.seek(0)
    members = writeFile.readlines()
    #remove header
    header = members[0]
    members.pop(0)

    #inactive members list
    inactive = [member for member in members if ('no' in member)]  #using list comprehension

    '''
    above is same as :
    inactive = []
    for member in members:
      if ('no' in member):
        inactive.append(member)
    '''

    # go to  the beginning of write file
    writeFile.seek(0)
    writeFile.write(header)
    for member in members:
      if (member in inactive):
        appendFile.write(member)
      else:
        writeFile.write(member)
    writeFile.truncate()

In [7]:
# The code below is to help you view the files.
# Do not modify this code for this exercise.
memReg = 'members.txt'
exReg = 'inactive.txt'
cleanFiles(memReg,exReg)


headers = "Membership No  Date Joined  Active  \n"
with open(memReg,'r') as readFile:
    print("Active Members: \n\n")
    print(readFile.read())

with open(exReg,'r') as readFile:
    print("Inactive Members: \n\n")
    print(readFile.read())


Active Members: 


Membership No  Date Joined  Active  
    58737      2017-12-1    yes   
    10393      2017-4-11    yes   
    21062      2018-8-23    yes   
    85185      2018-12-14   yes   
    64641      2018-1-13    yes   

Inactive Members: 


Membership No  Date Joined  Active  
    69448      2020-10-15   no    
    39525      2018-1-15    no    
    39871      2020-11-14   no    
    78791      2018-2-3     no    
    61330      2020-5-25    no    
    72197      2020-8-17    no    
    40786      2017-6-19    no    
    28303      2019-1-6     no    
    78236      2015-2-16    no    
    71459      2017-1-21    no    
    33161      2017-4-12    no    
    69741      2020-11-11   no    
    47040      2015-12-5    no    
    28063      2017-5-12    no    
    87330      2018-3-5     no    
    58117      2016-10-5    no    
    13066      2017-3-1     no    
    52672      2017-4-3     no    



The code cell below is to verify your solution. Please do not modify the code and run it to test your implementation of `cleanFiles`.


In [8]:
def testMsg(passed):
    if passed:
       return 'Test Passed'
    else :
       return 'Test Failed'

testWrite = "testWrite.txt"
testAppend = "testAppend.txt"
passed = True

genFiles(testWrite,testAppend)

with open(testWrite,'r') as file:
    ogWrite = file.readlines()

with open(testAppend,'r') as file:
    ogAppend = file.readlines()

try:
    cleanFiles(testWrite,testAppend)
except:
    print('Error')

with open(testWrite,'r') as file:
    clWrite = file.readlines()

with open(testAppend,'r') as file:
    clAppend = file.readlines()

# checking if total no of rows is same, including headers

if (len(ogWrite) + len(ogAppend) != len(clWrite) + len(clAppend)):
    print("The number of rows do not add up. Make sure your final files have the same header and format.")
    passed = False

for line in clWrite:
    if  'no' in line:
        passed = False
        print("Inactive members in file")
        break
    else:
        if line not in ogWrite:
            print("Data in file does not match original file")
            passed = False
print ("{}".format(testMsg(passed)))




Test Passed
