# Practice Notebook: Reading and Writing Files

In this exercise, we will test your knowledge of reading and writing files by playing around with some text files. 
<br><br>
Let's say we have a text file containing current visitors at a hotel.  We'll call it, *guests.txt*.  Run the following code to create the file.  The file will automatically populate with each initial guest's first name on its own line.

In [1]:
guests = open("guests.txt", "w")
initial_guests = ["Bob", "Andrea", "Manuel", "Polly", "Khalid"]

for i in initial_guests:
    guests.write(i + "\n")
    
guests.close()

No output is generated for the above code cell.  To check the contents of the newly created *guests.txt* file, run the following code.

In [47]:
with open("guests.txt") as guests:
    for line in guests:
        print(line)
type(guests)

Bob

Andrea

Manuel

Polly

Khalid



_io.TextIOWrapper

The output shows that our *guests.txt* file is correctly populated with each initial guest's first name on its own line.  Cool!
<br><br>
Now suppose we want to update our file as guests check in and out.  Fill in the missing code in the following cell to add guests to the *guests.txt* file as they check in.

In [48]:
new_guests = ["Sam", "Danielle", "Jacob"]

with open("guests.txt", "a") as guests:
    for i in new_guests:
        guests.write(i + "\n")

guests.close()

To check whether your code correctly added the new guests to the *guests.txt* file, run the following cell.

In [49]:
with open("guests.txt") as guests:
    for line in guests:
        print(line)

Bob

Andrea

Manuel

Polly

Khalid

Sam

Danielle

Jacob



The current names in the *guests.txt* file should be:  Bob, Andrea, Manuel, Polly, Khalid, Sam, Danielle and Jacob.
<br><br>
Was the *guests.txt* file correctly appended with the new guests? If not, go back and edit your code making sure to fill in the gaps appropriately so that the new guests are correctly added to the *guests.txt* file.  Once the new guests are successfully added, you have filled in the missing code correctly.  Great!
<br><br>
Now let's remove the guests that have checked out already.  There are several ways to do this, however, the method we will choose for this exercise is outlined as follows:
1. Open the file in "read" mode.
2. Iterate over each line in the file and put each guest's name into a Python list.
3. Open the file once again in "write" mode.
4. Add each guest's name in the Python list to the file one by one.

<br>
Ready? Fill in the missing code in the following cell to remove the guests that have checked out already.

In [50]:
checked_out=["Andrea", "Manuel", "Khalid"]
temp_list=[]

with open("guests.txt", "r") as guests:
    for g in guests:
        temp_list.append(g.strip())

with open("guests.txt", "w") as guests:
    for name in temp_list:
        if name not in checked_out:
            guests.write(name + "\n")

To check whether your code correctly removed the checked out guests from the *guests.txt* file, run the following cell.

In [51]:
with open("guests.txt") as guests:
    for line in guests:
        print(line)

Bob

Polly

Sam

Danielle

Jacob



The current names in the *guests.txt* file should be:  Bob, Polly, Sam, Danielle and Jacob.
<br><br>
Were the names of the checked out guests correctly removed from the *guests.txt* file? If not, go back and edit your code making sure to fill in the gaps appropriately so that the checked out guests are correctly removed from the *guests.txt* file. Once the checked out guests are successfully removed, you have filled in the missing code correctly. Awesome!
<br><br>
Now let's check whether Bob and Andrea are still checked in.  How could we do this? We'll just read through each line in the file to see if their name is in there.  Run the following code to check whether Bob and Andrea are still checked in.

In [13]:
guests_to_check = ['Bob', 'Andrea']
checked_in = []

with open("guests.txt","r") as guests:
    for g in guests:
        checked_in.append(g.strip())
    for check in guests_to_check:
        if check in checked_in:
            print("{} is checked in".format(check))
        else:
            print("{} is not checked in".format(check))

Bob is checked in
Andrea is checked in


We can see that Bob is checked in while Andrea is not.  Nice work! You've learned the basics of reading and writing files in Python!

# <a href="https://docs.python.org/3/library/functions.html#open"> Link to the source below </a> <br>
Character

Meaning

'r'

open for reading (default)

'w'

open for writing, truncating the file first

'x'

open for exclusive creation, failing if the file already exists

'a'

open for writing, appending to the end of the file if it exists

'b'

binary mode

't'

text mode (default)

'+'

open for updating (reading and writing)

### reading and writing files
* import os
* os.remove("novel.txt")
* os.rename("before_rename.txt", "after_rename")
* os.path.exists("novel.txt") -> True/ False <- return True if file existed
* os.path.getsize("file")
* import datetime
* timestamp = os.path.getmtime("spider.txt")
## manaing file and directories
* new dir: os.mkdir(". . .")
* change dir: os.chdir(". . .")
* current dir: os.getcwd()
* remove dir: os.rmdir("  ")
* os.listdir("directories_name") -> a list of file and directories
* os.path.join(dir, name)  -> for eg: website/images
* file or directory? -> os.path.isdir(name_of_...) -> return True or False -> True because images is directory

# generating CSV with lists

In [25]:
import csv
hosts = [['work.local', 'sai gon'], ['study.local', 'phan lan']]
with open('host.csv','w') as hosts_csv:
    writer = csv.writer(hosts_csv)
    # add content by using writerows
    writer.writerows(hosts)

# need to add schema this way in case our data is very big and we make an example with small data
hosts2_initative = [['this is the schema']]
hosts2 = [['this is the test text 1'], ['this is the test text 2']]
for content in hosts2:
    hosts2_initative.append(content)
#make csv file
with open('host2.csv','w') as hosts_csv2:
    writer = csv.writer(hosts_csv2)
    # add content by using writerows
    writer.writerows(hosts2_initative)

#import pandas as pd
#data = pd.read_csv("../Chapter_2/host2.csv")
#x = read_csv('host2.csv')
#x.columns = ['test schema']
#x.to_csv('host2.csv', index = False)
#data

In [27]:
h1 = [['test schema']]
h2 = [['this is the test text 1'], ['this is the test text 2']]
for content in h2:
    h1.append(content)
h1

[['test schema'], ['this is the test text 1'], ['this is the test text 2']]

# generating CSV with dictionaries

In [46]:
import os
import csv

# Create a file with data in it
def create_file(filename):
  with open(filename, "w") as file:
    file.write("name,color,type\n")
    file.write("carnation,pink,annual\n")
    file.write("daffodil,yellow,perennial\n")
    file.write("iris,blue,perennial\n")
    file.write("poinsettia,red,perennial\n")
    file.write("sunflower,yellow,annual\n")

# Read the file contents and format the information about each row
def contents_of_file(filename):
  return_string = ""

  # Call the function to create the file 
  create_file(filename)

  # Open the file
  with open(filename) as file_name:
    # Read the rows of the file into a dictionary
    rows = csv.DictReader(file_name)
    
    # Process each item of the dictionary
    for row in rows:
      return_string += "a {} {} is {}\n".format(row["color"], row["name"], row["type"])
  return return_string

#Call the function
print(contents_of_file("flowers.csv"))

a pink carnation is annual
a yellow daffodil is perennial
a blue iris is perennial
a red poinsettia is perennial
a yellow sunflower is annual



In [45]:
import os
import csv

# Create a file with data in it
def create_file(filename):
  with open(filename, "w") as file:
    file.write("name,color,type\n")
    file.write("carnation,pink,annual\n")
    file.write("daffodil,yellow,perennial\n")
    file.write("iris,blue,perennial\n")
    file.write("poinsettia,red,perennial\n")
    file.write("sunflower,yellow,annual\n")

# Read the file contents and format the information about each row
def contents_of_file(filename):
  return_string = ""

  # Call the function to create the file 
  create_file(filename)

  # Open the file
  with open(filename) as file_name:
    # Read the rows of the file
    rows = csv.reader(file_name)
    # use next to omit the schema
    next(rows)
    # Process each row
    for row in rows:
      name, color, typed = row
      # Format the return string for data rows only

      return_string += "a {} {} is {}\n".format(color, name, typed)
  return return_string

#Call the function
print(contents_of_file("flowers.csv"))

a pink carnation is annual
a yellow daffodil is perennial
a blue iris is perennial
a red poinsettia is perennial
a yellow sunflower is annual



In [48]:
import csv

def read_employees(csv_file_location):
    csv.register_dialect('empDialect', skipinitialspace=True, strict=True)
    employee_file = csv.DictReader(open(csv_file_location), dialect = 'empDialect')
    employee_list = []
    for data in employee_file:
        employee_list.append(data)
    return employee_list

employee_list = read_employees("/Users/phamthailinh/Desktop/ds_research/Google_IT_Python_Prof_Cert/Chapter_2/host.csv")
print(employee_list)

[OrderedDict([('Full Name', 'Audrey Miller'), ('Username', 'audrey'), ('Department', 'Development')]), OrderedDict([('Full Name', 'Arden Garcia'), ('Username', 'ardeng'), ('Department', 'Sales')]), OrderedDict([('Full Name', 'Bailey Thomas'), ('Username', 'baileyt'), ('Department', 'Human Resources')]), OrderedDict([('Full Name', 'Blake Sousa'), ('Username', 'sousa'), ('Department', 'IT infrastructure')]), OrderedDict([('Full Name', 'Cameron Nguyen'), ('Username', 'nguyen'), ('Department', 'Marketing')]), OrderedDict([('Full Name', 'Charlie Grey'), ('Username', 'greyc'), ('Department', 'Development')]), OrderedDict([('Full Name', 'Chris Black'), ('Username', 'chrisb'), ('Department', 'User Experience Research')]), OrderedDict([('Full Name', 'Courtney Silva'), ('Username', 'silva'), ('Department', 'IT infrastructure')]), OrderedDict([('Full Name', 'Darcy Johnsonn'), ('Username', 'darcy'), ('Department', 'IT infrastructure')]), OrderedDict([('Full Name', 'Elliot Lamb'), ('Username', 'ell