# Lecture-6 Files

# Filename and mode are the two parameters that the open() function accepts.
A file can be opened in four different ways (modes):
read; default value: "r" opens a file for reading and returns an error if the file is missing.
"a" - Append - Opens a file for appending and, if necessary, generates a new copy of the file.
The command "w" opens a file for writing and, if the file doesn't already exist, creates it.
“x” -Creates the provided file and returns an error if it already exists.


### Writing to a file

In [1]:
with open("test.txt", "w") as myfile:
    myfile.write("My first file written from Python\n")
    myfile.write("Hello, world!\n")

In [None]:
# Read few lines (say N) of text using input() and write them to file

### Read the whole file

In [2]:
with open("test.txt") as f:
    content = f.read()
print(content)

My first file written from Python
Hello, world!



### Walk Through: Develop a function that displays a file

infile. read() will read in the entire file as a single string of text. infile. readline() will read in one line at a time (each time you call this command, it reads in the next line).

In [3]:
def show(infile):
    with open(infile) as f:
        content = f.read()
    print(content)

# call it
show('test.txt')

My first file written from Python
Hello, world!



### Read a file line by line

In [4]:
with open("test.txt", "r") as my_new_handle:
    for line in my_new_handle:
        print(line)

My first file written from Python

Hello, world!



### Walk Through: Can you count the number of lines in a file?

In [5]:
count = 0

with open("test.txt", "r") as my_new_handle:
    for line in my_new_handle:
        count += 1
        print(line)

print('This file contains ', count, ' lines')

My first file written from Python

Hello, world!

This file contains  2  lines


### Convert each line to uppercase

In [6]:
with open("test.txt", "r") as my_new_handle:
    for line in my_new_handle:
        line1 = line.upper()
        print(line1)

# NOTE: Remember we did not change the original file

MY FIRST FILE WRITTEN FROM PYTHON

HELLO, WORLD!



### Read all lines of a file and convert to uppercase

In [7]:
with open("test.txt", "r") as input_file:
    all_lines = input_file.readlines()

# sort all lines
all_lines = [line.upper() for line in all_lines]

# write them back to file
with open("sortedtest.txt", "w") as output_file:
    for line in all_lines:
        output_file.write(line)
        
#close it
output_file.close()

# Now show it
show("sortedtest.txt")

MY FIRST FILE WRITTEN FROM PYTHON
HELLO, WORLD!



### Appending additional text to a file

In [8]:
with open("sortedtest.txt", "a") as myfile:
    myfile.write("This is the appended line to an existing file\n")
    myfile.close()

#now show it
show("sortedtest.txt")

MY FIRST FILE WRITTEN FROM PYTHON
HELLO, WORLD!
This is the appended line to an existing file



### Read and write the same file

In [None]:
with open("sortedtest.txt", "r+") as myfile:
    
    contents = myfile.read()
    print(contents)
    
    myfile.write("This line is writtern to the same file that is opened for reading\n")
    myfile.close()

#now show it
show("sortedtest.txt")

### Walk Through

 #### Develop a function that copies the contents of a source file to target file excluding comment lines that starts with '#'

In [None]:
# Let us first write few lines to a file
with open("sourcefile.txt", "w") as myfile:
    myfile.write("#This is a sample file\n")
    myfile.write("Python is a fantastic language\n")
    myfile.write("Never forget to learn\n")

#show it
show("sourcefile.txt")

In [1]:
# function filtercopy to copy a file 
def filtercopy(oldfile, newfile):
    with open(oldfile, "r") as infile, open(newfile, "w") as outfile:
        for line in infile:
            if not line.startswith('#'):
                outfile.write(line)

# call function
filtercopy('sourcefile.txt', 'targetfile.txt')

# show targetfile
with open("targetfile.txt") as f:
    content = f.read()
print(content)

Python is a fantastic language
Never forget to learn



## Read data from web

In [2]:
import urllib.request
with urllib.request.urlopen('http://www.bhc.edu.in') as response:
   htmldata =  response.read()

print(htmldata)



## Walk Through: Email Files Processing

### Print first 50 lines from the file

In [5]:
with open('mbox-short.txt') as fhand:
    data = fhand.readlines()
print(data[:50])



### Count the number of emails in a file

In [6]:
#Print all lines that start with "From:" 
fhand = open('mbox-short.txt')
count = 0
for line in fhand:
    if line.startswith('From:'):
        print(line)
        count += 1
print("Number of emails: ", count)

From: stephen.marquard@uct.ac.za

From: louis@media.berkeley.edu

From: zqian@umich.edu

From: rjlowe@iupui.edu

From: zqian@umich.edu

From: rjlowe@iupui.edu

From: cwen@iupui.edu

From: cwen@iupui.edu

From: gsilver@umich.edu

From: gsilver@umich.edu

From: zqian@umich.edu

From: gsilver@umich.edu

From: wagnermr@iupui.edu

From: zqian@umich.edu

From: antranig@caret.cam.ac.uk

From: gopal.ramasammycook@gmail.com

From: david.horwitz@uct.ac.za

From: david.horwitz@uct.ac.za

From: david.horwitz@uct.ac.za

From: david.horwitz@uct.ac.za

From: stephen.marquard@uct.ac.za

From: louis@media.berkeley.edu

From: louis@media.berkeley.edu

From: ray@media.berkeley.edu

From: cwen@iupui.edu

From: cwen@iupui.edu

From: cwen@iupui.edu

Number of emails:  27


### Display all email IDs of a particular domain, say 'umich.edu'

#### Solution: Use find() method, for example,find('umich.edu') returns the position of the argument if found otherwise -1

In [7]:
fhand = open('mbox-short.txt')
for line in fhand:
    line = line.rstrip()
    if line.find('@uct.ac.za') == -1: 
        continue
    print(line)

From stephen.marquard@uct.ac.za Sat Jan  5 09:14:16 2008
From: stephen.marquard@uct.ac.za
Author: stephen.marquard@uct.ac.za
From david.horwitz@uct.ac.za Fri Jan  4 07:02:32 2008
From: david.horwitz@uct.ac.za
Author: david.horwitz@uct.ac.za
r39753 | david.horwitz@uct.ac.za | 2008-01-04 13:05:51 +0200 (Fri, 04 Jan 2008) | 1 line
From david.horwitz@uct.ac.za Fri Jan  4 06:08:27 2008
From: david.horwitz@uct.ac.za
Author: david.horwitz@uct.ac.za
From david.horwitz@uct.ac.za Fri Jan  4 04:49:08 2008
From: david.horwitz@uct.ac.za
Author: david.horwitz@uct.ac.za
From david.horwitz@uct.ac.za Fri Jan  4 04:33:44 2008
From: david.horwitz@uct.ac.za
Author: david.horwitz@uct.ac.za
From stephen.marquard@uct.ac.za Fri Jan  4 04:07:34 2008
From: stephen.marquard@uct.ac.za
Author: stephen.marquard@uct.ac.za


### Exercises

#### Basic exercises

Prompt the user for a username and a password.Store the username and password into a file named “security.txt”.Store each value on its own line. <br>

Write a program that opens your “security.txt” file and reads in the username and password store in the file. Store these values into a series of variables. Prompt the user for a username and password. If they match the values stored in the file, allow them to continue.  Otherwise present an error message.

#### Writing and reading numeric data

Write() only writes strings, not numeric data. So write(55) will generate runtime error. So, you need to convert to string and write, as write(str(55))

Write a program that opens up a file named “testscores.txt”.  This file contains the following information in the following format: <br>
student name<br>
score1<br>
score2<br>
score3<br>
Read in the values and print out the average score for the student specified in the file along with the student’s name.

#### Students performance analysis

Create a text file, 'marks.txt', with 10 marks as floating point numbers. Open the file, read marks from it and compute and print the highest mark

Modify the above program so that it prints the top-3 highest marks (Note: you may need to use list concept)

Modify the above program so that it prints the lowest-3 marks

#### Average price computation

Programming Challenge <br>
Continually prompt a user for a series of price values<br>
Store these values in a text file called “prices.txt”<br>
When the user enters a price value of 0 or less you can assume they want to end the program.<br>
If the user runs the program more than once you should not overwrite the previous text file – simply append the new price values to the end of the file<br>
Hint: open the file using the ‘a’ flag
<br>
<br>
Open your ‘prices.txt’ file for reading<br>
Read in all values and calculate the average price based on the values that are contained in the file

#### Match Making Program

Write a “matchmaking” program that asks the user to enter in their name, favorite color and favorite food. <br>
Store the result in a text file <br>

Interface with your matchmaking text file and ask a second user for a favorite color and favorite food. <br>
Compare the results – if they get 0/2 questions correct, they are not a match!  ½, they might be a match.  2/2, they are definitely a match

#### Email processing

#### 1. Display only email IDs containing 'From:'

#### 2. Count all subject lines of emails

#### 3. Write a program to read through a file and print the contents of the file (line by line) all in upper case.

#### 4. Write a program to prompt for a file name, and then read through the file and look for lines of the form: 'X-DSPAM-Confidence: 0.8475'. Add all spam confidence scores and print the average value such as the following:

##### Enter the file name: mbox-short.txt
##### Average spam confidence: 0.750718518519

### Solutions

In [8]:
#Print all lines of 'From' without the term 'From'
fhand = open('mbox-short.txt')
count = 0
for line in fhand:
    if line.startswith('From:'):
        print(line[7:])
        count += 1
print("Number of emails: ", count)
fhand.close()

tephen.marquard@uct.ac.za

ouis@media.berkeley.edu

qian@umich.edu

jlowe@iupui.edu

qian@umich.edu

jlowe@iupui.edu

wen@iupui.edu

wen@iupui.edu

silver@umich.edu

silver@umich.edu

qian@umich.edu

silver@umich.edu

agnermr@iupui.edu

qian@umich.edu

ntranig@caret.cam.ac.uk

opal.ramasammycook@gmail.com

avid.horwitz@uct.ac.za

avid.horwitz@uct.ac.za

avid.horwitz@uct.ac.za

avid.horwitz@uct.ac.za

tephen.marquard@uct.ac.za

ouis@media.berkeley.edu

ouis@media.berkeley.edu

ay@media.berkeley.edu

wen@iupui.edu

wen@iupui.edu

wen@iupui.edu

Number of emails:  27


In [9]:
#Print all subject lines
fhand = open('mbox-short.txt')
count = 0
for line in fhand:
    if line.startswith('Subject:'):
        print(line)
        count += 1
print("Number of emails: ", count)
fhand.close()   

Subject: [sakai] svn commit: r39772 - content/branches/sakai_2-5-x/content-impl/impl/src/java/org/sakaiproject/content/impl

Subject: [sakai] svn commit: r39771 - in bspace/site-manage/sakai_2-4-x/site-manage-tool/tool/src: bundle java/org/sakaiproject/site/tool

Subject: [sakai] svn commit: r39770 - site-manage/branches/sakai_2-5-x/site-manage-tool/tool/src/webapp/vm/sitesetup

Subject: [sakai] svn commit: r39769 - in gradebook/trunk/app/ui/src: java/org/sakaiproject/tool/gradebook/ui/helpers/beans java/org/sakaiproject/tool/gradebook/ui/helpers/producers webapp/WEB-INF webapp/WEB-INF/bundle

Subject: [sakai] svn commit: r39766 - site-manage/branches/sakai_2-4-x/site-manage-tool/tool/src/java/org/sakaiproject/site/tool

Subject: [sakai] svn commit: r39765 - in gradebook/trunk/app: business/src/java/org/sakaiproject/tool/gradebook/business business/src/java/org/sakaiproject/tool/gradebook/business/impl ui ui/src/java/org/sakaiproject/tool/gradebook/ui/helpers/beans ui/src/java/org/saka