# Chapter 9: Reading and Writing Files
Variables are a fine way to store data while your program is running, but if you want your data to persist even after your program has finished, you need to save it to a file. You can think of a file’s contents as a single string value, potentially gigabytes in size. In this chapter, you will learn how to use Python to create, read, and save files on the hard drive.

## Files and File Paths

A file has two key properties: a filename (usually written as one word) and a path. The path specifies the location of a file on the computer. For example, there is a file on my Windows laptop with the filename project.docx in the path C:\Users\Al\Documents. The part of the filename after the last period is called the file’s extension and tells you a file’s type. The filename project.docx is a Word document, and Users, Al, and Documents all refer to folders (also called directories). Folders can contain files and other folders. For example, project.docx is in the Documents folder, which is inside the Al folder, which is inside the Users folder. Figure 9-1 shows this folder organization.

The C:\ part of the path is the root folder, which contains all other folders. On Windows, the root folder is named C:\ and is also called the C: drive. On macOS and Linux, the root folder is /. In this book, I’ll use the Windows-style root folder, C:\. If you are entering the interactive shell examples on macOS or Linux, enter / instead.

Additional volumes, such as a DVD drive or USB flash drive, will appear differently on different operating systems. On Windows, they appear as new, lettered root drives, such as D:\ or E:\. On macOS, they appear as new folders under the /Volumes folder. On Linux, they appear as new folders under the /mnt (“mount”) folder. Also note that while folder names and filenames are not case-sensitive on Windows and macOS, they are case-sensitive on Linux.

### Backslash on Windows and Forward Slash on macOS and Linux

On Windows, paths are written using backslashes (\) as the separator between folder names. The macOS and Linux operating systems, however, use the forward slash (/) as their path separator. If you want your programs to work on all operating systems, you will have to write your Python scripts to handle both cases.

Fortunately, this is simple to do with the Path() function in the pathlib module. If you pass it the string values of individual file and folder names in your path, Path() will return a string with a file path using the correct path separators.

In [3]:
from pathlib import Path
Path('spam', 'bacon', 'eggs')

WindowsPath('spam/bacon/eggs')

In [4]:
str(Path('spam', 'bacon', 'eggs'))

'spam\\bacon\\eggs'

Note that the convention for importing pathlib is to run from pathlib import Path, since otherwise we’d have to enter pathlib.Path everywhere Path shows up in our code. Not only is this extra typing redundant, but it’s also redundant.

I’m running this chapter’s interactive shell examples on Windows, so Path('spam', 'bacon', 'eggs') returned a WindowsPath object for the joined path, represented as WindowsPath('spam/bacon/eggs'). Even though Windows uses backslashes, the WindowsPath representation in the interactive shell displays them using forward slashes, since open source software developers have historically favored the Linux operating system.

If you want to get a simple text string of this path, you can pass it to the str() function, which in our example returns 'spam\\bacon\\eggs'. (Notice that the backslashes are doubled because each backslash needs to be escaped by another backslash character.) If I had called this function on, say, Linux, Path() would have returned a PosixPath object that, when passed to str(), would have returned 'spam/bacon/eggs'. (POSIX is a set of standards for Unix-like operating systems such as Linux.)

These Path objects (really, WindowsPath or PosixPath objects, depending on your operating system) will be passed to several of the file-related functions introduced in this chapter. For example, the following code joins names from a list of filenames to the end of a folder’s name:

In [5]:
from pathlib import Path
myFiles = ['account.txt', 'details.csv', 'invite.docx'] # create a list of files
for filename in myFiles:
    print(Path(r'C:\Users\Zac', filename))

C:\Users\Zac\account.txt
C:\Users\Zac\details.csv
C:\Users\Zac\invite.docx


On Windows, the backslash separates directories, so you can’t use it in filenames. However, you can use backslashes in filenames on macOS and Linux. So while Path(r'spam\eggs') refers to two separate folders (or a file eggs in a folder spam) on Windows, the same command would refer to a single folder (or file) named spam\eggs on macOS and Linux. For this reason, it’s usually a good idea to always use forward slashes in your Python code (and I’ll be doing so for the rest of this chapter). The pathlib module will ensure that it always works on all operating systems.

Note that pathlib was introduced in Python 3.4 to replace older os.path functions. The Python Standard Library modules support it as of Python 3.6, but if you are working with legacy Python 2 versions, I recommend using pathlib2, which gives you pathlib’s features on Python 2.7. Appendix A has instructions for installing pathlib2 using pip. Whenever I’ve replaced an older os.path function with pathlib, I’ve made a short note. You can look up the older functions at https://docs.python.org/3/library/os.path.html.

### Using the / Operator to Join Paths

We normally use the + operator to add two integer or floating-point numbers, such as in the expression 2 + 2, which evaluates to the integer value 4. But we can also use the + operator to concatenate two string values, like the expression 'Hello' + 'World', which evaluates to the string value 'HelloWorld'. Similarly, the / operator that we normally use for division can also combine Path objects and strings. This is helpful for modifying a Path object after you’ve already created it with the Path() function.

In [1]:
from pathlib import Path
Path('spam') / 'bacon' / 'eggs'

WindowsPath('spam/bacon/eggs')

In [2]:
Path('spam') / Path('bacon/eggs')

WindowsPath('spam/bacon/eggs')

In [3]:
Path('spam') / Path('bacon', 'eggs')

WindowsPath('spam/bacon/eggs')

Using the / operator with Path objects makes joining paths just as easy as string concatenation. It’s also safer than using string concatenation or the join() method, like we do in this example:

In [None]:
# This is not a suggested way to join path objects (would only work for windows becauase of backslash)
homeFolder = r'C:\Users\Al'
subFolder = 'spam'
homeFolder + '\\' + subFolder

'C:\\Users\\Al\\spam'

In [None]:
# This is not a suggested way to join path objects (would only work for windows becauase of backslash)
'\\'.join([homeFolder, subFolder])

'C:\\Users\\Al\\spam'

A script that uses this code isn’t safe, because its backslashes would only work on Windows. You could add an if statement that checks sys.platform (which contains a string describing the computer’s operating system) to decide what kind of slash to use, but applying this custom code everywhere it’s needed can be inconsistent and bug-prone.

The pathlib module solves these problems by reusing the / math division operator to join paths correctly, no matter what operating system your code is running on. The following example uses this strategy to join the same paths as in the previous example:

In [6]:
homeFolder = Path('C:/Users/Al')
subFolder = Path('spam')
homeFolder / subFolder

WindowsPath('C:/Users/Al/spam')

In [7]:
str(homeFolder / subFolder)

'C:\\Users\\Al\\spam'

The only thing you need to keep in mind when using the / operator for joining paths is that one of the first two values must be a Path object. Python will give you an error if you try entering the following into the interactive shell:

In [9]:
# gives an error: 'spam' / 'bacon' / 'eggs'

Python evaluates the / operator from left to right and evaluates to a Path object, so either the first or second leftmost value must be a Path object for the entire expression to evaluate to a Path object.

If you see the TypeError: unsupported operand type(s) for /: 'str' and 'str' error message shown previously, you need to put a Path object on the left side of the expression.

The / operator replaces the older os.path.join() function, which you can learn more about from https://docs.python.org/3/library/os.path.html#os.path.join

### The Current Working Directory

Every program that runs on your computer has a current working directory, or cwd. Any filenames or paths that do not begin with the root folder are assumed to be under the current working directory.

You can get the current working directory as a string value with the Path.cwd() function and change it using os.chdir().

In [16]:
from pathlib import Path
import os
Path.cwd()

WindowsPath('C:/Users/Zac/OneDrive/Python/Practice/ATBS')

In [18]:
os.chdir('C:\\Users\\Zac\\OneDrive\\Python\\Practice\\ATBS')
Path.cwd()

WindowsPath('C:/Users/Zac/OneDrive/Python/Practice/ATBS')

Python will display an error if you try to change to a directory that does not exist.

In [19]:
os.chdir('C:/ThisFolderDoesNotExist')

FileNotFoundError: [WinError 2] The system cannot find the file specified: 'C:/ThisFolderDoesNotExist'

There is no pathlib function for changing the working directory, because changing the current working directory while a program is running can often lead to subtle bugs.

### The Home Directory

All users have a folder for their own files on the computer called the home directory or home folder. You can get a Path object of the home folder by calling Path.home()

In [21]:
Path.home()

WindowsPath('C:/Users/Zac')

The home directories are located in a set place depending on your operating system:

    On Windows, home directories are under C:\Users.
    On Mac, home directories are under /Users.
    On Linux, home directories are often under /home.

Your scripts will almost certainly have permissions to read and write the files under your home directory, so it’s an ideal place to put the files that your Python programs will work with.

### Absolute vs. Relative Paths

There are two ways to specify a file path:

    An absolute path, which always begins with the root folder
    A relative path, which is relative to the program’s current working directory

There are also the dot (.) and dot-dot (..) folders. These are not real folders but special names that can be used in a path. A single period (“dot”) for a folder name is shorthand for “this directory.” Two periods (“dot-dot”) means “the parent folder.”

In [None]:
# Relativce path: .\fizz
# Absolute path: C:\bacon\fizz

### Creating New Folders Using the os.makedirs() Function

Your programs can create new folders (directories) with the os.makedirs() function. Enter the following into the interactive shell:

In [22]:
import os
os.makedirs('C:\\Users\\Zac\\OneDrive\\Python\\Practice\\ATBS\\Test')

os.makedirs() will create any necessary intermediate folders in order to ensure that the full path exists.

To make a directory from a Path object, call the mkdir() method. For example, this code will create a spam folder under the home folder on my computer:

In [None]:
# make a new directory from a Path object
from pathlib import Path
Path(r'C:\users\Zac\spam').mkdir()

Note that mkdir() can only make one directory at a time; it won’t make several subdirectories at once like os.makedirs().

### Handling Absolute and Relative Paths

The pathlib module provides methods for checking whether a given path is an absolute path and returning the absolute path of a relative path.

Calling the is_absolute() method on a Path object will return True if it represents an absolute path or False if it represents a relative path. For example, enter the following into the interactive shell, using your own files and folders instead of the exact ones listed here:

In [25]:
Path.cwd()

WindowsPath('C:/Users/Zac/OneDrive/Python/Practice/ATBS')

In [27]:
Path.cwd().is_absolute()

True

In [30]:
Path(r'OneDrive\Python\Practice').is_absolute()

False

To get an absolute path from a relative path, you can put Path.cwd() / in front of the relative Path object. After all, when we say “relative path,” we almost always mean a path that is relative to the current working directory. Enter the following into the interactive shell:

In [31]:
Path(r'OneDrive\Python\Practice')

WindowsPath('OneDrive/Python/Practice')

In [32]:
# Get an absolute path from the relative path
Path.cwd() / Path(r'OneDrive\Python\Practice')

WindowsPath('C:/Users/Zac/OneDrive/Python/Practice/ATBS/OneDrive/Python/Practice')

If your relative path is relative to another path besides the current working directory, just replace Path.cwd() with that other path instead. The following example gets an absolute path using the home directory instead of the current working directory:

In [33]:
Path(r'OneDrive\Python\Practice')

WindowsPath('OneDrive/Python/Practice')

In [35]:
# get an absolute path from the home directory instead of the current working directory
Path.home() / Path(r'OneDrive\Python\Practice')

WindowsPath('C:/Users/Zac/OneDrive/Python/Practice')

The os.path module also has some useful functions related to absolute and relative paths:

    Calling os.path.abspath(path) will return a string of the absolute path of the argument. This is an easy way to convert a relative path into an absolute one.
    Calling os.path.isabs(path) will return True if the argument is an absolute path and False if it is a relative path.
    Calling os.path.relpath(path, start) will return a string of a relative path from the start path to path. If start is not provided, the current working directory is used as the start path.

Try these functions in the interactive shell:

In [36]:
os.path.abspath('.')

'C:\\Users\\Zac\\OneDrive\\Python\\Practice\\ATBS'

In [37]:
os.path.abspath('.\\Scripts')

'C:\\Users\\Zac\\OneDrive\\Python\\Practice\\ATBS\\Scripts'

In [38]:
os.path.isabs('.')

False

In [39]:
os.path.isabs(os.path.abspath('.'))

True

Since C:\Users\Al\AppData\Local\Programs\Python\Python37 was the working directory when os.path.abspath() was called, the “single-dot” folder represents the absolute path 'C:\\Users\\Al\\AppData\\Local\\Programs\\Python\\Python37'.

Enter the following calls to os.path.relpath() into the interactive shell:

In [40]:
os.path.relpath('C:\\Windows', 'C:\\')

'Windows'

In [41]:
os.path.relpath('C:\\Windows', 'C:\\spam\\eggs')

'..\\..\\Windows'

When the relative path is within the same parent folder as the path, but is within subfolders of a different path, such as 'C:\\Windows' and 'C:\\spam\\eggs', you can use the “dot-dot” notation to return to the parent folder.

### Getting the parts of a file path
Given a Path object, you can extract the file path’s different parts as strings using several Path object attributes. These can be useful for constructing new file paths based on existing ones

The parts of a file path include the following:

    The anchor, which is the root folder of the filesystem
    On Windows, the drive, which is the single letter that often denotes a physical hard drive or other storage device
    The parent, which is the folder that contains the file
    The name of the file, made up of the stem (or base name) and the suffix (or extension)

Note that Windows Path objects have a drive attribute, but macOS and Linux Path objects don’t. The drive attribute doesn’t include the first backslash.

To extract each attribute from the file path, enter the following into the interactive shell:

In [26]:
from pathlib import Path
# anchor returns the root folder of the file system
p = Path('C:/Users/Al/test/test/spam.txt')
p.anchor

'C:\\'

In [27]:
# parent returns all of the directories and sub-directories up to the file name
p.parent # This is a Path object, not a string

WindowsPath('C:/Users/Al/test/test')

In [28]:
# name returns the file name and extension
p.name

'spam.txt'

In [29]:
# suffix returns the extension of the file name
p.suffix

'.txt'

In [30]:
# drive returns the drive name
p.drive

'C:'

These attributes evaluate to simple string values, except for parent, which evaluates to another Path object.

The parents attribute (which is different from the parent attribute) evaluates to the ancestor folders of a Path object with an integer index:

In [31]:
Path.cwd()

WindowsPath('c:/Users/Zac/OneDrive/Python/Practice/ATBS')

In [32]:
Path.cwd().parents[0]

WindowsPath('c:/Users/Zac/OneDrive/Python/Practice')

In [33]:
Path.cwd().parents[1]

WindowsPath('c:/Users/Zac/OneDrive/Python')

In [34]:
Path.cwd().parents[2]


WindowsPath('c:/Users/Zac/OneDrive')

In [35]:
Path.cwd().parents[3]


WindowsPath('c:/Users/Zac')

The older os.path module also has similar functions for getting the different parts of a path written in a string value. Calling os.path.dirname(path) will return a string of everything that comes before the last slash in the path argument. Calling os.path.basename(path) will return a string of everything that comes after the last slash in the path argument.

In [37]:
import os
calcFilePath = 'C:\\Windows\\System32\\calc.exe'
os.path.basename(calcFilePath)

'calc.exe'

In [38]:
os.path.dirname(calcFilePath)

'C:\\Windows\\System32'

If you need a path’s dir name and base name together, you can just call os.path.split() to get a tuple value with these two strings, like so:

In [40]:
calcFilePath = 'C:\\Windows\\System32\\calc.exe'
os.path.split(calcFilePath)

('C:\\Windows\\System32', 'calc.exe')

Notice that you could create the same tuple by calling os.path.dirname() and os.path.basename() and placing their return values in a tuple:

In [41]:
(os.path.dirname(calcFilePath), os.path.basename(calcFilePath))

('C:\\Windows\\System32', 'calc.exe')

But os.path.split() is a nice shortcut if you need both values.

Also, note that os.path.split() does not take a file path and return a list of strings of each folder. For that, use the split() string method and split on the string in os.sep. (Note that sep is in os, not os.path.) The os.sep variable is set to the correct folder-separating slash for the computer running the program, '\\' on Windows and '/' on macOS and Linux, and splitting on it will return a list of the individual folders. This returns all the parts of the path as strings.

In [42]:
calcFilePath.split(os.sep)

['C:', 'Windows', 'System32', 'calc.exe']

### Finding File Sizes and Folder Contents

Once you have ways of handling file paths, you can then start gathering information about specific files and folders. The os.path module provides functions for finding the size of a file in bytes and the files and folders inside a given folder.

    Calling os.path.getsize(path) will return the size in bytes of the file in the path argument.
    Calling os.listdir(path) will return a list of filename strings for each file in the path argument. (Note that this function is in the os module, not os.path.)


In [43]:
# Return a file size in bytes for the calc file
os.path.getsize('C:\\Windows\\System32\\calc.exe')

27648

In [45]:
# return a list of filename strings for each file in the following path
os.listdir('C:\\Windows\\System32')

['%userprofile%',
 '0409',
 '69fe178f-26e7-43a9-aa7d-2b616b672dde_eventlogservice.dll',
 '6bea57fb-8dfb-4177-9ae8-42e8b3529933_RuntimeDeviceInstall.dll',
 '@AdvancedKeySettingsNotification.png',
 '@AppHelpToast.png',
 '@AudioToastIcon.png',
 '@BackgroundAccessToastIcon.png',
 '@bitlockertoastimage.png',
 '@edptoastimage.png',
 '@EnrollmentToastIcon.png',
 '@language_notification_icon.png',
 '@optionalfeatures.png',
 '@StorageSenseToastIcon.png',
 '@VpnToastIcon.png',
 '@windows-hello-V4.1.gif',
 '@WindowsHelloFaceToastIcon.png',
 '@WindowsUpdateToastIcon.contrast-black.png',
 '@WindowsUpdateToastIcon.contrast-white.png',
 '@WindowsUpdateToastIcon.png',
 '@WirelessDisplayToast.png',
 '@WLOGO_48x48.png',
 'A-Volute',
 'aadauthhelper.dll',
 'aadcloudap.dll',
 'aadjcsp.dll',
 'aadtb.dll',
 'aadWamExtension.dll',
 'AarSvc.dll',
 'AboutSettingsHandlers.dll',
 'AboveLockAppHost.dll',
 'accessibilitycpl.dll',
 'accountaccessor.dll',
 'AccountsRt.dll',
 'AcGenral.dll',
 'AcLayers.dll',
 'acledi

As you can see, the calc.exe program on my computer is 27,648 bytes in size, and I have a lot of files in C:\Windows\system32. If I want to find the total size of all the files in this directory, I can use os.path.getsize() and os.listdir() together.

In [None]:
# calculate the total file size of all files in a directory using a for loop (nore I dont think this will calculate sub-directories)
totalSize = 0
for filename in os.listdir('C:\\Windows\\System32'):
    print(os.path.join('C:\\Windows\\System32',filename)) # join folder name with current filename
    totalSize = totalSize + os.path.getsize(os.path.join('C:\\Windows\\System32',filename)) # join folder name with current filename
print(totalSize)

C:\Windows\System32\%userprofile%
C:\Windows\System32\0409
C:\Windows\System32\69fe178f-26e7-43a9-aa7d-2b616b672dde_eventlogservice.dll
C:\Windows\System32\6bea57fb-8dfb-4177-9ae8-42e8b3529933_RuntimeDeviceInstall.dll
C:\Windows\System32\@AdvancedKeySettingsNotification.png
C:\Windows\System32\@AppHelpToast.png
C:\Windows\System32\@AudioToastIcon.png
C:\Windows\System32\@BackgroundAccessToastIcon.png
C:\Windows\System32\@bitlockertoastimage.png
C:\Windows\System32\@edptoastimage.png
C:\Windows\System32\@EnrollmentToastIcon.png
C:\Windows\System32\@language_notification_icon.png
C:\Windows\System32\@optionalfeatures.png
C:\Windows\System32\@StorageSenseToastIcon.png
C:\Windows\System32\@VpnToastIcon.png
C:\Windows\System32\@windows-hello-V4.1.gif
C:\Windows\System32\@WindowsHelloFaceToastIcon.png
C:\Windows\System32\@WindowsUpdateToastIcon.contrast-black.png
C:\Windows\System32\@WindowsUpdateToastIcon.contrast-white.png
C:\Windows\System32\@WindowsUpdateToastIcon.png
C:\Windows\System32

As I loop over each filename in the C:\Windows\System32 folder, the totalSize variable is incremented by the size of each file. Notice how when I call os.path.getsize(), I use os.path.join() to join the folder name with the current filename. The integer that os.path.getsize() returns is added to the value of totalSize. After looping through all the files, I print totalSize to see the total size of the C:\Windows\System32 folder.

### Modifying a List of Files Using Glob Patterns

If you want to work on specific files, the glob() method is simpler to use than listdir(). Path objects have a glob() method for listing the contents of a folder according to a glob pattern. Glob patterns are like a simplified form of regular expressions often used in command line commands. The glob() method returns a generator object (which are beyond the scope of this book) that you’ll need to pass to list() to easily view in the interactive shell:

In [53]:
p = Path('C:/Users/Zac/OneDrive/Desktop')
p.glob('*')
list(p.glob('*')) # Make a list from the generator.

[WindowsPath('C:/Users/Zac/OneDrive/Desktop/3DMark Demo.url'),
 WindowsPath('C:/Users/Zac/OneDrive/Desktop/Anno 1800 Benchmark (DX11).url'),
 WindowsPath('C:/Users/Zac/OneDrive/Desktop/Anno 1800 Benchmark (DX12).url'),
 WindowsPath('C:/Users/Zac/OneDrive/Desktop/Anno 1800.url'),
 WindowsPath('C:/Users/Zac/OneDrive/Desktop/desktop.ini'),
 WindowsPath('C:/Users/Zac/OneDrive/Desktop/Gskill Microcenter receipt.pdf'),
 WindowsPath('C:/Users/Zac/OneDrive/Desktop/Long to Wide data practice.xlsx'),
 WindowsPath('C:/Users/Zac/OneDrive/Desktop/Microcenter receipt.pdf'),
 WindowsPath('C:/Users/Zac/OneDrive/Desktop/Microcenter receipt.PNG'),
 WindowsPath('C:/Users/Zac/OneDrive/Desktop/Old Firefox Data'),
 WindowsPath('C:/Users/Zac/OneDrive/Desktop/Oral B receipt.jpg'),
 WindowsPath('C:/Users/Zac/OneDrive/Desktop/Outlaws.url'),
 WindowsPath('C:/Users/Zac/OneDrive/Desktop/Removed Apps.html'),
 WindowsPath('C:/Users/Zac/OneDrive/Desktop/Saved highlights'),
 WindowsPath('C:/Users/Zac/OneDrive/Desktop/

The asterisk (*) stands for “multiple of any characters,” so p.glob('*') returns a generator of all files in the path stored in p.

Like with regexes, you can create complex expressions:

In [60]:
p = Path('C:/Users/Zac/OneDrive/Desktop')
list(p.glob('*.txt')) # Lists all text files

[WindowsPath('C:/Users/Zac/OneDrive/Desktop/test.txt'),
 WindowsPath('C:/Users/Zac/OneDrive/Desktop/VBA long to wide.txt')]

The glob pattern '*.txt' will return files that start with any combination of characters as long as it ends with the string '.txt', which is the text file extension.

In contrast with the asterisk, the question mark (?) stands for any single character:

In [70]:
list(p.glob('tes?.txt')) # the ? represents any single character

[WindowsPath('C:/Users/Zac/OneDrive/Desktop/test.txt')]

The glob expression 'project?.docx' will return 'project1.docx' or 'project5.docx', but it will not return 'project10.docx', because ? only matches to one character—so it will not match to the two-character string '10'.

Finally, you can also combine the asterisk and question mark to create even more complex glob expressions, like this:

In [73]:
list(p.glob('*.?x?'))

[WindowsPath('C:/Users/Zac/OneDrive/Desktop/test.txt'),
 WindowsPath('C:/Users/Zac/OneDrive/Desktop/VBA long to wide.txt')]

The glob expression '*.?x?' will return files with any name and any three-character extension where the middle character is an 'x'.

By picking out files with specific attributes, the glob() method lets you easily specify the files in a directory you want to perform some operation on. You can use a for loop to iterate over the generator that glob() returns:

In [74]:
p = Path('C:/Users/Zac/OneDrive/Desktop')
for textFilePathObj in p.glob('*.txt'):
     print(textFilePathObj) # Prints the Path object as a string.
     # Do something with the text file.

C:\Users\Zac\OneDrive\Desktop\test.txt
C:\Users\Zac\OneDrive\Desktop\VBA long to wide.txt


If you want to perform some operation on every file in a directory, you can use either os.listdir(p) or p.glob('*').

Many Python functions will crash with an error if you supply them with a path that does not exist. Luckily, Path objects have methods to check whether a given path exists and whether it is a file or folder. Assuming that a variable p holds a Path object, you could expect the following:

    Calling p.exists() returns True if the path exists or returns False if it doesn’t exist.
    Calling p.is_file() returns True if the path exists and is a file, or returns False otherwise.
    Calling p.is_dir() returns True if the path exists and is a directory, or returns False otherwise.


In [75]:
winDir = Path('C:/Windows')
notExistsDir = Path('C:/This/Folder/Does/Not/Exist')
calcFile = Path('C:/Windows/System32/calc.exe')

print(winDir.exists())
print(winDir.is_dir())
print(notExistsDir.exists())
print(calcFile.is_file())
print(calcFile.is_dir())

True
True
False
True
False


You can determine whether there is a DVD or flash drive currently attached to the computer by checking for it with the exists() method. For instance, if I wanted to check for a flash drive with the volume named D:\ on my Windows computer, I could do that with the following:

In [76]:
dDrive =  Path('D:/') ## this is my secondary hard drive
dDrive.exists()

True

The older os.path module can accomplish the same task with the os.path.exists(path), os.path.isfile(path), and os.path.isdir(path) functions, which act just like their Path function counterparts. As of Python 3.6, these functions can accept Path objects as well as strings of the file paths.

## The File Reading and Writing Process
Once you are comfortable working with folders and relative paths, you’ll be able to specify the location of files to read and write. The functions covered in the next few sections will apply to plaintext files. Plaintext files contain only basic text characters and do not include font, size, or color information. Text files with the .txt extension or Python script files with the .py extension are examples of plaintext files. These can be opened with Windows’s Notepad or macOS’s TextEdit application. Your programs can easily read the contents of plaintext files and treat them as an ordinary string value.

Binary files are all other file types, such as word processing documents, PDFs, images, spreadsheets, and executable programs. If you open a binary file in Notepad or TextEdit, it will look like scrambled nonsense, like in Figure 9-6.

Since every different type of binary file must be handled in its own way, this book will not go into reading and writing raw binary files directly. Fortunately, many modules make working with binary files easier—you will explore one of them, the shelve module, later in this chapter. The pathlib module’s read_text() method returns a string of the full contents of a text file. Its write_text() method creates a new text file (or overwrites an existing one) with the string passed to it. Enter the following into the interactive shell:

In [79]:
from pathlib import Path
p = Path('Files/spam.txt') # new file in the Files folder of this working directory
p.write_text('Hello World')

11

In [80]:
p.read_text() # read the text file at the path

'Hello World'

These method calls create a spam.txt file with the content 'Hello, world!'. The 13 that write_text() returns indicates that 13 characters were written to the file. (You can often disregard this information.) The read_text() call reads and returns the contents of our new file as a string: 'Hello, world!'.

Keep in mind that these Path object methods only provide basic interactions with files. The more common way of writing to a file involves using the open() function and file objects. There are three steps to reading or writing files in Python:


    Call the open() function to return a File object.
    Call the read() or write() method on the File object.
    Close the file by calling the close() method on the File object.

We’ll go over these steps in the following sections.

### Opening Files with the open() Function

To open a file with the open() function, you pass it a string path indicating the file you want to open; it can be either an absolute or relative path. The open() function returns a File object.

Try it by creating a text file named hello.txt using Notepad or TextEdit. Type Hello, world! as the content of this text file and save it in your user home folder. Then enter the following into the interactive shell:

In [83]:
helloFile = open(r'C:\Users\Zac\OneDrive\Python\Practice\ATBS\Files\hello.txt')

Make sure to replace your_home_folder with your computer username. For example, my username is Al, so I’d enter 'C:\\Users\\Al\\hello.txt' on Windows. Note that the open() function only accepts Path objects as of Python 3.6. In previous versions, you always need to pass a string to open().

Both these commands will open the file in “reading plaintext” mode, or read mode for short. When a file is opened in read mode, Python lets you only read data from the file; you can’t write or modify it in any way. Read mode is the default mode for files you open in Python. But if you don’t want to rely on Python’s defaults, you can explicitly specify the mode by passing the string value 'r' as a second argument to open(). So open('/Users/Al/hello.txt', 'r') and open('/Users/Al/hello.txt') do the same thing.

The call to open() returns a File object. A File object represents a file on your computer; it is simply another type of value in Python, much like the lists and dictionaries you’re already familiar with. In the previous example, you stored the File object in the variable helloFile. Now, whenever you want to read from or write to the file, you can do so by calling methods on the File object in helloFile.

### Reading the Contents of Files

Now that you have a File object, you can start reading from it. If you want to read the entire contents of a file as a string value, use the File object’s read() method. Let’s continue with the hello.txt File object you stored in helloFile. Enter the following into the interactive shell

In [85]:
helloContent = helloFile.read()
helloContent

''

If you think of the contents of a file as a single large string value, the read() method returns the string that is stored in the file.

Alternatively, you can use the readlines() method to get a list of string values from the file, one string for each line of text. For example, create a file named sonnet29.txt in the same directory as hello.txt and write the following text in it:

When, in disgrace with fortune and men's eyes,
I all alone beweep my outcast state,
And trouble deaf heaven with my bootless cries,
And look upon myself and curse my fate,

Make sure to separate the four lines with line breaks. Then enter the following into the interactive shell:

In [88]:
sonnetFile = open(r'C:\Users\Zac\OneDrive\Python\Practice\ATBS\Files\sonnet29.txt')
sonnetFile.readlines()

["When, in disgrace with fortune and men's eyes,\n",
 'I all alone beweep my outcast state,\n',
 'And trouble deaf heaven with my bootless cries,\n',
 'And look upon myself and curse my fate,']

Note that, except for the last line of the file, each of the string values ends with a newline character \n. A list of strings is often easier to work with than a single large string value.

### Writing to Files

Python allows you to write content to a file in a way similar to how the print() function “writes” strings to the screen. You can’t write to a file you’ve opened in read mode, though. Instead, you need to open it in “write plaintext” mode or “append plaintext” mode, or write mode and append mode for short.

Write mode will overwrite the existing file and start from scratch, just like when you overwrite a variable’s value with a new value. Pass 'w' as the second argument to open() to open the file in write mode. Append mode, on the other hand, will append text to the end of the existing file. You can think of this as appending to a list in a variable, rather than overwriting the variable altogether. Pass 'a' as the second argument to open() to open the file in append mode.

If the filename passed to open() does not exist, both write and append mode will create a new, blank file. After reading or writing a file, call the close() method before opening the file again.

Let’s put these concepts together. Enter the following into the interactive shell:

In [90]:
baconFile = open('Files/bacon.txt', 'w') # open the file in write mode (will create a new file if it does not exist) 
baconFile.write('Hello, world!\n')
baconFile.close()

In [94]:
baconFile = open('Files/bacon.txt', 'a') # append to the existing file
baconFile.write('\nBacon is not a vegetable.')
baconFile.close()

In [96]:
# read the bacon file
baconFile = open('Files/bacon.txt')
content = baconFile.read()
baconFile.close()
print(content)

Hello, world!
Bacon is not a vegetable.Bacon is not a vegetable.
 Bacon is not a vegetable.
Bacon is not a vegetable.


First, we open bacon.txt in write mode. Since there isn’t a bacon.txt yet, Python creates one. Calling write() on the opened file and passing write() the string argument 'Hello, world! /n' writes the string to the file and returns the number of characters written, including the newline. Then we close the file.

To add text to the existing contents of the file instead of replacing the string we just wrote, we open the file in append mode. We write 'Bacon is not a vegetable.' to the file and close it. Finally, to print the file contents to the screen, we open the file in its default read mode, call read(), store the resulting File object in content, close the file, and print content.

Note that the write() method does not automatically add a newline character to the end of the string like the print() function does. You will have to add this character yourself.

As of Python 3.6, you can also pass a Path object to the open() function instead of a string for the filename.

## Saving Variables with the shelve Module

You can save variables in your Python programs to binary shelf files using the shelve module. This way, your program can restore data to variables from the hard drive. The shelve module will let you add Save and Open features to your program. For example, if you ran a program and entered some configuration settings, you could save those settings to a shelf file and then have the program load them the next time it is run.

Enter the following into the interactive shell:

In [99]:
import shelve
shelfFile = shelve.open('Files/mydata')
cats = ['Zophie', 'Pooka', 'Simon']
shelfFile['cats'] = cats # Store the cats list in shelfFile as a value assosiated with the key 'cats' (like in a dictionary)
shelfFile.close()

To read and write data using the shelve module, you first import shelve. Call shelve.open() and pass it a filename, and then store the returned shelf value in a variable. You can make changes to the shelf value as if it were a dictionary. When you’re done, call close() on the shelf value. Here, our shelf value is stored in shelfFile. We create a list cats and write shelfFile['cats'] = cats to store the list in shelfFile as a value associated with the key 'cats' (like in a dictionary). Then we call close() on shelfFile. Note that as of Python 3.7, you have to pass the open() shelf method filenames as strings. You can’t pass it Path object.

After running the previous code on Windows, you will see three new files in the current working directory: mydata.bak, mydata.dat, and mydata.dir. On macOS, only a single mydata.db file will be created.

These binary files contain the data you stored in your shelf. The format of these binary files is not important; you only need to know what the shelve module does, not how it does it. The module frees you from worrying about how to store your program’s data to a file.

Your programs can use the shelve module to later reopen and retrieve the data from these shelf files. Shelf values don’t have to be opened in read or write mode—they can do both once opened. Enter the following into the interactive shell:

In [100]:
shelfFile = shelve.open('Files/mydata')
type(shelfFile)

shelve.DbfilenameShelf

In [101]:
shelfFile['cats'] # can call the shelf file in a similar way to a dictionary key

['Zophie', 'Pooka', 'Simon']

In [102]:
shelfFile.close()

Here, we open the shelf files to check that our data was stored correctly. Entering shelfFile['cats'] returns the same list that we stored earlier, so we know that the list is correctly stored, and we call close().

Just like dictionaries, shelf values have keys() and values() methods that will return list-like values of the keys and values in the shelf. Since these methods return list-like values instead of true lists, you should pass them to the list() function to get them in list form. Enter the following into the interactive shell:

In [103]:
shelfFile = shelve.open('Files/myData')
list(shelfFile.keys()) # will return the cats key

['cats']

In [104]:
list(shelfFile.values())

[['Zophie', 'Pooka', 'Simon']]

In [105]:
shelfFile.close()

Plaintext is useful for creating files that you’ll read in a text editor such as Notepad or TextEdit, but if you want to save data from your Python programs, use the shelve module.

## Saving Variables with the pprint.pformat() Function

Recall from “Pretty Printing” on page 118 that the pprint.pprint() function will “pretty print” the contents of a list or dictionary to the screen, while the pprint.pformat() function will return this same text as a string instead of printing it. Not only is this string formatted to be easy to read, but it is also syntactically correct Python code. Say you have a dictionary stored in a variable and you want to save this variable and its contents for future use. Using pprint.pformat() will give you a string that you can write to a .py file. This file will be your very own module that you can import whenever you want to use the variable stored in it.

For example, enter the following into the interactive shell:

In [106]:
import pprint
cats = [{'name': 'Zophie', 'desc': 'chubby'}, {'name': 'Pooka', 'desc': 'fluffy'}] # list of dictionaries stored in variable cats
pprint.pformat(cats)

"[{'desc': 'chubby', 'name': 'Zophie'}, {'desc': 'fluffy', 'name': 'Pooka'}]"

In [108]:
fileObj = open('Files/myCats.py', 'w') # write file myCats
fileObj.write('cats = ' + pprint.pformat(cats) + '\n') # write the list of dictionaries to the variable cats and save as a python file myCats for future use
fileObj.close()

Here, we import pprint to let us use pprint.pformat(). We have a list of dictionaries, stored in a variable cats. To keep the list in cats available even after we close the shell, we use pprint.pformat() to return it as a string. Once we have the data in cats as a string, it’s easy to write the string to a file, which we’ll call myCats.py.

The modules that an import statement imports are themselves just Python scripts. When the string from pprint.pformat() is saved to a .py file, the file is a module that can be imported just like any other.

And since Python scripts are themselves just text files with the .py file extension, your Python programs can even generate other Python programs. You can then import these files into scripts.

In [124]:
os.chdir(os.path.join(os.getcwd(), 'Files')) # need to change to the Files sub-directory to access myCats by joining current wd with Files
import myCats
myCats.cats # Prints the list of dictionaries

[{'desc': 'chubby', 'name': 'Zophie'}, {'desc': 'fluffy', 'name': 'Pooka'}]

In [118]:
myCats.cats[0]

{'desc': 'chubby', 'name': 'Zophie'}

In [119]:
myCats.cats[0]['name']

'Zophie'

The benefit of creating a .py file (as opposed to saving variables with the shelve module) is that because it is a text file, the contents of the file can be read and modified by anyone with a simple text editor. For most applications, however, saving data using the shelve module is the preferred way to save variables to a file. Only basic data types such as integers, floats, strings, lists, and dictionaries can be written to a file as simple text. File objects, for example, cannot be encoded as text.

## Project: Generating Random Quiz Files

Say you’re a geography teacher with 35 students in your class and you want to give a pop quiz on US state capitals. Alas, your class has a few bad eggs in it, and you can’t trust the students not to cheat. You’d like to randomize the order of questions so that each quiz is unique, making it impossible for anyone to crib answers from anyone else. Of course, doing this by hand would be a lengthy and boring affair. Fortunately, you know some Python.

Here is what the program does:

    Creates 35 different quizzes
    Creates 50 multiple-choice questions for each quiz, in random order
    Provides the correct answer and three random wrong answers for each question, in random order
    Writes the quizzes to 35 text files
    Writes the answer keys to 35 text files

This means the code will need to do the following:

    Store the states and their capitals in a dictionary
    Call open(), write(), and close() for the quiz and answer key text files
    Use random.shuffle() to randomize the order of the questions and multiple-choice options


### Step 1: Store the Quiz Data in a Dictionary

The first step is to create a skeleton script and fill it with your quiz data. Create a file named randomQuizGenerator.py, and make it look like the following:

In [None]:
# Change the directory to a new folder to contain the quiz files:
import os
os.getcwd() # get the current working directory
# os.chdir(os.path.join(os.getcwd(), 'Files\\QuizGeneratorFiles'))


'c:\\Users\\Zac\\OneDrive\\Python\\Practice\\ATBS\\Files\\QuizGeneratorFiles'

In [None]:
#! python 3
# randomQuizGenerator.py - Creates quizzes with questions nad answers in random order, along with the answer key.

import random

# The quiz data. Keys are states and values are their capitals
capitals = {'Alabama': 'Montgomery', 'Alaska': 'Juneau', 'Arizona': 'Phoenix',
'Arkansas': 'Little Rock', 'California': 'Sacramento', 'Colorado': 'Denver',
'Connecticut': 'Hartford', 'Delaware': 'Dover', 'Florida': 'Tallahassee',
'Georgia': 'Atlanta', 'Hawaii': 'Honolulu', 'Idaho': 'Boise', 'Illinois':
'Springfield', 'Indiana': 'Indianapolis', 'Iowa': 'Des Moines', 'Kansas':
'Topeka', 'Kentucky': 'Frankfort', 'Louisiana': 'Baton Rouge', 'Maine':
'Augusta', 'Maryland': 'Annapolis', 'Massachusetts': 'Boston', 'Michigan':
'Lansing', 'Minnesota': 'Saint Paul', 'Mississippi': 'Jackson', 'Missouri':
'Jefferson City', 'Montana': 'Helena', 'Nebraska': 'Lincoln', 'Nevada':
'Carson City', 'New Hampshire': 'Concord', 'New Jersey': 'Trenton', 'New Mexico': 'Santa Fe',
'New York': 'Albany','North Carolina': 'Raleigh', 'North Dakota': 'Bismarck', 
'Ohio': 'Columbus', 'Oklahoma': 'Oklahoma City', 'Oregon': 'Salem',
'Pennsylvania': 'Harrisburg', 'Rhode Island': 'Providence',
'South Carolina': 'Columbia', 'South Dakota': 'Pierre', 'Tennessee':
'Nashville', 'Texas': 'Austin', 'Utah': 'Salt Lake City', 'Vermont':
'Montpelier', 'Virginia': 'Richmond', 'Washington': 'Olympia', 
'West Virginia': 'Charleston', 'Wisconsin': 'Madison', 'Wyoming': 'Cheyenne'}

# Generate 35 quiz files.


Since this program will be randomly ordering the questions and answers, you’ll need to import the random module ➊ to make use of its functions. The capitals variable ➋ contains a dictionary with US states as keys and their capitals as values. And since you want to create 35 quizzes, the code that actually generates the quiz and answer key files (marked with TODO comments for now) will go inside a for loop that loops 35 times ➌. (This number can be changed to generate any number of quiz files.)

### Step 2: Create the Quiz File and Shuffle the Question Order

Now it’s time to start filling in those TODOs.

The code in the loop will be repeated 35 times—once for each quiz—so you have to worry about only one quiz at a time within the loop. First you’ll create the actual quiz file. It needs to have a unique filename and should also have some kind of standard header in it, with places for the student to fill in a name, date, and class period. Then you’ll need to get a list of states in randomized order, which can be used later to create the questions and answers for the quiz.

Add the following lines of code to randomQuizGenerator.py:

In [None]:
#! python 3
# randomQuizGenerator.py - Creates quizzes with questions nad answers in random order, along with the answer key.


import random

# The quiz data. Keys are states and values are their capitals
capitals = {'Alabama': 'Montgomery', 'Alaska': 'Juneau', 'Arizona': 'Phoenix',
'Arkansas': 'Little Rock', 'California': 'Sacramento', 'Colorado': 'Denver',
'Connecticut': 'Hartford', 'Delaware': 'Dover', 'Florida': 'Tallahassee',
'Georgia': 'Atlanta', 'Hawaii': 'Honolulu', 'Idaho': 'Boise', 'Illinois':
'Springfield', 'Indiana': 'Indianapolis', 'Iowa': 'Des Moines', 'Kansas':
'Topeka', 'Kentucky': 'Frankfort', 'Louisiana': 'Baton Rouge', 'Maine':
'Augusta', 'Maryland': 'Annapolis', 'Massachusetts': 'Boston', 'Michigan':
'Lansing', 'Minnesota': 'Saint Paul', 'Mississippi': 'Jackson', 'Missouri':
'Jefferson City', 'Montana': 'Helena', 'Nebraska': 'Lincoln', 'Nevada':
'Carson City', 'New Hampshire': 'Concord', 'New Jersey': 'Trenton', 'New Mexico': 'Santa Fe',
'New York': 'Albany','North Carolina': 'Raleigh', 'North Dakota': 'Bismarck', 
'Ohio': 'Columbus', 'Oklahoma': 'Oklahoma City', 'Oregon': 'Salem',
'Pennsylvania': 'Harrisburg', 'Rhode Island': 'Providence',
'South Carolina': 'Columbia', 'South Dakota': 'Pierre', 'Tennessee':
'Nashville', 'Texas': 'Austin', 'Utah': 'Salt Lake City', 'Vermont':
'Montpelier', 'Virginia': 'Richmond', 'Washington': 'Olympia', 
'West Virginia': 'Charleston', 'Wisconsin': 'Madison', 'Wyoming': 'Cheyenne'}

# Generate 35 quiz files.
for quizNum in range(35):
    # Create the quiz and answer key files.
    quizFile = open(f'capitalsquiz{quizNum + 1}.txt', 'w') # create a new file for each quiz
    answerKeyFile = open(f'capitalsquiz_answers{quizNum + 1}.txt', 'w') # create new file for each answer key
    
    # Write out the header for the quiz.
    quizFile.write('Name:\n\nDate:\n\nPeriod:\n\n')
    quizFile.write((''* 20) + f'State Capitals Quiz (Form{quizNum + 1}')
    quizFile.write('\n\n')

    # Shuffle the order of the states.
    states = list(capitals.keys())
    random.shuffle(states)

    # TODO: Loop through all 50 states, making a question for each.




The filenames for the quizzes will be capitalsquiz<N>.txt, where <N> is a unique number for the quiz that comes from quizNum, the for loop’s counter. The answer key for capitalsquiz<N>.txt will be stored in a text file named capitalsquiz_answers<N>.txt. Each time through the loop, the {quizNum + 1} placeholder in f'capitalsquiz{quizNum + 1}.txt' and f'capitalsquiz_answers{quizNum + 1}.txt' will be replaced by the unique number, so the first quiz and answer key created will be capitalsquiz1.txt and capitalsquiz_answers1.txt. These files will be created with calls to the open() function at ➊ and ➋, with 'w' as the second argument to open them in write mode.

The write() statements at ➌ create a quiz header for the student to fill out. Finally, a randomized list of US states is created with the help of the random.shuffle() function ➍, which randomly reorders the values in any list that is passed to it.

### Step 3: Create the Answer Options

Now you need to generate the answer options for each question, which will be multiple choice from A to D. You’ll need to create another for loop—this one to generate the content for each of the 50 questions on the quiz. Then there will be a third for loop nested inside to generate the multiple-choice options for each question. Make your code look like the following:

In [None]:
#! python 3
# randomQuizGenerator.py - Creates quizzes with questions nad answers in random order, along with the answer key.


import random

# The quiz data. Keys are states and values are their capitals
capitals = {'Alabama': 'Montgomery', 'Alaska': 'Juneau', 'Arizona': 'Phoenix',
'Arkansas': 'Little Rock', 'California': 'Sacramento', 'Colorado': 'Denver',
'Connecticut': 'Hartford', 'Delaware': 'Dover', 'Florida': 'Tallahassee',
'Georgia': 'Atlanta', 'Hawaii': 'Honolulu', 'Idaho': 'Boise', 'Illinois':
'Springfield', 'Indiana': 'Indianapolis', 'Iowa': 'Des Moines', 'Kansas':
'Topeka', 'Kentucky': 'Frankfort', 'Louisiana': 'Baton Rouge', 'Maine':
'Augusta', 'Maryland': 'Annapolis', 'Massachusetts': 'Boston', 'Michigan':
'Lansing', 'Minnesota': 'Saint Paul', 'Mississippi': 'Jackson', 'Missouri':
'Jefferson City', 'Montana': 'Helena', 'Nebraska': 'Lincoln', 'Nevada':
'Carson City', 'New Hampshire': 'Concord', 'New Jersey': 'Trenton', 'New Mexico': 'Santa Fe',
'New York': 'Albany','North Carolina': 'Raleigh', 'North Dakota': 'Bismarck', 
'Ohio': 'Columbus', 'Oklahoma': 'Oklahoma City', 'Oregon': 'Salem',
'Pennsylvania': 'Harrisburg', 'Rhode Island': 'Providence',
'South Carolina': 'Columbia', 'South Dakota': 'Pierre', 'Tennessee':
'Nashville', 'Texas': 'Austin', 'Utah': 'Salt Lake City', 'Vermont':
'Montpelier', 'Virginia': 'Richmond', 'Washington': 'Olympia', 
'West Virginia': 'Charleston', 'Wisconsin': 'Madison', 'Wyoming': 'Cheyenne'}

# Generate 35 quiz files.
for quizNum in range(35):
    # Create the quiz and answer key files.
    quizFile = open(f'capitalsquiz{quizNum + 1}.txt', 'w') # create a new file for each quiz
    answerKeyFile = open(f'capitalsquiz_answers{quizNum + 1}.txt', 'w') # create new file for each answer key
    
    # Write out the header for the quiz.
    quizFile.write('Name:\n\nDate:\n\nPeriod:\n\n')
    quizFile.write((''* 20) + f'State Capitals Quiz (Form{quizNum + 1}')
    quizFile.write('\n\n')

    # Shuffle the order of the states.
    states = list(capitals.keys())
    random.shuffle(states)

    # Loop through all 50 states, making a question for each.
    for questionNum in range(50):
        # Get right and wrong answers
        correctAnswer = capitals[states[questionNum]] # increment through the randomized list of states and store the states corresponding capital as correct answer
        wrongAnswers = list(capitals.values()) # duplicate the values in capitals dictionary as wrongAnswers
        del wrongAnswers[wrongAnswers.index(correctAnswer)] # delete the correct answer from the list
        wrongAnswers = random.sample(wrongAnswers, 3) # choose 3 incorrect answers from the list at random
        answerOptions = wrongAnswers + [correctAnswer] # create a list of the 4 possible answer options
        random.shuffle(answerOptions) # Shuffle the answer options
    


The correct answer is easy to get—it’s stored as a value in the capitals dictionary ➊. This loop will loop through the states in the shuffled states list, from states[0] to states[49], find each state in capitals, and store that state’s corresponding capital in correctAnswer.

The list of possible wrong answers is trickier. You can get it by duplicating all the values in the capitals dictionary ➋, deleting the correct answer ➌, and selecting three random values from this list ➍. The random.sample() function makes it easy to do this selection. Its first argument is the list you want to select from; the second argument is the number of values you want to select. The full list of answer options is the combination of these three wrong answers with the correct answers ➎. Finally, the answers need to be randomized ➏ so that the correct response isn’t always choice D.

### Step 4: Write Content to the Quiz and Answer Key Files

All that is left is to write the question to the quiz file and the answer to the answer key file. Make your code look like the following:

In [15]:
#! python 3
# randomQuizGenerator.py - Creates quizzes with questions nad answers in random order, along with the answer key.

import random

# The quiz data. Keys are states and values are their capitals
capitals = {'Alabama': 'Montgomery', 'Alaska': 'Juneau', 'Arizona': 'Phoenix',
'Arkansas': 'Little Rock', 'California': 'Sacramento', 'Colorado': 'Denver',
'Connecticut': 'Hartford', 'Delaware': 'Dover', 'Florida': 'Tallahassee',
'Georgia': 'Atlanta', 'Hawaii': 'Honolulu', 'Idaho': 'Boise', 'Illinois':
'Springfield', 'Indiana': 'Indianapolis', 'Iowa': 'Des Moines', 'Kansas':
'Topeka', 'Kentucky': 'Frankfort', 'Louisiana': 'Baton Rouge', 'Maine':
'Augusta', 'Maryland': 'Annapolis', 'Massachusetts': 'Boston', 'Michigan':
'Lansing', 'Minnesota': 'Saint Paul', 'Mississippi': 'Jackson', 'Missouri':
'Jefferson City', 'Montana': 'Helena', 'Nebraska': 'Lincoln', 'Nevada':
'Carson City', 'New Hampshire': 'Concord', 'New Jersey': 'Trenton', 'New Mexico': 'Santa Fe',
'New York': 'Albany','North Carolina': 'Raleigh', 'North Dakota': 'Bismarck', 
'Ohio': 'Columbus', 'Oklahoma': 'Oklahoma City', 'Oregon': 'Salem',
'Pennsylvania': 'Harrisburg', 'Rhode Island': 'Providence',
'South Carolina': 'Columbia', 'South Dakota': 'Pierre', 'Tennessee':
'Nashville', 'Texas': 'Austin', 'Utah': 'Salt Lake City', 'Vermont':
'Montpelier', 'Virginia': 'Richmond', 'Washington': 'Olympia', 
'West Virginia': 'Charleston', 'Wisconsin': 'Madison', 'Wyoming': 'Cheyenne'}

# Generate 35 quiz files.
for quizNum in range(35):
    # Create the quiz and answer key files.
    quizFile = open(f'capitalsquiz{quizNum + 1}.txt', 'w') # create a new file for each quiz
    answerKeyFile = open(f'capitalsquiz_answers{quizNum + 1}.txt', 'w') # create new file for each answer key
    
    # Write out the header for the quiz.
    quizFile.write('Name:\n\nDate:\n\nPeriod:\n\n')
    quizFile.write((''* 20) + f'State Capitals Quiz (Form: {quizNum + 1})')
    quizFile.write('\n\n')

    # Shuffle the order of the states.
    states = list(capitals.keys())
    random.shuffle(states)

    # Loop through all 50 states, making a question for each.
    for questionNum in range(50):
        # Get right and wrong answers
        correctAnswer = capitals[states[questionNum]] # increment through the randomized list of states and store the states corresponding capital as correct answer
        wrongAnswers = list(capitals.values()) # duplicate the values in capitals dictionary as wrongAnswers
        del wrongAnswers[wrongAnswers.index(correctAnswer)] # delete the correct answer from the list
        wrongAnswers = random.sample(wrongAnswers, 3) # choose 3 incorrect answers from the list at random
        answerOptions = wrongAnswers + [correctAnswer] # create a list of the 4 possible answer options
        random.shuffle(answerOptions) # Shuffle the answer options

        # Write the question and the answer optiosn to the quiz file.
        quizFile.write(f'{questionNum + 1}. What is the capital of {states[questionNum]}?\n')
        for i in range(4):
            quizFile.write(f"    {'ABCD'[i]}. { answerOptions[i]}\n") # create an array A, B, C, D
        
        quizFile.write('\n')

        # write the answer key to a file
        answerKeyFile.write(f"{questionNum + 1}. {'ABCD'[answerOptions.index(correctAnswer)]}\n") # find the integer index of the corrext answer in the randomly shuffled options
    
    # close the files
    quizFile.close()
    answerKeyFile.close()



A for loop that goes through integers 0 to 3 will write the answer options in the answerOptions list ➊. The expression 'ABCD'[i] at ➋ treats the string 'ABCD' as an array and will evaluate to 'A','B', 'C', and then 'D' on each respective iteration through the loop.

In the final line ➌, the expression answerOptions.index(correctAnswer) will find the integer index of the correct answer in the randomly ordered answer options, and 'ABCD'[answerOptions.index(correctAnswer)] will evaluate to the correct answer’s letter to be written to the answer key file.

## Project: Updatable Multi-Clipboard

Let’s rewrite the “multi-clipboard” program from Chapter 6 so that it uses the shelve module. The user will now be able to save new strings to load to the clipboard without having to modify the source code. We’ll name this new program mcb.pyw (since “mcb” is shorter to type than “multi-clipboard”). The .pyw extension means that Python won’t show a Terminal window when it runs this program. (See Appendix B for more details.)

The program will save each piece of clipboard text under a keyword. For example, when you run py mcb.pyw save spam, the current contents of the clipboard will be saved with the keyword spam. This text can later be loaded to the clipboard again by running py mcb.pyw spam. And if the user forgets what keywords they have, they can run py mcb.pyw list to copy a list of all keywords to the clipboard.

Here’s what the program does:

    The command line argument for the keyword is checked.
    If the argument is save, then the clipboard contents are saved to the keyword.
    If the argument is list, then all the keywords are copied to the clipboard.
    Otherwise, the text for the keyword is copied to the clipboard.

This means the code will need to do the following:

    Read the command line arguments from sys.argv.
    Read and write to the clipboard.
    Save and load to a shelf file.

If you use Windows, you can easily run this script from the Run... window by creating a batch file named mcb.bat with the following content

I will write this program as a separate python file called "mcb.pyw" but will keep my notes here.

### Step 1: Comments and Shelf Setup

Let’s start by making a skeleton script with some comments and basic setup. It’s common practice to put general usage information in comments at the top of the file ➊. If you ever forget how to run your script, you can always look at these comments for a reminder. Then you import your modules ➋. Copying and pasting will require the pyperclip module, and reading the command line arguments will require the sys module. The shelve module will also come in handy: Whenever the user wants to save a new piece of clipboard text, you’ll save it to a shelf file. Then, when the user wants to paste the text back to their clipboard, you’ll open the shelf file and load it back into your program. The shelf file will be named with the prefix mcb ➌.

### Step 2: Save Clipboard Content with a Keyword

The program does different things depending on whether the user wants to save text to a keyword, load text into the clipboard, or list all the existing keywords. Let’s deal with that first case. 

If the first command line argument (which will always be at index 1 of the sys.argv list) is 'save' ➊, the second command line argument is the keyword for the current content of the clipboard. The keyword will be used as the key for mcbShelf, and the value will be the text currently on the clipboard ➋.

If there is only one command line argument, you will assume it is either 'list' or a keyword to load content onto the clipboard. You will implement that code later. For now, just put a TODO comment there ➌.

### Step 3: List Keywords and Load a Keyword’s Content

Finally, let’s implement the two remaining cases: the user wants to load clipboard text in from a keyword, or they want a list of all available keywords.

If there is only one command line argument, first let’s check whether it’s 'list' ➊. If so, a string representation of the list of shelf keys will be copied to the clipboard ➋. The user can paste this list into an open text editor to read it.

Otherwise, you can assume the command line argument is a keyword. If this keyword exists in the mcbShelf shelf as a key, you can load the value onto the clipboard ➌.

And that’s it! Launching this program has different steps depending on what operating system your computer uses. See Appendix B for details.

Recall the password locker program you created in Chapter 6 that stored the passwords in a dictionary. Updating the passwords required changing the source code of the program. This isn’t ideal, because average users don’t feel comfortable changing source code to update their software. Also, every time you modify the source code to a program, you run the risk of accidentally introducing new bugs. By storing the data for a program in a different place than the code, you can make your programs easier for others to use and more resistant to bugs.