<a href="https://colab.research.google.com/github/kilos11/PYTHON-_AUTOMATION-/blob/main/9_READING_AND_WRITING_FILES.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#**Files and File Paths**#
A file has two key properties: a filename (usually written as one word) and a path. The path specifies the location of a file on the computer. For example, there is a file on my Windows laptop with the filename project.docx in the path C:\Users\Al\Documents. The part of the filename after the last period is called the file’s extension and tells you a file’s type. The filename project.docx is a Word document, and Users, Al, and Documents all refer to folders (also called directories). Folders can contain files and other folders. For example, project.docx is in the Documents folder, which is inside the Al folder, which is inside the Users folder.

#**Backslash on Windows and Forward Slash on macOS and Linux**#
On Windows, paths are written using backslashes (\) as the separator between folder names. The macOS and Linux operating systems, however, use the forward slash (/) as their path separator. If you want your programs to work on all operating systems, you will have to write your Python scripts to handle both cases.

Fortunately, this is simple to do with the Path() function in the pathlib module. If you pass it the string values of individual file and folder names in your path, Path() will return a string with a file path using the correct path separators

In [None]:
from pathlib import Path

print(Path('spam', 'bacon', 'eggs'))
print(str(Path('spam', 'bacon', 'eggs')))

spam/bacon/eggs
spam/bacon/eggs


##**These Path objects (really, WindowsPath or PosixPath objects, depending on your operating system) will be passed to several of the file-related functions introduced in this chapter. For example, the following code joins names from a list of filenames to the end of a folder’s name:

In [None]:
from pathlib import Path

myF9iles = ['accounts.txt', 'details.csv', 'invite.docx']
for filename in myFiles:
    print(Path(r'C:\Users\Al', filename))

NameError: name 'myFiles' is not defined

#**Using the / Operator to Join Paths**#
We normally use the + operator to add two integer or floating-point numbers, such as in the expression 2 + 2, which evaluates to the integer value 4. But we can also use the + operator to concatenate two string values, like the expression 'Hello' + 'World', which evaluates to the string value 'HelloWorld'. Similarly, the / operator that we normally use for division can also combine Path objects and strings. This is helpful for modifying a Path object after you’ve already created it with the Path() function.

In [None]:
from pathlib import Path

print(Path('spam') / 'bacon' / 'eggs')
print(Path('spam') / Path('bacon/eggs'))
print((Path('spam') / Path('bacon', 'eggs')))

#Using the / operator with Path objects makes
#joining paths just as easy as string concatenation.
#It’s also safer than using string concatenation or the join() method
homeFolder = r'C:\Users\Al'
subFolder = 'spam'
print(homeFolder + '\\' + subFolder)
print('\\'.join([homeFolder, subFolder]))

spam/bacon/eggs
spam/bacon/eggs
spam/bacon/eggs
C:\Users\Al\spam
C:\Users\Al\spam


#**The Current Working Directory**
Every program that runs on your computer has a current working directory, or cwd. Any filenames or paths that do not begin with the root folder are assumed to be under the current working directory.

You can get the current working directory as a string value with the Path.cwd() function and change it using os.chdir().

In [None]:
from pathlib import Path
import os

Path.cwd()


PosixPath('/content')

#**The Home Directory**#
All users have a folder for their own files on the computer called the home directory or home folder. You can get a Path object of the home folder by calling Path.home():

In [None]:
Path.home()

PosixPath('/root')

#**Absolute vs. Relative Paths**#
There are two ways to specify a file path:

An absolute path, which always begins with the root folder
A relative path, which is relative to the program’s current working directory
There are also the dot (.) and dot-dot (..) folders. These are not real folders but special names that can be used in a path. A single period (“dot”) for a folder name is shorthand for “this directory.” Two periods (“dot-dot”) means “the parent folder.”




#**Creating New Folders Using the os.makedirs() Function**#
Your programs can create new folders (directories) with the os.makedirs() function.

In [None]:
import os

os.makedirs('C:\\delicious\\walnut\\waffles')

#To make a directory from a Path object, call the mkdir() method.
#For example, this code will create a spam folder under the home folder on my computer:
from pathlib import Path

Path(r'C:\Users\Al\spam').mkdir()

#Note that mkdir() can only make one directory at a time;
#it won’t make several subdirectories at once like os.makedirs().

FileExistsError: [Errno 17] File exists: 'C:\\delicious\\walnut\\waffles'

#**Handling Absolute and Relative Paths**#
The pathlib module provides methods for checking whether a given path is an absolute path and returning the absolute path of a relative path.

Calling the is_absolute() method on a Path object will return True if it represents an absolute path or False if it represents a relative path.

In [None]:
print(Path.cwd())
print(Path.cwd().is_absolute())
print(Path('spam/bacon/eggs').is_absolute())

/content
True
False


##*To get an absolute path from a relative path, you can put Path.cwd() / in front of the relative Path object. After all, when we say “relative path,” we almost always mean a path that is relative to the current working directory.

In [None]:
print(Path('my/relative/path'))
print(Path.cwd() / Path('my/relative/path'))

my/relative/path
/content/my/relative/path


##*If your relative path is relative to another path besides the current working directory, just replace Path.cwd() with that other path instead. The following example gets an absolute path using the home directory instead of the current working directory:

In [None]:
print(Path('my/relative/path'))
print(Path.home() / Path('my/relative/path'))

my/relative/path
/root/my/relative/path


##*The os.path module also has some useful functions related to absolute and relative paths:

##*calling os.path.abspath(path) will return a string of the absolute path of the argument. This is an easy way to convert a relative path into an absolute one.
##*Calling os.path.isabs(path) will return True if the argument is an absolute path and False if it is a relative path.
##*Calling os.path.relpath(path, start) will return a string of a relative path from the start path to path. If start is not provided, the current working directory is used as the start path.

In [None]:
print(os.path.abspath('.'))
print(os.path.abspath('.\\Scripts'))
print(os.path.isabs('.'))
print(os.path.isabs(os.path.abspath('.')))

/content
/content/.\Scripts
False
True


#**Getting the Parts of a File Path**#
Given a Path object, you can extract the file path’s different parts as strings using several Path object attributes. These can be useful for constructing new file paths based on existing ones.

The parts of a file path include the following:

The anchor, which is the root folder of the filesystem
On Windows, the drive, which is the single letter that often denotes a physical hard drive or other storage device
The parent, which is the folder that contains the file
The name of the file, made up of the stem (or base name) and the suffix (or extension)
Note that Windows Path objects have a drive attribute, but macOS and Linux Path objects don’t. The drive attribute doesn’t include the first backslash.


In [None]:
p = Path('C:/Users/Al/spam.txt')

print(p.anchor)
print(p.parent)
print(p.name)
print(p.stem)
print(p.suffix)
print(p.drive)


C:/Users/Al
spam.txt
spam
.txt



##*These attributes evaluate to simple string values, except for parent, which evaluates to another Path object.

##**The parents attribute (which is different from the parent attribute) evaluates to the ancestor folders of a Path object with an integer index:

In [None]:
print(Path.cwd())
print(Path.cwd().parents[0])
print(Path.cwd().parents[2])

##*The older os.path module also has similar functions for getting the different parts of a path written in a string value. Calling os.path.dirname(path) will return a string of everything that comes before the last slash in the path argument. Calling os.path.basename(path) will return a string of everything that comes after the last slash in the path argument.

In [None]:
calcFilePath = 'C:\\Windows\\System32\\calc.exe'

print(os.path.basename(calcFilePath))
print(os.path.dirname(calcFilePath))

C:\Windows\System32\calc.exe



##*If you need a path’s dir name and base name together, you can just call os.path.split() to get a tuple value with these two strings, like so:

In [None]:
calcFilePath = 'C:\\Windows\\System32\\calc.exe'

os.path.split(calcFilePath)


('', 'C:\\Windows\\System32\\calc.exe')

##*Notice that you could create the same tuple by calling os.path.dirname() and os.path.basename() and placing their return values in a tuple:

In [None]:
(os.path.dirname(calcFilePath), os.path.basename(calcFilePath))

('', 'C:\\Windows\\System32\\calc.exe')

#**Finding File Sizes and Folder Contents**#
##*Once you have ways of handling file paths, you can then start gathering information about specific files and folders. The os.path module provides functions for finding the size of a file in bytes and the files and folders inside a given folder.

##*Calling os.path.getsize(path) will return the size in bytes of the file in the path argument.
##*Calling os.listdir(path) will return a list of filename strings for each file in the path argument. (Note that this function is in the os module, not os.path.)

In [None]:
os.path.getsize('C:\\Windows\\System32\\calc.exe')

os.listdir('C:\\Windows\\System32')

##*As you can see, the calc.exe program on my computer is 27,648 bytes in size, and I have a lot of files in C:\Windows\system32. If I want to find the total size of all the files in this directory, I can use os.path.getsize() and os.listdir() together.

In [None]:
totalSize = 0

for filename in os.listdir('C:\\Windows\\System32'):
    totalSize = totalSize + os.path.getsize(os.path.join('C:\\Windows\\System32', filename))


#**Modifying a List of Files Using Glob Patterns**#
If you want to work on specific files, the glob() method is simpler to use than listdir(). Path objects have a glob() method for listing the contents of a folder according to a glob pattern. Glob patterns are like a simplified form of regular expressions often used in command line commands. The glob() method returns a generator object (which are beyond the scope of this book) that you’ll need to pass to list() to easily view in the interactive shell:

In [None]:
p = Path('C:/Users/Al/Desktop')

p.glob('*')
print(list(p.glob('*')))

[]


#**Checking Path Validity**#
##*Many Python functions will crash with an error if you supply them with a path that does not exist. Luckily, Path objects have methods to check whether a given path exists and whether it is a file or folder. Assuming that a variable p holds a Path object, you could expect the following:

##*Calling p.exists() returns True if the path exists or returns False if it doesn’t exist.
##*Calling p.is_file() returns True if the path exists and is a file, or returns False otherwise.
##*Calling p.is_dir() returns True if the path exists and is a directory, or returns False otherwise.

In [None]:
winDir = Path('C:/Windows')

notExistsDir = Path('C:/This/Folder/Does/Not/Exist')
calcFile = Path('C:/Windows/System32/calc.exe')
print(winDir.exists())
print(winDir.is_dir())
print(notExistsDir.exists())
print(calcFile.is_file())

False
False
False
False


##*You can determine whether there is a DVD or flash drive currently attached to the computer by checking for it with the exists() method. For instance, if I wanted to check for a flash drive with the volume named D:\ on my Windows computer, I could do that with the following:

In [None]:
dDrive = Path('D:/')
dDrive.exists()

False

#**The File Reading/Writing Process**#
##**Since every different type of binary file must be handled in its own way, this book will not go into reading and writing raw binary files directly. Fortunately, many modules make working with binary files easier—you will explore one of them, the shelve module, later in this chapter. The pathlib module’s read_text() method returns a string of the full contents of a text file. Its write_text() method creates a new text file (or overwrites an existing one) with the string passed to it.
##**Keep in mind that these Path object methods only provide basic interactions with files. The more common way of writing to a file involves using the open() function and file objects. There are three steps to reading or writing files in Python:

##**Call the open() function to return a File object.
##**Call the read() or write() method on the File object.
##**Close the file by calling the close() method on the File object.

In [None]:
from pathlib import Path

p = Path('spam.txt')
print(p.read_text())
print(p.write_text('Hello, world!'))
p.read_text()

#**Opening Files with the open() Function**#

##**To open a file with the open() function, you pass it a string path indicating the file you want to open; it can be either an absolute or relative path. The open() function returns a File object.

##**Try it by creating a text file named hello.txt using Notepad or TextEdit. Type Hello, world! as the content of this text file and save it in your user home folder.

In [None]:
helloFile = open(Path.home() / 'hello.txt')

##**The open() function can also accept strings. If you’re using Windows, enter the following into the interactive shell:

In [None]:
helloFile = open('C:\\Users\\your_home_folder\\hello.txt')

##**Reading the Contents of Files**#
##*Now that you have a File object, you can start reading from it. If you want to read the entire contents of a file as a string value, use the File object’s read() method. Let’s continue with the hello.txt File object you stored in helloFile.

In [None]:
helloContent = helloFile.read()
helloContent

##**If you think of the contents of a file as a single large string value, the read() method returns the string that is stored in the file.

##**Alternatively, you can use the readlines() method to get a list of string values from the file, one string for each line of text. For example, create a file named sonnet29.txt in the same directory as hello.txt and write the following text in it:

When, in disgrace with fortune and men's eyes,
I all alone beweep my outcast state,
And trouble deaf heaven with my bootless cries,
And look upon myself and curse my fate,

##**Make sure to separate the four lines with line breaks

In [None]:
sonnetFile = open(Path.home() / 'sonnet29.txt')
sonnetFile.readlines()

#**Writing to Files**#
##**Python allows you to write content to a file in a way similar to how the print() function “writes” strings to the screen. You can’t write to a file you’ve opened in read mode, though. Instead, you need to open it in “write plaintext” mode or “append plaintext” mode, or write mode and append mode for short.

##*Write mode will overwrite the existing file and start from scratch, just like when you overwrite a variable’s value with a new value. Pass 'w' as the second argument to open() to open the file in write mode. Append mode, on the other hand, will append text to the end of the existing file. You can think of this as appending to a list in a variable, rather than overwriting the variable altogether. Pass 'a' as the second argument to open() to open the file in append mode.

##*If the filename passed to open() does not exist, both write and append mode will create a new, blank file. After reading or writing a file, call the close() method before opening the file again.

In [None]:
baconFile = open('bacon.txt', 'w')
baconFile.write('Hello, world!\n')
baconFile.close()
baconFile = open('bacon.txt', 'a')
baconFile.write('Bacon is not a vegetable.')
baconFile.close()
baconFile = open('bacon.txt')
content = baconFile.read()
baconFile.close()
print(content)

Hello, world!
Bacon is not a vegetable.


#**Saving Variables with the shelve Module**#
##*You can save variables in your Python programs to binary shelf files using the shelve module. This way, your program can restore data to variables from the hard drive. The shelve module will let you add Save and Open features to your program. For example, if you ran a program and entered some configuration settings, you could save those settings to a shelf file and then have the program load them the next time it is run.

In [None]:
import shelve

shelfFile = shelve.open('mydata')
cats = ['Zophie', 'Pooka', 'Simon']
shelfFile['cats'] = cats
shelfFile.close()

##*After running the previous code on Windows, you will see three new files in the current working directory: mydata.bak, mydata.dat, and mydata.dir. On macOS, only a single mydata.db file will be created.

##*These binary files contain the data you stored in your shelf. The format of these binary files is not important; you only need to know what the shelve module does, not how it does it. The module frees you from worrying about how to store your program’s data to a file.

##*Your programs can use the shelve module to later reopen and retrieve the data from these shelf files. Shelf values don’t have to be opened in read or write mode—they can do both once opened.

In [None]:
shelfFile = shelve.open('mydata')
print(type(shelfFile))
print(shelfFile['cats'])
shelfFile.close()

<class 'shelve.DbfilenameShelf'>
['Zophie', 'Pooka', 'Simon']


##*Just like dictionaries, shelf values have keys() and values() methods that will return list-like values of the keys and values in the shelf. Since these methods return list-like values instead of true lists, you should pass them to the list() function to get them in list form.

In [None]:
shelfFile = shelve.open('mydata')
print(list(shelfFile.keys()))
print(list(shelfFile.values()))
shelfFile.close()

['cats']
[['Zophie', 'Pooka', 'Simon']]


##*Plaintext is useful for creating files that you’ll read in a text editor such as Notepad or TextEdit, but if you want to save data from your Python programs, use the shelve module.

#**Saving Variables with the pprint.pformat() Function**#
##*Recall from “Pretty Printing” on page 118 that the pprint.pprint() function will “pretty print” the contents of a list or dictionary to the screen, while the pprint.pformat() function will return this same text as a string instead of printing it. Not only is this string formatted to be easy to read, but it is also syntactically correct Python code. Say you have a dictionary stored in a variable and you want to save this variable and its contents for future use. Using pprint.pformat() will give you a string that you can write to a .py file. This file will be your very own module that you can import whenever you want to use the variable stored in it.

In [None]:
import pprint

cats = [{'name': 'Zophie', 'desc': 'chubby'}, {'name': 'Pooka', 'desc': 'fluffy'}]
print(pprint.pformat(cats))
fileObj = open('myCats.py', 'w')
print(fileObj.write('cats = ' + pprint.pformat(cats) + '\n'))
fileObj.close()

[{'desc': 'chubby', 'name': 'Zophie'}, {'desc': 'fluffy', 'name': 'Pooka'}]
83


##*Here, we import pprint to let us use pprint.pformat(). We have a list of dictionaries, stored in a variable cats. To keep the list in cats available even after we close the shell, we use pprint.pformat() to return it as a string. Once we have the data in cats as a string, it’s easy to write the string to a file, which we’ll call myCats.py.

##*The modules that an import statement imports are themselves just Python scripts. When the string from pprint.pformat() is saved to a .py file, the file is a module that can be imported just like any other.

##*And since Python scripts are themselves just text files with the .py file extension, your Python programs can even generate other Python programs. You can then import these files into scripts.

In [None]:
import myCats

print(myCats.cats)
print(myCats.cats[0])
print(myCats.cats[0]['name'])
print(myCats.cats[1]['desc'])


[{'desc': 'chubby', 'name': 'Zophie'}, {'desc': 'fluffy', 'name': 'Pooka'}]
{'desc': 'chubby', 'name': 'Zophie'}
Zophie
fluffy


##*The benefit of creating a .py file (as opposed to saving variables with the shelve module) is that because it is a text file, the contents of the file can be read and modified by anyone with a simple text editor. For most applications, however, saving data using the shelve module is the preferred way to save variables to a file. Only basic data types such as integers, floats, strings, lists, and dictionaries can be written to a file as simple text. File objects, for example, cannot be encoded as text.

#**Project: Generating Random Quiz Files**#
Say you’re a geography teacher with 35 students in your class and you want to give a pop quiz on US state capitals. Alas, your class has a few bad eggs in it, and you can’t trust the students not to cheat. You’d like to randomize the order of questions so that each quiz is unique, making it impossible for anyone to crib answers from anyone else. Of course, doing this by hand would be a lengthy and boring affair. Fortunately, you know some Python.

Here is what the program does:

Creates 35 different quizzes
Creates 50 multiple-choice questions for each quiz, in random order
Provides the correct answer and three random wrong answers for each question, in random order
Writes the quizzes to 35 text files
Writes the answer keys to 35 text files
This means the code will need to do the following:

Store the states and their capitals in a dictionary
Call open(), write(), and close() for the quiz and answer key text files
Use random.shuffle() to randomize the order of the questions and multiple-choice options

#**Step 1: Store the Quiz Data in a Dictionary**#
The first step is to create a skeleton script and fill it with your quiz data. Create a file named randomQuizGenerator.py, and make it look like the following:

In [14]:
#! python3
# randomQuizGenerator.py - Creates quizzes with questions and answers in
# random order, along with the answer key.
import random

 # The quiz data. Keys are states and values are their capitals.

us_state_to_capital = {
    'Alabama': 'Montgomery', 'Alaska': 'Juneau', 'Arizona': 'Phoenix', 'Arkansas': 'Little Rock',
    'California': 'Sacramento', 'Colorado': 'Denver', 'Connecticut': 'Hartford', 'Delaware': 'Dover',
    'Florida': 'Tallahassee',
    'Georgia': 'Atlanta',
    'Hawaii': 'Honolulu',
    'Idaho': 'Boise',
    'Illinois': 'Springfield',
    'Indiana': 'Indianapolis',
    'Iowa': 'Des Moines',
    'Kansas': 'Topeka',
    'Kentucky': 'Frankfort',
    'Florida': 'Tallahassee',
    'Georgia': 'Atlanta',
    'Hawaii': 'Honolulu',
    'Idaho': 'Boise',
    'Illinois': 'Springfield',
    'Indiana': 'Indianapolis',
    'Iowa': 'Des Moines',
    'Kansas': 'Topeka',
    'Kentucky': 'Frankfort',
    'Florida': 'Tallahassee',
    'Georgia': 'Atlanta',
    'Hawaii': 'Honolulu',
    'Idaho': 'Boise',
    'Illinois': 'Springfield',
    'Indiana': 'Indianapolis',
    'Iowa': 'Des Moines',
    'Kansas': 'Topeka',
    'Kentucky': 'Frankfort',
    'Florida': 'Tallahassee',
    'Georgia': 'Atlanta',
    'Hawaii': 'Honolulu',
    'Idaho': 'Boise',
    'Illinois': 'Springfield',
    'Indiana': 'Indianapolis',
    'Iowa': 'Des Moines',
    'Kansas': 'Topeka',
    'Kentucky': 'Frankfort'}

# Generate 35 quiz files.
for quizNum in range(35):
     # TODO: Create the quiz and answer key files.
