# Python Basics: Sections 12, 13, and 15

## Section 12: Working with Files

### Read from a Text File

#### TL;DR

The following shows how to read all text from the readme.txt tile into a string:

In [None]:
with open('readme.txt') as f:
    lines = f.readlines()

#### Steps for Reading a Text File in Python

To read a text file, follow these steps:
* Open a text file for reading by using the *open()* function
* Read text from the text file using the file *read()*, *readline()*, or *readlines()* method of the file object
* Close the file using the file *close()* method

##### open() function

In [None]:
# Syntax: open(path_to_file, mode)

For example, if the file *readme.txt* is stored in the *sample* folder as the program, you need to specify the path to the file as *c:/sample/readme.txt*

The *mode* is an optional parameter. It’s a string that specifies the mode in which you want to open the file. The following table shows available modes for opening a text file:

|Mode|Description|
|:----|----:|
|'r'|Open for text file for reading text|
|'w'|Open a text file for writing text|
|'a'|Open a text file for appending text|

In [None]:
# Ex. Opening a file that is stored in the same folder as the program.
f = open('the-zen-of-python.txt','r')

##### Reading text methods

The file object provides you with three methods for reading text from a text file:
* *read(size)* – read some contents of a file based on the optional size and return the contents as a string. If you omit the size, the *read()* method reads from where it left off till the end of the file. If the end of a file has been reached, the *read()* method returns an empty string.
* *readline()* – read a single line from a text file and return the line as a string. If the end of a file has been reached, the *readline()* returns an empty string.
* *readlines()* – read all the lines of the text file into a list of strings. This method is useful if you have a small file and you want to manipulate the whole text of that file.

##### close() method

The file that you open will remain open until you close it using the *close()* method.

It’s important to close the file that is no longer in use for the following reasons:
* First, when you open a file in your script, the file system usually locks it down so no other programs or scripts can use it until you close it.
* Second, your file system has a limited number of file descriptors that you can create before it runs out of them. Although this number might be high, it’s possible to open a lot of files and deplete your file system resources.
* Third, leaving many files open may lead to race conditions which occur when multiple processes attempt to modify one file at the same time and can cause all kinds of unexpected behaviors.

In [None]:
f.close()

# To close automatically:
# with open(path_to_file) as f:
#     contents = f.readlines()

#### Reading a Text File Examples

In [None]:
# read() method
with open('the-zen-of-python.txt') as f:
    contents = f.read()
    print(contents)
# Output:
# Beautiful is better than ugly.
# Explicit is better than implicit.
# Simple is better than complex.
# ...

# readlines() method to read the text tile and return as a string
with open('the-zen-of-python.txt') as f:
    [print(line) for line in f.readlines()]
# Output:
# Beautiful is better than ugly.
#
# Explicit is better than implicit.
#
# Simple is better than complex.
#
# Complex is better than complicated.
# ...

# strip() method, which removes the blank line
with open('the-zen-of-python.txt') as f:
    [print(line.strip()) for line in f.readlines()]

# readline() to read the text file line by line
with open('the-zen-of-python.txt') as f:
    while True:
        line = f.readline()
        if not line:
            break
        print(line.strip())
# Output:
# Explicit is better than implicit.
# Complex is better than complicated.
# Flat is better than nested.
# ...

#### A More Concise Way to Read a Text File Line by Line

In [None]:
with open('the-zen-of-python.txt') as f:
    for line in f:
        print(line.strip()) 

#### Read UFT-8 Text Files

The code in the previous examples works fine with ASCII text files. However, if you’re dealing with other languages such as Japanese, Chinese, and Korean, the text file is not a simple ASCII text file. And it’s likely a UTF-8 file that uses more than just the standard ASCII text characters.

To open a UTF-8 text file, you need to pass the *encoding='utf-8'* to the *open()* function to instruct it to expect UTF-8 characters from the file.

For the demonstration, you’ll use the following *quotes.txt* file that contains some quotes in Japanese.

In [None]:
with open('quotes.txt', encoding='utf8') as f:
    for line in f:
        print(line.strip())

### Write to a Text File

#### Steps for Writing to Text Files

To write to a text file in Python, follow these steps:
* Open the text file for writing (or append) using the *open()* function
* Write to the text file using the *write()* or *writelines()* method
* Close the file using the *close()* method

The *open()* function accepts many paramters. We'll focus on the first two parameters:
* The *file* paramter specifies the path to the text file that you want to open for writing
* The *mode* paramter specifies the mode for which you want to open the text file

|Mode|Description|
|:----|----:|
|'w'|Open a text file for writing. If the file exists, the function will truncate all the contents as soon as you open it. If the file doesn't exist, the function creates a new file.|
|'a'|Open a text file for appending text. If the file exists, the fucntion append contents at the end of the file.|
|'+'|Open a text file for updating (both reading & writing)|

The *open()* function returns a file object that has two useful methods for writing text to the file: *write()* and *writelines()*
* The *write()* method writes a string to a text file
* The *writelines()* method write a list of strings to a file at once4

The *writelines()* method accepts an iterable object, not just a list, so you can pass a tuple of strings, a set of strings, etc., to the *writelines()* method.

To write a new line to a text file, you need to manually add a new line character:

In [None]:
f.write('\n')
f.writelines('\n')

#### Writing Text File Examples

In [None]:
# Shows how to use the write() function to write a list of text file:
lines = ['Readme', 'How to write text files in Python']
with open('readme.txt', 'w') as f:
    for line in lines:
        f.write(line)
        f.write('\n')

# Write a list of text
lines = ['Readme', 'How to write text files in Python']
with open('readme.txt', 'w') as f:
    f.writelines(lines)

# If you treat each element of the list as a line, you need to concatenate it with the newline character:
lines = ['Readme', 'How to write text files in Python']
with open('readme.txt', 'w') as f:
    f.write('\n'.join(lines))

#### Appending Text Files

To append to a text file, you need to open the etext file for appending mode.

In [None]:
more_lines = ['', 'Append text files', 'The End']

with open('readme.txt', 'a') as f:
    f.write('\n'.join(more_lines))

#### Writing to a UTF-8 Text File

If you write UTF-8 characters to a text file using the code from the previous example, you get an error:

**UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-44: character maps to <*undefined>**

To open a file and write UTF-8 characters to a file, you need to pass the *encoding='utf-8'* parameter to the *open()* function.

In [None]:
quote = '成功を収める人とは人が投げてきたレンガでしっかりした基盤を築くことができる人のことである。'

with open('quotes.txt', 'w', encoding='utf-8') as f:
    f.write(quote)

### Create a New Text File

#### Using the open() Function to Create a New Text File

To create a new text file, use the *open()* function.

In [None]:
# Syntax: f = open(path_to_file, mode)

For creating a new text file, use one of the following modes:
* *'w'* -- open a file for writing. If the file doesn't exist, the *open()* function creates a new file. Otherwise, it'll overwrite the contents of the existing file.
* *'x'* -- open a file for exclusive creation. If the file exists, the *open()* function raises an error (*FileExistsError*). Otherwise, it'll create the text file.

In [None]:
with open('readme.txt', 'w') as f:
    f.write('Create a new text file!')

If you want to create a file in a specified directory, you need to ensure that the *docs* directory exists before creating the file. Otherwise, you'll get an exception.

In [None]:
with open('docs/readme.txt', 'w') as f:
    f.write('Create a new text file!')

# Outputs Error:
# FileNotFoundError: [Errno 2] No such file or directory: 'docs/readme.txt'

In this example, Python **raises an exception** because the *docs* directory doesn't esits. Therefore, it could not create the *readme.txt* file in that directory. To fix the issue, you need to create the *docs* directory first and then create the *readme.txt* file in that folder.

In [None]:
try:
    with open('docs/readme.txt', 'w') as f:
        f.write('Create a new text file!')
except FileNotFoundError:
    print("The 'docs' directory does not exist")

# Output:
# The 'docs' directory does not exist

If you don't want to create a new text file in case it already exists, you can use the *'x'* mode when calling the *open()* function:

In [None]:
with open('readme.txt', 'x') as f:
    f.write('Create a new text file!')

### Check if a File Exists

When processing files, you'll often wnat to check if a file exists before doing something else with it such as **reading from the file** or **writing to it**. To do it, you can use the *exists()* function from the *os.path* module or *is_file()* method from the *Path* class in the *pathlib* module.

#### os.path.exists() function

In [None]:
from os.path import exists

file_exists = exists(path_to_file)

#### Path.is_file() method

In [None]:
from pathlib import Path

path = Path(path_to_file)

path.is_file()

#### Using os.path.exists() function to check if a file exists

To check if a file exists, you pass the file path to the *exists()* function from the *os.path* standard library
* First, import the *os.path* standard library
* Second, call the *exists()* function

If the file exists, the *exists()* function returns *True*. Otherwise, it returns *False*. If the file is in the same folder as the program, the *path_to_file* is just simply the file name. However, it's not the case, you need to pass the full file path of the file.

Even if you run the program on Windows, you should use the forward-slash (*/*) to separate the path. It'll work across Windows, macOS, and Linux.

The following example uses the *exists()* function to check if the *readme.txt* file exists in the same folder as the program:

In [None]:
import os.path

file_exists = os.path.exists('readme.txt')

print(file_exists)

To make the call to the *exists()* function shorter and more obvious, you can import that function and rename it to *file_exists()* function like so:

In [None]:
from os.path import exists as file_exists

file_exists('readme.txt')

#### Using the pathlib module to check if a file exists

The *pathlib* module allows you to manipulate files and folders using the object-oriented approach.
* First, import the *Path* class from the *pathlib* module
* Then instantiate a new instance of the *Path* class and initialize it witht he file path that you want to check for existence
* Finally, check if the file exists using the *is_file()* method

If the file doesn't exist, the *is_file()* method returns *False*. Otherwise, it returns *True*. The following example shows how to sue the *Path* class from the *pathlib* module to check if the *readme.txt* file exists in the same folder of the program

In [None]:
from pathlib import Path

path_to_file = 'readme.txt'
path = Path(path_to_file)

if path.is_file():
    print(f'The file {path_to_file} exists')
else:
    print(f'The file {path_to_file} does not exists')

# Output:
# The file readme.txt exists

### Read CSV Files

#### What is a CSV File

CSV stands for comma-separated values. A CSV file is a delimited text file that uses a comma to separate values. A CSV file consists of one or more lines. Each line is a data record. And each data record consists of one or more values separated by commas. In addition, all the lines of a CSV file have the same number of vlaues. Typically, you use a CSV file to store tabular data in plain text. The CSV file format is quite popular and supported by many software applications such as Microsoft Excel and Google Spreadsheet.

|name|area|country_code2|country_code3|
|:---|:---:|:---:|---:|
|Afghanistan|652090|AF|AFG|
|Albania|28748|AL|ALB|
|Algeria|2381741|DZ|DZA|
|American Samoa|199|AS|ASM|
|Andorra|468|AD|AND|
|Angola|1246700|AO|AGO|
|Anguilla|96|AI|AIA|
|etc.|etc.|etc.|etc.|

#### Reading a csv file in Python

To read a CSV file in Python, you follow these steps:
* First, import the csv module
* Second, open the CSV file using the built-in open() function
* Note: If the CSV contains UTF-8 characters, you need to specify the encoding
* Third, pass the file object (*f*) to the *reader()* function of the *csv* module. The *reader()* function returns a csv reader object

In [None]:
import csv

f = open('path/to/csv_file')

f = open('path/to/csv_file', encoding='UTF8')

csv_reader = csv.reader(f)

The *csv_reader* is an **iterable** object of liens from the CSV file. Therefore, you can iterate over the lines of the CSV file using a *for* loop

In [None]:
for line in csv_reader:
    print(line)

Each line is a list of values. To access each value, you use the square bracket notation *[]*. The first value has an index of 0. The second value has an index of 1, and so on. Finally, always close the file once you're no longer access it by calling the *close()* mthod of the file object. It'll be easier to use the *with* statement so that you don't need to explicitly call the *close()* method.

In [None]:
line[0]

f.close()

The following illustrates all the steps for reading a CSV file:

In [None]:
import csv

with open('path/to/csv_file', 'r') as f:
    csv_reader = csv.reader(f)
    for line in csv_reader:
        # process each line
        print(line)

#### Reading a CSV file examples

We'll use the *country.csv* file that contains country infomration including name, area, 2-letter country code, 3-letter counry code:

The following shows how to read the *country.csv* file and display each line to the screen:

In [None]:
import csv

with open('country.csv', encoding = "utf8") as f:
    csv_reader = csv.reader(f)
    for line in csv_reader:
        print(line)

# Output:
# ['name', 'area', 'country_code2', 'country_code3']
# ['Afghanistan', '652090.00', 'AF', 'AFG']
# ['Albania', '28748.00', 'AL', 'ALB']
# ['Algeria', '2381741.00', 'DZ', 'DZA']
# ['American Samoa', '199.00', 'AS', 'ASM']
# ...

The *country.csv* has the first line as the header. To separate the header and data, you use the *enumerate()* function to get the index of each line:

In [None]:
import csv

with open('country.csv', encoding="utf8") as f:
    csv_reader = csv.reader(f)
    for line_no, line in enumerate(csv_reader, 1):
        if line_no == 1:
            print('Header:')
            print(line)  # header
            print('Data:')
        else:
            print(line)  # data

In this example, we use the *enumerate()* function and specify the index of the first line as 1.

Inside the loop, if the *line_no* is 1, the line is the header. Otherwise, it’s a data line.

Another way to skip the header is to use the *next()* function. The *next()* function forwards to the reader to the next line.

In [None]:
import csv

with open('country.csv', encoding="utf8") as f:
    csv_reader = csv.reader(f)

    # skip the first row
    next(csv_reader)

    # show the data
    for line in csv_reader:
        print(line)

The following reads the *country.csv* file and calculate the total areas of all countries:

In [None]:
import csv

total_area = 0

# calculate the total area of all countries

with open('country.csv', encoding="utf8") as f:
    csv_reader = csv.reader(f)

    # skip the header
    next(csv_reader)

    # calculate total
    for line in csv_reader:
        total_area += float(line[1])

print(total_area)

# Output:
# 148956306.9

#### Reading a CSV file using the DictReader class

When you use the csv.reader() function, you can access values of the CSV file using the bracket notation such as line[0], line[1], and so on. However, using the csv.reader() function has two main limitations:

* First, the way to access the values from the CSV file is not so obvious. For example, the line[0] implicitly means the country name. It would be more expressive if you can access the country name like line['country_name'].
* Second, when the order of columns from the CSV file is changed or new columns are added, you need to modify the code to get the right data.
This is where the DictReader class comes into play. The DictReader class also comes from the csv module.

The DictReader class allows you to create an object like a regular CSV reader. But it maps the information of each line to a dictionary (dict) whose keys are specified by the values of the first line.

By using the DictReader class, you can access values in the country.csv file like line['name'], line['area'], line['country_code2'], and line['country_code3'].

The following example uses the DictReader class to read the country.csv file:

In [None]:
import csv

with open('country.csv', encoding="utf8") as f:
    csv_reader = csv.DictReader(f)
    # skip the header
    next(csv_reader)
    # show the data
    for line in csv_reader:
        print(f"The area of {line['name']} is {line['area']} km2")

# Output:
# The area of Afghanistan is 652090.00 km2
# The area of Albania is 28748.00 km2
# The area of Algeria is 2381741.00 km2        
# ...

If you want to have different field names other than the ones specified in the first line, you can explicitly specify them by passing a list of field names to the *DictReader()* constructor like this:

In [None]:
import csv

fieldnames = ['country_name', 'area', 'code2', 'code3']

with open('country.csv', encoding="utf8") as f:
    csv_reader = csv.DictReader(f, fieldnames)
    next(csv_reader)
    for line in csv_reader:
        print(f"The area of {line['country_name']} is {line['area']} km2")

### Write CSV Files

#### Steps for Writing a CSV File

To write data into a CSV file, you follow these steps:

* First, open the CSV file for writing (w mode) by using the open() function.
* Second, create a CSV writer object by calling the writer() function of the csv module.
* Third, write data to CSV file by calling the writerow() or writerows() method of the CSV writer object.
* Finally, close the file once you complete writing data to it.

In [None]:
import csv

# open the file in the write mode
f = open('path/to/csv_file', 'w')

# create the csv writer
writer = csv.writer(f)

# write a row to the csv file
writer.writerow(row)

# close the file
f.close()

It’ll be shorter if you use the with statement so that you don’t need to call the *close()* method to explicitly close the file:

In [None]:
import csv

# open the file in the write mode
with open('path/to/csv_file', 'w') as f:
    # create the csv writer
    writer = csv.writer(f)

    # write a row to the csv file
    writer.writerow(row)

If you’re dealing with non-ASCII characters, you need to specify the character encoding in the open() function.

The following illustrates how to write UTF-8 characters to a CSV file:

In [None]:
import csv

# open the file in the write mode
with open('path/to/csv_file', 'w', encoding='UTF8') as f:
    # create the csv writer
    writer = csv.writer(f)

    # write a row to the csv file
    writer.writerow(row)

#### Writing to CSV Files Example

In [None]:
import csv  

header = ['name', 'area', 'country_code2', 'country_code3']
data = ['Afghanistan', 652090, 'AF', 'AFG']

with open('countries.csv', 'w', encoding='UTF8') as f:
    writer = csv.writer(f)

    # write the header
    writer.writerow(header)

    # write the data
    writer.writerow(data)

If you open the *countries.csv*, you’ll see one issue that the file contents have an additional blank line between two subsequent rows:

name, area, country_code2, country_code3

Afghanistan,652090, AF, AFG

To remove the blank line, you pass the keyword argument *newline=''* to the *open()* function as follows:

In [None]:
import csv

header = ['name', 'area', 'country_code2', 'country_code3']
data = ['Afghanistan', 652090, 'AF', 'AFG']


with open('countries.csv', 'w', encoding='UTF8', newline='') as f:
    writer = csv.writer(f)

    # write the header
    writer.writerow(header)

    # write the data
    writer.writerow(data)

# Output:
# name, area, country_code2, country_code3
# Afghanistan,652090, AF, AFG

#### Writing multiple rows to CSV files

To write multiple rows to a CSV file at once, you use the *writerows()* method of the CSV writer object.

The following uses the *writerows()* method to write multiple rows into the *countries.csv* file:

In [None]:
import csv

header = ['name', 'area', 'country_code2', 'country_code3']
data = [
    ['Albania', 28748, 'AL', 'ALB'],
    ['Algeria', 2381741, 'DZ', 'DZA'],
    ['American Samoa', 199, 'AS', 'ASM'],
    ['Andorra', 468, 'AD', 'AND'],
    ['Angola', 1246700, 'AO', 'AGO']
]

with open('countries.csv', 'w', encoding='UTF8', newline='') as f:
    writer = csv.writer(f)

    # write the header
    writer.writerow(header)

    # write multiple rows
    writer.writerows(data)

#### Writing to CSV files using the DictWriter class

If each row of the CSV file is a dictionary, you can use the *DictWriter* class of the csv module to write the dictionary to the CSV file.

The example illustrates how to use the DictWriter class to write data to a CSV file:

In [None]:
import csv

# csv header
fieldnames = ['name', 'area', 'country_code2', 'country_code3']

# csv data
rows = [
    {'name': 'Albania',
    'area': 28748,
    'country_code2': 'AL',
    'country_code3': 'ALB'},
    {'name': 'Algeria',
    'area': 2381741,
    'country_code2': 'DZ',
    'country_code3': 'DZA'},
    {'name': 'American Samoa',
    'area': 199,
    'country_code2': 'AS',
    'country_code3': 'ASM'}
]

with open('countries.csv', 'w', encoding='UTF8', newline='') as f:
    writer = csv.DictWriter(f, fieldnames=fieldnames)
    writer.writeheader()
    writer.writerows(rows)

How it works.
* First, define variables that hold the field names and data rows of the CSV file.
* Next, open the CSV file for writing by calling the open() function.
* Then, create a new instance of the DictWriter class by passing the file object (f) and fieldnames argument to it.
* After that, write the header for the CSV file by calling the writeheader() method.
* Finally, write data rows to the CSV file using the writerows() method.

### Rename a File

To rename a file, you use the *os.rename()* function. If the src file does not exist, the *os.rename()* function raises a *FileNotFound* error. Similarly, if the *dst* already exists, the *os.rename()* function issues a *FileExistsError* error.

For example, the following uses the os.rename() function to rename the file readme.txt to notes.txt:

In [None]:
import os

os.rename('readme.txt', 'notes.txt')

To avoid an error if the *readme.txt* doesn’t exist and/or the *notes.txt* file already exists, you can use the *try...except* statement:

In [None]:
import os

try:
    os.rename('readme.txt', 'notes.txt')
except FileNotFoundError as e:
    print(e)
except FileExistsError as e:
    print(e)

# Output shows file does not exist:
# [WinError 2] The system cannot find the file specified: 'readme.txt' -> 'notes.txt'

# Output shows file already exists:
# [WinError 183] Cannot create a file when that file already exists: 'readme.txt' -> 'notes.txt'

### Delete a File

To delete a file, you use the *remove()* function of the os built-in module. For example, the following uses the *os.remove()* function to delete the readme.txt file. If the readme.txt file doesn't exist, the *os.remove()* function will issue an error:

In [None]:
import os

os.remove('readme.txt')

# Output:
# Error: FileNotFoundError: [WinError 2] The system cannot find the file specified: 'readme.txt'

To avoid the error, you can check the file exists before deleting it or use the *try...except* statement to catch teh exception if the file doesn't exist:

In [None]:
# check if file exists
import os

filename = 'readme.txt'
if os.path.exists(filename):
    os.remove(filename)

# try...except statement
import os

try:
    os.remove('readme.txt')
except FileNotFoundError as e:
    print(e)

## Section 13: Working Directories

### Working with Directories

#### Get the current working directory

The current working directory is the directory where the Python script is running. To get the current working directory, you use the *os.getcwd()*. To change the curretn working directory, you use the function *os.chdir()*.

In [None]:
# get current directory
import os


cwd = os.getcwd()
print(cwd)


# change directory
import os


os.chdir('/script')
cwd = os.getcwd()
print(cwd)

#### Join and split a path

To make a program work across platforms including Windows, Linux, and macOS, you need to use platform-independent file and directory paths.Python provides you with a submodule os.path that contains several useful functions and constants to join and split paths.

The *join()* function joins path components together and returns a path with the corresponding path separator. For example, it uses backslash (\\) on Windows and forward slash (/) on macOS or Linux.

The *split()* function splits a path into components without a path separator. Here’s an example of using *join()* and *split()* functions:

In [None]:
import os

fp = os.path.join('temp', 'python')
print(fp)  # temp\python (on Windows)

pc = os.path.split(fp)
print(pc)  # ('temp', 'python')


#### Test if a path is a directory

To check if a path exists and is a directory, you can use the functions *os.path.exists()* and *os.path.isdir()* functions. For example:

In [None]:
import os

dir = os.path.join("C:\\", "temp")
print(dir)

if os.path.exists(dir) or os.path.isdir(dir):
    print(f'The {dir} is a directory')

#### Create a directory

To create a new directory, you use *os.mkdir()* function. And you should always check if a directory exists first before creating a new directory.

The following example creates a new directory called python under the *c:\temp* directory.

In [None]:
import os

dir = os.path.join("C:\\", "temp", "python")
if not os.path.exists(dir):
    os.mkdir(dir)

#### Rename a directory

To rename the directory, you use the *os.rename()* function:

In [None]:
import os

oldpath = os.path.join("C:\\", "temp", "python")
newpath = os.path.join("C:\\", "temp", "python3")

if os.path.exists(oldpath) and not os.path.exists(newpath):
    os.rename(oldpath, newpath)
    print("'{0}' was renamed to '{1}'".format(oldpath, newpath))


#### Delete a directory

To delete a directory, you use the *os.rmdir()* function as follows:

In [None]:
import os

dir = os.path.join("C:\\","temp","python")
if os.path.exists(dir):
    os.rmdir(dir)
    print(dir + ' is removed.')

#### Traverse a directory recursively

The *os.walk()* function allows you to traverse a directory recursively. The *os.walk()* function returns the root directory, the sub-directories, and files.

The following example shows how to print all files and directories in the *c:\temp* directory:

In [None]:
import os

path = "c:\\temp"
for root, dirs, files in os.walk(path):
    print("{0} has {1} files".format(root, len(files)))

### List Files in a Directory

Sometimes, you may want to list all files from a directory for processing. For example, you might want to find all images of a directory and resize each of them. To list all files in a directory, you can use the *os.walk()* function.

The *os.walk()* function generates file names in a directory by walking the tree either top-down or bottom-up. The *os.walk()* function yields a tuple with three fields (dirpath, dirnames, and filenames) for each directory in the directory tree.

Note that the *os.walk()* function examines the whole directory tree. Therefore, you can use it to get all files from all directories and their subdirectories of a root directory.

#### Python list file example

Suppose you have a folder *D:\web* with the following directories and files:

In [None]:
# D:\web
# ├── assets
# |  ├── css
# |  |  └── style.css
# |  └── js
# |     └── app.js
# ├── blog
# |  ├── read-file.html
# |  └── write-file.html
# ├── about.html
# ├── contact.html
# └── index.html

The following example shows how to use the *os.walk()* function to list all HTML files from the *D:\web* directory:

In [None]:
import os


path = 'D:\\web'

html_files = []

for dirpath, dirnames, filenames in os.walk(path):
    for filename in filenames:
        if filename.endswith('.html'):
            html_files.append(os.path.join(dirpath, filename))

for html_file in html_files:
    print(html_file)


# Output:
# D:\web\about.html
# D:\web\contact.html
# D:\web\index.html
# D:\web\blog\read-file.html
# D:\web\blog\write-file.html

How it works:
* First, initialize a list to store the path to HTML files
* Second, call *os.walk()* function to examine directories of the *D:\web* folder
* Note: The dirpath stores the directory and filenames store files in that directory.
* Third, loop over the filenames and add them to the *html_files* list if their extensions are *.html*
* Note: The *os.path.join()* returns the full path of the filename by joining the dirpath with the filename.
* Finally, print output the filenames in the *html_files* list

In [None]:
html_files = []

# ...
for filename in filenames:
        if filename.endswith('.html'):
            html_files.append(os.path.join(dirpath, filename))

for html_file in html_files:
    print(html_file)

#### Defining a reusable list files funciton

By using the *os.walk()* function, we can define a reusable *list_files()* function like this:

In [None]:
import os


def list_files(path, extentions=None):
    """ List all files in a directory specified by path
    Args:
        path - the root directory path
        extensions - a iterator of file extensions to include, pass None to get all files.
    Returns:
        A list of files specified by extensions
    """
    filepaths = []
    for root, _, files in os.walk(path):
        for file in files:
            if extentions is None:
                filepaths.append(os.path.join(root, file))
            else:
                for ext in extentions:
                    if file.endswith(ext):
                        filepaths.append(os.path.join(root, file))

    return filepaths


if __name__ == '__main__':
    filepaths = list_files(r'D:\web', ('.html', '.css'))
    for filepath in filepaths:
        print(filepath)


# Output:
# D:\web\about.html
# D:\web\contact.html
# D:\web\index.html
# D:\web\assets\css\style.css
# D:\web\blog\read-file.html
# D:\web\blog\write-file.html

#### Make list files function more efficient

If the number of files is small, the *list_files()* function works fine. However, when the number of files is large, returning a large list of files is not memory efficient.

To resolve this, you can use a **generator** to yield each file at a time instead of returning a list:

In [None]:
import os


def list_files(path, extentions=None):
    """ List all files in a directory specified by path
    Args:
        path - the root directory path
        extensions - a iterator of file extensions to include, pass None to get all files.
    Returns:
        A list of files specified by extensions
    """
    for root, _, files in os.walk(path):
        for file in files:
            if extentions is None:
                yield os.path.join(root, file)
            else:
                for ext in extentions:
                    if file.endswith(ext):
                        yield os.path.join(root, file)


if __name__ == '__main__':
    filepaths = list_files(r'D:\web', ('.html', '.css'))
    for filepath in filepaths:
        print(filepath)

## Section 15: Strings

### F-Strings

#### Introduction to the Python F-strings

Python 3.6 introduced the f-strings that allow you to format text strings faster and more elegant. The f-strings provide a way to embed **variables** and expressions inside a string literal using a clearer syntax than the *format()* method.

In [None]:
name = 'John'
s = f'Hello, {name}!'
print(s)


# Output:
# Hello, John!

How it works.
* First, define a variable with the value 'John'.
* Then, place the name variable inside the curly braces {} in the literal string. Note that you need to prefix the string with the letter f to indicate that it is an f-string. It’s also valid if you use the letter in uppercase (F).
* Third, print out the string s.

It’s important to note that Python evaluates the expressions in f-string at runtime. It replaces the expressions inside an f-string with their values.

#### Python f-string examples

In [None]:
# upper() method
name = 'John'
s = F'Hello, {name.upper()}!'
print(s)

# Output: Hello, JOHN!


# multiple curly braces inside an f-string
first_name = 'John'
last_name = 'Doe'
s = F'Hello, {first_name} {last_name}!'
print(s)

# Output: Hello, John Doe!


# join()
first_name = 'John'
last_name = 'Doe'
s = F'Hello, {" ".join((first_name, last_name))}!'

print(s)

# Output: Hello, John Doe!

#### Multiline f-strings

In [None]:
name = 'John'
website = 'PythonTutorial.net'

message = (
    f'Hello {name}. '
    f"You're learning Python at {website}." 
)

print(message)


# spread
name = 'John'
website = 'PythonTutorial.net'

message = f'Hello {name}. ' \
          f"You're learning Python at {website}." 

print(message)

#### Curly braces

In [None]:
s = f'{{1+2}}'
print(s)

# Output: {1+2}


s = f'{{{1+2}}}'
print(s)

# same output as above


s = f'{{{{1+2}}}}'
print(s)

# Output: {{1+2}}

#### The evaluation order of expressions in Python f-strings

In [None]:
def inc(numbers, value):
    numbers[0] += value
    return numbers[0]

numbers = [0]

s = f'{inc(numbers,1)},{inc(numbers,2)}'
print(s)

# Output: 1, 3

inc(numbers,1)

#### Format numbers using f-strings

In [None]:
# hex
number = 16
s = f'{number:x}'
print(s)  # 10


# sci notation
number = 0.01
s = f'{number:e}'
print(s)  # 1.000000e-02


# f-string
number = 200
s = f'{number: 06}'
print(s)  # 00200


# format f-string
number = 9.98567
s = f'{number: .2f}'
print(s)  # 9.99

### Raw Strings

In [None]:
s = '\n'
raw_string = repr(s)[1:-1]
print(raw_string)

### Backslash

In [None]:
colors = ['red','green','blue']
rgb = '\n'.join(colors)
s = f"The RGB colors are:\n{rgb}"
print(s)