# Using the filesystem in Python

This chapter is about using the filesystem: creating, moving, referring to files...

To understand how to read and write files, see [10: Reading and Writing Files](../10_reading-writing-files/10_reading-writing-files.ipynb).

## `os` and `os.path` vs. `pathlib`

The `os` and `os.path` modules constitute the traditional way in which file paths and filesystem operations have been handled in Python.

Since Python 3.5, `pathlib` is available and provides a more modern way of doing the same operations.

## Paths and pathnames

All operating systems refer to files and directories with strings that represent them. These strings are often called *pathnames* or simply *paths*.

The fact that paths are both strings and the underlying *path object* they represent introduces some complications you should be aware.

Filesystems on almost all OS are modeled as tree structures, with a disk being the root and folders, subfolders, and so on being branches, subbranches, etc.

Thus, independently of the OS the way to refer to a file or folder is with a pathname that specifies the path to follow from the root of the filesystem tree to the file in question.

However, each OS defines their own conventions regarding the precise syntax of such paths (separator, the way to identify the root of the filesystem, the way to handle uppercase/lowercase, etc.)

### Absolute and relative paths

+ **Absolute paths** specify the exact location of a file in a filesystem without any ambiguity. They do this by specifying the entire path to the file, starting from the root of the filesystem.

+ **Relative paths** specify the position of a file relative to some other point in the filesystem, which isn't specified in the relative pathname itself.


### The current working dir

To identify a file from a relative path, you need some other context to anchor it.

Many times is via an implicit reference to the *current working directory*, which is the particular directory where a Python program considers itself to be at any point during its execution.

The directory that a Python program is in is called the *current working directory* for that program. This directory may be different from the directory the program resides in.

#### Accesing directories with `pathlib` (preferred)

The following snippter returns the current working directory, as a *Path* object (not a string):

In [7]:
import pathlib

cur_path = pathlib.Path()
cur_path.cwd()

PosixPath('/home/ubuntu/Development/git-repos/side_projects/python-workbench/part_1-python-fundamentals/03_basics-deep-dive/09_filesystem/sample_data')

You can construct paths using the `Path.joinpath()` method:

In [28]:
from pathlib import Path

cur_path = Path()
cur_path.joinpath("bin", "utils", "disktools")

PosixPath('bin/utils/disktools')

The division operator `/` is overloaded with `Path` objects to let you build paths with a cleaner syntax:

In [29]:
from pathlib import Path

cur_path = Path()
cur_path / "bin" / "utils" / "disktools"

PosixPath('bin/utils/disktools')

The `parts` property returns a tuple with all the components of a Path object:

In [30]:
from pathlib import Path

cur_path = Path()
some_path = cur_path / "bin" / "utils" / "disktools"
some_path.parts

('bin', 'utils', 'disktools')

+ The `name` property returns only the basename of the path, that is the single file or directory at the end of a path.
+ The `parent` property returns the path up to, but not including the last name.
+ The `suffix` property returns the dotted extension of a file (if available)

In [31]:
from pathlib import Path

# path whose basename is a file
some_path = Path("path", "to", "img.png")

assert some_path.name == "img.png"
assert some_path.parent == Path("path", "to")
assert some_path.suffix == ".png"

In [32]:
from pathlib import Path

# path whose basename is a directory
some_path = Path("path", "to", "some", "dir")

assert some_path.name == "dir"
assert some_path.parent == Path("path", "to", "some")
assert some_path.suffix == ""

`Path` objects also support referring to user and home directories using the `Path.expanduser()` and `Path.home()` methods:

In [90]:
from pathlib import Path

assert Path("~").expanduser() == Path("/home/ubuntu")
assert Path().home() == Path("/home/ubuntu")

#### Accessing directories using `os` (legacy)

The `os.getcwd()` function returns the current working directory as a string.


In [1]:
import os

os.getcwd()

'/home/ubuntu/Development/git-repos/side_projects/python-workbench/part_1-python-fundamentals/03_basics-deep-dive/09_filesystem'

The constant `os.curdir` returns a string representing the current directory (i.e., `.` in Linux and Windows):

In [3]:
import os

os.curdir

'.'

Thus, to list the files in the current dir, you can do:

In [4]:
import os

os.listdir(os.curdir)

['.venv', 'README.md', '09_filesystem.ipynb', 'requirements.txt']

You can use `os.chdir` to change the CWD and have a look at a particular folder:

In [5]:
import os

os.chdir('sample_data')
print(os.getcwd())
os.listdir()

/home/ubuntu/Development/git-repos/side_projects/python-workbench/part_1-python-fundamentals/03_basics-deep-dive/09_filesystem/sample_data


['file1.txt', 'file2.txt', 'file3.txt']

To construct a path, you can use `os.path.join` function passing a variable number of directory names or filenames, which will be joined to form a single string that can be used as a path. 

| NOTE: |
| :---- |
| It is not necessary to do `import os.path`. Importing `os` brings `os.path` with it. |

In [8]:
import os

os.path.join("bin", "utils", "disktools")

'bin/utils/disktools'

`os.path.join` also accepts subpaths, which will be then joined to make longer pathnames:

In [9]:
import os

os.path.join("mydir/bin", "utils/disktools/chkdsk")

'mydir/bin/utils/disktools/chkdsk'

`os.path.join` can work transparently with Linux and Windows path styles:

In [12]:
import os

win_path = os.path.join("mydir\\bin", "utils\\disktools\\chkdsk")
print(win_path)

mydir\bin/utils\disktools\chkdsk


Note that the path looks strange. First you need to pass `\\` as it is the directory separator in Windows. Also, when printing it, you find a mixture of Linux and Windows path separators.

It is much better to write the previous example in a more portable manner using:

In [13]:
import os

path1 = os.path.join("mydir", "bin")
path2 = os.path.join("utils", "disktools", "chkdsk")
final_path = os.path.join(path1, path2)
print(final_path)

mydir/bin/utils/disktools/chkdsk


The previous path will work well in both Windows and Linux.

`os.path.join` also understands quite well absolute vs. relative pathnames.

In Linux is easy: *absolute* paths begin with `/`, which refers to the topmost directory of the entire system, even if it has multiple disks.

In Windows is a bit more complex:
+ A path beginning with a drive letter followed by a colon and a backslash and then a path is an absolute path (`C:\Program Files\Doom`). Note that `C:` is not the top-level directory of `C:`, but `C:\` is.
+ A path beginning with neither a drive letter nor a backslash is a relative path: `side_projects\python\python_games`.
+ A path beginning with `\\` followed by the name of a server is the path to a network resource: `\\win_2012\shared`
+ Anything else can be considered an invalid pathname.

`os.path.join` doesn't perform validity checks on the paths it is constructing. You might end up with a path that cannot be created on a given OS.

If that is a risk for your application, take some time to create a path validity checker function.

The `os.path.split` command returns a 2-tuple splitting the basename (the single file or directory at the end of a path) from the rest of the path:

In [14]:
import os

os.path.split(os.path.join("path", "to", "some", "directory"))

('path/to/some', 'directory')

The `os.path.basename` function returns only the basename of the path, and the `os.path.dirname` function returns the path up to, but not including, the last name:

In [18]:
import os

some_path = os.path.join("path", "to", "some", "directory", "img.png")

assert os.path.basename(some_path) == "img.png"
assert os.path.dirname(some_path) == "path/to/some/directory"

The function `os.path.splitext` returns a tuple consisting of a file name and the dotted extension:

In [21]:
import os

some_path = os.path.join("path", "to", "img.png")
assert os.path.splitext(some_path) == ("path/to/img", ".png")

In [22]:
import os

some_path = os.path.join("path", "to", "some", "dir")
assert os.path.splitext(some_path) == ("path/to/some/dir", "")

The function `os.path.expandvars` can be used to expand environment variables used in paths (both for Windows and Linux):

In [25]:
import os

os.path.expandvars("$HOME/downloads")


'/home/ubuntu/downloads'

Similarly, `os.path.expanduser` function expand username shortcuts found in paths:

In [None]:
import os

os.path.expanduser("~/downloads")

You can make your code more platform independent if you use a few useful constants such as `os.curdir` and `os.pardir` to refer to the current and parent dir:

In [33]:
import os

print(os.curdir)
print(os.pardir)

.
..


Those can be used as normal path parameters when building paths:

In [34]:
import os

some_path = os.path.join("path", "to", "some", "dir")
parent = os.path.join(os.pardir, some_path)
print(parent)

../path/to/some/dir


The function `os.path.isabs()` can be used to ask whether a path is an absolute path or a relative one:

In [35]:
import os

some_path = os.path.join(os.pardir, "mini-projects")
os.path.isabs(some_path)

False

The constant `os.curdir` is typically used when running commands on the current directory:

In [36]:
import os

os.listdir(os.curdir)

['file1.txt', 'file2.txt', 'file3.txt']

The constant `os.name` returns the name of the Python module used to handle the OS specific operations:

In [37]:
import os

os.name

'posix'

| NOTE: |
| :---- |
| Most versions of windows will return `nt` for the `os.name`.<br>In some cases, you might see code that reads `sys.platform` instead, which gives more detailed information (`linux` for Linux systems, `win32` for Windows systems, and `darwin` for Macs). |

In [38]:
import sys

sys.platform

'linux'

The variable `os.environ` references a dictionary with all the environment variables. On most operating systems, this dictionary includes certain variables related to paths (such as search paths for binaries, etc.)

In [None]:
import os

os.environ

### Exercise

How would you use the `os` module's function to take a path to a file called `test.log` and create a new file path in the same directory for a file called `test.log.old`? How would you do the same thing using the `pathlib` module? What path would you get if you created a pathlib `Path` object from `os.pardir`?

Assuming that `test.log` sits on `sample_data/` directory, the file could be references using `os` module doing:

In [47]:
import os

# check that CWD is still pointing to 'sample_data/'
assert os.path.basename(os.getcwd()) == "sample_data"
path_to_test_log = os.path.join(os.curdir, "test.log")
path_to_test_log_old = os.path.join(os.path.pardir, path_to_test_log, "test.log.old")

print(path_to_test_log_old)


.././test.log/test.log.old


Using Pathlib is clearer:

In [51]:
from pathlib import Path

path_to_test_log = Path() / "test.log"
path_to_test_log.parent / "test.log.old"

PosixPath('test.log.old')

If you have executed all the cells, `os.curdir` will point to `sample_data/` directory. Therefore, `os.pardir` will point to where the Python notebook resides:

In [52]:
import os
from pathlib import Path

Path(os.pardir)

PosixPath('..')

## Getting information about files

### Using `Path` (preferred)

The most commonly used path-information functions are:

+ `Path.exists()` &mdash; returns `True` if its argument is a path corresponding to something that exists in the filesystem.

+ `Path.isfile()` &mdash; returns `True` if its argument is a normal file, otherwise, even is the path doesn't exists returns `False`.

+ `Path.isdir()` &mdash; returns `True` if its argument is a directory.

In [None]:
from pathlib import Path

# Path.exists()
assert Path("~/Development").expanduser().exists() == True
assert Path("path/to/file").exists() == False

# Path.isdir()
assert Path("~/Development/").expanduser().is_dir() == True
assert Path("/path/to/file").is_dir() == False # if doesn't exist returns False
assert Path("~/.bashrc").expanduser().is_dir() == False

# Path.isfile()
assert Path("~/Development/").expanduser().is_file() == False
assert Path("/path/to/file/img.png").is_file() == False # if doesn't exist returns False
assert Path("~/.bashrc").expanduser().is_file() == True

+ `Path.is_symlink()` &mdash; returns `True` if the path is a symbolic link.

+ `Path.is_mount()` &mdash; returns `True` if the path is a mount point.

+ `Path.samefile(path)` &mdash; returns `True` if the path it is applied to and the given path point to the same file.

+ `Path.is_abs()` &mdash; returns `True` if the path represents an absolute path.

+ `Path.stat()` &mdash; returns an object with file properties such as `st_size`, `st_mtime`, `st_atime`, and `st_ctime` which provides the size, modified time, last access time, and creation time.


In [82]:
from pathlib import Path

assert (Path.home() / "wsl_shared").is_symlink() == True
assert Path("/mnt/wsl").is_mount() == True
assert (Path().home() / ".bashrc").samefile(Path("/home/ubuntu/.bashrc"))
assert Path("/home").is_absolute() == True

(Path().home() / ".bashrc").stat()


os.stat_result(st_mode=33188, st_ino=40738, st_dev=2080, st_nlink=1, st_uid=1000, st_gid=1000, st_size=7581, st_atime=1724561483, st_mtime=1705511456, st_ctime=1705511456)

### Using `os.path` (legacy)

The most commonly used path-information functions are:

+ `os.path.exists()` &mdash; returns `True` if its argument is a path corresponding to something that exists in the filesystem.

+ `os.path.isfile()` &mdash; returns `True` if its argument is a normal file, otherwise, even is the path doesn't exists returns `False`.

+ `os.path.isdir()` &mdash; returns `True` if its argument is a directory.

In [67]:
import os

# os.path.exists()
assert os.path.exists(os.path.expanduser("~/Development/")) == True
assert os.path.exists("path/to/file") == False

# os.path.isdir()
assert os.path.isdir(os.path.expanduser("~/Development/")) == True
assert os.path.isdir("/path/to/file") == False # if doesn't exist returns False
assert os.path.isdir(os.path.expanduser("~/.bashrc")) == False

# os.path.isfile()
assert os.path.isfile(os.path.expanduser("~/Development/")) == False
assert os.path.isfile("/path/to/file/img.png") == False # if doesn't exist returns False
assert os.path.isfile(os.path.expanduser("~/.bashrc")) == True


+ `os.path.islink()` &mdash; returns `True` if its argument is a file link. Note that it returns `False` for Windows shortcut files.

+ `os.path.ismount()` &mdash; returns `True` if its argument is a mount point.

+ `os.path.samefile(path1, path2)` &mdash; returns `True` if `path1` and `path2` point to the same file.

+ `os.path.isabs(path)` &mdash; returns `True` if path represents an absolute path.

+ `os.path.getsize(path)` &mdash; returns the size of the file identified by the given path.

+ `os.path.getmtime(path)` &mdash; returns the last modified time of the given path.

+ `os.path.getatime(path)` &mdash; returns the last access time of the given path.

### Getting information about files with `os.scandir` (legacy)

The function `os.scandir` returns an iterator of `os.DirEntry` objects. The `DirEntry` object exposes the file attributes of a directory entry, and can be more efficient than using `os.listdir` combined with `os.path` operations.

Additionally, `os.scandir()` supports the context manager syntax, which ensures that resources are properly disposed of when no longer needed.

In [84]:
with os.scandir(".") as my_dir:
    for dir_entry in my_dir:
        print(dir_entry.name, dir_entry.is_file())

05_control-flow False
09_filesystem False
01_python-building-blocks False
02_lists-tuples-sets False
07_modules-and-scoping-rules False
.python-version True
00_tools False
08_python-programs False
03_strings False
06_functions False
04_dictionaries False


## More filesystem operations using `pathlib` module (preferred)

The `Path.iterdir` method returns an iterator of path so that you can obtain a list of the contents of a directory:

In [179]:
from pathlib import Path


expected = [Path("file1.txt"), Path("file2.txt"), Path("file3.txt"), Path("log.out")]
for path in Path().iterdir():
    assert path in expected

# materializing the iterator
list(Path().iterdir())

[PosixPath('file1.txt'),
 PosixPath('file2.txt'),
 PosixPath('file3.txt'),
 PosixPath('log.out')]

`Path` objects expose a `glob` method you can use to obtain an iterator of the path objects that match the given pattern.

The following is a simple cheatsheet of glob patterns:

+ `*` &mdash; matches any sequence of characters
+ `?` &mdash; matches any single character
+ `[h, H]` &mdash; matches the given characters
+ `[0-9]` &mdash; matches the given character sequence

In [138]:
from pathlib import Path

list(Path().glob("file?.txt"))

[PosixPath('file1.txt'), PosixPath('file2.txt'), PosixPath('file3.txt')]

To rename/move a file or directory, use `Path.rename`. This can be used to move files within and across directories.

In [176]:
import os

# make sure we're in the correct path
# os.chdir("sample_data")

assert os.path.basename(os.getcwd()) == "sample_data"


In [178]:
from pathlib import Path

log_file = Path("log.out")
log_file.rename("log.out.old")
assert Path("log.out.old").exists()

# revert the renaming
Path("log.out.old").rename("log.out")


PosixPath('log.out')

To remove or delete a data file, use `Path.unlink` method. You cannot use `Path.unlink` to delete directories, even if they're empty.

In [142]:
tmp_file = Path("some.tmp")
tmp_file.touch()

assert Path("some.tmp").exists()
tmp_file.unlink()
assert Path("some.tmp").exists() == False

You can create directories with the `Path.mkdir` method. With the `parents=True` parameter any intermediate directories that do not exist will be created:

In [146]:
from pathlib import Path
import shutil

new_dir = Path("path", "to", "some", "dir")
new_dir.mkdir(parents=True)

assert new_dir.is_dir()
shutil.rmtree("path")


To remove a directory, use the `Path.rmdir` method. This method only works on empty directories:

In [148]:
from pathlib import Path

new_dir = Path("path", "to", "some", "dir")
new_dir.mkdir(parents=True)

assert new_dir.is_dir()
Path("path", "to", "some", "dir").rmdir()
Path("path", "to", "some").rmdir()
Path("path", "to").rmdir()
Path("path").rmdir()

## More filesystem operations using the `os` module (legacy)

`os.listdir(path)` function returns a list of files in the directory identified by path:

In [95]:
import os

os.listdir(os.path.expandvars("$HOME/Development/git-repos/side_projects"))

['aws-saa-c03',
 'gremlin-visualizer',
 'go-workbench',
 'nodejs-in-action',
 'tempconv',
 'vec2d',
 'functionality-conflict',
 'python-workbench',
 'math-workbench',
 'grokking-graphs',
 'azure-in-action',
 'stats-workbench',
 'go-prjs',
 'docker-book',
 'vec3d',
 'Multitenancy-AuthorizationAuthentication',
 'graph-explorer',
 'currex',
 'aws-workbench',
 'Math-for-Programmers',
 'serverless-webapp',
 'python-in-action',
 'kubernetes-in-action',
 'es-in-action']

Note that the listing does not include `os.curdir` and `os.pardir`.

The `glob` function from the `glob` module exapands Linux shell-style wildcard characters and character sequences in a pathname, returning the files in the current working directory that match:

+ `*` &mdash; matches any sequence of characters
+ `?` &mdash; matches any single character
+ `[h, H]` &mdash; matches the given characters
+ `[0-9]` &mdash; matches the given character sequence

In [102]:
import os

# adjust as needed to point to `sample_data/`
# os.chdir("09_filesystem")

# check we're where the notebook resides
assert os.path.basename(os.getcwd()) == "09_filesystem"

# change cwd to sample_data
os.chdir("sample_data")


In [181]:
import glob

expected = ["log.out", "file1.txt", "file2.txt", "file3.txt"]
for file in glob.glob("*"):
    assert file in expected


In [108]:
import glob

expected = ["file1.txt", "file2.txt", "file3.txt"]
for file in glob.glob("*txt"):
    assert file in expected

In [109]:
import glob

expected = ["file1.txt", "file2.txt", "file3.txt"]
for file in glob.glob("file?.txt"):
    assert file in expected

In [110]:
import glob

expected = ["file1.txt", "file2.txt", "file3.txt"]
for file in glob.glob("file[1-3].txt"):
    assert file in expected

In [111]:
import glob

expected = ["file1.txt", "file2.txt", "file3.txt"]
for file in glob.glob("file[1, 2, 3].txt"):
    assert file in expected

To rename/move a file you can use `os.rename`:

In [180]:
import os
import glob

os.rename("log.out", "log.out.old")
assert glob.glob("*.old") == ["log.out.old"]

# reset
os.rename("log.out.old", "log.out")


Note that you can use this command to move files across directories as well, and not only within directories.

You can remove/delete a file with `os.remove`:

In [119]:
import os
from pathlib import Path
import glob

# Create a tmp file
Path("some.tmp").touch()
assert glob.glob("*.tmp") == ["some.tmp"]

# Remove it
os.remove("some.tmp")
assert glob.glob("*.tmp") == []

| NOTE: |
| :---- |
| You can't use `os.remove()` to delete directories, even if they're empty. |

In [128]:
import os

# Create an empty directory
os.mkdir("empty_dir")
assert os.path.isdir("empty_dir")

try:
    os.remove("empty_dir")
except Exception as e:
    print("Oops:", {e})

# remove directory to make the cell idempotent
os.rmdir("empty_dir")

Oops: {IsADirectoryError(21, 'Is a directory')}


To create a directory use `os.makedirs()` or `os.mkdir()`. The former create any necessary intermediate directories.

`os.rmdir()` can be used to remove empty directories. Attempting to remove a non-empty directory with `rmdir` ends in an exception being raised:

In [133]:
import os

os.makedirs("path/to/sample/dir")

assert os.path.isdir("path")
assert os.path.isdir("path/to")
assert os.path.isdir("path/to/sample")
assert os.path.isdir("path/to/sample/dir")

os.rmdir("path/to/sample/dir")
os.rmdir("path/to/sample/")
os.rmdir("path/to")
os.rmdir("path")



To remove non-empty directories, use the `shutil.rmtree()`. This function recursively removes all files in a directory tree:

In [134]:
import os
import shutil

os.makedirs("path/to/sample/dir")

assert os.path.isdir("path")
assert os.path.isdir("path/to")
assert os.path.isdir("path/to/sample")
assert os.path.isdir("path/to/sample/dir")

shutil.rmtree("path")

## Processing all files in a directory subtree using `Pathlib` (preferred)

The `Path.walk()` function lets you walk through an entire directory tree, returning three things for each directory it traverses: the root or path of that directory, a list of its subdirectories, and a list of files.

`Path.walk()` can be configured with three optional arguments:
+ `topdown` &mdash; if True or not present the files in each directory are processed before its subdirectories. If False, the subdirectories are processed first.
+ `oneerror` &mdash; can be set to a function to handle any error that results from calling `os.listdir`. By default, errors are ignored.
+ `followlinks` &mdash; if True, symbolic links will be followed. By default, it doesn't walk down into folders that are symbolic links.

## Processing all files in a directory subtree using `os` (legacy)

The `os.walk()` function lets you walk through an entire directory tree, returning three things for each directory it traverses: the root or path of that directory, a list of its subdirectories, and a list of files.

`os.walk()` is invoked with the path of the starting, or top, directory and three optional arguments:
+ `topdown` &mdash; if True or not present the files in each directory are processed before its subdirectories. If False, the subdirectories are processed first.
+ `oneerror` &mdash; can be set to a function to handle any error that results from calling `os.listdir`. By default, errors are ignored.
+ `followlinks` &mdash; if True, symbolic links will be followed. By default, it doesn't walk down into folders that are symbolic links.

In [None]:
import os

# Set the CWD to where the notebook resides
# os.chdir(os.pardir)

# Check we're in the correct place to run next cell
assert os.path.basename(os.getcwd()) == "09_filesystem"

In [165]:
from pathlib import Path

notebook_dir = Path()

for root, dirs, files in notebook_dir.walk():
    print(f"{root} has {len(files)} files")
    if "lib" in dirs:
        dirs.remove("lib")
    if Path("lib64") in dirs:
        dirs.remove("lib64")

. has 3 files
.venv has 2 files
.venv/include has 0 files
.venv/include/python3.12 has 0 files
.venv/share has 0 files
.venv/share/jupyter has 0 files
.venv/share/jupyter/kernels has 0 files
.venv/share/jupyter/kernels/python3 has 4 files
.venv/share/man has 0 files
.venv/share/man/man1 has 1 files
.venv/bin has 20 files
sample_data has 4 files


The following snippet iterates over the tree of the directory where the notebook resides.

While iterating, we prevent going down the `lib` and `lib64` subtrees:

In [151]:
import os

# Set the CWD to where the notebook resides
# os.chdir(os.pardir)



In [154]:
import os

# Check we are in the correct place
assert os.path.basename(os.getcwd()) == "09_filesystem"

for root, dirs, files in os.walk(os.curdir):
    print(f"{root} has {len(files)} files")
    if "lib" in dirs:
        dirs.remove("lib")
    if "lib64" in dirs:
        dirs.remove("lib64")

. has 3 files
./.venv has 1 files
./.venv/include has 0 files
./.venv/include/python3.12 has 0 files
./.venv/share has 0 files
./.venv/share/jupyter has 0 files
./.venv/share/jupyter/kernels has 0 files
./.venv/share/jupyter/kernels/python3 has 4 files
./.venv/share/man has 0 files
./.venv/share/man/man1 has 1 files
./.venv/bin has 20 files
./sample_data has 4 files


The `shutil.copytree()` function recursively makes copies of all the firec in a directory and all of its subdirectories, preserving their permission modes and stat data.

In [184]:
import shutil

# Check we are in the correct place
# os.chdir(os.pardir)
assert os.path.basename(os.getcwd()) == "09_filesystem"

shutil.copytree("sample_data", "sample_data_copy")

assert os.path.isfile("sample_data_copy/log.out")

# remove the copy
shutil.rmtree("sample_data_copy")

### Lab: File operations

Create a program that calculates the total size of all files ending with `.test` that aren't symlinks in a directory. The files found have to be moved to a new subdirectory in the same directory called backup:

In [192]:
from pathlib import Path

# assert we're in the correct place
assert Path().resolve().name == "09_filesystem"

lab_dir = Path("lab")
total_size = 0
for file in lab_dir.glob("*.tst"):
    if Path(file).is_symlink():
        print(f"Skipping {file}: it's a symbolic link")
    else:
        if Path(file).is_file():
            total_size += Path(file).stat().st_size
            backup_dir = lab_dir / "backup"
            backup_dir.mkdir(exist_ok=True)
            Path(file).rename(Path(backup_dir, Path(file).name))

print(f"Total size of '.tst' files: {total_size} bytes")


Skipping lab/file5_lnk.tst: it's a symbolic link
Total size of '.tst' files: 19 bytes
