(auto-office)=
# The Automated Office

In this chapter, we'll look at a range of ways to automate processes and tasks you might need to undertake in an office context.

Let's import a few of the packages we'll need first. You may need to install some of these; the Chapter on {ref}`code-preliminaries` covers how to install new packages.

In [None]:
import numpy as np
import pandas as pd
from rich import inspect
import matplotlib.pyplot as plt
import matplotlib as mpl
import warnings

# Plot settings
plt.style.use(
    "https://github.com/aeturrell/coding-for-economists/raw/main/plot_style.txt"
)
mpl.rcParams.update({"lines.linewidth": 1.2})

# Pandas: Set max rows displayed for readability
pd.set_option("display.max_rows", 8)

# Set seed for random numbers
seed_for_prng = 78557
prng = np.random.default_rng(seed_for_prng)  # prng=probabilistic random number generator
# Turn off warnings
warnings.filterwarnings('ignore')

## Files

Python is sometimes thought of as a 'glue' language because it can glue together lots of different functionalities (including calling other languages). The ins and outs of your operating system are no different.

The single most important module for manipulating files in Python is `os`, which interacts with your operating system and is built-in to Python (so no need for a separate install). Let's start by getting the current working directory (`getcwd()`) for the kernel (this will be whatever computer the code is being run on).

In [None]:
import os

os.getcwd()

`os` can be used to create files and directories, for example `os.mkdir()` creates a new directory (but throws an error if it already exists). There are also commands to remove files, which should of course be used with care!

One particularly useful `os` method is `stat(path).st_size`, which returns the size of the file from a given path. To get a bit meta, we can use it to query the size of the page you're currently reading.

In [None]:
# Size in bytes
print(f"The current page is {os.stat('auto-office.ipynb').st_size/1e3} kilobytes.")

Another command you should be aware of is `os.chdir(path)` which, when used, changes the working path of your code.

`shutil` is another handy file-manipulation module built-in to Python. It has `copyfile` and `move` functions, which do exactly what you'd expect.

[**watchdog**](https://pythonhosted.org/watchdog) is a library that allows you to monitor files on a computer for changes, and to log changes to a text file when they do occur. This can be useful in a production setting, or for monitoring changes in files on a connected network drive.

### Downloading Files

Downloading files programmatically and repeatably is possible using the `urllib` library, which comes built-in with Python. Here's an example of how to use it to download a file and give it a specific name:

```python

import urllib.request

url = "https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1031268/NIC_Annual_Report_and_Accounts_2020_to_2021_Final_4_November.pdf"

urllib.request.urlretrieve(url, "nic_ann_rep.pdf")
```

You can also download and unzip files in one fell swoop.

```python
from io import BytesIO
from urllib.request import urlopen
from zipfile import ZipFile

# URL of the zip file
zipurl = "https://files.stlouisfed.org/files/htdocs/uploads/FRED-QD%20Appendix.zip"

# extract to path
extract_to = "downloads/"

zipfile = ZipFile(BytesIO(urlopen(url).read()))
zipfile.extractall(path=extract_to)
```

If you don't want to actually save the files on your computer, you can still look at the contents of them:

In [None]:
from io import BytesIO
from urllib.request import urlopen
from zipfile import ZipFile

# URL of the zip file
zipurl = "https://files.stlouisfed.org/files/htdocs/uploads/FRED-QD%20Appendix.zip"

# Take a look at the contents
with urlopen(zipurl) as zipresp:
    with ZipFile(BytesIO(zipresp.read())) as zfile:
        print("\n".join(zfile.namelist()))