# Task scheduling, monitoring and basic applications with Python

## What is task automation Using Python

From the book [Automate the Boring Stuff with Python](https://automatetheboringstuff.com/) By Al Sweigart:

"If you've ever spent hours renaming files or updating hundreds of spreadsheet cells, you know how tedious tasks like these can be. But what if you could have your computer do them for you?"

## Monitoring file system events

Monitoring file system events can be valuable for various reasons, particularly in scenarios where you need to track changes, updates, or activities within the file system

We will use the [Watchdog](https://pythonhosted.org/watchdog) package.
```
conda install watchdog
```
We will write an event handler for filesystem events, and give it to an observer that will use the event handler to handle events on a specific path, the `img` folder.

Our event handler will be very simple, it will just print the filename related to each event.

In [1]:
from watchdog.observers import Observer
import watchdog.events

In [2]:
class MyEventHandler(watchdog.events.FileSystemEventHandler):
    def on_any_event(self, event):
        fname = event.src_path
        print("Something happened to", fname, event.event_type)
        

In [3]:
path = '../data/img'
event_handler = MyEventHandler()
observer = Observer()
observer.schedule(event_handler, path, recursive=False)
observer.start()

When we started the observer, it created a new thread for it to run in.

Here we use Jupyter magic to write to a file in that observed path.

In [4]:
%%file ../data/img/tmp.txt
this is a tmp file

Writing ../data/img/tmp.txt


FileNotFoundError: [Errno 2] No such file or directory: '../data/img/tmp.txt'

In [5]:
observer.stop()
observer.join()

Let's create an observer that converts an image file into a compressed image file when added to a certain directory.

In [6]:
import os
from pathlib import Path  # library for getting path of any kind of source
from PIL import Image

class ImageFileHandler(watchdog.events.FileSystemEventHandler):
    def convert_images_to_png(self, file_path: Path):
        try:
            with Image.open(file_path) as img:
                new_file_path = file_path.parent / "compressed" / (file_path.stem + ".png")
                img.save(new_file_path, 'PNG')
                print(f"Converted '{file_path.name}' to '{new_file_path.name}'")
            file_path.unlink()
        except Exception as e:
            print(f"Could not convert '{file_path.name}': {e}")

    def on_created(self, event):  # this function will work on a que of pasting an image
            fname = Path(event.src_path).resolve()
            print(fname.name, event.event_type,"in", fname.parent)
            self.convert_images_to_png(fname)

In [7]:
event_handler = ImageFileHandler()
image_observer = Observer()
image_observer.schedule(event_handler, "/home/pupkolab/temp/handler_test", recursive=False)
image_observer.start()

In [8]:
image_observer.stop()
image_observer.join()

## Scheduling jobs

In Python, you can schedule jobs using various libraries and modules.
One commonly used module for scheduling tasks in Python is ```apscheduler```. 
It provides a simple interface for scheduling jobs to run at specific intervals. 
Here's a basic example using the schedule module to schedule a job to print a message every minute


We will use the [Advanced Python Scheduler](https://apscheduler.readthedocs.org/).
```
conda install apscheduler
```
We create a background scheduler (which runs in the background) and start it (in its own thread).
We will discuss threads more extensivley in the coming lectures.

In [2]:
from apscheduler.schedulers.background import BackgroundScheduler
from datetime import datetime

ModuleNotFoundError: No module named 'apscheduler'

In [None]:
scheduler = BackgroundScheduler()
scheduler.start()

Now we right a function that performs some specific job, and add it to the scheduler.

In [None]:
def job():
    print('{time}: Hello scheduler!'.format(time=datetime.now().ctime()))
scheduler.add_job(job)

Now we add another job, but this time we ask that it will run in half a minute, rather then now.

In [None]:
scheduler.add_job(job, trigger='interval', minutes=0.25) # interval every 15 seconds, runs forever
print(datetime.now().ctime())

example - gzip every night at 00:00 compress all images that were converted by watchdog during the day.

In [27]:
import shutil

def compress_directory(directory):
    dir_path = Path(directory).resolve()
    zip_filename = f"{datetime.now().strftime('%Y-%m-%d')}"
    shutil.make_archive(zip_filename, 'zip', dir_path)

    print("Directory compressed to {zip_filename}")

directory_to_compress = "/home/pupkolab/temp/handler_test/compressed"

scheduler = BackgroundScheduler()
scheduler.add_job(compress_directory, 'cron', args=[directory_to_compress], hour=0, minute=0)
scheduler.start()

print("Scheduled to compress directory", directory_to_compress, "every night at midnight.")


Scheduler started to compress directory /home/pupkolab/temp/handler_test/compressed every night at midnight.


Directory compressed to 2024-05-20


Finally we can shutdown the scheduler.

In [19]:
scheduler.shutdown()

## Command line interfaces

`argparse` is a Python module that simplifies the process of parsing command-line arguments. It allows you to define the arguments your script expects and automatically generates help messages and usage instructions based on that definition.

Here's a brief explanation of key concepts in `argparse`:

1. **ArgumentParser**: The central object in the `argparse` module. You create an instance of `ArgumentParser` to define the command-line arguments your program can accept.

2. **Arguments**: These are the options and values provided to your script when it's run from the command line. Arguments typically consist of flags (like `--verbose`) and their associated values.

3. **Positional Arguments**: These are arguments that are required and must be provided in a specific order. For example, a filename might be a positional argument.

4. **Optional Arguments**: These are arguments that are not required and can be provided in any order. They usually start with a dash (`-`) or a double dash (`--`).

5. **Parsing**: Once you've defined your expected arguments with `ArgumentParser`, you call the `parse_args()` method to parse the command-line arguments provided when the script is run. `parse_args()` returns an object containing the values of the parsed arguments.

6. **Help Text**: `argparse` automatically generates help text based on the argument definitions you provide. This help text is displayed when the user runs your script with the `-h` or `--help` flag.

Overall, `argparse` simplifies the process of creating Python scripts that can be run from the command line by handling the parsing of command-line arguments and providing a user-friendly interface for usage and help information.

We will now move to the python script located [here](../scripts/cli_example.py).