![IIIF Manifest Creation](https://raw.githubusercontent.com/arvest-data-in-context/ml-notebooks/refs/heads/main/docs/images/notebooks/iiif-manifest-creation.png)

In this notebook, you'll learn how to create a simple [IIIF](https://iiif.io/) Manifest in python using the [iiif-prezi3](https://github.com/iiif-prezi/iiif-prezi3) package from an image file. We shall take the image file from [Arvest](https://arvest.app) using the [Arvest API](https://github.com/arvest-data-in-context/arvest-api), and upload the Manifest so that you can see straight away how it looks.

A IIIF Manifest is a small json file that allows you to bring together different media, set metadata and add annotations. A IIIF Manifest follows a strict format called the [IIIF Presentation API](https://iiif.io/api/presentation/3.0/).

**ℹ️ In this notebook we shall take you through the whole process step by step. However, know that we also provide a utility function in the [`arvesttools`](https://github.com/arvest-data-in-context/arvest-api-tools) package called `media_to_manifest()` which essentially does all of this for you. We will show you this at the end of this tutorial, however if you're interested to learn how everyhting is working under the hood, please continue!**

# 0. Setup

Let's begin by installing and importing all of the different components we will need.

In [None]:
print("Installing and importing packages...")

# Uninstall and reinstall packages for a clean environment
!pip uninstall -q -y arvestapi
!pip uninstall -q -y arvesttools
!pip uninstall -q -y jhutils
!pip uninstall -q -y iiif_prezi3
!pip install -q --disable-pip-version-check git+https://github.com/arvest-data-in-context/arvest-api.git
!pip install -q --disable-pip-version-check git+https://github.com/arvest-data-in-context/arvest-api-tools.git
!pip install -q --disable-pip-version-check git+https://github.com/jdchart/jh-py-utils.git
!pip install -q --disable-pip-version-check git+https://github.com/iiif-prezi/iiif-prezi3.git

# Import packages
import arvestapi
import arvesttools.manifest_creation
from jhutils.local_files import read_json, write_json, get_image_info
import jhutils.online_files
from jhutils.misc import print_progress_bar, slugify
import os
import iiif_prezi3
import shutil
import mimetypes
mimetypes.add_type('image/webp', '.webp')

print("👍 Ready!")

# 1. Prepare your media

In order to work, IIIF needs the URL of an accesible media file on the internet. There are plenty of services that let you store files and render them accessible, we'll be using media stored on [Arvest](https://arvest.app). The first step, therefore, will be to upload your media to Arvest (there are tutorials for doing this via the API in this repo).

First, we need to "connect" to Arvest using the Arvest API package. For this, we need our user email and our password which we will give to an instance of the `arvestapi.Arvest()` class. For convenience, we've saved ours in a file which is why we get `LOGIN_DATA` by reading a json file.

In [None]:
LOGIN_DATA = read_json(os.path.join(os.getcwd(), "login_private.json"))

ar = arvestapi.Arvest(LOGIN_DATA["email"], LOGIN_DATA["password"])
print(f"👍 Succesfully connected to Arvest with \"{ar.profile.name}\"")

Next we shall choose which media items we want by looking through all of our media items (using the `get_medias()` function), and selecting items according to specific metadata. Here for example, we get all of the media items with the `identifier` `"API-TUTORIAL-CONTENT-IMAGE"`.

In [None]:
media_for_manifests = []
media_items = ar.get_medias()

for media_item in media_items:
    media_item_metadata = media_item.get_metadata()
    if media_item_metadata["identifier"] == "API-TUTORIAL-CONTENT-IMAGE":
        media_for_manifests.append(media_item)

print(f"🔍 Found {len(media_for_manifests)} media files corresponding to search criteria.")

**ℹ️ From this point, we will take you through, step by step, how to create IIIF Manifests from the media items. Know that we also provide a utility function in the `arvesttools` package called `media_to_manifest()` which essentially does all of this for you. We will show you this at the end of this tutorial, however if you're interested to learn how everyhting is working under the hood, please continue!**

# 2. Get media info
In order to create our Manifests, we will need to gather some basic information about our media: notably the **dimensions** of the image. The most reliable way of doing this is to download the media file and then get the information.

In the following cell, we get the url of each media item using the `get_full_url()` function, then download the file into a temporary folder defined with `TEMP_FOLDER`. Then we read the image's dimensions using the `get_image_info()` utility function.

In [None]:
TEMP_FOLDER = os.path.join(os.getcwd(), "_TEMP")
media_info = {}

if os.path.isdir(TEMP_FOLDER) == False:
    os.makedirs(TEMP_FOLDER)

print("Downloading and retriving info...")

for i, media_item in enumerate(media_for_manifests):
    media_url = media_item.get_full_url()
    
    print_progress_bar(i + 1, len(media_for_manifests), f"Processing \"{media_item.title}\"")

    dl_location = jhutils.online_files.download(media_url, dir = TEMP_FOLDER)

    media_info[media_item.id] = {"info" : get_image_info(dl_location), "url" : media_url}

for item in media_info:
    print(f"Media item id \"{item}\":\n\t{media_info[item]}")

print("👍 Finished!")

# 3. Create basic Manifests
Now we have all of the information we need to create our basic Manifests that point to our media. We shall be using the [iiif-prezi3](https://github.com/iiif-prezi/iiif-prezi3) package which is specifically made for this purpose.

We'll start by creating the basic `iiif_prezi3.Manifest()` class. We'll need to give it an `id` and a `label`. Notice that the id is a placeholder. The reason for this, is that the id corresponds to the location where the json file is stored online. As we haven't uploaded it yet, we can't know what this location is! We make the Manifest with a placeholder location, and this will be automaticcaly replaced when we upload the file using the Arvest API. The only part which will persist is the title we give to the json file.. We use `slugify()` to make sure that this strign is "url safe".

In [None]:
# We'll put all of our Manifest objects into a list for iteration:
manifests = []

print("Creating Manifests...")

for i, media_item in enumerate(media_for_manifests):
    print_progress_bar(i + 1, len(media_for_manifests), f"Creating a Manifest for \"{media_item.title}\"")

    title_sanitized = slugify(media_item.title)

    # Creating an instance of the iiif_prezi3.Manifest class:
    manifest = iiif_prezi3.Manifest(
        id = f"https://placeholder.com/{title_sanitized}.json",
        label = {"en" : [f"{media_item.title}"]}
    )

    manifests.append(manifest)

print("👍 Finished!")

## Adding a Canvas

Next, we'll need to give each Manifest a Canvas - think of this as a "page" upon which our media will be painted. For this, we'll create an instance of the `iiif_prezi3.Canvas()` class and add it to the Manifest's `items`.

Note that, as well as an `id` and a `label`, we'll also have to give our Canvas a `width` and a `height` which correspond to the width and height of our media.

In [None]:
print("Creating Canvases...")

for i, media_item in enumerate(media_for_manifests):
    print_progress_bar(i + 1, len(media_for_manifests), f"Creating a Canvas for \"{media_item.title}\"")

    # Creating an instance of the iiif_prezi3.Canvas class:
    canvas = iiif_prezi3.Canvas(
        id = "https://placeholder.com/canvas/1",
        label = {"en" : [f"{media_item.get_full_url()}"]},
        width = media_info[media_item.id]["info"]["width"],
        height = media_info[media_item.id]["info"]["height"]
    )

    # Add the Canvas to the Manifest's list of items:
    manifests[i].items.append(canvas)

print("👍 Finished!")

## Painting the media to the Canvas

Now we need to "paint" our media onto the Canvas. Following IIIF's Presentation API, we'll need to add an `AnnotationPage` to the Canvases `items`, and add an `Annotation` to the AnnotationPage's items, the `body` of which contains the reference to our media. This can be a bit complicated to follow, so here is a small diagram showing the hierarchy:

![IIIF Manifest structure](https://raw.githubusercontent.com/arvest-data-in-context/ml-notebooks/refs/heads/main/docs/images/notebooks/manifest-structure.jpg)

In [None]:
print("Adding media to Canvases...")

for i, media_item in enumerate(media_for_manifests):
    print_progress_bar(i + 1, len(media_for_manifests), f"Adding media for \"{media_item.title}\"")

    # First, we need to retrieve the media's info, and the file type:
    width = media_info[media_item.id]["info"]["width"]
    height = media_info[media_item.id]["info"]["height"]
    mime_type, encoding = mimetypes.guess_type(media_item.get_full_url())

    # Then we create an instance of the iiif_prezi3.AnnotationPage class:
    annotation_page = iiif_prezi3.AnnotationPage(id = "https://placeholder.com/canvas/1/page/1")
    
    # Creating an instance of the iiif_prezi3.Annotation class:
    media_annotation_element = iiif_prezi3.Annotation(
        id = "https://placeholder.com/canvas/1/page/1/1",
        motivation = "painting",
        target = f"https://placeholder.com/canvas/1#xywh=0,0,{width},{height}",
        body = {
            "id" : media_item.get_full_url(),
            "type" : mime_type.split("/")[0].capitalize(),
            "format" : mime_type,
            "width" : width,
            "height" : height
        }
    )

    # Then we add the Annotation to the AnnotationPage, and the AnnitationPage to the Canvas:
    annotation_page.items.append(media_annotation_element)
    manifests[i].items[0].items.append(annotation_page)

print("👍 Finished!")

## Add a thumbnail
We've now create our simple Manifest, however, it can be nice to add a thumbnail so that things look a bit prettier in Arvest. We shall use the media item's `thumbnail` url, but you can use anything you want.

In [None]:
print("Creating thumbnails...")

for i, media_item in enumerate(media_for_manifests):
    
    # Get the url of the media item's thumbnail:
    thumb_url = media_item.thumbnail_url
    if thumb_url != None:
        print_progress_bar(i + 1, len(media_for_manifests), f"Adding thumbnails for \"{media_item.title}\"")
        
        # Get the thumbnail's dimensions and file info:
        dl_location = jhutils.online_files.download(thumb_url, dir = TEMP_FOLDER)
        thumb_info = get_image_info(dl_location)
        mime_type, encoding = mimetypes.guess_type(dl_location)

        # Create a thumbnail object:
        thumb_object = {
            "id" : thumb_url,
            "type" : mime_type.split("/")[0].capitalize(),
            "format" : mime_type,
            "width" : thumb_info["width"],
            "height" : thumb_info["height"]
        }

        # We'll add the thumbnail to the Manifest as well as the Canvas:
        manifests[i].thumbnail = [thumb_object]
        manifests[i].items[0].thumbnail = [thumb_object]

print("👍 Updated Manifest thumbnails!")

## Save to disk
Congrats! We've sucessfully created our Manifests. Now we just need to save it to disk as a json file. We'll save it to the `TEMP_FOLDER` as we won't be needing it after.

Run the cell and take a look at your Manifests!

In [None]:
print("Saving to disk...")

for i, media_item in enumerate(media_for_manifests):
    print_progress_bar(i + 1, len(media_for_manifests), f"Savign Manifest for \"{media_item.title}\"")

    title_sanitized = slugify(media_item.title)
    write_json(os.path.join(TEMP_FOLDER, f"{title_sanitized}-manifest.json"), manifests[i].dict())

print("👍 Finished!")

# 3. Upload to Arvest
Congrats! We've succesfully created our first Manifests. Let's upload them to Arvest in order to see how they look. To do this, we'll use the `add_manifest()` function.

`add_manifest()` will take one kwarg, `path`, which is the path to the file we'd like to upload. This is the local path to the Manifest file we have created. We'll also need to set the `update_id` kwarg to `True` so that the placeholder urls get replaced with the new url created when we upload the Manifest to arvest.

Let's also modify the **title**, **description** and **thumbnail** of the Manifest item that will be created in Arvest, as well as it's **metadata** (note that this is the Arvest Manifest item's metadata, not the actual metadata of the Manifest itself).

In [None]:
uploaded_manifests = []
count = 0
print("Uploading files...")

for i, media_item in enumerate(media_for_manifests):
    print_progress_bar(i + 1, len(media_for_manifests), f"Uploading Manifest for \"{media_item.title}\"")
    
    # Add the Manifest here:
    title_sanitized = slugify(media_item.title)
    added_manifest = ar.add_manifest(path = os.path.join(TEMP_FOLDER, f"{title_sanitized}-manifest.json"), update_id = True)
    
    # Update the Manifest item's info:
    added_manifest.update_title(f"{media_item.title} Manifest")
    added_manifest.update_description("An API created Manifest!")
    media_thumb_url = media_item.thumbnail_url
    if media_thumb_url != None:
        added_manifest.update_thumbnail_url(media_thumb_url)

    # Update the Manifest item's metadata:
    manifest_metadata = added_manifest.get_metadata()
    manifest_metadata["creator"] = "Batch manifest upload example script"
    manifest_metadata["identifier"] = "&&BATCH_UPLOAD"
    added_manifest.update_metadata(manifest_metadata)
        
    uploaded_manifests.append(added_manifest)
    count = count + 1

print(f"👏 Added {count} Manifest files to Arvest!")

## View in Arvest
Not that our Manifest has been uploaded, we can open it in Arvest. You can either go to your [workspace](https://workspace.arvest.app) and find it in your Manifest ist, or run the following cell to get a direct link to consult the Manifest using the `get_preview_url()` method.

In [None]:
for i, manifest in enumerate(uploaded_manifests):
    print(f"Manifest {i + 1}:\n\t{manifest.get_preview_url()}")

Follow our other tutorials to see how to create Manifests with other types of media, and more complicated Manifests, such as multi-page Manifests, Manifests with metadata, and Manifests with Annotations.

# 4. The quick version: `arvesttools` `media_to_manifest()`

Now that we know how everything works, we'll finish by shoing you a utility function which allows you to do all of this in any less lines of code using the [arvesttools](https://github.com/arvest-data-in-context/arvest-api-tools) package and its `media_to_manifest()` function.

This function will take an **Arvest media item** and create a IIIF Manifest formated dict from it which can then be saved and uploaded straight to Arvest.

In [None]:
uploaded_manifests = []

for i, media_item in enumerate(media_for_manifests):
    print_progress_bar(i + 1, len(media_for_manifests), f"Creating Manifest for \"{media_item.title}\"")

    # Using the media_to_manifest() fucntion (returns the iiif_prezi3 Manifest):
    manifest = arvesttools.manifest_creation.media_to_manifest(media_item)

    # Save to disk:
    path_on_disk = os.path.join(TEMP_FOLDER, f"{slugify(media_item.title)}-automatically-created-manifest.json")
    write_json(path_on_disk, manifests[i].dict())

    # Upload to Arvest:
    added_manifest = ar.add_manifest(path = path_on_disk, update_id = True)

    added_manifest.update_title(f"{media_item.title} Manifest (automatically created)")
    media_thumb_url = media_item.thumbnail_url
    if media_thumb_url != None:
        added_manifest.update_thumbnail_url(media_thumb_url)

    manifest_metadata = added_manifest.get_metadata()
    manifest_metadata["creator"] = "Batch manifest upload example script"
    manifest_metadata["identifier"] = "&&BATCH_UPLOAD"
    added_manifest.update_metadata(manifest_metadata)

    uploaded_manifests.append(added_manifest)

print(f"👏 Finished!")

Like above, get their preview urls here or view them in your [workspace](https://workspace.arvest.app/app/my-projects):

In [None]:
for i, manifest in enumerate(uploaded_manifests):
    print(f"Manifest {i + 1}:\n\t{manifest.get_preview_url()}")

# 5. Cleanup
To finish, lets clean up our mess! First, we can delete the temporary folder where the media was downloaded and our Manifests were created.

In [None]:
shutil.rmtree(TEMP_FOLDER)
print(f"🗑️ {TEMP_FOLDER} removed !")

And finally, we can remove from Arvest all of our Manifests. We can get all of our Manifests by using the `get_manifests()` function, then check it's metadata. If it's one of the files we want to remove, we can then use the `remove()` function.

**⚠️ Warning: there's no going back after using the remove function, so be careful! To avoid accidential removal, we've added a `REMOVE` variable that need to be set to `True` for the code to run.**

In [None]:
REMOVE = False

if REMOVE:
    all_manifests = ar.get_manifests()
    count = 0
    print("Removing manifests...")

    for i, media_file in enumerate(all_manifests):
        print_progress_bar(i + 1, len(all_manifests), f"(Processing file {i + 1}/{len(all_manifests)})")
        media_metadata = media_file.get_metadata()
        if media_metadata["creator"] == "Batch manifest upload example script" and media_metadata["identifier"] == "&&BATCH_UPLOAD":
            media_file.remove()
            count = count + 1

    print(f"🗑️ Removed {count} Manifest files!")