![IIIF Manifest Creation](https://raw.githubusercontent.com/arvest-data-in-context/ml-notebooks/refs/heads/main/docs/images/notebooks/iiif-manifest-creation.png)

In this notebook, you'll learn how to create [IIIF](https://iiif.io/) Manifests in python using the [iiif-prezi3](https://github.com/iiif-prezi/iiif-prezi3) package, and upload them to [Arvest](https://arvest.app) using the [Arvest API](https://github.com/arvest-data-in-context/arvest-api) so that you can see how they look. A IIIF Manifest is a small json file that allows you to bring together different media, set metadata and add annotations. A IIIF Manifest follows a strict format called the [IIIF Presentation API](https://iiif.io/api/presentation/3.0/).

# 0. Setup

Let's begin by installing and importing all of the different components we will need.

In [None]:
print("Installing and importing packages...")

# Uninstall and reinstall packages for a clean environment
!pip uninstall -q -y arvestapi
!pip uninstall -q -y jhutils
!pip uninstall -q -y iiif_prezi3
!pip install -q --disable-pip-version-check git+https://github.com/arvest-data-in-context/arvest-api.git
!pip install -q --disable-pip-version-check git+https://github.com/jdchart/jh-py-utils.git
!pip install -q --disable-pip-version-check git+https://github.com/iiif-prezi/iiif-prezi3.git

# Import packages
import arvestapi
from jhutils.local_files import write_json, get_audio_info, get_video_info, get_image_info
import jhutils.online_files
from jhutils.misc import print_progress_bar_colab
import os
import iiif_prezi3
import shutil
import mimetypes

print("👍 Ready!")

# 1. Prepare your media
In order to work, IIIF needs the URL of an accesible media file on the internet. There are plenty of services that let you store files and render them accessible, some of them are free, some you will have to pay for. Here are a few:
- [Internet archive](https://archive.org/): you can store any type of file on internet archive.
- [Github](https://github.com/): if you have a github account, you can upload small files to a repo and access them with a direct url.
- [Nakala](https://www.nakala.fr/): run by Huma-Num, Nakala is a service that allows you to store data for digital humanities projects.
- [File Browser](https://filebrowser.org/): file browser is an open source file hosting system you can set up on your own server.
- [Arvest](https://arvest.app): Arvest is a digital humanities tool specifically designed for IIIF which also lets you store media.

You will know that a file is directly available if you go to the link in an internet browser and nothing but the media file's content appears. For the purposes of this notebook, we shall use some content that we know is available at the following adresses (hosted at the [Library of Congress](https://www.loc.gov/collections/fsa-owi-color-photographs/about-this-collection/)), however you can always replace these file names with your own. 

In [None]:
media_urls = {
    "images" : [
        "https://tile.loc.gov/storage-services/service/pnp/fsac/1a34000/1a34600/1a34630v.jpg",
        "https://tile.loc.gov/storage-services/service/pnp/fsac/1a34000/1a34200/1a34209v.jpg"
    ],
    "audio" : [
        "https://tile.loc.gov/storage-services/master/afc/afc1940001/afc1940001_a3815a2/afc1940001_a3815a2.wav",
        "https://tile.loc.gov/storage-services/master/afc/afc1999008/afc1999008_crf_mha215007.wav"
    ],
    "video" : [
        "https://tile.loc.gov/storage-services/service/mbrs/ntscrm/01629085/01629085.mp4",
        "https://tile.loc.gov/storage-services/public/music/musihas-200003870/musihas-200003870.0001.mp4"
    ]
}

In order to create our Manifests, we will need to gather some basic information about our media: **dimensions** and in the case of audiovisual media **duration**. Unfortunately, the most reliable way of doing this is to download the media file and then get the information. In the following cell, we download each file, get the information we need, and then remove all of the media.

In [None]:
TEMP_FOLDER = os.path.join(os.getcwd(), "_TEMP")
media_info = {}

if os.path.isdir(TEMP_FOLDER) == False:
    os.makedirs(TEMP_FOLDER)

print("Downloading and retriving info...")

for media_type in media_urls:
    for i, media_file in enumerate(media_urls[media_type]):
        print_progress_bar_colab(i, len(media_urls[media_type]) - 1, f"(File {i + 1}/{len(media_urls[media_type])} in {media_type})")

        jhutils.online_files.download(media_file, dir = TEMP_FOLDER)
        if media_type == "images":
            media_info[os.path.basename(media_file)] = get_image_info(os.path.join(TEMP_FOLDER, os.path.basename(media_file)))
        if media_type == "audio":
            media_info[os.path.basename(media_file)] = get_audio_info(os.path.join(TEMP_FOLDER, os.path.basename(media_file)))
        if media_type == "video":
            media_info[os.path.basename(media_file)] = get_video_info(os.path.join(TEMP_FOLDER, os.path.basename(media_file)))

for item in media_info:
    print(f"{item}:\n\t{media_info[item]}")

print("👍 Finished!")

# 2. Create basic Manifests
Now we have all of the information we need to create our basic Manifests that point to our media. 

In [None]:
manifests = {"images" : [], "audio" : [], "video" : []}

for media_type in media_urls:
    for i, media_file in enumerate(media_urls[media_type]):
        print_progress_bar_colab(i, len(media_urls[media_type]) - 1, f"(File {i + 1}/{len(media_urls[media_type])} in {media_type})")

        media_file_info = media_info[os.path.basename(media_file)]
        
        # The Manifest class is our starting point:
        manifest = iiif_prezi3.Manifest(
            id = f"https://placeholder.com/{os.path.basename(media_file)}.json",
            label = {"en" : [f"{os.path.basename(media_file)} ({media_type} {i + 1} / {len(media_urls[media_type])})"]}
        )

        # Next we create Canvas, think of this as a page in your Manifest:
        canvas = iiif_prezi3.Canvas(
            id = "https://placeholder.com/canvas/1",
            label = {"en" : [f"{os.path.basename(media_file)}"]}
        )

        # Next we need to add the media file:
        annotation_page = iiif_prezi3.AnnotationPage(id = "https://placeholder.com/canvas/1/page/1")
        media_annotation_element = iiif_prezi3.Annotation(
            id = "https://placeholder.com/canvas/1/page/1/1",
            motivation = "painting",
            target = "https://placeholder.com/canvas/1"
        )
        mime_type, encoding = mimetypes.guess_type(media_file)
        body = {
            "id" : media_file,
            "type" : mime_type.split("/")[0].capitalize(),
            "format" : mime_type
        }

        # Next let's update the corresponding fields with dimansion and duration information:
        if media_type == "images" or media_type == "video":
            media_annotation_element.target = media_annotation_element.target + f"#xywh=0,0,{media_file_info['width']},{media_file_info['height']}"
            body["width"] = media_file_info['width']
            body["height"] = media_file_info['height']
            canvas.width = media_file_info['width']
            canvas.height = media_file_info['height']
            if media_type == "video":
                media_annotation_element.target = media_annotation_element.target + f"&t=0,{media_file_info['duration'] / 1000}"
                body["duration"] = media_file_info['duration'] / 1000
                canvas.duration = media_file_info["duration"] / 1000
        if media_type == "audio":
            media_annotation_element.target = media_annotation_element.target + f"#t=0,{media_file_info['duration'] / 1000}"
            body["duration"] = media_file_info['duration'] / 1000
            body["type"] = "Sound"
            canvas.duration = media_file_info["duration"] / 1000

        # We need to stitch everything together:
        media_annotation_element.body = body
        annotation_page.items.append(media_annotation_element)
        canvas.items.append(annotation_page)
        manifest.items.append(canvas)

        # And finally save everything to disk:
        write_json(os.path.join(TEMP_FOLDER, f"{os.path.splitext(os.path.basename(media_file))[0]} Basic Manifest.json"), manifest.dict())
        manifests[media_type].append(manifest)

print("👍 Manifests created!")

# 3. Upload to Arvest
Congrats! We've succesfully created our first Manifests. Let's upload them to Arvest in order to see how they look. First, we need to "connect" to Arvest using the Arvest API package. For this, we need our user email and our password.

In [None]:
ar = arvestapi.Arvest("my_email@something.com", "myarvestpassword")
print(f"👍 Succesfully connected to Arvest with \"{ar.profile.name}\"")

Now we can upload the Manifest files to Arvest using the `add_manifest()` function. This will take one kwarg, `path`, which is the path to the file we'd like to upload. This can be a local path or an online path, the API package will take care of things for us. Each time, we'll grab the added media, and also modify the **title** and **description**. Well also need to set the `update_id` kwarg to `True` so that the placeholder urls get replaced with the new url created when we upload the Manifest to arvest.

In [None]:
uploaded_manifests = []
count = 0
print("Uploading files...")

for media_type in media_urls:
    for i, media_file in enumerate(media_urls[media_type]):
        print_progress_bar_colab(i, len(media_urls[media_type]) - 1, f"(Manifest {i + 1}/{len(media_urls[media_type])} in {media_type})")
        manifest_path = os.path.join(TEMP_FOLDER, f"{os.path.splitext(os.path.basename(media_file))[0]} Basic Manifest.json")
        
        added_manifest = ar.add_manifest(path = manifest_path, update_id = True)
        added_manifest.update_title(f"{os.path.splitext(os.path.basename(media_file))[0]} Basic Manifest")
        added_manifest.update_description("API created Manifest")
        
        # Update metadata:
        manifest_metadata = added_manifest.get_metadata()
        manifest_metadata["creator"] = "Batch media upload example script"
        manifest_metadata["identifier"] = "&&BATCH_UPLOAD"
        added_manifest.update_metadata(manifest_metadata)
        count = count + 1

print(f"👏 Added {count} Manifest files to Arvest!")

# 4. More complicated Manifests
Now that we've got basic Manifests up and running, lets make them a bit more coplicated! We'll go through a few functionalities here that show you the scope of what a IIIF Manifst can do.

## Manifest Metadata
First, we can give our Manifets some metadat. Note that this isn't the same as the metadata given to the object in Arvest, this is the Metadata that is displayed within the workspace, alongside your media. For the sake of these examples, we'll take the first video Manifest created earlier. To add metadata, we simply need to update the list associated with the prezi3 `Manifest`'s `metadata` property.

In [None]:
manifest = manifests["video"][0]

metadata = [
    {
        "label" : {"en" : ["Title"]},
        "value" : {"en" : ["My super video manifest"]},
    },
    {
        "label" : {"en" : ["Creator"]},
        "value" : {"en" : ["Me!"]},
    },
    {
        "label" : {"en" : ["Date"]},
        "value" : {"en" : ["My super video manifest"]},
    }
]

manifest.metadata = metadata

print("👍 Updated Manifest metadata!")

## Annotations
Next we can give our Manifest's canvas some annotations. To do this, we'll need to create a new prezi3 `AnnotationPage` object and some `Annotation`s which we shall format in a specific way.

In [None]:
annotation_page = iiif_prezi3.AnnotationPage(id = "https://placeholder.com/canvas/1/annotation/1")

# Simple textual annotation
annotation_page.items.append(iiif_prezi3.Annotation(
    id = "https://placeholder.com/canvas/1/annotation/1/1",
    motivation = "commenting",
    target = "https://placeholder.com/canvas/1",
    body = {
        "type" : "TextualBody",
        "format" : "text/html",
        "value" : "<p><strong><em>MY COOL ANNOTATION</em></strong><br>This Annotation is written in html, here's a <a href=\"https://arvest.app/en\">link</a>.</p>"
    }
))

# Tagging annotation
annotation_page.items.append(iiif_prezi3.Annotation(
    id = "https://placeholder.com/canvas/1/annotation/1/2",
    motivation = "tagging",
    target = "https://placeholder.com/canvas/1",
    body = {
        "type" : "TextualBody",
        "format" : "text/html",
        "value" : "Hello world"
    }
))

# Annotation linked to another manifest
linked_manifest_url = "https://iiif.harvardartmuseums.org/manifests/object/299843"
annotation_page.items.append(iiif_prezi3.Annotation(
    id = f"https://placeholder.com/canvas/1/annotation/1/3#{linked_manifest_url}",
    motivation = "commenting",
    target = "https://placeholder.com/canvas/1",
    body = {
        "type" : "TextualBody",
        "format" : "text/html",
        "value" : "<p>Visit this Manifest by clicking the link below.</p>"
    }
))

# Annotation with a spatial region:
region = {"x" : 10, "y" : 10, "w" : 100, "h" : 100}
annotation_page.items.append(iiif_prezi3.Annotation(
    id = f"https://placeholder.com/canvas/1/annotation/1/4",
    motivation = "commenting",
    target = f"https://placeholder.com/canvas/1#xywh={region['x']},{region['y']},{region['w']},{region['h']}",
    body = {
        "type" : "TextualBody",
        "format" : "text/html",
        "value" : "<p>This will target a 100 pixel square at the coordinates 10,10.</p>"
    }
))

# Annotation with a temporal region:
region = {"start" : 1, "end" : 2}
annotation_page.items.append(iiif_prezi3.Annotation(
    id = f"https://placeholder.com/canvas/1/annotation/1/5",
    motivation = "commenting",
    target = f"https://placeholder.com/canvas/1#t={region['start']},{region['end']}",
    body = {
        "type" : "TextualBody",
        "format" : "text/html",
        "value" : "<p>This will target from 1 second to 2 seconds.</p>"
    }
))

# Annotation with everything!
region = {"x" : 110, "y" : 110, "w" : 100, "h" : 100, "start" : 1, "end" : 2}
linked_manifest_url = "https://iiif.harvardartmuseums.org/manifests/object/299843"
annotation_page.items.append(iiif_prezi3.Annotation(
    id = f"https://placeholder.com/canvas/1/annotation/1/6#{linked_manifest_url}",
    motivation = "commenting",
    target = f"https://placeholder.com/canvas/1#xywh={region['x']},{region['y']},{region['w']},{region['h']}&t={region['start']},{region['end']}",
    body = {
        "type" : "TextualBody",
        "format" : "text/html",
        "value" : "<p><strong>A BIT OF EVERYTHING PLEASE!</strong></p>"
    }
))

# Add the annotations to the Canvas:
manifest.items[0].annotations = [annotation_page]

print("👍 Updated Manifest annotations!")

## Attribution
Next, let's provide a few other bits ofinformation, such as `requiredStatement`, `rights`, `provider` etc.

In [None]:
manifest.rights = "https://creativecommons.org/licenses/by-nc-nd/4.0/"
manifest.requiredStatement = {
    "label" : {"en" : ["Attribution"]},
    "value" : {"en" : ["My institution"]}
}
manifest.provider = [iiif_prezi3.ProviderItem(
    id = "https://www.univ-rennes2.fr/",
    label = {"en" : ["My institution"]},
    homepage = iiif_prezi3.HomepageItem(id = "https://arvest.app/en", label = {"en" : ["Arvest Homepage"]}, type = "Text"),
    logo = {"id" : "https://arvest.app/imgs/logos/arvest_logo_cut.png", "type" : "Image", "format" : "image/png", "width" : 107, "height" : 107}
)]

print("👍 Updated Manifest attribution!")

## Thumbnails
Let's also add some thumbnails. We'll need to set one for the Manifest iteself as well as the Canvas.

In [None]:
manifest.thumbnail = [{"id" : "https://arvest.app/imgs/logos/arvest_logo_cut.png", "type" : "Image", "format" : "image/png", "width" : 107, "height" : 107}]
manifest.items[0].thumbnail = [{"id" : "https://arvest.app/imgs/logos/arvest_logo_cut.png", "type" : "Image", "format" : "image/png", "width" : 107, "height" : 107}]

print("👍 Updated Manifest thumbnails!")

## Save the Manifest and upload
Finally let's save the Manifest to disk, and upload the result to Arvest.

In [None]:
# Save
out_path = os.path.join(TEMP_FOLDER, "complex_video_manifest.json")
write_json(out_path, manifest.dict())

# Upload
added_manifest = ar.add_manifest(path = out_path, update_id = True)
added_manifest.update_title("Complex video manifest")
added_manifest.update_description("Showing off some more complex possibilities.")

# Update metadata
manifest_metadata = added_manifest.get_metadata()
manifest_metadata["creator"] = "Batch media upload example script"
manifest_metadata["identifier"] = "&&BATCH_UPLOAD"
added_manifest.update_metadata(manifest_metadata)

# 5. Cleanup
To finish, lets clean up our mess! First, we can delete the temporary folder where the media was downloaded and our Manifests were created.

In [None]:
shutil.rmtree(TEMP_FOLDER)
print(f"🗑️ {TEMP_FOLDER} removed !")

And finally, we can reove from Arvest all of our Manifests. We can get all of our Manifests by using the `get_manifests()` function, then check it's metadata. If it's one of the files we want to remove, we can then use the `remove()` function.

**Warning: there's no going back after using the remove function, so be careful!**

In [None]:
all_manifests = ar.get_manifests()
count = 0
print("Removing manifests...")

for i, media_file in enumerate(all_manifests):
    print_progress_bar_colab(i, len(all_manifests) - 1, f"(Processing file {i + 1}/{len(all_manifests)})")
    media_metadata = media_file.get_metadata()
    if media_metadata["creator"] == "Batch media upload example script" and media_metadata["identifier"] == "&&BATCH_UPLOAD":
        media_file.remove()
        count = count + 1

print(f"🗑️ Removed {count} Manifest files!")