![Arvest batch media upload](https://raw.githubusercontent.com/arvest-data-in-context/ml-notebooks/refs/heads/main/docs/images/notebooks/link-online-manifests.png)

In this notebook, we shall learn how to add online IIIF Manifests to your Manifest list on [Arvest](https://arvest.app) using the [Arvest API](https://github.com/arvest-data-in-context/arvest-api). This allows you to build up a collection of IIIF Manifests and also organize them according to metadata notably so that you can easily acces them for processing.

# 0. Setup

Let's begin by installing and importing all of the different components we will need.

In [None]:
print("Installing and importing packages...")

# Uninstall and reinstall packages for a clean environment
!pip uninstall -q -y arvestapi
!pip uninstall -q -y jhutils
!pip install -q --disable-pip-version-check git+https://github.com/arvest-data-in-context/arvest-api.git
!pip install -q --disable-pip-version-check git+https://github.com/jdchart/jh-py-utils.git

# Import packages
import arvestapi
from jhutils.local_files import read_json
from jhutils.misc import print_progress_bar
import os

print("👍 Ready!")

# 1. Prepare your media
First, we need to get the path to to your IIIF Manifest files. We shall create a list of urls in the `ONLINE_MANIFEST_FILES` variable that leads directly to where a IIIF Manifest file is found online.

In [None]:
ONLINE_MANIFEST_FILES = [
    "https://iiif.harvardartmuseums.org/manifests/object/299843",
    "https://iiif.bodleian.ox.ac.uk/iiif/manifest/e32a277e-91e2-4a6d-8ba6-cc4bad230410.json",
    "https://www.e-codices.unifr.ch/metadata/iiif/gau-Fragment/manifest.json"
]

# 2. Connect to Arvest
Next, we need to "connect" to Arvest using the Arvest API package. For this, we need our user email and our password which we will give to an instance of the `arvestapi.Arvest()` class. For convenience, we've saved ours in a file which is why we get `LOGIN_DATA` by reading a json file.

In [None]:
LOGIN_DATA = read_json(os.path.join(os.getcwd(), "login_private.json"))

ar = arvestapi.Arvest(LOGIN_DATA["email"], LOGIN_DATA["password"])
print(f"👍 Succesfully connected to Arvest with \"{ar.profile.name}\"")

Now we can add the Manifest to Arvest using the `add_manifest()` function. This will take one kwarg, `path`, which is the path to the file we'd like to upload.

We'll first upload the file and put the returned object into a variable called `added_manifest`. This will then allow us to update the **title** and the **description** in Arvest of the media item.

In [None]:
uploaded_manifests = []
count = 0
print("Uploading files...")

for i, manifest_file_path in enumerate(ONLINE_MANIFEST_FILES):
    print_progress_bar(i + 1, len(ONLINE_MANIFEST_FILES), f"(Local file {i + 1}/{len(ONLINE_MANIFEST_FILES)})")

    # Add manifest using the add_manifest() function:
    added_manifest = ar.add_manifest(path = manifest_file_path)

    # Update the title and description (change this to whatever you want):
    added_manifest.update_title(f"{manifest_file_path} (batch upload file {i + 1}).")
    added_manifest.update_description(f"Uploaded to demonstrate batch media uploading from a python notebook.")
    
    # We add the manifets to a list so that we can retrieve them later:
    uploaded_manifests.append(added_manifest)
    count = count + 1

print(f"👏 Added {count} media files to Arvest!")

You can now logon to your [workspace](https://workspace.arvest.app/) and see the new Manifest items in your list.

# 3. Update metadata
Finally we can update our Manifest's metadata. Note that this is modifying the Manifest item in Arvest, not the actual content of the Manifest itself. Among other things, this will notably be useful for parsing our documents and making sure that we find the files we need when scripting.

We can deal with our metadata as a `dict` in python which we get using the `get_metadata()` function. We can then update this dict and use the `update_metadata()` function to update in Arvest.

Check your [workspace](https://workspace.arvest.app/) again to examine how the metadata has been updated.

In [None]:
print("Updating metadata...")

for i, added_manifest in enumerate(uploaded_manifests):
    print_progress_bar(i + 1, len(uploaded_manifests), f"(File {i + 1}/{len(uploaded_manifests)})")

    # Get the metadata dict:
    media_metadata = added_manifest.get_metadata()

    # Update fields:
    media_metadata["creator"] = "Batch manifest upload example script"
    media_metadata["identifier"] = "&&BATCH_UPLOAD"

    # Update on Arvest:
    added_manifest.update_metadata(media_metadata)

print(f"👍 Metadata updated!")

# 4. Batch remove media
If we need to remove Manifest files we can do so by parsing through all of our Manifests and checking certain conditions. For example, we can get all of our media files using the `get_manifests()` function, then check it's metadata. If it's one of the files we want to remove, we can then use the `remove()` function.

**⚠️ Warning: there's no going back after using the remove function, so be careful! To avoid accidential removal, we've added a `REMOVE` variable that need to be set to `True` for the code to run.**

In [None]:
REMOVE = True

if REMOVE:
    count = 0
    print("Removing files...")

    # Get all of our Manifest files:
    all_manifests = ar.get_manifests()
    
    for i, manifest_file in enumerate(all_manifests):
        print_progress_bar(i + 1, len(all_manifests), f"(Processing file {i + 1}/{len(all_manifests)})")
        
        # Get the Manifest item's metadata and check if it matches some conditions:
        manifest_metadata = manifest_file.get_metadata()
        if manifest_metadata["creator"] == "Batch manifest upload example script" and manifest_metadata["identifier"] == "&&BATCH_UPLOAD":
            
            # Remove the item:
            manifest_file.remove()
            count = count + 1

    print(f"🗑️ Removed {count} manifest files!")