yProv is a provenance service aimed at addressing multi-level provenance as well as reproducibility challenges in climate analytics experiments. It allows scientists to manage provenance information compliant with the W3C PROV standard in a more structured way and navigate and explore the provenance space across multiple dimensions, thus enabling the possibility to get coarse or fine-grained information according to the level of interest.
yProv is a joint project between University of Trento and CMCC.
yProvStore is the backend service of yProv, built with FastAPI and designed to handle the storage and retrieval of provenance data. It provides a RESTful API for interacting with provenance information, allowing users to create and read provenance records, manage document metadata, and handle permissions.
This section provides instructions for setting up the yProvStore project for local development. It covers the prerequisites, dependencies installation, database setup, and how to run the application.
For a quick setup, you can follow the TL;DR: Quick Setup & Installation section below, otherwise, you can read through the detailed steps provided in the following sections.
-
Clone the repository:
git clone https://github.com/HPCI-Lab/yProvStore cd yProvStore -
Install
uv(optional, recommended):pip install uv
-
Install Python 3.12 (if not already installed). You can use
pyenvoruvto manage Python versions:uv install python 3.12
-
Install dependencies:
uv sync
-
Run database migrations:
uv run alembic upgrade head
-
Create a
.envfile in the root directory to set necessary environment variables. For local testing, you can use:USE_LOCAL_PID_SERVICE=True # Uses a mocked version of the PID service for local testing USE_LOCAL_FILE_STORAGE_SERVICE=True # Uses local file storage instead of MinIO for local testing
Or, if you want to connect to the real PID service, provide the path to your private key:
PID_PRIVATE_KEY_PATH=keys/user_private.pem USE_LOCAL_PID_SERVICE=False USE_LOCAL_FILE_STORAGE_SERVICE=True # Still using local file storage for testing # Other optional variables for PID service if you need to override defaults PID_PREFIX=21.T11961 # default value PID_SERVER_URL=https://pidhs.disi.unitn.it:8000 # default value PID_ADMIN_HANDLE_INDEX=301 # default value
If you want to use MinIO for file storage, set the following variables instead of
USE_LOCAL_FILE_STORAGE_SERVICE=True:MINIO_ROOT_USER=minioadmin MINIO_ROOT_PASSWORD=minioadmin MINIO_BUCKET_NAME=yprov-documents MINIO_ENDPOINT=localhost:9000 # Change to your MinIO server (for example within docker it whould be `yprovstore-minio:9000`, which is also the default value) MINIO_SECURE=False # !! IMPORTANT: if testing locally you need to disable HTTPS
-
Start the application:
uv run src/run.py
NOTE: If you see errors, check the Troubleshooting and Environment Variables sections for common issues.
-
Access the API docs:
Open http://localhost:8000/docs in your browser.For CLI usage, refer to the yProvStore CLI section below.
Before starting with dependencies installation, make sure you have Python 3.12 installed on your system, as it is a requirement to run the application. You can check your Python version by running:
python --versionIf you don't have Python 3.12, you can install it using your system's package manager or download it from the official Python website.
Alternatively, you can also use tools such as pyenv or uv to manage multiple Python versions on your system:
uv install python 3.12NOTE:
uvis a tool that simplifies Python project management, including virtual environments and dependency management. It is designed to be faster and more efficient than traditional tools likepipandvirtualenv.If not already installed, you can install
uvusing:pip install uv
After cloning the repository, you can create a virtualenv and install the required dependencies by simply running:
uv syncThis command will set up the environment and install all required packages as specified in the pyproject.toml file.
NOTE: All commands run with
uvwill automatically activate the virtual environment, so you don't need to manually activate it. This is one of the advantages of usinguvfor managing your Python projects.
The application uses Alembic to manage database migrations. To set up the database, run the following command in the root directory of the cloned repository:
uv run alembic upgrade headThis command will apply all pending migrations to your database, ensuring that it is up-to-date with the latest schema changes.
If you need to change any environment variables, you can create a .env file in the root directory of the project. This file can contain any environment-specific configurations, such as database connection strings or API keys. Some usefule environment variables are:
LOG_LEVEL=INFO # Change to DEBUG for more verbose logging
PID_PRIVATE_KEY_PATH=/path/to/private/key.pem # Path to the private key for PID service (will throw an error if not set and USE_LOCAL_PID_SERVICE is False)
USE_LOCAL_PID_SERVICE=True # Set to True to use the local PID service for testing purposes (default is False)
USE_LOCAL_FILE_STORAGE_SERVICE=True # Set to True to use local file storage instead of MinIO (default is False)The other environment variables can be seen in the src/application/settings.py file, where they are defined with default values. You can override these defaults by setting them in your .env file.
To run the application, go to the root directory of the cloned repository and use the following command:
uv run src/run.pyThanks to uv, this command will automatically activate the virtual environment and run the FastAPI application.
You can now go to your web browser and navigate to http://localhost:8000/docs to access the interactive API documentation provided by FastAPI. This interface allows you to test the API endpoints and explore the available functionality.
If you encounter this error when running the application:
Exception: Database is empty (no tables), verify your configuration and migrations.
It means that the database has not been initialized yet. To resolve this, ensure you have run the Alembic migrations as described in the Database Setup section above.
If you encounter this error when running the application:
FileNotFoundError: PID private key file not found: keys/admpriv.pem. Please provide it or set USE_LOCAL_PID_SERVICE to True if you only need to test locally.
It means that the application is trying to use the PID service, but the private key file is missing. To resolve this, you have two options:
- Provide the missing private key file at the specified path (
keys/admpriv.pem) - or set the key pathPID_PRIVATE_KEY_PATHin your.envfile. - Set the
USE_LOCAL_PID_SERVICEenvironment variable toTruein your.envfile if you only need to test locally.
More details on this can be found in the Environment Variables section above.
Uploading document from JSON file: examples/prov_valid.json
Error 503:
{'description': 'Failed to ensure storage bucket.'}
❌ Error : No response.
If you encounter this error when trying to upload a document, it means that the bucket was not created on MinIO or MinIO is not reachable by the application. To resolve this, you need to first check whether the bucket defined in the MINIO_BUCKET environment variable has been created. You can do this by accessing the MinIO web interface at http://localhost:9001 and logging in with the root user and password you defined in the .env file (default is minioadmin with password minioadmin). Once logged in, create a new bucket with the name specified in MINIO_BUCKET (default name is yprov-documents) if not existing. Alternatively, you need to check if the MinIO server is running and accessible from the application.
You may be missing MINIO_SECURE=False in your .env file if you are using an insecure connection between the application and the MinIO server (meaning no cls certificates have been configured), which is common in local testing.
The application can also be deployed using Docker and Docker Compose. Check the DEPLOYMENT.md file for detailed instructions on how to set up and run the application using Docker.
This command-line interface (CLI) allows you to interact with the yProv API directly from your terminal.
You can install it and see the available commands in its dedicated repository: yProvStore-cli

