# README: Cultural Heritage Data Integration and Query System

## Project Overview
This project integrates and queries cultural heritage data from multiple sources using both relational and graph database systems. The goal is to enable complex querying over structured metadata and processing data by leveraging both linked data (RDF) and traditional tabular structures (SQL).

It combines data from **JSON** and **CSV** files and loads them into:
- A **SQLite database** (for relational queries)
- A **Blazegraph triple store** (for SPARQL queries)

## Data Sources
- `metadata/` – JSON files describing cultural heritage objects (authors, timeframes, etc.)
- `processing/` – CSV files including acquisition dates, processing statuses, etc.

## Tools & Technologies
- Python 3
- pandas
- SQLite
- Blazegraph
- rdflib
- Custom Python classes (in `impl/` directory)

## Repository Structure
```
impl/
  ├── MetadataQueryHandler.py
  ├── ProcessDataQueryHandler.py
  └── AdvancedMashup.py

data/
  ├── metadata/     # JSON files
  └── processing/   # CSV files

relational.db       # SQLite database
README.ipynb        # This notebook
```

## Key Features
- Modular query classes for SPARQL and SQL queries
- Combined queries via a unified mashup interface
- Parsing and transformation with pandas
- Flexible integration between graph and relational data

## Example Use Case

In [None]:
from impl import MetadataQueryHandler, ProcessDataQueryHandler, AdvancedMashup

metadata_qh = MetadataQueryHandler()
metadata_qh.setDbPathOrUrl("http://127.0.0.1:9999/blazegraph/sparql")

process_qh = ProcessDataQueryHandler()
process_qh.setDbPathOrUrl("relational.db")

mashup = AdvancedMashup([metadata_qh], [process_qh])

# Get authors of objects acquired in April 2023
authors = mashup.getAuthorsOfObjectsAcquiredInTimeFrame("2023-04-01", "2023-04-30")
print(authors)

## Installation & Setup
1. Clone the repo and open the notebook
2. Install dependencies:
```bash
pip install pandas rdflib
```
3. Start Blazegraph locally (on port `9999`)
4. Load data into both systems using provided scripts
5. Use the `MetadataQueryHandler`, `ProcessDataQueryHandler`, and `AdvancedMashup` to explore

## Data BARK" members:

Benjamin Kollmar [benjamin.kollmar@studio.unibo.it]
Amanda Altamirano [amanda.altamirano@studio.unibo.it]
Rubens Fernandes [rubens.fernandes@studio.unibo.it]
Ekaterina Krasnova [ekaterina.krasnova@studio.unibo.it]
Instructor: Silvio Peroni

## License
This project is for educational use only.