# Semantic Ethical Glass Box (SEGB)

## 1. Overview 

The Semantic Ethical Glass Box (SEGB) is global *log* storage, which keeps a semantic registry (graph) of logs generated within different systems. It is comprised of two parts: 

1. A REST API Flask-based server, whose functions are 1) to add new triples to the global graph and 2) retrieve the global graph; 

2. A MongoDB-based database, where the global graph is storaged in JSON-LD format

>[!IMPORTANT]
> Since the SEGB is in a testing stage, the MongoDB database is stored in a local Docker-managed volume on the local computer where the SEGB is deployed. In the future, during the deployment stage, the database will be migrated to a centralized server to store all the records in a safe, consistent manner


## 2. API Description

### 🔹 `POST /log`
**Description:**  
Stores the received **Turtle (TTL)** data, converts it to **JSON-LD**, and saves it in the database. The TTL data could contain one or several triples.

#### ✅ Request
- **URL:** `/log`
- **Method:** `POST`
- **Required Headers:** 
    Content-Type: text/turtle
- **Request Body:**  
    A document in **Turtle (TTL)** format (`text/turtle`).

#### 📤 Responses
| Status Code | Description |
|-------------|-------------|
| `200 OK` | Data successfully stored. |
| `400 Bad Request` | Error processing data or missing data. |

---

### 🔹 `GET /get_graph`
**Description:**  
Retrieves the stored **JSON-LD** data, processes it, and returns it in **Turtle (TTL)** format.

#### ✅ Request
- **URL:** `/get_graph`
- **Method:** `GET`

#### 📤 Responses
| Status Code | Description |
|-------------|-------------|
| `200 OK` | Returns the data in **Turtle (TTL)** format. |
| `404 Not Found` | No data available in the database. |

# 3. Launching the Semantic Ethical Glass Box (SEGB)


Use the docker-compose file available in this repository. This action requires access to the image used in the docker compose file. This consists on several steps:

1. Get a personal access token to enable console login in ghcr.io (Follow these instructions <https://docs.github.com/es/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens>)

>[!CAUTION]
>A *classic personal access token* is preferred, given that *fine-grained access* token may cause problems.

3. In your console, export your token with:

```shell
export CR_PAT=<YOUR_TOKEN>
```

4. Now, login in ghcr.io with:

```shell
echo $CR_PAT | docker login ghcr.io -u <YOUR_USER_NAME> --password-stdin
```

5. Finally, execute docker compose in the directory you have your docker-compose.yaml file:

```shell
docker compose up -d
```

6. The URL of the SEGB is `http://127.0.0.1:5000`

## 4. Sending data to and retrieving data from the SEGB within the AMOR context.

>[!IMPORTANT]
>We strongly recommend to do **NOT use blank nodes** in any triples you want to log in the SEGB. They will not break the SEGB, but it can generate duplicated blank nodes (in the global graph) if they are sent several times to the SEGB due to external limitatios.


We have defined a *Python* script, [segb_tutorial.py](./segb_tutorial.py) which defines an SEGB's use case within the AMOR context. 

It first defines two functions, both of them including console *logs* and some errors verification logic, and being appropiately described by using *Docstring*:

- ***log_ttl***: function who receives as *input* the server's URL and the TTL file path and makes a POST to the SEGB.

- ***get_graph***: function who receives as *input* the server's URL and the output TTL file path and makes a GET to the SEGB.


The workflow defined within the *script* defines the use case as follows:

1. First, the server's URL and the TTL files' routes with the different ontologies used (stored in the [example_data](./example-data/) directory in this repo) are defined:

In [2]:

server = "http://127.0.0.1:5000"
    
models = [
    "example-data/amor.ttl",
    "example-data/mft.ttl",
    "example-data/bhv.ttl",
    "example-data/amor-mft.ttl",
    "example-data/amor-bhv.ttl"
]

2. Next, the *log_ttl* and *get_graph* are defined: 

In [3]:
import requests

In [4]:
def log_ttl(server: str, input_file_path: str):
    
    """Log a TTL file to the SEGB.

    Reads a Turtle (TTL) file from the specified path and sends its content
    to the SEGB's `/log` endpoint via a POST request.

    Args:
        server (str): The base URL of the SEGB server (e.g., "http://127.0.0.1:5000").
        input_file_path (str): The path to the TTL file to be logged.
    
    Example:
        >>> log_ttl("http://127.0.0.1:5000", "/path/to/file/data.ttl")
    """
    
    with open(input_file_path, mode="r", encoding="utf-8") as file:
        data = file.read()
        print("File successfully read from:", input_file_path)
    
    headers = {
        "Content-Type": "text/turtle"
    }
    
    response = requests.post(f"{server}/log", headers=headers, data=data)
    
    if response.status_code == 200:
        print("POST request completed successfully")
    else:
        print(f"Error in POST: {response.status_code} - {response.text}")

In [5]:
def get_graph(server: str, output_file_path: str):
    """Download the complete graph stored in the SEGB.

    Sends a GET request to the SEGB's `/get_graph` endpoint to retrieve the
    complete graph in Turtle format and saves it to the specified output file.

    Args:
        server (str): The base URL of the SEGB server (e.g., "http://127.0.0.1:5000").
        output_file_path (str): The path where the downloaded graph will be saved.
    
    Example:
        >>> get_graph("http://127.0.0.1:5000", "/path/to/output/graph.ttl")
    """
    print("Requesting graph...")
    
    response = requests.get(f"{server}/get_graph")
    
    if response.status_code == 200:
        with open(output_file_path, mode="w", encoding="utf-8") as file:
            file.write(response.text)
        print("File successfully downloaded to:", output_file_path)

    else:
        print(f"Error in GET: {response.status_code} - {response.text}")

3. The ontologies TTL files are mapped to the SEGB's global graph via the *log_ttl* function:

In [6]:
for model in models:
    log_ttl(server, model)

File successfully read from: example-data/amor.ttl
POST request completed successfully
File successfully read from: example-data/mft.ttl
POST request completed successfully
File successfully read from: example-data/bhv.ttl
POST request completed successfully
File successfully read from: example-data/amor-mft.ttl
POST request completed successfully
File successfully read from: example-data/amor-bhv.ttl
POST request completed successfully


4. A new ontology can be uptated whenever we need it, so we upload a new ontology (in this case, plenty of individuals):

In [7]:
input_ttl_file = "example-data/amor-examples.ttl"
log_ttl(server, input_ttl_file)

File successfully read from: example-data/amor-examples.ttl
POST request completed successfully


5.  The same way, we can upload non-ontology TTL files (ideally representing TLL-parsed logs from differents systems, e.g., social robots):

In [8]:
input_ttl_file = "example-data/new-triples.ttl"
log_ttl(server, input_ttl_file)

File successfully read from: example-data/new-triples.ttl
POST request completed successfully


6. We finally retrieve the global graph in TTL format, which includes all the mapped ontologies and all the logs which previously were uploaded, using the *get_graph* function:

In [10]:
output_ttl_file = "graph.ttl"
get_graph(server, output_ttl_file)

Requesting graph...
File successfully downloaded to: graph.ttl
