<div style="background-color:  #663066; padding: 20px;">
   
   
</div>

<div style="display: flex; align-items: center; justify-content: space-between;">
    <div>
        <h1><strong>MIM 2 Data Spaces Graduation Project</strong></h1>
        <h4>by Maxwell Ernst - 18/06/2024</h4>
    </div>
    <div>
        <img src="fontyslogo.png" alt="Fontys Logo" style="height: 80px; margin-left: 20px;">

    
</div>

## **MIM2 Process Tutorial**


This Jupyter Notebook serves as a tutorial for creating a Minimum Interoperable Mechanism - MIM2 (data models and sharing) for data spaces, using mock data that resembles real-world sensor data. The steps outlined here are designed to be generalizable and can be adapted for various data sources and MIM2 development purposes.

# 1) General Steps for creating a MIM

General process for creating a MIM can be outlined in the following steps:

1) Read the Data Spaces Summary Document to gain a good understanding of Data Spaces.
2) Determine the domain you are working in - Mobility, Smart and Sustainable Cities, and Communities.
3) Define the requirements for what needs to be developed to identify which MIM to create, as outlined in the building blocks - Figure 1: Building Blocks taxonomy recommended by OpenDEI and adapted by the DSBA Technical.
4) Search for available standards for that MIM from technical and governance standpoints.
5) Develop the necessary MIM(s).

![Alt text](MIMsOverview.png)

Figure 1: Building Blocks taxonomy recommended by OpenDEI and adapted by the DSBA Technical.

# 2) Steps for creating MIM2

## 2.1) What is MIM2?


MIM2, or "Shared Data Models," ensures that data sets use the same definitions for key terms, which is crucial for accurate data linking. For instance, if one dataset defines "children" as ages 5-15 and another defines them as ages 2-12, merging these datasets would create inaccuracies.

Data models are machine-readable definitions of terms, which allow APIs to understand and handle them properly. Consistent data models enable applications to link relevant contextual data with datasets.

## 2.2) Why are shared data-models important?

A common set of data models creates a shared language, allowing systems to communicate effectively. Well-defined data models help cities to integrate and open up data across different solutions and support various applications. Harmonized data models can be reused, facilitating data sharing and learning among cities.

## 2.3) EU Policy Context


Sharing data between different agencies within a city or between cities requires a common way of defining entities. For example, consistent definitions for terms like "bus" or "taxi" are essential. Without common data models, each agency would need to create their own, making data sharing difficult and inefficient.

Common data models support benchmarking and shared learning, reducing the effort required to define data sets.

## 2.4) Requirements for Compliance

Entities described by data in the ecosystem should use consistent data models based on:

- Resource Description Framework (RDF)
- Resource Description Framework Schema (RDFS)
- Web Ontology Language (OWL)

For spatial and spatio-temporal data, consider the provisions of MIM-7 (Places) regarding data encoding.

## 2.5) Recommended Specifications

Using NGSI-LD compliant data models is the preferred option for smart city aspects. These data models have been defined by organizations and projects, including OASC, FIWARE, GSMA, and the SynchroniCity project. There is ongoing collaboration between OASC, TM Forum, and FIWARE to specify more models through the Smart Data Models initiative: Smart Data Models.

Alternatively, existing data models and ontologies can be adapted for use with NGSI-LD by identifying entities, properties, and relationships that can be managed by the NGSI-LD API. Some examples include:

- oneM2M base ontology (compatible with SAREF), which provides semantic descriptions of data through metadata
- SAREF: Smart Appliances REFerence ontology, with SAREF4Cities focused on smart cities
- Core vocabularies of ISA, such as the Core Public Service Vocabulary Application Profile, used for the Single Digital Gateway Regulation
- Digital Twin Definition Language (DTDL) developed by Microsoft, based on json-ld, with existing Fiware data models converted to this format

## 2.6) Relevant European References and Specifications

As part of ongoing work related to MIM2, support for the Smart Data Models Initiative aims to:

- Develop guidelines and a catalogue of minimum common data models in different sectors for interoperability
- Create harmonized representation formats and semantics for applications to consume and publish data
- Develop data models for interoperable and replicable smart solutions across sectors, starting with smart cities and extending to smart agri-food, smart utilities, smart industry, etc.
- Establish a methodology to translate between credible initiatives developing data models
- Provide guidelines on developing consistent data models
- Expand the catalogue of data models agreed upon by OASC cities as common models for use

# 3) MIM2 Example using mock data

## 3.1) Prerequisites

Before following this tutorial, ensure you have the following:

- Python 3.x: Download and install Python from https://www.python.org/downloads/.
- Jupyter Notebook: Install Jupyter Notebook using pip install jupyter in your terminal.
- Pandas library: Install Pandas using pip install pandas in your terminal.
- Familiarity with MIM concepts: Basic understanding of MIMs and data spaces is recommended.

## 3.2) Data Source

This tutorial utilizes mock data that simulates real-world sensor data. You can replace this with your actual data source during implementation. The mock data will have a similar structure to sensor readings, including:

- timestamp: The time the data was collected.
- sensor_id: Unique identifier for the sensor.
- temperature: The recorded temperature value.
- humidity: The recorded humidity value.

## 3.4) ETL Process

- Extract: In a real scenario, you'd extract data from its source (databases, APIs, etc.). Here, we'll create some mock data to demonstrate the process.

In [1]:
import pandas as pd

# Create mock data
data = {
    "timestamp": pd.to_datetime(["2024-05-30 10:00:00", "2024-05-30 11:00:00", "2024-05-30 12:00:00"]),
    "sensor_id": ["sensor_1", "sensor_2", "sensor_1"],
    "temperature": [22.5, 23.2, 21.8],
    "humidity": [55, 60, 52]
}

# Create a Pandas DataFrame from the dictionary
df = pd.DataFrame(data)

# Display the first few rows of the data
print(df.head())


            timestamp sensor_id  temperature  humidity
0 2024-05-30 10:00:00  sensor_1         22.5        55
1 2024-05-30 11:00:00  sensor_2         23.2        60
2 2024-05-30 12:00:00  sensor_1         21.8        52




- Transform: This stage involves cleaning, formatting, and manipulating the data to conform to the chosen data model structure. In this example, the data is already relatively clean. However, you might need to handle missing values, convert data types, or create new features depending on your specific data source.

- Load: The transformed data is loaded into a suitable format for further processing. In MIM2 creation, this might involve storing the data in a format compatible with your data space platform.

## 3.6) Data Model Selection

- Choose a standardized data model for representing your data within the MIM2. This tutorial uses Smart Data Models by FIWARE as an example. Select the most appropriate model(s) that aligns with your data content and adheres to MIM2 specifications.

## 3.7) Data Mapping

- Map the transformed data elements to the corresponding entities and attributes defined in the chosen data model. Here's an example mapping for our mock data:

In [2]:
# Map data elements to Smart Data Model attributes
mapped_data = df.rename(columns={
    "timestamp": "observedAt",
    "sensor_id": "deviceID",
    "temperature": "temperatureValue",
    "humidity": "humidityValue"
})

# Display the mapped data
print(mapped_data)


           observedAt  deviceID  temperatureValue  humidityValue
0 2024-05-30 10:00:00  sensor_1              22.5             55
1 2024-05-30 11:00:00  sensor_2              23.2             60
2 2024-05-30 12:00:00  sensor_1              21.8             52


## 3.8) JSON Schema and Input

## 3.9) JSON Validation Test

### Additional Notes:

- This tutorial provides a foundational framework. Depending on the complexity of your MIM2 and data source, additional steps or considerations might be necessary.
- Explore resources and documentation provided by FIWARE for in-depth guidance on using Smart Data Models and MIM2 development best practices.
Next Steps:

This tutorial has provided a basic understanding of the MIM2 creation process. Feel free to modify and adapt these steps to your specific data source and MIM2 development goals. Remember to consult relevant FIWARE documentation for further technical details and best practices.

# Bibliography/Sources