# Variables

## Import libraries

In [None]:
import os
import sys
from rdflib import URIRef, Literal

### File and Folder Definitions

This section defines various files and folders used in the process. These are divided into two categories:

1. **Existing files**: These are pre-existing files that are part of the project and are used as input or reference files during the process.
   - `ont_file_name`: The ontology file in Turtle format (`ontology.ttl`).
   - `ruleset_file_name`: The ruleset file in the PIE format (`rules.pie`).

2. **Created files during the process**: These are files generated during the execution of the process, and they are stored in the `tmp_folder`.
   - `export_file_name`: Temporary exported data file in Turtle format (`addresses-temp.ttl`).
   - `out_file_name`: Final output data file in Turtle format (`addresses.ttl`).
   - `local_config_file_name`: Configuration file for the repository in Turtle format (`config_repo.ttl`).
   - `facts_ttl_file_name`: File containing facts data in Turtle format (`facts_data.ttl`).
   - `implicit_to_facts_ttl_file_name`: File containing implicit facts data in Turtle format (`implicit_to_facts.ttl`).

3. **Existing folders**: These are folders that already exist and store files used in the project.
   - `data_folder_name`: The folder containing the data files (`../data`).

4. **Created folder during the process**: This is the folder created during the process to store temporary files.
   - `tmp_folder_name`: Folder for temporary files during the process (`../tmp_files`).

### GraphDB Repository Name
- `facts_repository_name`: The name of the repository in GraphDB where the data is stored (`addresses_from_factoids`).

### Named Graphs Definitions
These are the names of the named graphs in the GraphDB repository.
- `ontology_named_graph_name`: The named graph for the ontology (`ontology`).
- `facts_named_graph_name`: The named graph for the facts data (`facts`).
- `factoids_named_graph_name`: The named graph for the factoids data (`factoids`).
- `permanent_named_graph_name`: The named graph for permanent data (`permanent`).
- `tmp_named_graph_name`: The named graph for temporary data (`temporary`).
- `inter_sources_name_graph_name`: The named graph for inter-sources data (`inter_sources`).

### URIs to Access GraphDB
- `str_graphdb_url`: The URL to access the local GraphDB instance (`http://localhost:7200`).

### Code Folder Path
- `py_code_folder_path`: The folder containing the Python code (`./code`).

These variables are used throughout the process to refer to different files, folders, and named graphs in the GraphDB repository. They allow for a modular and flexible approach to handling the data and configuring the process steps.


In [None]:
# Existing files
ont_file_name = "ontology.ttl"
ruleset_file_name = "rules.pie"

# Created files during process (in `tmp_folder`)
export_file_name = "addresses-temp.ttl"
out_file_name = "addresses.ttl"
local_config_file_name = "config_repo.ttl"
facts_ttl_file_name = "facts_data.ttl"
implicit_to_facts_ttl_file_name = "implicit_to_facts.ttl"

# Existing folders
data_folder_name = "../data"

# Created folder during process
tmp_folder_name = "../tmp_files"

# GraphDB repository name
facts_repository_name = "addresses_from_factoids"

# Definition of names of named graphes 
ontology_named_graph_name = "ontology"
facts_named_graph_name = "facts"
factoids_named_graph_name = "factoids"
permanent_named_graph_name = "permanent"
tmp_named_graph_name = "temporary"
inter_sources_name_graph_name = "inter_sources"

# URIs to access to GraphDB
str_graphdb_url = "http://localhost:7200"

py_code_folder_path = "./code"

## Processing Global Variables

In this section, we define and process various global variables related to file paths and configurations used throughout the process. This includes:

- **Obtaining absolute file paths**: Converts relative file paths, as defined in the previous section, into absolute paths. This ensures the program can correctly locate the files regardless of the current working directory.
    - `tmp_folder`: The absolute path for the temporary folder used to store intermediate files (`tmp_folder_name`).
    - `data_folder`: The absolute path for the folder containing the data files (`data_folder_name`).
    - `python_code_folder`: The absolute path for the folder containing the Python code (`py_code_folder_path`).
    - `local_config_file`: The absolute path for the local configuration file (`local_config_file_name`), located in the temporary folder.
    - `ont_file`: The absolute path for the ontology file (`ont_file_name`).
    - `ruleset_file`: The absolute path for the ruleset file (`ruleset_file_name`).
    - `facts_ttl_file`: The absolute path for the facts data file (`facts_ttl_file_name`), located in the temporary folder.
    - `implicit_to_facts_ttl_file`: The absolute path for the file containing implicit facts data (`implicit_to_facts_ttl_file_name`), located in the temporary folder.

- **Creating a temporary folder**: If the folder specified by `tmp_folder_name` does not already exist, the program will create it to store files that are intended to be deleted after processing.

- **Creating an RDFLib object for `graphdb_url`**: This step converts the string representing the GraphDB URL (`str_graphdb_url`) into an RDFLib `URIRef` object. This object can be used in RDF queries and updates to interact with the GraphDB instance.
    - `graphdb_url`: An RDFLib `URIRef` representing the GraphDB URL (`str_graphdb_url`).

These steps help to set up the working environment by ensuring that the necessary paths and configurations are ready for the process to begin.


In [None]:
tmp_folder = os.path.abspath(tmp_folder_name)
data_folder = os.path.abspath(data_folder_name)

python_code_folder = os.path.abspath(py_code_folder_path)

local_config_file = os.path.join(tmp_folder, local_config_file_name)
ont_file = os.path.abspath(ont_file_name)
ruleset_file = os.path.abspath(ruleset_file_name)
facts_ttl_file = os.path.join(tmp_folder, facts_ttl_file_name)
implicit_to_facts_ttl_file = os.path.join(tmp_folder, implicit_to_facts_ttl_file_name)

graphdb_url = URIRef(str_graphdb_url)

## Importing Python Modules

This section imports various Python modules used throughout the project. The following modules are imported from the `code` folder:

- **file_management**: Provides functions for managing files, such as reading, writing, and manipulating file paths.
- **graphdb**: Handles interactions with GraphDB, including querying, updating, and managing named graphs.
- **graphrdf**: Likely used for handling RDF (Resource Description Framework) data and operations on graphs.
- **attribute_version_comparisons**: Facilitates the comparison of different versions of attributes in the RDF data.
- **multi_sources_processing**: Contains methods for processing data from multiple sources, likely aggregating or reconciling them.
- **factoids_creation**: Includes functions for creating factoids, which are small pieces of information or data extracted or derived from sources.
- **time_processing**: Likely used for managing and processing time-based data or timestamps.
- **resource_transfert**: Handles the transfer of resources, possibly related to moving or copying data between systems or storage locations.
- **evolution_construction**: Likely deals with constructing or evolving data models or entities over time.

These modules provide the necessary functionality for managing files, interacting with a graph database, processing data, and performing other domain-specific tasks related to the project.


In [None]:
# Calling up the `code` folder contains the python codes
sys.path.insert(1, python_code_folder)

import file_management as fm
import graphdb as gd
import graphrdf as gr
import attribute_version_comparisons as avc
import multi_sources_processing as msp
import factoids_creation as fc
import time_processing as tp
import resource_transfert as rt
import evolution_construction as ec

## Creation of folders if they don't exist

In [None]:
fm.create_folder_if_not_exists(tmp_folder)

### Creating the local directory in GraphDB
For the creation to work, GraphDB must be launched and therefore the URI given by `graphdb_url` must work. If the directory already exists, nothing is done.

### Options

- **`allow_removal`**: If set to `False`, the repository will not be removed during the reinitialization process. Instead, the repository will simply be emptied. This is useful in case the deletion of the repository fails, ensuring the directory is cleared without being deleted and recreated. 
- **`disable_same_as`**: When set to `True`, this option disables the use of `sameAs` in the reasoning process.

### Creation Process

The function `gd.reinitialize_repository` is used to reinitialize the repository. When called, it ensures the repository is set up according to the provided configurations (e.g., `local_config_file`, `ruleset_name`). If the repository already exists, it is reinitialized without removing it (if `allow_removal` is set to `False`).

- **`graphdb_url`**: The URL pointing to the running GraphDB instance.
- **`facts_repository_name`**: The name of the repository to be reinitialized.
- **`local_config_file`**: The configuration file used to initialize the repository.
- **`ruleset_name`**: The name of the ruleset to be used for reasoning, such as `"owl2-rl-optimized"`.
- **`allow_removal`**: Controls whether the repository can be deleted and recreated (`False` will just empty it).

For the creation of the repository to work, GraphDB must be running, and the provided `graphdb_url` must be valid. If the repository already exists, it will be reinitialized without further action.


In [None]:
# Il se peut que la suppression d'un répertoire ne fonctionne pas donc pour éviter la suppresion au moment de la réinitialisation (suppression + (re)création)
# `allow_removal` doit valoir False et dans ce cas-là, le répertoire sera juste vidé.
allow_removal = False
disable_same_as = False

gd.reinitialize_repository(graphdb_url, facts_repository_name, local_config_file, ruleset_name="owl2-rl-optimized", disable_same_as=disable_same_as, allow_removal=allow_removal)
# gd.reinitialize_repository(graphdb_url, facts_repository_name, local_config_file, ruleset_file=ruleset_file, disable_same_as=disable_same_as, allow_removal=allow_removal)

## Local directory management

## Importing ontologies

In [None]:
gd.load_ontologies(graphdb_url, facts_repository_name, [ont_file], ontology_named_graph_name)

## Definition of variables linked to sources

### Paris thoroughfares via Wikidata

* `wd` for "wikidata"
* `wdp_land` for "wikidata paris landmarks"
* `wdp_loc` for "wikidata paris locations"

In [None]:
# Name of the directory where the factoid triples of Wikidata data are stored and constructed
wd_repository_name = "factoids_wikidata"

# CSV file to store the result of the selection query
wdp_land_csv_file_name = "wd_paris_landmarks.csv"
wdp_land_csv_file = os.path.join(data_folder, wdp_land_csv_file_name)

# CSV file to store the result of the selection query
wdp_loc_csv_file_name = "wd_paris_locations.csv"
wdp_loc_csv_file = os.path.join(data_folder, wdp_loc_csv_file_name)

# TTL file for structuring knowledge of the Paris thoroughfares
wdp_kg_file_name = "wd_paris.ttl"
wdp_kg_file = os.path.join(tmp_folder, wdp_kg_file_name)

# Final TTL files for Wikidata factoids
wdp_factoids_kg_file_name = "wd_paris_factoids.ttl"
wdp_factoids_kg_file = os.path.join(tmp_folder, wdp_factoids_kg_file_name)
wdp_permanent_kg_file_name = "wd_paris_permanent.ttl"
wdp_permanent_kg_file = os.path.join(tmp_folder, wdp_permanent_kg_file_name)

# Time interval of validity of the source (there is not end time)
wdp_valid_time = {
    "start_time" : {"stamp":"2024-08-26T00:00:00Z","precision":"day","calendar":"gregorian"}
    }

### Nomenclature of Paris thoroughfares (Ville de Paris data)

The City of Paris data is made up of two sets:
* [names of current street rights-of-way](https://opendata.paris.fr/explore/dataset/denominations-emprises-voies-actuelles)
* [obsolete street names](https://opendata.paris.fr/explore/dataset/denominations-des-voies-caduques)

Current roads have a geometric right of way, unlike the old thoroughfares.

* `vpt` for ‘ville paris thoroughfares’
* `vpta` for ‘ville paris thoroughfares actuelles’.
* `vptc` for ‘ville paris thoroughfares caduques’.

In [None]:
# Name of the directory where the factoid triples of Ville de Paris data are stored and constructed
vpt_repository_name = "factoids_ville_de_paris"

# CSV files containting data
vpta_csv_file_name = "denominations-emprises-voies-actuelles.csv"
vpta_csv_file = os.path.join(data_folder, vpta_csv_file_name)
vptc_csv_file_name = "denominations-des-voies-caduques.csv"
vptc_csv_file = os.path.join(data_folder, vptc_csv_file_name)

# TTL file for structuring knowledge of the Paris thoroughfares
vpt_kg_file_name = "voies_paris.ttl"
vpt_kg_file = os.path.join(tmp_folder, vpt_kg_file_name)

# Final TTL files for Ville de Paris factoids
vpt_factoids_kg_file_name = "vpt_factoids.ttl"
vpt_factoids_kg_file = os.path.join(tmp_folder, vpt_factoids_kg_file_name)
vpt_permanent_kg_file_name = "vpt_permanent.ttl"
vpt_permanent_kg_file = os.path.join(tmp_folder, vpt_permanent_kg_file_name)

# Time interval of validity of the source (there is not end time)
vpt_valid_time = {
    "start_time" : {"stamp":"2024-02-10T00:00:00Z","precision":"day","calendar":"gregorian"},
    "end_time" : {"stamp":tp.get_current_timestamp(),"precision":"day","calendar":"gregorian"}
    }

### Base Adresse Nationale (BAN)

Data from the [Base Adresse Nationale (BAN)](https://adresse.data.gouv.fr/base-adresse-nationale) (National Address Base), available [here](https://adresse.data.gouv.fr/data/ban/adresses/latest/csv)

bpa` for ‘BAN paris addresses’

In [None]:
# Name of the directory where the factoid triples of BAN data are stored and constructed
bpa_repository_name = "factoids_ban"

# CSV file containting data
bpa_csv_file_name = "ban_adresses.csv"
bpa_csv_file = os.path.join(data_folder, bpa_csv_file_name)

# TTL file for structuring knowledge of Paris addresses
bpa_kg_file_name = "ban_adresses.ttl"
bpa_kg_file = os.path.join(tmp_folder, bpa_kg_file_name)

# Final TTL file for BAN factoids
bpa_factoids_kg_file_name = "ban_factoids.ttl"
bpa_factoids_kg_file = os.path.join(tmp_folder, bpa_factoids_kg_file_name)
bpa_permanent_kg_file_name = "ban_permanent.ttl"
bpa_permanent_kg_file = os.path.join(tmp_folder, bpa_permanent_kg_file_name)

# Time interval of validity of the source (there is not end time)
bpa_valid_time = {
    "start_time" : {"stamp":"2024-01-01T00:00:00Z","precision":"day","calendar":"gregorian"},
    "end_time" : {"stamp":tp.get_current_timestamp(),"precision":"day","calendar":"gregorian"}
    }

### OpenStreetMap (OSM)

Extracting data from OpenStreetMap

In [None]:
# Name of the directory where the factoid triples of OSM data are stored and constructed
osm_repository_name = "factoids_osm"

# CSV files containting data
osm_csv_file_name = "osm_adresses.csv"
osm_csv_file = os.path.join(data_folder, osm_csv_file_name)
osm_hn_csv_file_name = "osm_hn_adresses.csv"
osm_hn_csv_file = os.path.join(data_folder, osm_hn_csv_file_name)

# TTL file for structuring knowledge of OSM addresses
osm_kg_file_name = "osm_adresses.ttl"
osm_kg_file = os.path.join(tmp_folder, osm_kg_file_name)

# Final TTL files for OSM factoids
osm_factoids_kg_file_name = "osm_factoids.ttl"
osm_factoids_kg_file = os.path.join(tmp_folder, osm_factoids_kg_file_name)
osm_permanent_kg_file_name = "osm_permanent.ttl"
osm_permanent_kg_file = os.path.join(tmp_folder, osm_permanent_kg_file_name)

# Time interval of validity of the source (there is not end time)
osm_valid_time = {
    "start_time" : {"stamp":"2024-01-01T00:00:00Z","precision":"day","calendar":"gregorian"},
    "end_time" : {"stamp":tp.get_current_timestamp(),"precision":"day","calendar":"gregorian"}
    }

### Integration of data from Geojson files

These files are derived from the vectorisation of maps of Paris:
* the revised Napoleonic cadatre of 1847 ;
* Andriveau’ plan of 1849 ;
* municipal plot plan of 1871 ;
* the Municipal Atlas map of 1888.

#### Global variables for importing data from Geojson files

In [None]:
lang = "fr"
landmark_type = "Thoroughfare"
geojson_join_property = "name"
tmp_kg_file_name = "tmp_kg.ttl"
tmp_kg_file = os.path.join(tmp_folder, tmp_kg_file_name)

#### Napoleonic cadastre of 1847

In [None]:
# Name of the directory where data factoid triples are stored and constructed
cn_1847_repository_name = "factoids_1847_cadastre_nap"

# Geojson file containting data
cn_1847_geojson_file_name = "1847_cadastre_nap.geojson"
cn_1847_geojson_file = os.path.join(data_folder, cn_1847_geojson_file_name)
cn_1847_kg_file_name = "cn_1847_kg.ttl"
cn_1847_kg_file = os.path.join(tmp_folder, cn_1847_kg_file_name)

# Final TTL files of factoids from the revised 1847 Napoleonic cadastre
cn_1847_factoids_kg_file_name = "cn_1847_factoids.ttl"
cn_1847_factoids_kg_file = os.path.join(tmp_folder, cn_1847_factoids_kg_file_name)
cn_1847_permanent_kg_file_name = "cn_1847_permanent.ttl"
cn_1847_permanent_kg_file = os.path.join(tmp_folder, cn_1847_permanent_kg_file_name)

cn_1847_geojson = fm.read_json_file(cn_1847_geojson_file)

# Description of the source within a dictionary
cn_1847_source_desc = {
    "lang" : "fr", 
    "label" : "Cadastre napoléonien de Gentilly de 1847",
    "publisher" : {
        "label": "Empire français"
        }
}

# Time interval of validity of the source
cn_1847_valid_time = {
    "start_time" : {"stamp":"1845-01-01T00:00:00Z","precision":"year","calendar":"gregorian"},
    "end_time" : {"stamp":"1850-01-01T00:00:00Z","precision":"year","calendar":"gregorian"},
}
cn_1847_geojson["source"] = cn_1847_source_desc
cn_1847_geojson["time"] = cn_1847_valid_time

#### Andriveau atlas

In [None]:
# Name of the directory where data factoid triples are stored and constructed
an_1849_repository_name = "factoids_1849_andriveau"

# Geojson file containting data
an_1849_geojson_file_name = "1849_andriveau.geojson"
an_1849_geojson_file = os.path.join(data_folder, an_1849_geojson_file_name)
an_1849_kg_file_name = "an_1849_kg.ttl"
an_1849_kg_file = os.path.join(tmp_folder, an_1849_kg_file_name)

# Final TTL files of factoids from the 1849 Andriveau atlas
an_1849_factoids_kg_file_name = "an_1849_factoids.ttl"
an_1849_factoids_kg_file = os.path.join(tmp_folder, an_1849_factoids_kg_file_name)
an_1849_permanent_kg_file_name = "an_1849_permanent.ttl"
an_1849_permanent_kg_file = os.path.join(tmp_folder, an_1849_permanent_kg_file_name)

an_1849_geojson = fm.read_json_file(an_1849_geojson_file)

# Description of the source within a dictionary
an_1849_source_desc = {
    "lang" : "fr", 
    "label" : "Plan d'Andriveau de 1849",
    "publisher" : {
        "label": "Andriveau"
        }
}

# Time interval of validity of the source
an_1849_valid_time = {
    "start_time" : {"stamp":"1847-01-01T00:00:00Z","precision":"year","calendar":"gregorian"},
    "end_time" : {"stamp":"1851-01-01T00:00:00Z","precision":"year","calendar":"gregorian"},
}

an_1849_geojson["source"] = an_1849_source_desc
an_1849_geojson["time"] = an_1849_valid_time

#### 1871 municipal parcel map of Paris

In [None]:
# Name of the directory where data factoid triples are stored and constructed
pm_1871_repository_name = "factoids_1871_plan_parcellaire_mun"

# Geojson file containting data
pm_1871_geojson_file_name = "1871_plan_parcellaire_mun.geojson"
pm_1871_geojson_file = os.path.join(data_folder, pm_1871_geojson_file_name)
pm_1871_kg_file_name = "pm_1871_kg.ttl"
pm_1871_kg_file = os.path.join(tmp_folder, pm_1871_kg_file_name)

# Final TTL file of factoids from the 1871 municipal parcel map
pm_1871_factoids_kg_file_name = "pm_1871_factoids.ttl"
pm_1871_factoids_kg_file = os.path.join(tmp_folder, pm_1871_factoids_kg_file_name)
pm_1871_permanent_kg_file_name = "pm_1871_permanent.ttl"
pm_1871_permanent_kg_file = os.path.join(tmp_folder, pm_1871_permanent_kg_file_name)

pm_1871_geojson = fm.read_json_file(pm_1871_geojson_file)

# Description of the source within a dictionary
pm_1871_source_desc = {
    "lang" : "fr",
    "label" : "Plan parcellaire municipal",
    "publisher" : {
        "label": "IIIe République"
        }
}

# Time interval of validity of the source
pm_1871_valid_time = {
    "start_time" : {"stamp":"1870-01-01T00:00:00Z","precision":"year","calendar":"gregorian"},
    "end_time" : {"stamp":"1872-01-01T00:00:00Z","precision":"year","calendar":"gregorian"},
}
pm_1871_geojson["source"] = pm_1871_source_desc
pm_1871_geojson["time"] = pm_1871_valid_time

#### 1888 Municipal Atlas of Paris

In [None]:
# Name of the directory where data factoid triples are stored and constructed
am_1888_repository_name = "factoids_1888_atlas_municipal"

# Geojson file containting data
am_1888_geojson_file_name = "1888_atlas_municipal.geojson"
am_1888_geojson_file = os.path.join(data_folder, am_1888_geojson_file_name)
am_1888_kg_file_name = "am_1888_kg.ttl"
am_1888_kg_file = os.path.join(tmp_folder, am_1888_kg_file_name)

# Final TTL file of factoids from the 1888 Municipal Atlas plan
am_1888_factoids_kg_file_name = "am_1888_factoids.ttl"
am_1888_factoids_kg_file = os.path.join(tmp_folder, am_1888_factoids_kg_file_name)
am_1888_permanent_kg_file_name = "am_1888_permanent.ttl"
am_1888_permanent_kg_file = os.path.join(tmp_folder, am_1888_permanent_kg_file_name)

am_1888_geojson = fm.read_json_file(am_1888_geojson_file)

# Description of the source within a dictionary
am_1888_source_desc = {
    "lang" : "fr", 
    "label" : "Plan de l'atlas municipal de 1888",
    "publisher" : {
        "label": "Ville de Paris"
        }
}

# Time interval of validity of the source
am_1888_valid_time = {
    "start_time" : {"stamp":"1887-01-01T00:00:00Z","precision":"year","calendar":"gregorian"},
    "end_time" : {"stamp":"1889-01-01T00:00:00Z","precision":"year","calendar":"gregorian"},
}
am_1888_geojson["source"] = am_1888_source_desc
am_1888_geojson["time"] = am_1888_valid_time

### Events

TTL file describings events

In [None]:
# Name of the directory where the factoid triples of events data are stored and constructed
events_repository_name = "factoids_events"

# Event file containting data
events_json_file_name = "events.json"
events_json_file = os.path.join(data_folder, events_json_file_name)

# Final TTL file of factoids from events
events_ttl_file_name = "events.ttl"
events_ttl_file = os.path.join(tmp_folder, events_ttl_file_name)

# Final TTL files for events factoids
events_factoids_kg_file_name = "events_factoids.ttl"
events_factoids_kg_file = os.path.join(tmp_folder, events_factoids_kg_file_name)
events_permanent_kg_file_name = "events_permanent.ttl"
events_permanent_kg_file = os.path.join(tmp_folder, events_permanent_kg_file_name)

## Final and iterative process

### Creating factoids in directories

For each source, factoids are created independently in separate directories

#### Ville de Paris


In [None]:
# fc.create_factoids_repository_ville_paris(graphdb_url, vpt_repository_name, tmp_folder,
#                                           ont_file, ontology_named_graph_name,
#                                           factoids_named_graph_name, permanent_named_graph_name,
#                                           vpta_csv_file, vptc_csv_file, vpt_kg_file, vpt_valid_time, lang=lang)

####  BAN


In [None]:
# fc.create_factoids_repository_ban(graphdb_url, bpa_repository_name, tmp_folder,
#                                   ont_file, ontology_named_graph_name,
#                                   factoids_named_graph_name, permanent_named_graph_name,
#                                   bpa_csv_file, bpa_kg_file, bpa_valid_time, lang=lang)

#### Wikidata


In [None]:
# # fc.get_data_from_wikidata(wdp_land_csv_file, wdp_loc_csv_file)
# fc.create_factoids_repository_wikidata_paris(graphdb_url, wd_repository_name, tmp_folder,
#                                              ont_file, ontology_named_graph_name,
#                                              factoids_named_graph_name, permanent_named_graph_name,
#                                              wdp_land_csv_file, wdp_loc_csv_file, wdp_kg_file, wdp_valid_time=wdp_valid_time, lang=lang)

#### OSM

In [None]:
# fc.create_factoids_repository_osm(graphdb_url, osm_repository_name, tmp_folder,
#                                   ont_file, ontology_named_graph_name,
#                                   factoids_named_graph_name, permanent_named_graph_name,
#                                   osm_csv_file, osm_hn_csv_file, osm_kg_file, osm_valid_time=osm_valid_time, lang=lang)

#### Data from Geojson files

* Napoleonic cadastre of Gentilly (1847)
* Andriveau plan (1849)
* municipal parcel map of Paris (1871)
* municipal map of Paris (1888)

In [None]:
# fc.create_factoids_repository_geojson_states(graphdb_url, cn_1847_repository_name, tmp_folder, ont_file, ontology_named_graph_name,
#                                factoids_named_graph_name, permanent_named_graph_name, cn_1847_geojson, geojson_join_property, cn_1847_kg_file, tmp_kg_file, landmark_type, lang)
# fc.create_factoids_repository_geojson_states(graphdb_url, an_1849_repository_name, tmp_folder, ont_file, ontology_named_graph_name,
#                                factoids_named_graph_name, permanent_named_graph_name, an_1849_geojson, geojson_join_property, an_1849_kg_file, tmp_kg_file, landmark_type, lang)
# fc.create_factoids_repository_geojson_states(graphdb_url, pm_1871_repository_name, tmp_folder, ont_file, ontology_named_graph_name,
#                                factoids_named_graph_name, permanent_named_graph_name, pm_1871_geojson, geojson_join_property, pm_1871_kg_file, tmp_kg_file, landmark_type, lang)
# fc.create_factoids_repository_geojson_states(graphdb_url, am_1888_repository_name, tmp_folder, ont_file, ontology_named_graph_name,
#                                factoids_named_graph_name, permanent_named_graph_name, am_1888_geojson, geojson_join_property, am_1888_kg_file, tmp_kg_file, landmark_type, lang)

#### Data from Events files

In [None]:
# fc.create_factoids_repository_events(graphdb_url, events_repository_name, tmp_folder,
#                                      events_json_file, events_ttl_file,
#                                      ont_file, ontology_named_graph_name,
#                                      factoids_named_graph_name, permanent_named_graph_name)

### Insertion of factoids in the fact graph

In [None]:
gd.remove_named_graph(graphdb_url, facts_repository_name, facts_named_graph_name)
gd.remove_named_graph(graphdb_url, facts_repository_name, inter_sources_name_graph_name)

#### Ville de Paris

In [None]:
named_graph_name = "source_ville_de_paris"
rt.transfert_factoids_to_facts_repository(graphdb_url, facts_repository_name, vpt_repository_name,
                                           vpt_factoids_kg_file, vpt_permanent_kg_file,
                                           factoids_named_graph_name, permanent_named_graph_name, named_graph_name, facts_named_graph_name)
msp.import_factoids_in_facts(graphdb_url, facts_repository_name, named_graph_name, facts_named_graph_name, inter_sources_name_graph_name)

#### Wikidata

In [None]:
named_graph_name = "source_wikidata"
rt.transfert_factoids_to_facts_repository(graphdb_url, facts_repository_name, wd_repository_name,
                                           wdp_factoids_kg_file, wdp_permanent_kg_file,
                                           factoids_named_graph_name, permanent_named_graph_name, named_graph_name, facts_named_graph_name)
msp.import_factoids_in_facts(graphdb_url, facts_repository_name, named_graph_name, facts_named_graph_name, inter_sources_name_graph_name)

#### BAN

In [None]:
named_graph_name = "source_ban"
rt.transfert_factoids_to_facts_repository(graphdb_url, facts_repository_name, bpa_repository_name,
                                           bpa_factoids_kg_file, bpa_permanent_kg_file,
                                           factoids_named_graph_name, permanent_named_graph_name, named_graph_name, facts_named_graph_name)
msp.import_factoids_in_facts(graphdb_url, facts_repository_name, named_graph_name, facts_named_graph_name, inter_sources_name_graph_name)

#### OSM

In [None]:
named_graph_name = "source_osm"
rt.transfert_factoids_to_facts_repository(graphdb_url, facts_repository_name, osm_repository_name,
                                           osm_factoids_kg_file, osm_permanent_kg_file,
                                           factoids_named_graph_name, permanent_named_graph_name, named_graph_name, facts_named_graph_name)
msp.import_factoids_in_facts(graphdb_url, facts_repository_name, named_graph_name, facts_named_graph_name, inter_sources_name_graph_name)

#### Data from Geojson files

* Napoleonic cadastre of Gentilly (1847)
* Andriveau plan (1849)
* municipal parcel map of Paris (1871)
* municipal map of Paris (1888)

In [None]:
named_graph_name = "source_geojson"

rt.transfert_factoids_to_facts_repository(graphdb_url, facts_repository_name, cn_1847_repository_name,
                                           cn_1847_factoids_kg_file, cn_1847_permanent_kg_file,
                                           factoids_named_graph_name, permanent_named_graph_name, named_graph_name, facts_named_graph_name)

rt.transfert_factoids_to_facts_repository(graphdb_url, facts_repository_name, an_1849_repository_name,
                                           an_1849_factoids_kg_file, an_1849_permanent_kg_file,
                                           factoids_named_graph_name, permanent_named_graph_name, named_graph_name, facts_named_graph_name)

rt.transfert_factoids_to_facts_repository(graphdb_url, facts_repository_name, pm_1871_repository_name,
                                           pm_1871_factoids_kg_file, pm_1871_permanent_kg_file,
                                           factoids_named_graph_name, permanent_named_graph_name, named_graph_name, facts_named_graph_name)

rt.transfert_factoids_to_facts_repository(graphdb_url, facts_repository_name, am_1888_repository_name,
                                           am_1888_factoids_kg_file, am_1888_permanent_kg_file,
                                           factoids_named_graph_name, permanent_named_graph_name, named_graph_name, facts_named_graph_name)

msp.import_factoids_in_facts(graphdb_url, facts_repository_name, named_graph_name, facts_named_graph_name, inter_sources_name_graph_name)


#### Data from Events files

In [None]:
named_graph_name = "source_events"

rt.transfert_factoids_to_facts_repository(graphdb_url, facts_repository_name, events_repository_name,
                                           events_factoids_kg_file, events_permanent_kg_file,
                                           factoids_named_graph_name, permanent_named_graph_name, named_graph_name, facts_named_graph_name)
msp.import_factoids_in_facts(graphdb_url, facts_repository_name, named_graph_name, facts_named_graph_name, inter_sources_name_graph_name)


### Construction of entities evolution from multi-source data

In [None]:
order_named_graph_name = "temporal_ordering"

facts_named_graph_uri = gd.get_named_graph_uri_from_name(graphdb_url, facts_repository_name, facts_named_graph_name)
inter_sources_name_graph_uri = gd.get_named_graph_uri_from_name(graphdb_url, facts_repository_name, inter_sources_name_graph_name)
tmp_named_graph_uri = gd.get_named_graph_uri_from_name(graphdb_url, facts_repository_name, tmp_named_graph_name)
order_named_graph_uri = gd.get_named_graph_uri_from_name(graphdb_url, facts_repository_name, order_named_graph_name)

#### Comparison of version values

In [None]:
comparison_settings = {
    "geom_similarity_coef": 0.85,
    "geom_buffer_radius": 5,
    "geom_crs_uri": URIRef('http://www.opengis.net/def/crs/EPSG/0/2154'),
}
comp_named_graph_name = "comparisons"
comp_tmp_file_name = "comparisons.ttl"
comp_tmp_file = os.path.join(tmp_folder, comp_tmp_file_name)
avc.compare_attribute_versions(graphdb_url, facts_repository_name, comp_named_graph_name, comp_tmp_file, comparison_settings)

#### Initialize missing landmark apperance and disapperance changes

* After having imported all factoids, changes which describe the appearance and the disappearance of landmark mainly are not created as they don't exist in factoids named graph.
* This step aims at initializing missing landmark apperance and disapperance changes and their related events for which we give an estimation the time at which it happened. We consider the appearance happened before the earliest time of reference of the landmark in sources. For the disapperance, it happened after the earliest time of reference.

In [None]:
ec.initialize_missing_changes_and_events_for_landmarks(graphdb_url, facts_repository_name, facts_named_graph_uri, inter_sources_name_graph_uri, tmp_named_graph_uri)

#### Split overlapping versions

In [None]:
gd.remove_named_graph_from_uri(tmp_named_graph_uri)
ec.get_elementary_versions_and_changes(graphdb_url, facts_repository_name, facts_named_graph_uri, tmp_named_graph_uri)

#### Get evolution from elementary elements

Get the attribute version evolution from elementary versions and changes
* remove empty attribute versions
    * versions not related to any trace
    * versions which changes are not related to any trace
* merge successive attribute versions which are similar

In [None]:
ec.get_attribute_version_evolution_from_elementary_elements(graphdb_url, facts_repository_name,
                                                            facts_named_graph_uri, inter_sources_name_graph_uri, tmp_named_graph_uri)

# Remove temporary named graph (which is used for construction)
gd.remove_named_graph_from_uri(tmp_named_graph_uri)