# Formal Ontology of Mathematics Notebook

## Purpose
This notebook aims to build a knowledge graph (KG) representing the formal ontology of mathematics, specifically focusing on Euclid's Elements, Book 1. It extracts information from various input files (CSV and TXT) containing definitions, postulates, common notions, and concepts from Euclid's Elements, Book 1.

## Functionality
1. **Data Ingestion**: It reads data from input files like "ontology_definitions_concepts.txt", "Euclid.Postulates.Book1.csv", and "Euclid.CommonNotions.Book1.csv".
2. **Knowledge Representation**: It utilizes the RDFLib library to create and populate a knowledge graph, representing entities and their relationships using triples (subject, predicate, object).
3. **Ontology Structure**: The KG follows a defined ontology with classes like Concept, Postulate, Common Notion, Operation Type, and Relation Type.
4. **Relationship Modeling**: It establishes relationships between entities using object properties like refers_to, is_used_in, has_definition, has_subject, has_domain, etc.
5. **Data Processing**: Functions like `add_postulates`, `add_common_notions`, `add_statement`, `add_concept_hierarchy` process and structure the data to populate the knowledge graph.
6. **Ontology Output**: It exports the generated KG in Turtle format into the "output" directory as "euclid_book1.ttl".

## Libraries Used
- **RDFLib**: for creating and manipulating RDF graphs.
- **Pandas**: for data handling and manipulation.
- **OS**: for interacting with the operating system, such as mounting Google Drive and changing directories.
- **re**: for regular expressions.
- **Google Colab**: for accessing and mounting Google Drive.


## Workflow
The notebook first initializes an empty RDF graph. It then defines a set of core ontology items (classes and object properties). It loads data from the input files and adds them as triples to the RDF graph. The process includes specific functions for adding postulates, common notions, concepts, and their relationships. Finally, the notebook outputs the populated knowledge graph in Turtle format.

## Assumptions
- Assumes the input files are present in the specified directory ("input").
- Assumes the required libraries are installed.
- Assumes the Google Colab environment is used for execution.

In [None]:
# Pappus' poof as the only involving no construction and no operation (including non constructive operations). However, cf. Aristotle's proof.

# Maybe:
# Aristotle: no construction and no operation (at least not explicitly because all the objects are assumed to be available already), but use of external objects.
# Euclid: construction, operations, and external objects.
# Pappus: no construction, no operation (except a purely mental one), and no external objects.

In [None]:
%pip install rdflib

import google
import pandas as pd
import os

# prompt: mount drive here to read files from the folder "My Drive > Colab_Notebooks > Formal_Ontology_of_Mathematics"
google.colab.drive.mount('/content/drive')

os.chdir("/content/drive/My Drive/Colab_Notebooks/Formal_Ontology_of_Mathematics")
!ls

In [None]:
import pandas as pd
import os
import re
import rdflib
import typing

import modules.utils as utils
import modules.tbox as tbox
import modules.concepts as concepts
import modules.postulates_module as postulate_module
import modules.common_notions_module as common_notions_module
import modules.propositions_module as propositions_module
import modules.concepts_module as concepts_module
import modules.datatype_properties_module as datatype_properties_module
import modules.operations_relations_module as operations_relations_module
import modules.proofs_module as proofs_module

# common IRIs
rdf_type = rdflib.RDF.type
rdfs_label = rdflib.RDFS.label
rdfs_subclassof = rdflib.RDFS.subClassOf
rdfs_subpropertyof = rdflib.RDFS.subPropertyOf
rdfs_range = rdflib.RDFS.range
skos_prefLabel = rdflib.SKOS.prefLabel
skos_altLabel = rdflib.SKOS.altLabel
owl_class = rdflib.OWL.Class
owl_individual = rdflib.OWL.NamedIndividual
owl_object_property = rdflib.OWL.ObjectProperty
owl_data_property = rdflib.OWL.DatatypeProperty
owl_annotation_property = rdflib.OWL.AnnotationProperty
xsd_boolean = rdflib.XSD.boolean
xsd_true = rdflib.Literal("true", datatype=xsd_boolean)
xsd_false = rdflib.Literal("false", datatype=xsd_boolean)

# classes IRIs
common_notion_class = utils.create_iri("Common notion", namespace="https://www.foom.com/core")
concept_class = utils.create_iri("Concept", namespace="https://www.foom.com/core")
concept_type_class = utils.create_iri("Concept type", namespace="https://www.foom.com/core")
gist_class = utils.create_iri("Gist", namespace="https://www.foom.com/core")
enumeration_class = utils.create_iri("Enumeration", namespace="https://www.foom.com/core")
implication_class = utils.create_iri("Implication", namespace="https://www.foom.com/core")
magnitude_class = utils.create_iri("Magnitude", namespace="https://www.foom.com/core")
moral_class = utils.create_iri("Moral", namespace="https://www.foom.com/core")
operation_type_class = utils.create_iri("Operation type", namespace="https://www.foom.com/core")
operation_instance_class = utils.create_iri("Operation instance", namespace="https://www.foom.com/core")
proposition_class = utils.create_iri("Proposition", namespace="https://www.foom.com/core")
proposition_type_class = utils.create_iri("Proposition type", namespace="https://www.foom.com/core")
relation_instance_class = utils.create_iri("Relation instance", namespace="https://www.foom.com/core")
relation_type_class = utils.create_iri("Relation type", namespace="https://www.foom.com/core")
set_class = utils.create_iri("Set", namespace="https://www.foom.com/core")
statement_class = utils.create_iri("Statement", namespace="https://www.foom.com/core")

# individual IRIs
elements_book_1 = rdflib.URIRef("https://www.foom.com/core#document__elements_book_1")
object_individual = utils.create_iri("Concept type: Object", namespace="https://www.foom.com/core")
relation_individual = utils.create_iri("Concept type: Relation", namespace="https://www.foom.com/core")
operation_individual = utils.create_iri("Concept type: Operation", namespace="https://www.foom.com/core")

# object properties IRIs
elements_book_1 = utils.create_iri("Document: Elements Book 1", namespace="https://www.foom.com/core")
proposition_type_construction = utils.create_iri("Proposition type: Construction", namespace="https://www.foom.com/core")
proposition_type_theorem = utils.create_iri("Proposition type: Theorem", namespace="https://www.foom.com/core")

refers_to = utils.create_iri("refers to", namespace="https://www.foom.com/core")
definition_refers_to = utils.create_iri("definition refers to", namespace="https://www.foom.com/core")

has_conceptual_component = utils.create_iri("has conceptual component", namespace="https://www.foom.com/core")
is_conceptual_component_of = utils.create_iri("is conceptual component of", namespace="https://www.foom.com/core")

is_sub_concept_of = utils.create_iri("is sub-concept of", namespace="https://www.foom.com/core")
is_super_concept_of = utils.create_iri("is super-concept of", namespace="https://www.foom.com/core")

is_used_in_definition_of = utils.create_iri("is used in definition of", namespace="https://www.foom.com/core")
is_defined_in = utils.create_iri("is defined in", namespace="https://www.foom.com/core")

contains_definition_of = utils.create_iri("contains_definition_of", namespace="https://www.foom.com/core")
is_used_in = utils.create_iri("is used in", namespace="https://www.foom.com/core")

has_definition = utils.create_iri("has definition", namespace="https://www.foom.com/core")
defines = utils.create_iri("defines", namespace="https://www.foom.com/core")

has_subject = utils.create_iri("has subject", namespace="https://www.foom.com/core")
has_predicate = utils.create_iri("has predicate", namespace="https://www.foom.com/core")
has_object = utils.create_iri("has object", namespace="https://www.foom.com/core")

has_domain = utils.create_iri("has domain", namespace="https://www.foom.com/core")
is_domain_of = utils.create_iri("is domain of", namespace="https://www.foom.com/core")
has_range = utils.create_iri("has_range", namespace="https://www.foom.com/core")
is_range_of = utils.create_iri("is range of", namespace="https://www.foom.com/core")

has_statement = utils.create_iri("has statement", namespace="https://www.foom.com/core")
is_statement_of = utils.create_iri("is statement of", namespace="https://www.foom.com/core")

has_implication = utils.create_iri("has implication", namespace="https://www.foom.com/core")
is_implication_of = utils.create_iri("is implication of", namespace="https://www.foom.com/core")

is_in = utils.create_iri("is in", namespace="https://www.foom.com/core")
contains = utils.create_iri("contains", namespace="https://www.foom.com/core")

has_given_concept = utils.create_iri("has given concept", namespace="https://www.foom.com/core")
is_given_concept_of = utils.create_iri("is given concept of", namespace="https://www.foom.com/core")

has_gist = utils.create_iri("has gist", namespace="https://www.foom.com/core")
is_gist_of = utils.create_iri("is gist of", namespace="https://www.foom.com/core")

has_moral = utils.create_iri("has moral", namespace="https://www.foom.com/core")
is_moral_of = utils.create_iri("is moral of", namespace="https://www.foom.com/core")

has_concept_type = utils.create_iri("has concept type", namespace="https://www.foom.com/core")
is_concept_type_of = utils.create_iri("is concept type of", namespace="https://www.foom.com/core")

contains_concept = utils.create_iri("contains concept", namespace="https://www.foom.com/core")
is_concept_in = utils.create_iri("is concept in", namespace="https://www.foom.com/core")

HAS_RELATION_TYPE = utils.create_iri("has relation type", namespace="https://www.foom.com/core")
IS_RELATION_TYPE_OF = utils.create_iri("is relation type of", namespace="https://www.foom.com/core")

HAS_OPERATION_TYPE = utils.create_iri("has operation type", namespace="https://www.foom.com/core")
IS_OPERATION_TYPE_OF = utils.create_iri("is operation type of", namespace="https://www.foom.com/core")

HAS_PROPOSITION_TYPE = utils.create_iri("has proposition type", namespace="https://www.foom.com/core")
IS_PROPOSITION_TYPE_OF = utils.create_iri("is proposition type of", namespace="https://www.foom.com/core")

HAS_RELATION_TO_CONCEPT = utils.create_iri("has relation to concept", namespace="https://www.foom.com/core")

HAS_AUTHOR = utils.create_iri("has author", namespace="https://www.foom.com/core")

AUTHOR_CLASS = utils.create_iri("Author", namespace="https://www.foom.com/core")

ARISTOTLE = utils.create_iri("Aristotle", namespace="https://www.foom.com/core")
EUCLID = utils.create_iri("Euclid", namespace="https://www.foom.com/core")
PAPPUS = utils.create_iri("Pappus", namespace="https://www.foom.com/core")

ARISTOTLE_PROOF = rdflib.URIRef("https://www.foom.com/core#proof_aristotle")

#################################################
# initialize graph
kg = rdflib.Graph()

# tbox and foundational triples
ontology_items = {
    # classes
    ("Author", owl_class),
    ("Concept", owl_class),
    ("Concept type", owl_class),
    ("Common notion", owl_class),
    ("Definition", owl_class),
    ("Document", owl_class),
    ("Enumeration", owl_class),
    ("Gist", owl_class),
    ("Implication", owl_class),
    ("Magnitude", owl_class),
    ("Moral", owl_class),
    ("Operation type", owl_class),
    ("Operation instance", owl_class),
    ("Postulate", owl_class),
    ("Proof", owl_class),
    ("Proposition", owl_class),
    ("Proposition type", owl_class),
    ("Relation type", owl_class),
    ("Relation instance", owl_class),
    ("Set", owl_class),
    ("Statement", owl_class),

    # object properties
    ("refers to", owl_object_property),
    ("definition refers to", owl_object_property),

    ("has conceptual component", owl_object_property),
    ("is conceptual component of", owl_object_property),

    ("is sub-concept of", owl_object_property),
    ("is super-concept of", owl_object_property),

    ("is used in definition of", owl_object_property),
    ("is defined in", owl_object_property),

    ("contains_definition_of", owl_object_property),
    ("is used in", owl_object_property),

    ("has definition", owl_object_property),
    ("defines", owl_object_property),

    ("has subject", owl_object_property),
    ("has predicate", owl_object_property),
    ("has object", owl_object_property),

    ("has domain", owl_object_property),
    ("is domain of", owl_object_property),
    ("has range", owl_object_property),
    ("is range of", owl_object_property),

    ("has statement", owl_object_property),
    ("is statement of", owl_object_property),

    ("has implication", owl_object_property),
    ("is implication of", owl_object_property),

    ("is in", owl_object_property),
    ("contains", owl_object_property),

    ("has given concept", owl_object_property),
    ("is given concept of", owl_object_property),

    ("has gist", owl_object_property),
    ("is gist of", owl_object_property),

    ("has moral", owl_object_property),
    ("is moral of", owl_object_property),

    ("has concept type", owl_object_property),
    ("is concept type of", owl_object_property),

    ("has proposition type", owl_object_property),
    ("is proposition type of", owl_object_property),

    ("contains concept", owl_object_property),
    ("is concept in", owl_object_property),

    ("has relation type", owl_object_property),
    ("is relation type of", owl_object_property),

    ("has operation type", owl_object_property),
    ("is operation type of", owl_object_property),

    ("has author", owl_object_property),

    ("has relation to concept", owl_object_property)

}
triples = {
    (elements_book_1, rdf_type, owl_individual),
    (elements_book_1, rdfs_label, rdflib.Literal("Document: Elements Book 1")),
    (elements_book_1, skos_prefLabel, rdflib.Literal("Elements Book 1")),
    (elements_book_1, rdf_type, utils.create_iri("Document", namespace="https://www.foom.com/core")),

    (utils.create_iri("Proposition type", namespace="https://www.foom.com/core"), rdfs_subclassof, enumeration_class),
    (proposition_type_construction, rdf_type, owl_individual),
    (proposition_type_construction, rdf_type, proposition_type_class),
    (proposition_type_construction, rdfs_label, rdflib.Literal("Proposition type: Construction")),
    (proposition_type_construction, skos_prefLabel, rdflib.Literal("Construction")),
    (proposition_type_theorem, rdf_type, owl_individual),
    (proposition_type_theorem, rdf_type, proposition_type_class),
    (proposition_type_theorem, rdfs_label, rdflib.Literal("Proposition type: Theorem")),
    (proposition_type_theorem, skos_prefLabel, rdflib.Literal("Theorem")),

    (concept_type_class, rdfs_subclassof, enumeration_class),

    (ARISTOTLE, rdf_type, owl_individual),
    (ARISTOTLE, rdf_type, AUTHOR_CLASS),
    (ARISTOTLE, rdfs_label, rdflib.Literal("Author: Aristotle")),
    (ARISTOTLE, skos_prefLabel, rdflib.Literal("Aristotle")),

    (EUCLID, rdf_type, owl_individual),
    (EUCLID, rdf_type, AUTHOR_CLASS),
    (EUCLID, rdfs_label, rdflib.Literal("Author: Euclid")),
    (EUCLID, skos_prefLabel, rdflib.Literal("Euclid")),

    (PAPPUS, rdf_type, owl_individual),
    (PAPPUS, rdf_type, AUTHOR_CLASS),
    (PAPPUS, rdfs_label, rdflib.Literal("Author: Pappus")),
    (PAPPUS, skos_prefLabel, rdflib.Literal("Pappus")),

    (elements_book_1, HAS_AUTHOR, EUCLID),
    (ARISTOTLE_PROOF, HAS_AUTHOR, ARISTOTLE),

    (defines, rdfs_subpropertyof, HAS_RELATION_TO_CONCEPT),
    (contains_concept, rdfs_subpropertyof, HAS_RELATION_TO_CONCEPT),
    (has_given_concept, rdfs_subpropertyof, HAS_RELATION_TO_CONCEPT),
    (definition_refers_to, rdfs_subpropertyof, HAS_RELATION_TO_CONCEPT),
    (has_conceptual_component, rdfs_subpropertyof, HAS_RELATION_TO_CONCEPT),
    (is_sub_concept_of, rdfs_subpropertyof, HAS_RELATION_TO_CONCEPT),
    (is_super_concept_of, rdfs_subpropertyof, HAS_RELATION_TO_CONCEPT)
}

kg = tbox.add_tbox(kg, ontology_items, triples)

# abox: concepts
concepts_input_file_path = "input/ontology_definitions_concepts.txt"
kg = concepts.main_add_definition_concepts(concepts_input_file_path, kg)

# abox: postulates
postulates_input_file_path = "input/Euclid.Postulates.Book1.csv"
kg = postulate_module.add_postulates(kg, postulates_input_file_path)

# abox: add common notions
common_notions_input_file_path = "input/Euclid.CommonNotions.Book1.csv"
kg = common_notions_module.add_common_notions(kg, common_notions_input_file_path)

# check differences between the list of concepts in the concepts spreadsheet
# and the list of concepts in the propositions spreadsheet
concepts_analysis_input_file_path = "input/Euclid.ConceptsAnalysis.Book1.csv"
propositions_input_file_path = "input/Euclid.Propositions.Book1.csv"

utils.diff_concepts_propositions_and_concepts_list(propositions_input_file_path, concepts_analysis_input_file_path)

# abox: add concepts with hierarchy and type
concepts_analysis_input_file_path = "input/Euclid.ConceptsAnalysis.Book1.csv"
kg = concepts_module.add_concepts(kg, concepts_analysis_input_file_path)

# abox: add propositions
propositions_input_file_path = "input/Euclid.Propositions.Book1.csv"
kg = propositions_module.add_propositions(kg, propositions_input_file_path)

#  abox: add datatype properties
datatype_properties_input_file_path = "input/Euclid.DatatypePropertiesAnalysis.Book1.csv"
kg = datatype_properties_module.add_datatype_properties(kg, datatype_properties_input_file_path)

# abox: add operations and relations
OPERATIONS_INPUT_FILE_PATH = "input/Euclid.OperationsAnalysis.Book1.csv"
RELATIONS_INPUT_FILE_PATH = "input/Euclid.RelationsAnalysis.Book1.csv"

kg = operations_relations_module.add_relations_operations(kg, OPERATIONS_INPUT_FILE_PATH, "operations")
kg = operations_relations_module.add_relations_operations(kg, RELATIONS_INPUT_FILE_PATH, "relations")

# abox: diff of concepts, relations, and operations considering the analysis of proofs
CONCEPTS_INPUT_FILE_PATH = "input/Euclid.ConceptsAnalysis.Book1.csv"
RELATIONS_INPUT_FILE_PATH = "input/Euclid.RelationsAnalysis.Book1.csv"
OPERATIONS_INPUT_FILE_PATH = "input/Euclid.OperationsAnalysis.Book1.csv"
PROOFS_INPUT_FILE_PATH = "input/Euclid.Proofs.Book1.csv"

diff_proofs_concepts = utils.find_diff_concepts_proofs(CONCEPTS_INPUT_FILE_PATH, PROOFS_INPUT_FILE_PATH, verbose=True)
diff_relations = utils.find_diff_proofs(RELATIONS_INPUT_FILE_PATH, PROOFS_INPUT_FILE_PATH, verbose=True)
diff_operations = utils.find_diff_proofs(OPERATIONS_INPUT_FILE_PATH, PROOFS_INPUT_FILE_PATH, "operation_instance", verbose=True)

# abox: import proofs
kg = proofs_module.add_proofs(kg, PROOFS_INPUT_FILE_PATH)

# abox: add Aristotle's proof
ARISTOTLE_CONCEPTS_FILE_PATH = "input/Aristotle - concepts.csv"
ARISTOTLE_PROOF_FILE_PATH = "input/Aristotle - proof.csv"
ARISTOTLE_RELATIONS_FILE_PATH = "input/Aristotle - relations.csv"

kg = concepts_module.add_concepts(kg, ARISTOTLE_CONCEPTS_FILE_PATH)
kg = proofs_module.add_proofs(kg, ARISTOTLE_PROOF_FILE_PATH, add_book_1=False, add_statements=True)
kg = operations_relations_module.add_relations_operations(kg, ARISTOTLE_RELATIONS_FILE_PATH, "relations")

# abox: add Pappus' proof
PAPPUS_CONCEPTS_FILE_PATH = "input/Pappus - concepts.csv"
PAPPUS_PROOF_FILE_PATH = "input/Pappus - proof.csv"
PAPPUS_OPERATIONS_FILE_PATH = "input/Pappus - operations.csv"

kg = concepts_module.add_concepts(kg, PAPPUS_CONCEPTS_FILE_PATH)
kg = proofs_module.add_proofs(kg, PAPPUS_PROOF_FILE_PATH, add_book_1=False, add_statements=True)
kg = operations_relations_module.add_relations_operations(kg, PAPPUS_OPERATIONS_FILE_PATH, "operations")



# output ontology
utils.output_ontology(kg, "output", "euclid_book1.ttl", "turtle")

In [None]:
PROOFS_INPUT_FILE_PATH = "input/Euclid.Proofs.Book1.csv"

ONTOLOGY_NAMESPACE = "https://www.foom.com/core"
PROOF_CLASS = utils.create_iri("Proof", namespace=ONTOLOGY_NAMESPACE)
CONTAINS_CONCEPT = utils.create_iri("contains concept", namespace="https://www.foom.com/core")
IS_CONCEPT_IN = utils.create_iri("is concept in", namespace="https://www.foom.com/core")
REFERS_TO = utils.create_iri("refers to", namespace="https://www.foom.com/core")
IS_USED_IN = utils.create_iri("is used in", namespace="https://www.foom.com/core")
USES_REDUCTION = utils.create_iri("uses_reductio", namespace=ONTOLOGY_NAMESPACE)

def add_proofs(kg: rdflib.Graph,
               input_file_path: str) -> rdflib.Graph:

    # read database of proofs
    proofs_df = pd.read_csv(input_file_path).fillna("")
    for _, row in proofs_df.iterrows():
        # add proof
        kg, proof_iri = add_proof(kg, row["proof"])

        # add concepts
        if concepts := row["additional_proof_concepts"]:
            kg = add_concepts(kg, proof_iri, concepts)

        # add relation instance
        if relation_instance := row["relation_instance"]:
            kg = add_relation_instance(kg, relation_instance.strip(), proof_iri)

        # add operation instance
        if operation_instance := row["operation_instance"]:
            kg = add_operation_instance(kg, operation_instance.strip(), proof_iri)

        # add implicit operation instance
        if implicit_operation_instance := row["implicit_operation_instance"]:
            kg = add_operation_instance(kg, implicit_operation_instance.strip(), proof_iri)

        # add uses_reductio
        if uses_reductio := row["reductio"]:
            kg.add((proof_iri, USES_REDUCTION, rdflib.Literal("true", datatype=rdflib.XSD.boolean)))

    return kg


def add_proof(kg: rdflib.Graph,
              proof_number: str) -> rdflib.Graph:
    proof_label = f"Proof {proof_number}"
    proof_iri = utils.create_iri(proof_label, namespace=ONTOLOGY_NAMESPACE)
    kg.add((proof_iri, rdf_type, PROOF_CLASS))
    kg.add((proof_iri, rdfs_label, rdflib.Literal(proof_label)))
    kg.add((proof_iri, skos_prefLabel, rdflib.Literal(proof_number)))

    return kg, proof_iri

def add_concepts(kg: rdflib.Graph,
                 proof_iri: rdflib.URIRef,
                 concepts: str) -> rdflib.Graph:
    concepts_list = [concept.strip() for concept in concepts.split(",")]
    for concept in concepts_list:
        concept_iri = utils.create_iri(f"Concept: {concept}", namespace=ONTOLOGY_NAMESPACE)
        kg.add((proof_iri, CONTAINS_CONCEPT, concept_iri))
        kg.add((concept_iri, IS_CONCEPT_IN, proof_iri))

    return kg

def add_relation_instance(kg: rdflib.Graph,
                          relation_instance: str,
                          proof_iri: rdflib.URIRef) -> rdflib.Graph:
    relation_instance_iri = utils.create_iri(f"Relation instance: {relation_instance}", namespace=ONTOLOGY_NAMESPACE)
    kg.add((proof_iri, REFERS_TO, relation_instance_iri))
    kg.add((relation_instance_iri, IS_USED_IN, proof_iri))

    return kg


def add_operation_instance(kg: rdflib.Graph,
                           operation_instance: str,
                           proof_iri: rdflib.URIRef) -> rdflib.Graph:
    operation_instance_instance_iri = utils.create_iri(f"Operation instance: {operation_instance}", namespace=ONTOLOGY_NAMESPACE)
    kg.add((proof_iri, REFERS_TO, operation_instance_instance_iri))
    kg.add((operation_instance_instance_iri, IS_USED_IN, proof_iri))

    return kg

print(len(kg))
kg = add_proofs(kg, PROOFS_INPUT_FILE_PATH)
print(len(kg))


# output ontology
utils.output_ontology(kg, "output", "euclid_book1.ttl", "turtle")

In [None]:
diff = find_diff_proofs(OPERATIONS_INPUT_FILE_PATH, PROOFS_INPUT_FILE_PATH, "operation_instance", verbose=True)

In [None]:
RELATIONS_INPUT_FILE_PATH = "input/Euclid.RelationsAnalysis.Book1.csv"
OPERATIONS_INPUT_FILE_PATH = "input/Euclid.OperationsAnalysis.Book1.csv"

ONTOLOGY_NAMESPACE = "https://www.foom.com/core"
CONCEPT_CLASS = utils.create_iri("Concept", namespace=ONTOLOGY_NAMESPACE)
CONTAINS_CONCEPT = utils.create_iri("contains concept", namespace=ONTOLOGY_NAMESPACE)
IS_CONCEPT_IN = utils.create_iri("is concept in", namespace=ONTOLOGY_NAMESPACE)
OPERATION_INSTANCE_CLASS = utils.create_iri("Operation instance", namespace=ONTOLOGY_NAMESPACE)
OPERATION_TYPE_CLASS = utils.create_iri("Operation type", namespace=ONTOLOGY_NAMESPACE)
RELATION_INSTANCE_CLASS = utils.create_iri("Relation instance", namespace=ONTOLOGY_NAMESPACE)
RELATION_TYPE_CLASS = utils.create_iri("Relation type", namespace=ONTOLOGY_NAMESPACE)

def add_relations_operations(kg: rdflib.Graph,
                             input_file_path: str,
                             item_type: typing.Literal["relations", "operations"]) -> rdflib.Graph:
    items_df = pd.read_csv(input_file_path).fillna("")
    for _, row in items_df.iterrows():

        if item_type == "relations":
            instance_pref_label = row["relation_instance"].strip().capitalize()
            type_pref_label = row["relation_type"].strip().capitalize()
            kg, instance_iri, type_iri = add_relation_instance_type(kg, instance_pref_label, type_pref_label, "Relation instance", "Relation type")

            # find concepts in instance and in type and add them to the graph
            kg = add_concepts(kg, instance_iri, instance_pref_label)
            kg = add_concepts(kg, type_iri, type_pref_label)

        elif item_type == "operations":
            instance_pref_label = row["operation_instance"].strip().capitalize()
            type_pref_label = row["operation_type"].strip().capitalize()
            kg, instance_iri, type_iri = add_operation_instance_type(kg, instance_pref_label, type_pref_label, "Operation instance", "Operation type")

            # find concepts in instance and in type and add them to the graph
            kg = add_concepts(kg, instance_iri, instance_pref_label)
            kg = add_concepts(kg, type_iri, type_pref_label)

        else:
            raise ValueError(f"Invalid item type: {item_type}")

    return kg

def find_concepts(item_pref_label: str) -> set:
    item_pref_label_v1 = item_pref_label.replace("(", " ").replace(")", " ").replace(",", " ")

    return {concept.strip() for concept in item_pref_label_v1.split()}


def add_concepts(kg: rdflib.Graph,
                 item_iri: rdflib.URIRef,
                 item_pref_label: str) -> rdflib.Graph:
    concepts = find_concepts(item_pref_label)

    for concept in concepts:
        concept_iri = utils.create_iri(f"Concept: {concept}", namespace=ONTOLOGY_NAMESPACE)
        kg.add((item_iri, CONTAINS_CONCEPT, concept_iri))
        kg.add((concept_iri, IS_CONCEPT_IN, item_iri))

    return kg

def add_basic_triples(kg: rdflib.Graph,
                      pref_label: str,
                      class_iri: rdflib.URIRef,
                      prefix: str) -> rdflib.Graph:
    item_label = f"{prefix}: {pref_label}"
    item_iri = utils.create_iri(item_label, namespace=ONTOLOGY_NAMESPACE)
    kg.add((item_iri, rdf_type, class_iri))
    kg.add((item_iri, rdfs_label, rdflib.Literal(item_label)))
    kg.add((item_iri, skos_prefLabel, rdflib.Literal(pref_label)))

    return kg, item_iri

def add_relation_instance_type(kg: rdflib.Graph,
                               relation_instance_pref_label: str,
                               relation_type_pref_label: str,
                               prefix_instance: str,
                               prefix_type: str) -> rdflib.Graph:

    kg, instance_iri = add_basic_triples(kg, relation_instance_pref_label, RELATION_INSTANCE_CLASS, prefix_instance)
    kg, type_iri = add_basic_triples(kg, relation_type_pref_label, RELATION_TYPE_CLASS, prefix_type)

    kg.add((instance_iri, HAS_RELATION_TYPE, type_iri))
    kg.add((type_iri, IS_RELATION_TYPE_OF, instance_iri))

    return kg, instance_iri, type_iri

def add_operation_instance_type(kg: rdflib.Graph,
                                operation_instance_pref_label: str,
                                operation_type_pref_label: str,
                                prefix_instance: str,
                                prefix_type: str) -> rdflib.Graph:
    kg, instance_iri = add_basic_triples(kg, operation_instance_pref_label, OPERATION_INSTANCE_CLASS, prefix_instance)
    kg, type_iri = add_basic_triples(kg, operation_type_pref_label, OPERATION_TYPE_CLASS, prefix_type)

    kg.add((instance_iri, HAS_RELATION_TYPE, type_iri))
    kg.add((type_iri, IS_RELATION_TYPE_OF, instance_iri))

    return kg, instance_iri, type_iri

kg = add_relations_operations(kg, RELATIONS_INPUT_FILE_PATH, "relations")
kg = add_relations_operations(kg, OPERATIONS_INPUT_FILE_PATH, "operations")

# output ontology
utils.output_ontology(kg, "output", "euclid_book1.ttl", "turtle")

In [None]:
# prepare list of relations and operations
PROPOSITIONS_INPUT_FILE_PATH = "input/Euclid.Propositions.Book1.csv"
RELATIONS_OUTPUT_FILE_PATH = "input/relations.csv"
OPERATIONS_OUTPUT_FILE_PATH = "input/operations.csv"

def prepare_list_of_relations_operations(input_file_path: str,
                                         relations_output_file_path: str,
                                         operations_output_file_path: str):
    relations = set()
    operations = set()

    concepts_df = pd.read_csv(input_file_path).fillna("")

    for i in concepts_df.index:
        relations.add(concepts_df.at[i, "relation_instance"].strip())
        operations.add(concepts_df.at[i, "operation_instance"].strip())

    relations = sorted([relation for relation in relations if relation])
    operations = sorted([operation for operation in operations if operation])

    relations_df = pd.DataFrame(relations, columns=["relation_instance"])
    operations_df = pd.DataFrame(operations, columns=["operation_instance"])

    relations_df.to_csv(relations_output_file_path, index=False)
    operations_df.to_csv(operations_output_file_path, index=False)

    return relations, operations

relations, operations = prepare_list_of_relations_operations(PROPOSITIONS_INPUT_FILE_PATH, RELATIONS_OUTPUT_FILE_PATH, OPERATIONS_OUTPUT_FILE_PATH)
