![logo](https://raw.githubusercontent.com/sciknoworg/OntoAligner/main/images/logo-with-background.png)

[![PyPI version](https://badge.fury.io/py/OntoAligner.svg)](https://badge.fury.io/py/OntoAligner)
[![PyPI Downloads](https://static.pepy.tech/badge/ontoaligner)](https://pepy.tech/projects/ontoaligner)
![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)
[![pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit)](https://github.com/pre-commit/pre-commit)
[![Documentation Status](https://readthedocs.org/projects/ontoaligner/badge/?version=main)](https://ontoaligner.readthedocs.io/)
[![Maintenance](https://img.shields.io/badge/Maintained%3F-yes-green.svg)](MAINTANANCE.md)
 [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.14533133.svg)](https://doi.org/10.5281/zenodo.14533133)

- **Documentation website**: [https://ontoaligner.readthedocs.io/index.html](https://ontoaligner.readthedocs.io/index.html)
- **Resource Paper**: [https://doi.org/10.1007/978-3-031-94578-6_10](https://doi.org/10.1007/978-3-031-94578-6_10)


--------

# Quick Introduction to OntoAligner



--------

Contents of this tutorial:
1. Lets understand what OntoAligner framework offers!
2. Real world scenario: eCommerce ontologies alignments.
3. Setting up the environments: Installation
4. Parsers: Parsing source and target ontologies.
5. Encoders: Encoding (preparing) the source and target ontologies.
6. Aligners: Apply alignment on source and target ontologies.

--------

---
# 1Ô∏è‚É£. Lets understand what OntoAligner framework offers!

![OntoAligner](https://raw.githubusercontent.com/sciknoworg/OntoAligner/refs/heads/dev/docs/source/img/ontoaligner-pip.jpg)

**üß© Parsers**:  The Parser module serves as the entry point of OntoAligner, handling ontology ingestion and alignment data loading. Key components include: ``OntologyParser`` and ``AlignmentsParser``.

**üî† Encoder**: After parsing, the `Encoder` module transforms ontological concepts into structured representations suited for similarity estimation or prompt-based inference given the nature of aligner model.


**üß† Aligners**: The Aligner Module is the core of OntoAligner and is responsible for discovering mappings between entities in two ontologies. It includes a diverse suite of alignment algorithms grouped into different categories: `Lightweight`, `Retrieval`, `LLM`, `RAG`, etc.


**üîÑ Post-Processing**: Fully modular and extendable to accommodate custom post-alignment techniques.
- Mapper: Maps generated texts (e.g., LLM outputs) to ontology classes.
- Filtering: Applies heuristic and rule-based strategies to ensure consistency and precision.

**üìä Evaluation and Exporters**:
- üìà Evaluator: Computes standard OA metrics: Precision, Recall, and F1-score. Supports comparison across different aligners and configurations.
- üì§ Exporter: Supports alignment export in common formats: XML and JSON.

---
# 2Ô∏è‚É£. Real world scenario: eCommerce ontologies alignments.



![](https://www.colorclipping.com/storage/posts/best-e-commerce-sites-shipping-worldwide.webp)

**Objective**: In modern e‚Äëcommerce, different platforms use heterogeneous product taxonomies that hinder unified search, recommendation, and analytics. An ontology alignment pipeline discovers correspondences between source and target classes‚Äîe.g. mapping ``GamingLaptop`` (from Amazon) to ``GamingNotebook`` (to eBay)‚Äîto enable cross‚Äësite product integration and comparison.


- Sample Amazon Ontology: https://github.com/sciknoworg/OntoAligner/tree/main/assets/e-commerce/amazon.owl
- Sample eBay Ontology: https://github.com/sciknoworg/OntoAligner/tree/main/assets/e-commerce/ebay.owl




## Visualization

In [None]:
import requests
from IPython.display import HTML

url = "https://raw.githubusercontent.com/sciknoworg/OntoAligner/main/docs/source/_static/amazon-ebay-alignment.html"

html_content = requests.get(url).text
HTML(html_content)

# 3Ô∏è‚É£. Setting up the environments: Installation

OntoAligner is available on the Python Package Index at https://pypi.org/project/OntoAligner/ for installation.

You can install OntoAligner from PyPI using pip:

In [None]:
!pip install -q ontoaligner numpy>=2.0

Or You can install OntoAligner directly from source to take advantage of the bleeding edge main branch for development.


In [None]:
!pip install git+https://github.com/sciknoworg/OntoAligner.git

---
# 4Ô∏è‚É£. Parsers: Parsing source and target ontologies.

We begin by parsing the RDF/XML representations of the Amazon and eBay ontologies using the ``GenericOntology`` class. This process extracts classes, labels, hierarchical relationships, synonyms, and comments, structuring them into a format suitable for alignment tasks.

**Parse ontology from a URL:**

In [None]:
# Import generic ontology parser
from ontoaligner.ontology import GenericOntology
ontology = GenericOntology()

# Source target ontology Path
src_onto_path =  "https://raw.githubusercontent.com/sciknoworg/OntoAligner/main/assets/e-commerce/amazon.owl"
tgt_onto_path = "https://raw.githubusercontent.com/sciknoworg/OntoAligner/main/assets/e-commerce/ebay.owl"

src_onto = ontology.parse(src_onto_path)
tgt_onto = ontology.parse(tgt_onto_path)

**OR parse ontologies from a local directory:**

In [None]:
# download the desired ontologies to local directory!
!wget https://raw.githubusercontent.com/sciknoworg/OntoAligner/main/assets/e-commerce/amazon.owl
!wget https://raw.githubusercontent.com/sciknoworg/OntoAligner/main/assets/e-commerce/ebay.owl

In [None]:
src_onto = ontology.parse("amazon.owl")
tgt_onto = ontology.parse("ebay.owl")

**Lets take a look at parsed ontologies:**

In [None]:
src_onto[0]

{'name': 'Electronics',
 'iri': 'http://example.org/amazon#Electronics',
 'label': 'Electronics',
 'childrens': [{'iri': 'http://example.org/amazon#Laptop',
   'label': 'Laptop',
   'name': 'Laptop'}],
 'parents': [],
 'synonyms': [],
 'comment': []}

In [None]:
tgt_onto[0]

{'name': 'Computers',
 'iri': 'http://example.org/ebay#Computers',
 'label': 'Computers',
 'childrens': [{'iri': 'http://example.org/ebay#Notebook',
   'label': 'Notebook',
   'name': 'Notebook'}],
 'parents': [],
 'synonyms': [],
 'comment': []}

---
# 5Ô∏è‚É£. Encoders: Encoding (preparing) the source and target ontologies.

To facilitate efficient matching, we encode each concept by concatenating its label with its parent labels. This approach captures both the concept's identity and its hierarchical context, providing a richer representation for similarity computations.

In [None]:
from ontoaligner.encoder import ConceptChildrenLightweightEncoder

encoder = ConceptChildrenLightweightEncoder()

encoder_output = encoder(source=src_onto, target=tgt_onto)

In [None]:
encoder_output

[[{'iri': 'http://example.org/amazon#Electronics',
   'text': 'electronics  laptop'},
  {'iri': 'http://example.org/amazon#Laptop',
   'text': 'laptop  gaminglaptop, ultrabook'},
  {'iri': 'http://example.org/amazon#GamingLaptop', 'text': 'gaminglaptop  '},
  {'iri': 'http://example.org/amazon#Ultrabook', 'text': 'ultrabook  '},
  {'iri': 'http://example.org/amazon#Smartphone', 'text': 'smartphone  '}],
 [{'iri': 'http://example.org/ebay#Computers', 'text': 'computers  notebook'},
  {'iri': 'http://example.org/ebay#Notebook',
   'text': 'notebook  gamingnotebook, businessnotebook'},
  {'iri': 'http://example.org/ebay#GamingNotebook',
   'text': 'gamingnotebook  '},
  {'iri': 'http://example.org/ebay#BusinessNotebook',
   'text': 'businessnotebook  '}]]

---
# 6Ô∏è‚É£. Aligners: Apply alignment on source and target ontologies.

**Lightweight Aligner**: We apply a fuzzy string matching algorithm to identify potential correspondences based on lexical similarity. This method computes similarity scores between concept labels, capturing straightforward matches.


In [None]:
from ontoaligner.aligner import SimpleFuzzySMLightweight

fuzzy = SimpleFuzzySMLightweight(fuzzy_sm_threshold=0.3)

fuzzy_matches = fuzzy.generate(input_data=encoder_output)

100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 5/5 [00:00<00:00, 27094.99it/s]


In [None]:
fuzzy_matches

[{'source': 'http://example.org/amazon#Electronics',
  'target': 'http://example.org/ebay#Computers',
  'score': 0.42105263157894735},
 {'source': 'http://example.org/amazon#Laptop',
  'target': 'http://example.org/ebay#Notebook',
  'score': 0.547945205479452},
 {'source': 'http://example.org/amazon#GamingLaptop',
  'target': 'http://example.org/ebay#GamingNotebook',
  'score': 0.6153846153846154},
 {'source': 'http://example.org/amazon#Ultrabook',
  'target': 'http://example.org/ebay#Computers',
  'score': 0.5}]

---
# üöÄ A complete workflow

In [None]:
# Import the core ontology wrapper used to load and manage OWL ontologies
from ontoaligner.ontology import GenericOntology

# Import the encoder that represents concepts using their child concepts
# This is a lightweight structural encoder
from ontoaligner.encoder import ConceptChildrenLightweightEncoder

# Import a simple fuzzy string-matching based aligner
from ontoaligner.aligner import SimpleFuzzySMLightweight

# Create an ontology handler instance
ontology = GenericOntology()

# Initialize the encoder
encoder = ConceptChildrenLightweightEncoder()

# Initialize the fuzzy string-matching aligner
# fuzzy_sm_threshold controls how similar two labels must be to be considered a match
fuzzy = SimpleFuzzySMLightweight(fuzzy_sm_threshold=0.3)

# Parse the source ontology (Amazon)
src_onto = ontology.parse("amazon.owl")

# Parse the target ontology (eBay)
tgt_onto = ontology.parse("ebay.owl")

# Encode both ontologies into a representation suitable for alignment
# The encoder prepares comparable concept features from source and target
encoder_output = encoder(source=src_onto, target=tgt_onto)

# Generate alignment candidates using fuzzy string matching
# The output is a set of matched concept pairs with similarity scores
fuzzy_matches = fuzzy.generate(input_data=encoder_output)

fuzzy_matches

5it [00:00, 6873.65it/s]
4it [00:00, 4196.40it/s]
100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 5/5 [00:00<00:00, 23121.85it/s]


[{'source': 'http://example.org/amazon#Electronics',
  'target': 'http://example.org/ebay#Computers',
  'score': 0.42105263157894735},
 {'source': 'http://example.org/amazon#Laptop',
  'target': 'http://example.org/ebay#Notebook',
  'score': 0.547945205479452},
 {'source': 'http://example.org/amazon#GamingLaptop',
  'target': 'http://example.org/ebay#GamingNotebook',
  'score': 0.6153846153846154},
 {'source': 'http://example.org/amazon#Ultrabook',
  'target': 'http://example.org/ebay#Computers',
  'score': 0.5}]

---

# ‚úÖ We learned:

1. **How the parser and encoder work together to prepare ontologies for OntoAligner.**

2. **How aligners are defined, and how the generation of matchings is encapsulated at higher abstraction levels.**

For more information, visit the [OntoAligner Documentation](https://ontoaligner.readthedocs.io/)

-----------------------------------------------------------
-----------------------------------------------------------

üìÉ Acknowledgement

OntoAligner is licensed under [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)


```bibtex
@inproceedings{babaei2025ontoaligner,
  title={OntoAligner: A Comprehensive Modular and Robust Python Toolkit for Ontology Alignment},
  author={Babaei Giglou, Hamed and D‚ÄôSouza, Jennifer and Karras, Oliver and Auer, S{\"o}ren},
  booktitle={European Semantic Web Conference},
  pages={174--191},
  year={2025},
  organization={Springer}
}
```