Skip to content

The OWL to Pure Translator project automates translating OWL files to Pure code using manual insights and a fine-tuned LLM. It aims to streamline ontology translation, ensuring accuracy and scalability while enabling real-time data validation with Pure libraries.

License

Notifications You must be signed in to change notification settings

reyes-michael/OWL-Legend-Translator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

OWL to Pure Translator Project

Project Overview

The OWL to Pure Translator project aims to develop a robust and efficient method for translating OWL (Web Ontology Language) files into Pure code. Given the complexity and size of ontologies like the Comprehensive Disease Network Ontology (CDNO), our approach combines manual translation with automated processes using a Large Language Model (LLM). This strategy ensures accuracy, scalability, and a deep understanding of the underlying patterns in OWL ontologies.

Objectives

  1. Understand OWL Ontologies: Manually translate key components of the CDNO ontology to identify and document patterns and structures.
  2. Automate Translation: Develop and fine-tune an LLM to automate the translation of OWL files to Pure code.
  3. Validate Data: Use Pure libraries to validate data against the generated Pure models, ensuring compliance with the defined ontology structures.

Approach

Phase 1: Manual Translation and Pattern Recognition

  1. Load and Inspect CDNO:

    • Use tools like Protégé to load and explore the CDNO ontology.
    • Identify key classes, properties, and relationships.
  2. Document Key Components:

    • Create a detailed document outlining the structure of the CDNO ontology.
    • List classes, properties, domains, ranges, restrictions, and annotations.
  3. Manual Translation of Key Sections:

    • Manually translate several key sections of the CDNO ontology to Pure code.
    • Document any patterns or recurring structures observed.

Phase 2: Data Preparation for LLM Training

  1. Prepare Annotated Dataset:

    • Manually translate a diverse set of OWL constructs to Pure.
    • Ensure translations are well-documented and annotated to highlight patterns and structures.
  2. Expand Dataset with Automation:

    • Write scripts to automate the extraction of additional OWL samples.
    • Ensure these samples cover a wide range of OWL constructs.

Phase 3: Training the LLM

  1. Choose an LLM Framework:

    • Use a pre-trained LLM framework such as GPT-4 from OpenAI or Hugging Face’s Transformers.
  2. Fine-Tune the Model:

    • Fine-tune the model using the annotated dataset.
    • Validate the model’s output using a separate set of OWL files to ensure it generalizes well.
  3. Develop a User Interface:

    • Create a simple interface to input OWL files and output Pure code.
    • Ensure the interface allows for easy validation and feedback.

Phase 4: Integration and Testing

  1. Integrate with Pure Libraries:

    • Use Pure libraries to validate data against the generated Pure models.
    • Ensure data complies with the structure and constraints defined in the Pure models.
  2. Iterative Testing and Improvement:

    • Rigorously test the model with various OWL ontologies.
    • Incorporate feedback and expand the training dataset with more complex examples.

About

The OWL to Pure Translator project automates translating OWL files to Pure code using manual insights and a fine-tuned LLM. It aims to streamline ontology translation, ensuring accuracy and scalability while enabling real-time data validation with Pure libraries.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages