Skip to content

Automated parsing, and ontological & machine learning-powered semantic similarity modelling, of the Digital, Data and Technology (DDaT) profession capability framework website.

License

Notifications You must be signed in to change notification settings

hyperlearningai/ddat-ontology-modeller

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

48 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DDaT Ontology Modeller

Automated parsing, and ontological & machine learning-powered semantic similarity modelling, of the Digital, Data and Technology (DDaT) profession capability framework website.

DDaT Capability Framework · DDaT Ontology Visualisation

DDaT Ontology
DDaT ontology visualisation in OntoSpark

Table of Contents

1. Introduction
2. Getting Started
    2.1. Prerequisites
    2.2. Clone Source Code
    2.3. Install Python Packages
    2.4. Configuration
    2.5. Usage
3. License
4. Acknowledgements
5. Useful Links
6. Authors

1. Introduction

The DDaT ontology modeller application is a Python application that automatically parses, and performs ontological and machine learning-powered semantic similarity modelling of, the Digital, Data and Technology (DDaT) profession capability framework. The goal of the resulting ontology is to enable effective visualisation of the framework, and the goal of the machine-learning powered semantic similarity modelling is to identify potentially duplicate classes such as skills.

Back to Top ▲

2. Getting Started

2.1. Prerequisites

Please ensure that the following prerequisite software services are installed in your environment.

  • Git - open source distributed version control system.
  • Python 3 - Python 3 general-purpose programming language.
  • ChromeDriver - WebDriver for Chrome.

Back to Top ▲

2.2. Clone Source Code

The open source code for this application may be found on GitHub at https://github.com/hyperlearningai/ddat-ontology-modeller. To clone the repository, please run the following Git command via your command line or preferred Git GUI tool. The base location of the cloned repository will hereafter be referred to as $DDAT_ONTOLOGY_MODELLER_BASE.

# Clone the GitHub public repository
$ git clone https://github.com/hyperlearningai/ddat-ontology-modeller

# Navigate into the base project folder
# This location will hereafter be referred to as $DDAT_ONTOLOGY_MODELLER_BASE
$ cd ddat-ontology-modeller

Back to Top ▲

2.3. Install Python Packages

The DDaT ontology modeller application requires the Pandas, PyYAML, Selenium and Sentence Transformers Python packages to be installed in the relevant Python 3 environment. To install these Python package dependencies, please do so either manually or via the requirements.txt in $DDAT_ONTOLOGY_MODELLER_BASE using pip in the relevant Python environment as follows:

# Install the required Python package dependencies in your active Python environment
$ pip install -r requirements.txt

Back to Top ▲

2.4. Configuration

The DDaT ontology modeller application configuration may be found at $DDAT_ONTOLOGY_MODELLER_BASE/ddat/config/config.yaml. Please review and update the following configuration as required before running the application.

Property Description
app.base_working_dir Absolute path to a readable and writeable local directory where the DDaT ontology will be written to as an OWL RDF/XML file, as well as other working and application log files.
app.webdriver_paths.chromedriver Absolute path to the Google Chrome WebDriver (see Prerequisites).

Back to Top ▲

2.5. Usage

To run the DDaT ontology modeller application, simply run $DDAT_ONTOLOGY_MODELLER_BASE/main.py as follows:

# Run the DDaT ontology modeller application
$ python main.py

Back to Top ▲

3. License

The DDaT ontology modeller application source code is available and distributed under the MIT license. Please refer to LICENSE for further information. The DDaT ontology created by the DDaT ontology modeller application contains public sector information licensed under the Open Government License v3.0.

Back to Top ▲

4. Acknowledgements

The DDaT ontology created by the DDaT ontology modeller application contains public sector information sourced from the Digital, Data and Technology (DDaT) profession capability framework which is maintained by the Central Digital and Data Office (CDDO). The framework is publicly-available under the Open Government License v3.0.

Back to Top ▲

5. Useful Links

Back to Top ▲

6. Authors

The DDaT ontology modeller application was developed by the following authors:

  • Jillur Quddus
    Chief Data Scientist & Principal Polyglot Software Engineer

Back to Top ▲

About

Automated parsing, and ontological & machine learning-powered semantic similarity modelling, of the Digital, Data and Technology (DDaT) profession capability framework website.

Topics

Resources

License

Stars

Watchers

Forks

Languages