# Geological Interpretor Development

This is a notebook for testing and developping some of the basic code in this package.

## Testing ontology manipulation

The knowledge manipulated in this package is formalised in an ontology,<br>
which is store in a *.owl* file.

It is named **MOGI** for **M**inimal **O**ntology for **G**eological **I**nterpretation

To manipulated this ontology, we use the package **owlready2** available from here: https://owlready2.readthedocs.io

In [None]:
import owlready2 as owl

In [None]:
owl.onto_path.append("../ontologies/")
mogi = owl.get_ontology("mogi.owl").load()
mogi

Ontology provides access to its components, e.g.:
* classes
* properties
* individuals
* rules

In [None]:
print(list(mogi.classes()))
print(list(mogi.properties()))
print(list(mogi.individuals()))
print(list(mogi.rules()))

More specific elements can be searched through simple queries:

In [None]:
mogi.search(iri = "*Surface")

In [None]:
context = mogi.Geologic_Context('Data_properties')

In [None]:
context.get_properties()

In [None]:
context.INDIRECT_get_properties()

In [None]:
owl.Thing.get_properties(owl.Thing)

### Reasoner

Ontologies are even more powerful thansk to their capabilities to use reasoning for infering types, properties, and relationships that were not explicitly stated.
This is usefull for obtaining results implied by the already stated information.

This is achieved by running a *reasoner* on the ontology as follows.

In [None]:
owl.sync_reasoner(infer_property_values=True)

## Geological Knowledge Manager

**GeologicalKnowledgeManager** may know different instances of **GeologicalKnowledgeFramework**,<br>
for example to allow differenciating scenarios or for allowing customisation of knowledge and its formalisation.

**GeologicalKnowledgeFramework** provides access to concept definitions for providing knowledge.

In [None]:
import os

class GeologicalKnowledgeManager(object):
    """GeologicalKnowledgeManager is managing one or several GeologicalKnowledgeFramework.
    
    The GeologicalKnowledgeManager is typically a singleton, so there is always one and only one instance of it.
    
    The GeologicalKnowledgeManager may know different instances of GeologicalKnowledgeFramework,
    for example to allow different interpretation scenarios or for allowing user-specific customisation
    of knowledge and its formalisation.
    
    GeologicalKnowledgeFramework are typically ontologies and extensions defined in this package or elsewhere.
    """
    
    def __new__(cls):
        """Method to access (and create if needed) the only allowed instance of this class.
        
        Returns:
        - an instance of GeologicalKnowledgeManager"""
        if not hasattr(cls, 'instance'):
            cls.instance = super(GeologicalKnowledgeManager, cls).__new__(cls)
            cls.initialised= False
            print("DEBUG::creates new manager")
        return cls.instance
        
    def __init__(self, default= "mogi", default_source_directory= "../ontologies/", default_source_file= "mogi.owl", default_ontology_backend= "owlready2"):
        """Initializes the GeologicalKnowledgeManager with some default values from configuration.
        
        Parameters:
        - default: specifies the name of the default knowledge framework
        - default_source_directory: specifies the default folder containing of the knowledge framework definitions
        - default_source_file: file contained in the source_directory defining the knowledge framework (e.g., .owl file)
        - default_ontology_backend: specifies the default ontology backend to be used
        """
        print("DEBUG::__init__")
        if not self.initialised:
            self._initialise(default= default, default_source_directory= default_source_directory, default_source_file= default_source_file, default_ontology_backend= default_ontology_backend)
            
    def _initialise(self, default, default_source_directory, default_source_file, default_ontology_backend):
        """Initializes the GeologicalKnowledgeManager with some default values from configuration.
        
        Parameters:
        - default: specifies the name of the default knowledge framework
        - default_source_directory: specifies the default folder containing of the knowledge framework definitions
        - default_source_file: file contained in the source_directory defining the knowledge framework (e.g., .owl file)
        - default_ontology_backend: specifies the default ontology backend to be used
        """
        print("DEBUG::initialize manager")
        self.default= default
        self.default_source_directory= default_source_directory
        self.default_source_file= default_source_file
        self.default_ontology_backend= default_ontology_backend
        
        self.knowledge_framework_dict = {}
        
        self.initialised= True
        
    def reset(self, default= "mogi", default_source_directory= "../ontologies/", default_source_file= "mogi.owl", default_ontology_backend= "owlready2"):
        """Reinitializes the GeologicalKnowledgeManager with some default values from configuration.
        
        Parameters:
        - default: specifies the name of the default knowledge framework
        - default_source_directory: specifies the default folder containing of the knowledge framework definitions
        - default_source_file: file contained in the source_directory defining the knowledge framework (e.g., .owl file)
        - default_ontology_backend: specifies the default ontology backend to be used
        """
        print("DEBUG::reset manager")
        self._initialise(default= default, default_source_directory= default_source_directory, default_source_file= default_source_file, default_ontology_backend= default_ontology_backend)
             
    def load_knowledge_framework(self, name=None, source= None, source_directory= None, backend= None):
        """Gets and initilises the ontology from the specified source.
        
        Parameters:
        - name: the name to be given to the knowledge framework. If None (default) the file name will be used.
        - source: filename to the ontology source. If None(default) the default ontology is used.
        - source_directory: where the system should look for ontology definition files. If None, the `GeologicalKnowledgeFramework` will decide.
        - backend: the ontology backend to be used. If None, the `GeologicalKnowledgeFramework` will decide."""
        source = source if source is not None else self.default_source_file
        name = name if name is not None else os.path.basename(source).split(os.path.extsep)[0]
        self.knowledge_framework_dict[name] = GeologicalKnowledgeFramework(name= name, source= source, source_directory= source_directory, backend= backend)
    
    def get_knowledge_framework(self,name= "default"):
        """Accessor to knowledge frameworks."""
        name = self.default if name == "default" else name
        assert len(self.knowledge_framework_dict) > 0, "No ontology has been loaded yet. Please use GeologicalKnowledgeManager().load_knowledge_framework() first"
        assert name in self.knowledge_framework_dict.keys(), "The specified ontology hasn't been loaded: "+name+\
            "\navailable ontology names are: "+"\n".join(self.knowledge_framework_dict.keys())
        return self.knowledge_framework_dict[name]
    
class GeologicalKnowledgeFramework(object):
    """A GeologicalKnowledgeFramework holds the definition of concepts and relationships describing knowledge.
    
    This is typically an overlay around a formal ontology definition, which also brings additional capabilities,
    such as algorithms and factories to achieve specific tasks and create objects."""
    
    def __init__(self, name, source, source_directory= None, backend= None):
        """Initialise a KnowledgeFramework form a given ontology file (source).
        
        Parameters:
        - name: should be the name under which this KnowledgeFramework is known in the manager
        - source: the source file for the ontology definition
        - source_directory: the directory where the source files for the ontology definition are looked for.
        If None (default) the default path provided by the `KnowledgeManager` is used.
        - backend: the ontology backend to be used for this knwoledge framework.
        If None (default) the default ontology backend provided by the `KnowledgeManager` is used."""
        self.name= name
        print(source)
        self.__source_directory= None
        self.init_source_directory(source_directory)
        self.initialise_ontology_backend(backend)
        print(source)
        self.load_ontology(source)
    
    def init_source_directory(self, source_directory):
        """Initialises the folder where source files are searched.
        
        Parameters:
        - source_directory: if None, the previous value is used if it wasn't None, else the `GeologicalKnowledgeManager`default is used."""
        if source_directory is not None:
            self.__source_directory= source_directory
        elif self.__source_directory is None:
            self.__source_directory= GeologicalKnowledgeManager().default_source_directory
    
    def initialise_ontology_backend(self, backend_name:str= None):
        """Initializes the ontology package used as a backend to access ontologies.
        
        This will:
        - try to import the backend as onto
        - set the default path for ontologies"""
                
        self.__ontology_backend = None
        backend_name= GeologicalKnowledgeManager().default_ontology_backend if backend_name is None else backend_name
        if backend_name == "owlready2":
            try:
                import owlready2 as owl2 
                self.__ontology_backend = owl2
                if self.__source_directory not in self.__ontology_backend.onto_path:
                    self.__ontology_backend.onto_path.append(self.__source_directory)
            except ImportError:
                raise ImportError("Your are trying to use Owlready2 as a backend for ontology management, but it doesn't appear to be installed."\
                "This is either because OwlReady2 is given as default option or because you asked for it."\
                "Please install the OwlReady2 package from https://owlready2.readthedocs.io"\
                "or give another backend through GeologicalKnowledgeManager().initialise_ontology_backend()")
                
            # also test if java is correctly installed & accessible, as it is used by owlready2 for reasoning
            try:
                os.system("java -version")
            except:
                raise ImportError("Java doesn't appear to be installed properly as the command `java -version` returned an error."\
                    "This error occured while loading owlready2 package as an ontology backend, because java is used for the reasoning engine.")
        else:
            raise Exception("The specified backed for ontology is not supported: "+backend_name)
          
        
    def load_ontology(self, source):
        """Loads the ontology specified by source.
        
        Parameters:
        - source: the source file for the ontology definition
        - source_directory: the directory where the source files for the ontology definition are looked for.
        If None (default) the default path provided by the `KnowledgeManager` is used."""
        self.__source= source
        print(source)
        try:
            self.__onto = self.__ontology_backend.get_ontology(self.__source).load()
        except Exception as err:
            raise Exception("Unexpected exception received while loading ontology:\n - source: {}\n - onto_path: {}".format(self.__source, self.__ontology_backend.onto_path))
        
    def __call__(self):
        return self.__onto
        
    def get_ontology_backend(self):
        """Gets the ontology backend"""
        assert self.__ontology_backend is not None, "Trying to access the ontology backend without initialising it."
        return self.__ontology_backend
    
    def sync_reasoner(self, **kargs):
        """Synchronise the reasoner.
        
        Parameters:
        - **kargs:
        |-infer_property_values"""
        self.__ontology_backend.sync_reasoner(**kargs)
    

In our approach, geological datasets will be progressively interpreted in terms of structural objects,<br>
based on a formal definition of concepts own by a **GeologicalKnowledgeManager**.<br>


In [None]:
GeologicalKnowledgeManager()

In [None]:
GeologicalKnowledgeManager()

In [None]:
GeologicalKnowledgeManager().reset()

In [None]:
GeologicalKnowledgeManager()

In [None]:
GeologicalKnowledgeManager().load_knowledge_framework()
GeologicalKnowledgeManager().get_knowledge_framework()

In [None]:
GeologicalKnowledgeFramework("mogi","mogi.owl")

In [None]:
GeologicalKnowledgeManager().knowledge_framework_dict

In [None]:
mogi = GeologicalKnowledgeManager().get_knowledge_framework()
mogi.name

In [None]:
mogi().classes

## Creating a dataset

Data are actually described within the ontology, here thanks to the *Data* class.<br>
Adding new data points calls for creating new *Data* individuals (i.e., instances in the ontology).

In [None]:
import numpy as np
import pandas as pd

In [None]:
data_head = np.array(['name', 'x', 'y', 'z', 'dip_dir', 'dip', 'geology'])
data_array = np.array([['D1', 15, 20, 35, 270, 45, 'Trias_Base'],
                       ['D2', 30, 25, 50, 270, 45, 'Trias_Base'],
                       ['D3', 60, 30, 40, 90, 45, 'Trias_Base'],
                       ['D4', 75, 15, 25, 90, 45, 'Trias_Base'],
                       ['D5', 110, 20, 40, 270, 63, 'Trias_Base'],
                       ['D6', 120, 20, 60, 270, 64, 'Trias_Base'],
                       ['D7', 155, 20, 60, 89, 39, 'Trias_Base'],
                       ['D8', 190, 20, 30, 91, 40, 'Trias_Base'],
                       ['D11', 25, 22, 45, np.nan, np.nan, np.nan],
                       ['D22', 50, 22, 50, np.nan, np.nan, np.nan],
                       ['D44', 100, 30, 20, np.nan, np.nan, np.nan],
                       ['D77', 168, 30, 47, np.nan, np.nan, np.nan]]
)
dataset = pd.DataFrame(data = data_array, columns = data_head)
dataset = dataset.astype({'name':str, 'x':float, 'y':float, 'z':float, 'dip_dir':float, 'dip':float, 'geology':str})
dataset.set_index("name", inplace = True)
dataset

In [None]:
dataset.info()

In [None]:
# clearing any data already stored in the ontology
for data_i in mogi.search(type = mogi.Ponctual_Observation):
    owl.destroy_entity(data_i)
mogi.search(type = mogi.Ponctual_Observation)

In [None]:
# setting the dataset in the ontology by creating individuals
for name_i, values_i in dataset.iterrows():
    mogi.Ponctual_Observation(name_i, **{key:[val] for key, val in values_i.items()})
mogi.search(type = mogi.Ponctual_Observation)

In [None]:
# for loading dataset from the ontology
dataset = pd.DataFrame(columns=["name","x","y","z","dip_dir","dip",'geology'])
dataset.set_index("name",inplace=True)
for di in mogi.search(type = mogi.Ponctual_Observation):
    for prop in di.get_properties():
        for value in prop[di]:
            dataset.loc[di.name,prop.name] = value
dataset = dataset.astype({'x':float, 'y':float, 'z':float, 'dip_dir':float, 'dip':float, 'geology':str})
dataset.head()

In [None]:
dataset.info()

### Object implementation

In [None]:
class Space(object):
    """A `Space` represents an abstract place where things exist and can be observed or rendered.
    
    It is typically dereived into:
    - `PhysicalSpace` for spaces with physical coordinates (typically X, Y, Z)
    - `TemporalSpace` for spaces with a time coordinate"""
    
class PhysicalSpace(Space):
    """A `PhysicalSpace` represents a physical place where things exist and can be observed or rendered."""
    
class TemporalSpace(Space):
    """A `TemporalSpace` represents a time span where things exist and can be observed or rendered."""

class DataSet(object):
    """A `DataSet` gathers several kinds of data / observations / informations"""

## Data visualisation

### Testing Data visualisation

In [None]:
import matplotlib.pyplot as plt

In [None]:
def draw_line(center, dip, dir, length= 1, ax= None, color = "black", **kargs):
    ax_plt = plt if ax is None else ax

    center = np.array(center)
    dip_rad = np.deg2rad(dip)
    vec_x =  np.cos(dip_rad)
    if dir == "left": vec_x *= -1
    vec_z = -np.sin(dip_rad)
    vect = 0.5 * length * np.array([vec_x,vec_z])
    start = center - vect
    end = center + vect
    ax_plt.plot([start[0],end[0]],[start[1],end[1]], color = color, **kargs)
    
    return vect
    
def draw_dip_symbol(center, dip, dir, length= 1, polarity= None, ax= None, color = "black", polarity_ratio= 0.4, **kargs):
    ax_plt = plt if ax is None else ax
    
    vect = draw_line(center= center, dip= dip, dir= dir, length= length, ax= ax_plt, color = color, **kargs)
    
    if polarity is not None:
        vect_pol = polarity_ratio * np.array([-vect[1],vect[0]])
        if (dir == "left" and polarity == "up") or (dir == "right" and polarity == "down") : vect_pol *= -1
        ax_plt.arrow(*center,*vect_pol, width=length/100, color = color, **kargs)
        

In [None]:
draw_line([0,0],30, "left")
draw_dip_symbol([0,1],60, "right", polarity= "up", color= "red" )
plt.gca().set_aspect("equal")

In [None]:
def draw_dataset( dataset, ax= None, **kargs):
    ax_plt = plt if ax is None else ax
    
    for data_i in dataset.itertuples():
        if (data_i.dip != np.nan) and (data_i.dip_dir != np.nan):
            dir = "right" if data_i.dip_dir < 180 else "left"
            draw_dip_symbol( center= [data_i.x,data_i.z], dip= data_i.dip, dir= dir, **kargs)

In [None]:
draw_dataset(dataset, length=10, polarity="up")
plt.gca().set_aspect("equal")

In [None]:
next(dataset.itertuples()).dip

### Object implementation

We distinguish two kind of operations here:
* representation
* visualisation

A representation is a formal description of how something appears in a given representation space, but it doesn't have to be visualised.<br>
A visualisation takes care of the rendering of a representation with a given support (image, screen).

Representation should also be made a bit more abstract.<br>
1. There is a variety of object that can be rendered in a representation space (typically, different kinds of a dataset components)
2. Several kinds of representation spaces could be envisionned (e.g., spatial 1D,2D,3D, or temporal, or just an abstract text)

In [None]:
class RepresentationSpace(object):
    """A general framework for Representating geological objects"""
    
class TemporalRepresentationSpace(RepresentationSpace):
    """A `RepresentationSpace` representing temporal apsects of represented objects."""
    
class PhysicalRepresentationSpace(RepresentationSpace):
    """A type of `RepresentationSpace` representing physical aspects of the represented objects."""
    
    __default_coordinate_labels = ["X","Y","Z"]
    
    def __init__(self, dimension: int=None, coordinate_label: str|list= None ):
        """Initialisation of the representation space.
        
        Parameters:
        - dimension (int): specify the number of dimensions of the representation space, typically 1D, 2D, or 3D (i.e., 1, 2, or 3),
        NB: larger dimension spaces are not supported. At least either the `dimension` parameter or `coordinate_label` parameter should be given.
        - coordinate_label(str|list(str)): gives the label(s) of the coordinates. If given, the number of dimensions is deduced from the size of the list
        and `dimensions`is ignored, otherwise, the labels are taken from the `__default_coordinate_labels` based on the number of `dimension`s. 
        At least either the `dimension` parameter or `coordinate_label` parameter should be given.
        """
        assert not (coordinate_label is None and dimension is None), "At least one of the parameters shoudl be specified"
        if coordinate_label is None:
            assert dimension in [1,2,3], "The specified number of dimensions ({:d}) is not supported, should be 1, 2 or 3.".format(dimension)
            self.dimension= dimension
            self.coordinate_labels= PhysicalRepresentationSpace.__default_coordinate_labels[:self.dimension]
        elif isinstance(coordinate_label,str):
            self.dimension= 1
            self.coordinate_labels=  [coordinate_label]
        elif isinstance(coordinate_label, list):
            self.dimension= len(coordinate_label)
            self.coordinate_labels= coordinate_label
        else:
            raise("Unsupported initialisation of representation space: dimension({}) and coordinate_label ({}).\n At least one of the parameters shoudl be specified.".format(dimension, coordinate_label))
            

In [None]:
PhysicalRepresentationSpace(2)

In [None]:
PhysicalRepresentationSpace()

In [None]:
PhysicalRepresentationSpace(coordinate_label=["X","Y"])

## Dataset

In [None]:
class GeologicalDataset(object):
    """A GeologicalDataset gather information about geological data to be interpreted"""

## Interpretation Workflow

The interpretation process in itself is run in a **GeologicalInterpretationProcess** and follow a very simple and generic algorithm.<br>
This algorithm implements a Deming wheel process of continual improvement:
1. Plan:
    1. Select a situation
    2. Select an action
2. Do: Implement the action (e.g., CreateInterpretationElement)
    1. List features
    2. Identify possible explanations
    3. Rank/chose explanations
    4. Instanciate individuals
    5. Infer and set parameters
3. Check: Evaluate consistency
    1. Evaluate internal consistency
    2. Evaluate relational likelihood
    3. Evaluate feature explanation
4. Act: Generate anomalies and report

In [None]:
class GeologicalInterpretationProcess(object):
    """GeologicalInterpretationProcess implements the core process of a geological intepretation.
    
    It connects all the required elements and resulting artefacts relatively to a given interpretation sequence:
     - a GeologicalKnowledgeFramework"""
     
    def __init__(self, dataset: GeologicalDataset, knowledge_framework= None):
         """Creates a GeologicalInterpretationProcess
         
         ---------------------------
         Parameters:
         - dataset (GeologicalDataset): a dataset to be explained by this interpretor
         - knowledge_framework: a GeologicalKnowledgeFramework that defines the concepts used for this interpretation.
            If None is given, the the default knowledge framework is used (`GeologicalKnowledgeManager().get_knowledge_framework()`)
         """
         self.knowledge_framework= GeologicalKnowledgeManager().get_knowledge_framework() if knowledge_framework is None else knowledge_framework
    