Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use of UDFs when fetching data from Oracle for building KG #239

Closed
IshanDindorkar opened this issue May 3, 2024 · 4 comments
Closed

Use of UDFs when fetching data from Oracle for building KG #239

IshanDindorkar opened this issue May 3, 2024 · 4 comments
Assignees
Labels
question Further information is requested rml-fnml

Comments

@IshanDindorkar
Copy link

Hello Team,

Thank you for your work.
I recently started using Morph-KGC library for one of the use cases focused on building Knowledge Graph from a relational database like Oracle. We are trying to explore if there is a possibility to use UDFs (Python based user-defined functions) to process incoming data from database before generating KG out of it. To do it, we followed steps mentioned in the official documentation and tried same mapping as shown in the doc. The only difference being the source of data used for example is a CSV file while in our case we are fetching data from an Oracle db. After spending some time, we found this test for UDF functionality and tried to replicate the mappings and config file. It works fine with CSV as an input but unfortunately, the UDF is not working as expected when the source of data is an Oracle table.
Another thing which we noticed is that there are so many examples when db is being using in the mappings file to fetch data and build KG at this location in the repository but not even a single test shows use of UDF. Is it just a coincidence or there is some other reason for it? Could you please advise.

Thank you very much for your support. Appreciate it.

@IshanDindorkar IshanDindorkar added the question Further information is requested label May 3, 2024
@arenas-guerrero-julian arenas-guerrero-julian self-assigned this May 3, 2024
@arenas-guerrero-julian
Copy link
Member

Hi @IshanDindorkar,

The UDFs should work with any data source. Do you get any specific error?

Regarding your sencond question, the reason for which there is no UDF example in that location is that it only contains the R2RML test cases which do not include UDFs.

@IshanDindorkar
Copy link
Author

Hi @arenas-guerrero-julian,

Thank you very much for your prompt response.
I am not getting an error as such. But the function does not get executed on the column of database table. For e.g. I am trying to fetch two columns in table A - col 1 & col 2. With the help of UDF I am trying to convert values of col 2 as lower case. When I save the KG, I do not see value of col2 converted to lower case. For this I am using a mapping file like shown below

@prefix rr: <http://www.w3.org/ns/r2rml#> .
@prefix rml: <http://semweb.mmlab.be/ns/rml#> .
@prefix ql: <http://semweb.mmlab.be/ns/ql#> .
@prefix ex: <http://example.com/> .
@prefix grel: <http://users.ugent.be/~bjdmeest/function/grel.ttl#> .

@base <http://example.com/base/> .

<#TM1>
    rml:logicalSource [
        rml:query "SELECT col1, col2 FROM A WHERE ROWNUM < 10" ;
    ] ;

    rr:subjectMap [
        rr:template "http://example.com/{col1}" ;
        rr:class ex:col1 ;
    ] ;

     rr:predicateObjectMap [
        rr:predicate ex:col1;
        rr:objectMap [
            rr:column "col1" ;
        ] ;
    ] ;

    rr:predicateObjectMap [
        rr:predicate ex:col2 ;
        rr:objectMap [
            rr:column "col2" ;
            rr:functionExecution <#Execution> ;
        ] ;
    ] .

<#Execution>
    rml:function ex:toLowerCase ;
    rml:input [
        rml:parameter grel:valueParam ;
        rml:inputValueMap [
            rml:reference "col2" ;
        ]
    ] .

The UDF looks like this

@udf(
    fun_id='http://example.com/toLowerCase',
    text='http://users.ugent.be/~bjdmeest/function/grel.ttl#valueParam')
def to_lower_case(text):
    return text.lower()

Could you please advise what I am missing here and help me in fixing the issue.

Thank you very much for your support. Appreciate it.

@arenas-guerrero-julian
Copy link
Member

I think that the problem is that you are mixing R2RML and RML. For instance, you seem to be using RML but you employ rr:column, which is R2RML. The correct property is rml:reference. Similarly with RML-FNML, you are using rr:functionExecution but the correct property is rml:functionExecution.

Also, use the latest prefixes in the mapping:

@prefix rml: <http://w3id.org/rml/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix fno: <https://w3id.org/function/ontology#> .
@prefix morph-kgc: <https://github.com/morph-kgc/morph-kgc/function/built-in.ttl#> .
@prefix grel: <http://users.ugent.be/~bjdmeest/function/grel.ttl#> .
@prefix idlab-fn: <http://example.com/idlab/function/> .

Some additional advice:

  • Instead of the turtle-based syntax of RML, I recommend you to use the YARRRML syntax, which is supported out-of-the-box in Morph-KGC. It is easier and it will also avoid you the aforementioned issue with prefixes.
  • Given that you are using Oracle, for simple data transformation functions you may use RML views. For instance, for lowercase you can do SELECT col1, LOWER(col2) AS col2 FROM A WHERE ROWNUM < 10.

@IshanDindorkar
Copy link
Author

@arenas-guerrero-julian Thank you so much for your great response and pointing us in the right direction. Really appreciate it :)
We fixed prefixes and replaced properties as you suggested and that helped us in resolving the issue. We will keep in mind to use YARRML as you advised. Regarding use of SELECT query for lowercasing, it completely makes sense. We actually were experimenting with UDFs and thought of starting with very basic operation like transforming text to lowercase as we have our data all in uppercase. But our ultimate goal is to implement much complex functionality in the Python-based UDFs for pre-processing our enterprise data before converting it into KG :)
Thank you once again and have a great start for the week!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested rml-fnml
Projects
None yet
Development

No branches or pull requests

2 participants