Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: JSON path not finding nested object that contain empty lists #149

Open
Manoe-K opened this issue Feb 13, 2023 · 1 comment
Open

BUG: JSON path not finding nested object that contain empty lists #149

Manoe-K opened this issue Feb 13, 2023 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@Manoe-K
Copy link

Manoe-K commented Feb 13, 2023

What Happens?

When iterating over a list using JSON path, objects that happen to have an empty list as an attribute will not be matched.
(I am very not sure about how I define this issue in the above phrase sorry, but I tried to make an example as clear as possible.)

In my example, I'm iterating over a list of persons who may have a list of friends. Those friends may have nicknames.
If I try to create a triple of the form "<person_A> foaf:knows <person_A's_friend>". The triple will not be created if that person's friend doesn't have nicknames. To be precise, it will not work if it has an empty list as value of one of its attributes. Even though said attribute may not even be used in the mapping.

I am doing that example on the python library of morph-kgc.

To Reproduce

data:

{
    "persons": [
        {
            "name": "Alice",
            "friends": [
                {
                    "name": "Bob",
                    "nicknames": ["Bobby"]
                },
                {
                    "name": "Curtis",
                    "nicknames": []
                }
            ]
        }
    ]
}

mapping:

@prefix ql: <http://semweb.mmlab.be/ns/ql#> .
@prefix rml: <http://semweb.mmlab.be/ns/rml#> .
@prefix rr: <http://www.w3.org/ns/r2rml#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .


<#example> a rr:TriplesMap;
	
	rml:logicalSource [
		rml:source "data.json";
		rml:referenceFormulation ql:JSONPath;
		rml:iterator "$.persons[*]";
	];

	rr:subjectMap [
 		rr:template "http://example.com/{name}";
 	];
    
	rr:predicateObjectMap [
		rr:predicate foaf:knows;
  		rr:objectMap [ rml:reference "friends.name"; ];
  	];
.

python code:

import morph_kgc

config = """
            [DataSource1]
            mappings=mapping.ttl
         """

g = morph_kgc.materialize_oxigraph(config)
g.dump(output='data.ttl', mime_type='text/turtle')

results:

<http://example.com/Alice> <http://xmlns.com/foaf/0.1/knows> "Bob" .

Expected results:

<http://example.com/Alice> <http://xmlns.com/foaf/0.1/knows> "Bob" .
<http://example.com/Alice> <http://xmlns.com/foaf/0.1/knows> "Curtis" .

Environment:

  • OS: Ventura 13.2
  • Python version: 3.10.8
  • Morph-KGC version: 2.1.1

This issue happens while using version 2.1.1, with that version my current example returns 1 triple instead of my expected 2.
I first had another issue using version 2.3.1, with that version my current example returns 0 triple.
But that second issue might just be issue 132.

@Manoe-K Manoe-K added the bug Something isn't working label Feb 13, 2023
@arenas-guerrero-julian
Copy link
Member

Hi @Manoe-K ,

Thanks for your detailed issue.

There is the following problem when converting from json to dataframe:

JSON:

[{'ID': 10, 'Sport': 100, 'Name': 'Venus Williams'}, {'ID': 20, 'Name': 'Demi Moore'}]

DataFrame

   ID  Sport            Name
0  10  100.0  Venus Williams
1  20    NaN      Demi Moore

Because of the missing values, pandas uses a NaN, and the integers are read as doubles. This is related to your issue because morph-kgc has a hack to fix this that prevents fixing this issue.

I am checking how to make everything work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants