Issues in test cases #83

chrdebru · 2024-02-13T20:32:50Z

chrdebru · 2024-02-14T22:48:47Z

Here are some suggestions for simple datatype tests:

does rml:datatype work
does rml:datatypeMap work
does rml:datatypeMap for non-xsd datatypes work (should be accepted) as data validation is a separate process

The last one assumes http://example.com/base is given as input for the base IRI.

In other words, the datatype map "behaves" like a IRI generating term map.

RMLTC0021a-CSV.zip

DylanVanAssche · 2024-02-15T13:18:43Z

The ones covered by shapes are listed in tests.py:

RMLTC0004b: Literal as Term Type in Subject Map
RMLTC0007h: Named graph which is not an IRI
RMLTC0012c: missing Subject Map
RMLTC0012d: 2 Subject Maps
RMLTC0015b: invalid language tag --> do you have a proposal for the shapes here to improve it? I am not so sure what you suggest above.

These should already raise an error by the engine if they use the SHACL shapes to validate the mapping.
Currently, we don't have nice page like this: https://rml.io/test-cases/ where it is listed which test-cases throws an error.
We should have this in the future but also add metadata in each test-case what kind of error is thrown.
Do you have a suggestion for that?

Regarding 19a, that one might be a bit in the flux.

Regarding 19b, is on purpose with a data error. The test-cases currently assume 'best-effort' in that case.
Maybe we need to be here super strict and throw an error, to be on the same level as other test-cases like a named graph must be an IRI (0007h)

Regarding 20b, why is that an error? The RFC for URIs (https://www.rfc-editor.org/rfc/rfc3986#section-4.1) says that it can be resolved if needed. It is a valid IRI.
Question is: should we require engines to remove relative path stuff in IRIs like they do for encoding?

Datatype maps are missing yes, feel free to put it in a PR, thanks a lot! If you don't have time, I can do it as well, let me know.

chrdebru · 2024-02-15T14:35:59Z

Aha! Okay, I understood that the test cases (for engines) assume valid mappings. In other words, you now assume that each engine uses the shapes (or something else) to cover all cases. Which is OK, but I misinterpreted that.

19a --> Assuming the base IRI of the mapping is the same as the output is a dangerous assumption. You could leave it for the tests for backward compatibility with previous engines, but I would propose to document it as (either http://www.example.com/base is used as input or the base of the mapping file is assumed to be the base). R2RML explicitly states that the base IRI for the output is passed as a parameter. I prefer David's solution of rml:baseIRI per triples map.

19b --> "best effort" contradicts with "generating no file" so you want to be strict. You retain partial results of the same triples map.

20b --> Based on R2RML --> the string is tested for being an absolute IRI and does not mention anything about trying to compute the absolute IRI where possible. RML can propose this, but I have yet to find this mentioned in the spec. RDF does allow one to store information about <http://example.com/base/path/../Danny>, but I'm pretty certain that <http://example.com/base/path/../Danny> and <http://example.com/base/Danny> are two different resources. That will open a whole can of worms.

DylanVanAssche · 2024-02-15T15:42:04Z

Aha! Okay, I understood that the test cases (for engines) assume valid mappings. In other words, you now assume that each engine uses the shapes (or something else) to cover all cases. Which is OK, but I misinterpreted that.

Well not required but engines should not crash with invalid mappings. Thry can make their life ewsy by using the shapes or do it manually.

Assuming the base IRI of the mapping is the same as the output is a dangerous assumption. You could leave it for the tests for backward compatibility with previous engines, but I would propose to document it as (either http://www.example.com/base is used as input or the base of the mapping file is assumed to be the base). R2RML explicitly states that the base IRI for the output is passed as a parameter. I prefer David's solution of rml:baseIRI per triples map.

I agree here. I just need a way forward. Dropping the testcase seems the best way, we don't want engines to support this behavior. Do you agree?

19b --> "best effort" contradicts with "generating no file" so you want to be strict. You retain partial results of the same triples map.

Agreed! Especially this kind of stuff needs to go, the test cases must always follow the same paradigm. Let's change it to an error.

20b --> Based on R2RML --> the string is tested for being an absolute IRI and does not mention anything about trying to compute the absolute IRI where possible. RML can propose this, but I have yet to find this mentioned in the spec. RDF does allow one to store information about http://example.com/base/path/../Danny, but I'm pretty certain that http://example.com/base/path/../Danny and http://example.com/base/Danny are two different resources. That will open a whole can of worms.

Exactly! This is a whole can of worms so we need to pick a side here. These URIs are valid ones because they are in the end absolute. So not resolving? But if not resolving we can allow this case right?

chrdebru · 2024-02-15T15:51:08Z

Agree to drop the case.
Agree to remove the file.

Well, turning http://example.com/base/path/../Danny into http://example.com/base/Danny is called IRI normalization. RDF 1.1 states that non-normalized IRIs must be avoided, but it does not say that they must be normalized before ingestion. This makes sense, as we can say different things about the two IRIs. The RFC about IRIs states that one way to test IRI equality is by string-comparison (character-by-character), but other approaches may include normalization. That's what I appreciate about R2RML. If an IRI is absolute, then use that one. If not, test whether the base IRI + IRI is absolute. In other words, R2RML enforces the use of absolute IRIs.

rml:normalizeIRIs true (by default false) can be a solution. (an expensive solution, that is).

DylanVanAssche · 2024-03-04T10:36:25Z

I think this is all resolved? @chrdebru

chrdebru · 2024-03-04T12:36:57Z

Yup!

dachafra assigned DylanVanAssche Feb 13, 2024

dachafra added the bug Something isn't working label Feb 13, 2024

DylanVanAssche closed this as completed Mar 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issues in test cases #83

Issues in test cases #83

chrdebru commented Feb 13, 2024 •

edited

Loading

chrdebru commented Feb 14, 2024

DylanVanAssche commented Feb 15, 2024

chrdebru commented Feb 15, 2024 •

edited

Loading

DylanVanAssche commented Feb 15, 2024

chrdebru commented Feb 15, 2024 •

edited

Loading

DylanVanAssche commented Mar 4, 2024

chrdebru commented Mar 4, 2024

Issues in test cases #83

Issues in test cases #83

Comments

chrdebru commented Feb 13, 2024 • edited Loading

chrdebru commented Feb 14, 2024

DylanVanAssche commented Feb 15, 2024

chrdebru commented Feb 15, 2024 • edited Loading

DylanVanAssche commented Feb 15, 2024

chrdebru commented Feb 15, 2024 • edited Loading

DylanVanAssche commented Mar 4, 2024

chrdebru commented Mar 4, 2024

chrdebru commented Feb 13, 2024 •

edited

Loading

chrdebru commented Feb 15, 2024 •

edited

Loading

chrdebru commented Feb 15, 2024 •

edited

Loading