Turtle Parse Error: illegal subject type: literal #306

GeraldGrootRoessink · 2018-12-19T19:43:34Z

I have this set of triples:

cdm:example
        sh:PropertyShape ;
	sh:name "heeft reden uitschrijving"@nl ;
	sh:nodeKind sh:IRIOrLiteral ;
	sh:path cdm:redenUitschrijvingHR-v01 ;
	sh:class cdm:RedenUitschrijvingHR-v01.1 ;
	sh:datatype xsd:token ;
	sh:minLength 1 ;
	sh:maxLength 70 ;
	sh:maxCount 1 ;
	sh:pattern "[A-Z0-9_]*" ;
.

Easyrdf complains unexpectedly about the sh:class line.
Or am I missing something?

The text was updated successfully, but these errors were encountered:

njh · 2020-06-04T21:18:22Z

I have just tested using this more complete turtle document:

@prefix cdm: <http://publications.europa.eu/ontology/cdm#> .
@prefix sh:   <http://www.w3.org/ns/shacl#> .
@prefix xsd:   <http://www.w3.org/2001/XMLSchema#> .

cdm:example
        a sh:PropertyShape ;
	sh:name "heeft reden uitschrijving"@nl ;
	sh:nodeKind sh:IRIOrLiteral ;
	sh:path cdm:redenUitschrijvingHR-v01 ;
	sh:class cdm:RedenUitschrijvingHR-v01.1 ;
	sh:datatype xsd:token ;
	sh:minLength 1 ;
	sh:maxLength 70 ;
	sh:maxCount 1 ;
	sh:pattern "[A-Z0-9_]*" ;
.

It looks like the parser doesn't like the . in cdm:RedenUitschrijvingHR-v01.1.

I will need to check the Turtle grammar and see why it isn't working:
https://www.w3.org/TR/turtle/#sec-grammar-grammar

Thanks for reporting.

billyk18278 · 2022-06-15T19:54:24Z

Hi, any news on that.
URIs that use prefix : and contain . are causing a parsing error
in the example above using
sh:class http://publications.europa.eu/ontology/cdm#RedenUitschrijvingHR-v01.1 ;
is parsed without issue

zozlak · 2022-06-17T18:16:16Z

It goes down to https://www.w3.org/TR/turtle/#grammar-production-PN_LOCAL where we can see that a dot can't be the first nor the last character of the "after the semicolon part of a prefixed name" but is allowed in the middle. This isn't honored by the Turtle::isNameChar() which always treats it as a prefixed name end.

billyk18278 · 2022-06-17T21:08:58Z

I have posted a specific example in #396
The syntax is valid protege/topbraid/jena work with prefixed classes/properties with dot in the middle of the name.

It is not clear (in https://github.com/easyrdf/easyrdf/blob/main/lib/Parser/Turtle.php#L1305 that parses char by char) how this can be solved. Is it enough to add . (0x2E) in the list of accepted chars in Names?
Do you have any proposal on how to tackle this?

zozlak · 2022-06-18T06:47:23Z

I would say add to the list of accepted ones and on the pname end check if the last character is a dot. If so, remove it from the pname and return it to the input characters queue.

billyk18278 · 2022-06-18T11:02:41Z

I am not that familiar with the flow of parsing and the functions used.
but i was thinking to say inside the isNameChar()
$next=$this->peek();//not sure whether this gives next char
$onext=ord($next);
and then add the condition to retun true if thi char is dot and next is text. i dont care for prefix:prop.5 cases at all.

||( $c=="." &( $onext >= 0x0300 && $onext <= 0x036F ||
$onext >= 0x203F && $onext <= 0x2040;)

any help appreciated. thanks

billyk18278 · 2022-06-20T06:57:11Z

I got it working by sending to isNameChar (which is static) also the next character using $this->peek() in each call.
When $c=='.' i check that the next is a latin character, if so i return true.

Here are my changes they seem to work but i am not sure this is a proper fix though since i did not go through the spec.

    public static function isNameChar($c,$cn)
    {
        $o = ord($c);
        $on = ord($cn);
        return
            self::isNameStartChar($c) ||
            $o >= 0x30 && $o <= 0x39 ||     # 0-9
            $c == '-' ||
            $o == 0x00B7 ||
            $o >= 0x0300 && $o <= 0x036F ||
            $o >= 0x203F && $o <= 0x2040 ||
            ($c=='.' && ($on >= 0x40 && $on <= 0x5b ||
            $on >= 0x60 && $on <= 0x7b))
            ;
    }

billyk18278 · 2022-06-20T07:58:09Z

In order for RdfNamespace::expand to work i had also to add . in the regExpr match list.

L432 } elseif (preg_match('/^(\w+?):([\w-.]+)$/', $shortUri, $matches)) {

shorten works as it is...

k00ni · 2022-07-04T11:14:32Z

If you think that is helpful to others, you should open a pull request.

morindamanik · 2023-03-01T13:29:21Z

@billyk18278 why your code looking like that? Is that on purpose and we should add that to our work or is that a bug??

billyk18278 · 2023-03-01T13:38:20Z

@billyk18278 why your code looking like that? Is that on purpose and we should add that to our work or is that a bug??
I do not remember what i did, it was a workaround (maybe even a wrong one).
The issue is what @njh wrote, and since as @zozlak wrote that based on the spec dot is allowed in the middle the name then not being able to parse such triples is a bug or a lack of a feature.

zozlak · 2023-03-01T21:14:35Z

@billyk18278 as far as I can tell the EasyRdf has been abandoned by the @njh. Still he's the only person with rights to merge the code into the main branch and issue new releases. Summing it up there are no chances for a fix, even if it's as simple as you proposed.

This leaves you with two options:

check out the https://github.com/sweetrdf/easyrdf and post this bug report there if it also affects it
give a try to another library, e.g. the https://github.com/sweetrdf/rdfInterface ecosystem

njh added the bug label Jun 4, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Turtle Parse Error: illegal subject type: literal #306

Turtle Parse Error: illegal subject type: literal #306

GeraldGrootRoessink commented Dec 19, 2018 •

edited by njh

njh commented Jun 4, 2020

billyk18278 commented Jun 15, 2022 •

edited

zozlak commented Jun 17, 2022 •

edited

billyk18278 commented Jun 17, 2022

zozlak commented Jun 18, 2022

billyk18278 commented Jun 18, 2022

billyk18278 commented Jun 20, 2022

billyk18278 commented Jun 20, 2022

k00ni commented Jul 4, 2022

morindamanik commented Mar 1, 2023

billyk18278 commented Mar 1, 2023

zozlak commented Mar 1, 2023

Turtle Parse Error: illegal subject type: literal #306

Turtle Parse Error: illegal subject type: literal #306

Comments

GeraldGrootRoessink commented Dec 19, 2018 • edited by njh

njh commented Jun 4, 2020

billyk18278 commented Jun 15, 2022 • edited

zozlak commented Jun 17, 2022 • edited

billyk18278 commented Jun 17, 2022

zozlak commented Jun 18, 2022

billyk18278 commented Jun 18, 2022

billyk18278 commented Jun 20, 2022

billyk18278 commented Jun 20, 2022

k00ni commented Jul 4, 2022

morindamanik commented Mar 1, 2023

billyk18278 commented Mar 1, 2023

zozlak commented Mar 1, 2023

GeraldGrootRoessink commented Dec 19, 2018 •

edited by njh

billyk18278 commented Jun 15, 2022 •

edited

zozlak commented Jun 17, 2022 •

edited