Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error with getLocalName() for https://schema.org/3DModel #2376

Closed
davidcockbill opened this issue Mar 28, 2024 · 3 comments
Closed

Error with getLocalName() for https://schema.org/3DModel #2376

davidcockbill opened this issue Mar 28, 2024 · 3 comments

Comments

@davidcockbill
Copy link

Version

5.0.0-rc1

What happened?

When extracting the local name for nTriples with an object of https://schema.org/3DModel I get DModel rather than 3DModel.

Test code to reproduce:

    @Test
    void testJena()
    {
        final String nTriples = """
            _:3d_model <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/3DModel> .
        """;

        final Model model = ModelFactory.createDefaultModel();
        final InputStream input = new ByteArrayInputStream(nTriples.getBytes(StandardCharsets.UTF_8));
        model.read(input, null, "nTriples");

        for (Statement stmt : model.listStatements().toList())
        {
            final RDFNode object = stmt.getObject();
            System.out.println("object name=" + object.asResource().getLocalName());
        }
    }

From stepping through the code the problem looks to be with the use of Util.splitNamespaceXML(); in particular the XMLChar.isNCNameStart(ch) which is specifying that '3' not valid for the start of an xml tag. From a quick search online this statement appears to be correct for XML tags, but is 3DModel really an XML tag? (i.e. should XML rules really be applied here?)

Either way, the getLocalName() method returns the wrong value.

Relevant output and stacktrace

No response

Are you interested in making a pull request?

None

@rvesse
Copy link
Member

rvesse commented Mar 28, 2024

Historically Resource.getLocalName() uses RDF/XML specific local name rules, which early versions of other syntaxes also used to follow. As RDF/XML has become less and less prevalent other formats relaxed those rules to allow for a wider range of characters in the local name portion of their prefixed name syntaxes but given Jena's long history early parts of the API like Model and Resource have not really changed in that regard.

There are other ways to present URIs e.g. PrefixMapping.shortForm() or PrefixMap.abbreviate() that should be used if you want more generic URI shortening and these will use more modern rules about what is allowed in the local name portion of URIs

Or you could look at the NodeFormatter API if you plan to generate a lot of shortened URIs in your outputs

@afs
Copy link
Member

afs commented Mar 28, 2024

See SplitIRI for split algorithms -- there are splits for RDF/XML, Turtle and a basic /,# split.

The javadoc for Resource.getLocalName refers to SplitIRI.

Changing Resource.getLocalName would impact existing applications and Jena has tended to be conservative in that regard. It encourages people to upgrade their use of Jena for general good security health.

New Resource.splitLocalName/splitNamespace operations could be done that call SplitIRI.

Contributions welcome!

@afs afs removed the bug label Mar 28, 2024
@davidcockbill
Copy link
Author

Thanks for the quick replies.

That'll teach me for looking at decompiled code in the debugger rather than the actual sources!

As Andy says the actual source does mention this:

  /** Returns the localname of this resource within its namespace if it is a URI else null.
   * <p>
   * Note: XML requires QNames to start with a letter, not a digit,
   * and this method reflects that restriction.
   * <p>
   * See functions in {@code SplitIRI} for other split algorithms.
   * @return The localname of this property within its namespace.
   */

I'lll investigate using either PrefixMapping or SplitIRI.

When I have a free moment I'll consider creating a PR for those new methods, depending on my success!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants