Skip to content

fix a few things in Lang and RDFLanguages#1146

Closed
bvosburgh-tq wants to merge 1 commit intoapache:mainfrom
TopQuadrant:lang-stuff
Closed

fix a few things in Lang and RDFLanguages#1146
bvosburgh-tq wants to merge 1 commit intoapache:mainfrom
TopQuadrant:lang-stuff

Conversation

@bvosburgh-tq
Copy link
Copy Markdown
Contributor

  • Lang.equals(Object) did not compare altLabels
  • RDFLanguages.register(Lang) unregistered the wrong lang
  • RDFLanguages.checkRegistration(Lang) unnecessarily checked mapContentTypeToLang and used the wrong lookup keys to check the altNames, altContentTypes, and fileExtensions
  • RDFLanguages.unregister(Lang) did not clear out the altNames

- Lang.equals(Object) did not compare altLabels
- RDFLanguages.register(Lang) unregistered the wrong lang
- RDFLanguages.checkRegistration(Lang) unnecessarily checked mapContentTypeToLang and used the wrong lookup keys to check the altNames, altContentTypes, and fileExtensions
- RDFLanguages.unregister(Lang) did not clear out the altNames

/** Make sure the registration does not overlap or interfere with an existing registration. */
private static void checkRegistration(Lang lang) {
if ( lang == null )
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's leave this in place. Just because in the current usage, lang is never null does not mean that it was or will be. checkRegistration is a robust building block done this way. A piece of defensive code here is zero cost -- it will probably happen before the instruction reaches the head of the execution pipeline and it's on a critical path in Jena anyway.

error("Language overlap: " + lang + " and " + mapContentTypeToLang.get(contentType) + " on content type " + contentType);
return;
}
// any pre-existing mapContentTypeToLang entry has already been removed - no need to check it here
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an internal consistency check. Let's leave it.

Lang otherLang = (Lang)other ;
return
this.label == otherLang.label &&
this.altLabels.equals(otherLang.altLabels) &&
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure it should even be testing alts. A lang corresponds to a MIME type registration which drives reader dispatch.

Has this been an issue for you?

}
}

private static boolean isMimeTypeRegistered(Lang lang) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please leave this function and keep its usage. If it needs fixes,fix it.

Having named operations for specific steps makes the code clearer. The optimizer will deal with efficiency.

@afs
Copy link
Copy Markdown
Member

afs commented Dec 29, 2021

The project uses JIRA: https://issues.apache.org/jira/projects/JENA/issues/.

As it is our record of changes, and people do search it, please open a JIRA and put the JIRA id in the PR title: JENA-NNNN: text -- text should be suitable for revisiting in a few years time. If it is the PR title, there will automatically be a link to the PR added to the ticket.

Commit messages should correspond to the changes. "fix a few things" will be unclear in a years time!
Having the "JENA-NNNN: ..." for commit messages is also very helpful, both for reviewing the git log and because it causes commit to be added as a comment to the JIRA ticket.

@afs
Copy link
Copy Markdown
Member

afs commented Dec 29, 2021

Could you provide some background and explanation please. The description is too brief.

There can be only one Lang for one content type because that's how reader lookup works. Different flavours of writer are handled by RDFFormat and RDFFormatVariant.

There are no new test cases or test case changes.

@afs
Copy link
Copy Markdown
Member

afs commented Jan 10, 2022

I've cherry-picked some fixes which are now on main.

Closing this PR - no response to comments, nor about the background and whether there is some functionality change being requested here.

One Lang is one media type. For output, there isRDFFormatwhich takes a Lang and a variation name.

@afs afs closed this Jan 10, 2022
@bvosburgh-tq
Copy link
Copy Markdown
Contributor Author

I am sorry I blew you off, Andy. : (

I submitted this PR (and 1147) in the middle of a long holiday and it has taken me a while to get back into things.

Thank you for cherry-picking the good stuff from this PR, though!

The thing that instigated this PR was when I worked on a custom writer and was trying to figure out how to register it. There is no clear documentation of how a new writer should be integrated with "RIOT"; so I am left with the code. : ) So, when I looked at the code in RDFLanguages to understand what it is for and how I should use it, the code had a few obvious bugs. Thus the PR.

But, although my writer seems to integrate successfully with RIOT, it's still not obvious to me how all the parts (e.g. labels, content types, alt names, alt content types, file extensions) fit together and how they are used by Jena.... : (

Also, I will start with a Jira issue next time. Should be coming soon. : )

@afs
Copy link
Copy Markdown
Member

afs commented Jan 12, 2022

@afs
Copy link
Copy Markdown
Member

afs commented Jan 12, 2022

I would also point to SHACL-C in the TQ SHACL engine but you (TQ) have dropped support for that.

Jena registering new reader and writer for SHACL-C : https://github.com/apache/jena/blob/main/jena-shacl/src/main/java/org/apache/jena/shacl/compact/SHACLC.java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants