Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support underscores in the prefixes #24

Open
berezovskyi opened this issue Mar 29, 2018 · 9 comments · May be fixed by #33
Open

Support underscores in the prefixes #24

berezovskyi opened this issue Mar 29, 2018 · 9 comments · May be fixed by #33

Comments

@berezovskyi
Copy link

Underscore is a valid prefix char.

I traced the necessary changes till:

return "[a-z][a-z0-9]{1,9}";

and
$message = 'prefix.cc supports a-z and 0-9 only; max 10 characters.';

Would you be ready to merge a PR for that?

@cygri
Copy link
Owner

cygri commented Apr 17, 2018

What is the use case for underscores?

The reason it's not allowed is that I don't want separate or different mappings for dcterms, DCterms, dcTerms, dc_terms, DC-terms, dc.terms, or whatever other variations you could think of. I'd rather allow only one variation and have people “fight” over it, than allowing all of them and people accidentally ending up with the wrong URI because they didn't realise that dcterms and dc_terms were mapped to different URIs.

That's why prefix.cc doesn't allow uppercase characters and punctuation.

The site is about the popular/canonical mappings, and not so much about the “long tail” of prefix mappings that are used only by a small group of people.

Better supporting the “long tail” would be a completely reasonable goal, and some kind of punctuation to allow grouping of prefixes would probably be part of that. But the site lacks various other features that would be required to do a decent job on that goal.

@berezovskyi
Copy link
Author

Thanks @cygri for getting back to me.

The use case is that the OSLC standard (you can think of it as an LDP for the enterprise) developed under OASIS is using the following prefixes in the spec (as RFC SHOULDs plus most of them are in use in the apps since 2009):

  • oslc_acc
  • oslc_rm
  • oslc_cm
  • oslc_trs

I totally agree with you that the proliferation of dc.terms and the like would be unacceptable. But in our case, the prefix with an underscore is the "canonical" one (as much as a prefix can be).

@cygri
Copy link
Owner

cygri commented Apr 20, 2018

Good point.

The argument I made above for the limited [a-z0-9] range is pretty strong, in my opinion.

But it's good that vocabulary authors propose canonical prefixes for their vocabularies. And it is desirable to have those proposed canonical prefixes in prefix.cc. And it's inevitable that some authors will propose prefixes outside of the [a-z0-9] range currently allowed by prefix.cc.

I don't currently have a good idea on how to resolve this contradiction.

@berezovskyi
Copy link
Author

I think the best way to resolve it would be to use https://github.com/perma-id/w3id.org approach with the pull-request model to add the prefixes. That would involve a lot of rework of the prefix.cc codebase; not sure I would have the time to do it if you give a green light.

Practical way to resolve this may be to remove the restrictions but have a sort of premoderation. I don't think it will be much of moderation work, but then again, involves significant code changes to add an admin panel.

The most practical way would be for me to ask you to add the prefixes via phpMyAdmin and forget about this issue until more people complain :) (Though would still require minimal code changes to resolve http://prefix.cc/oslc_rm for example)

@cygri
Copy link
Owner

cygri commented Apr 23, 2018

Good analysis. Can you make a separate PR just to make underscores resolve? I'll dig out the phpMyAdmin password…

@hsolbrig
Copy link

What is the status of this PR? We'd love to use and leverage prefix.cc, but many of the namespaces we're working with end with underscores (example: The Human Phenotype Ontology, "HP", uses "http://purl.obolibrary.org/obo/HP_" as a prefix).

@cygri
Copy link
Owner

cygri commented Dec 29, 2020

@hsolbrig This PR is about underscores in the prefix. It looks like you want underscores as the last character of the namespace URI.

@hsolbrig
Copy link

hsolbrig commented Jan 4, 2021

Ah - missed that. Has there been discussion on that topic?

@cygri
Copy link
Owner

cygri commented Jan 4, 2021

berezovskyi added a commit to berezovskyi/prefix.cc that referenced this issue Jan 26, 2022
Closes cygri#24 

Tested via https://regex101.com/ with PHP7.3+ selected.

An alternative return value would be `(?=.{1,11}$)[a-z][a-z0-9]*(?:_[a-z0-9]+)?` and that would limit the whole prefix to 11 chars.
@berezovskyi berezovskyi linked a pull request Jan 26, 2022 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants