-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Skohub support for OpenRefine #11
Comments
Hello @tombaker, I want to start working on SkoHub Reconcile to facilitate reconciliation with SKOS vocabularies in OpenRefine using the above mentioned spec. Trying to build a user story from what you've written I was just asking myself, if the vocabularies you would use are always the files you linked above or if the SKOSMOS API would also be available in this scenario. I just found the links to the search, but not to the SKOSMOS API on the NALT website. |
@sroertgen We do use Skosmos for NALT - in fact, @osma is part of the NAL project team (though with his Annif hat). Implementing an OpenRefine API for Skosmos is, as far as I know, still an open issue - see NatLibFi/Skosmos#23 . Since we plan to use Skosmos for all vocabularies, large and small, having such an API would solve the problem for us. |
@tombaker Thanks! I think the SkoHub Reconcile module might be designed so generic that it could use the already existing Skosmos API. I'm just looking for the "entrypoint", i.e. the Concept Scheme to get all relevant data. I'm not too familiar with Skosmos, so bare with me, when this is too obvious, but right now I have trouble receiving the Concept Schemes data. I can receive concept data with:
and in the response I find the link to the concept scheme, i.e. but a curl to the concept scheme directs me to the html search page, e.g.
(I tried with If I can get the concept scheme data it should be possible to use that for reconciliation. CC @osma |
@sroertgen What exactly do you mean by concept scheme data? Just the triples where the subject is the skos:ConceptScheme instance of NALT? Or do you mean the whole SKOS file / graph containing all concepts? Skosmos has a REST API with methods to access individual concept data, download the whole vocabulary etc. But the NAL installation you refer to is actually not a pure Skosmos instance, it is a Drupal site (iirc) with Skosmos running in the background, and they expose some, but not nearly all, the Skosmos functionality through the Drupal facade. I'm not sure if they make the Skosmos REST API available at all. |
The first one. A JSON representation of the Concept Scheme with its top concepts etc.
I guess that is the reason why I don't get any results with the above mentioned curl for the concept scheme. A drupal site sits on top of it I guess. Thanks! |
If you want the triples of the concept scheme as JSON (JSON-LD actually), you can get them from a Skosmos REST API endpoint using an URL like this:
This of course won't necessarily help with the NAL installation, because in my understanding, it doesn't expose the Skosmos REST API at all. The Skosmos REST API (unlike SkoHub, I think) has a little bit of indirection between the API URLs and the URIs of concepts and concept schemes. Basically the URLs of the REST API are independent of the concept and concept scheme URIs. So each REST API method generally takes the URI as a parameter. It's maybe not pretty, but it's often necessary because the URI namespace of the vocabulary is often quite different from the URL where Skosmos has been installed. |
Hello @tombaker, a quick update on the current status. We prototyped a reconciliation service for SKOS vocabularies that can be used in Open Refine. I hope that we will be able to deploy this to a public test system within the next 2 weeks. I will let you know and if you are interested, you are kindly invited to try out and provide feedback. |
Hello @tombaker , just wanted to let you know I deployed the nalt core dataset to our prototype reconcile service. Using this URL you can add a reconcile service in open refine (or use this in any other tools implementing the reconciliation spec). Matches might look like this: And you can also search: I would love to hear if this is somehow useful for you and the behavior you would expect. I had to make three smal adjustments to the NALT Dataset:
If you have any questions, please ask! Best |
will close this now since SkoHub Reconcile now basically supports OpenRefine (at least again with 68d2ac4 ) |
In a project at NAL with @woody544, "NALT for the Machine Age", we frequently need to reconcile domain vocabularies, documented in CSV files, with NALT in order to associate domain terms with NALT URIs.
We currently do this with
reconcile-csv
, which performs fuzzy matching between a column in the community vocabulary spreadsheet with a column in a spreadsheet of NALT labels.reconcile-csv
works fine for fuzzy matching of labels, but we see no way to integrate the tool into our own workflow scripts or even to configure the process automatically. Runningreconcile-csv
involves clicking on menu choices in a specific order and manually filling in fields on forms - a fiddly process that we document, internally, with annotated screenshots. It also requires that we generate a CSV of NALT labels solely for the purpose of reconciliation. We note thatreconcile-csv
has remained at version 0.1.3-SNAPSHOT since 2015.We wonder if a more sophisticated reconciliation process might take into account more than just labels - for example, definitions, scope notes, and related concepts.
Ideally, we'd also like to automate the reconciliation process as much as possible. For example, might a configuration file be used to specify which column in a given spreadsheet is to be reconciled against the concept scheme loaded in Skohub?
We have indicated our interest in support for OpenRefine reconciliation in issues for two other SKOS environments: Skosmos and Annif.
We note with interest the development of an improved Reconciliation Service API in a W3C community group.
The text was updated successfully, but these errors were encountered: