Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set up OpenRefine reconciliation endpoint #65

Closed
acka47 opened this Issue Apr 10, 2018 · 9 comments

Comments

Projects
None yet
3 participants

@fsteeg fsteeg added the ready label May 14, 2018

@fsteeg fsteeg added working and removed ready labels Jun 25, 2018

fsteeg added a commit that referenced this issue Jun 25, 2018

@dr0i dr0i added review and removed working labels Jun 25, 2018

@fsteeg

This comment has been minimized.

Copy link
Member

fsteeg commented Jun 25, 2018

Deployed to test:

http://test.lobid.org/gnd/reconcile

http://test.lobid.org/gnd/api#openrefine

curl --data 'queries={"q1":{"query":"Twain, Mark"}}' http://test.lobid.org/gnd/reconcile

@fsteeg fsteeg assigned acka47 and unassigned fsteeg Jun 25, 2018

@acka47

This comment has been minimized.

Copy link
Contributor Author

acka47 commented Jun 26, 2018

Shouldn't it say AuthorityResource (or an array with all the entity types listed) in the defaultTypes field? E.g. like this:

{
  "defaultTypes":[
    {
      "id":"AuthorityResource",
      "name":"authority resource"
    }
  ]
}
@fsteeg

This comment has been minimized.

Copy link
Member

fsteeg commented Jun 26, 2018

Shouldn't it say AuthorityResource (or an array with all the entity types listed) [...]

Right, if we actually support multiple types that would make sense. The types allow restricting the reconciliation in the OpenRefine UI. So with a single default type the name makes no difference (in lobid-organisations it's 'lobid-organisation', thus 'lobid-gnd' here). Restricting by top-level type would probably be a useful feature, both here and in lobid-organisations.

I suggest we add the funtionality in lobid-gnd in this issue, and open a new issue in lobid-organisations.

@acka47

This comment has been minimized.

Copy link
Contributor Author

acka47 commented Jun 26, 2018

We have the phenomenon again that OpenRefine not automatically picks a match which is quite comfortable for Open Refine users. Querying both the Wikidata endpoint as well as the OER World Map endpoint with an identical string to be found there, automatically matches the item but it doesn't for lobid-gnd. I think this has to do with the score we give back, we would have to give back a score of 1.0 or 100.0 for an automatic match. Comparing lobid-gnd with the Wikidata API:

$ curl --data 'queries={"q1":{"query" : "tim berners-lee", "limit" : 3} }' http://test.lobid.org/gnd/reconcile
{"q1":{"result":[{"id":"121649091","name":"Berners-Lee, Tim","score":47.57019,"match":false,"type":["lobid-gnd"]},{"id":"137213085","name":"Lee, Tim","score":30.181828,"match":false,"type":["lobid-gnd"]},{"id":"1012381277","name":"Lee, Tim","score":30.093718,"match":false,"type":["lobid-gnd"]}]}}
 $ curl --data 'queries={"q1":{"query" : "Tim Berners-Lee", "limit" : 3} }' https://tools.wmflabs.org/openrefine-wikidata/de/api
{"q1": {"result": [{"all_labels": {"weighted": 100.0, "score": 100}, "type": [{"name": "Mensch", "id": "Q5"}], "name": "Tim Berners-Lee", "match": true, "score": 100.0, "id": "Q80"}, {"all_labels": {"weighted": 70.0, "score": 70}, "type": [{"name": "TED-Vortrag", "id": "Q23011722"}], "name": "Tim Berners-Lee \u00fcber das n\u00e4chste Web", "match": false, "score": 70.0, "id": "Q22980417"}, {"all_labels": {"weighted": 54.0, "score": 54}, "type": [{"name": "TED-Vortrag", "id": "Q23011722"}], "name": "Tim Berners-Lee: Eine Magna Carta f\u00fcr das Internet", "match": false, "score": 54.0, "id": "Q22991023"}]}}
@acka47

This comment has been minimized.

Copy link
Contributor Author

acka47 commented Jun 26, 2018

As it seems to be very simple, we should also add a suggest API preview API.

@acka47

This comment has been minimized.

Copy link
Contributor Author

acka47 commented Jun 26, 2018

Re. preview API, it might make sense to just deliver the default suggestion string plus a small image if available, similar to wikidata, see e.g. https://tools.wmflabs.org/openrefine-wikidata/en/preview?id=Q42. Providing this as small HTML snippet makes this a bit more complicated, so we might skip this for now.

@acka47

This comment has been minimized.

Copy link
Contributor Author

acka47 commented Jun 26, 2018

We have the phenomenon again that OpenRefine not automatically picks a match which is quite comfortable for Open Refine users.

Strange, testing this again with the following data yielded four automatic matches:

id	label
01	Ford Taurus
02	Adrian Pohl
03	Niederrhein-Gebiet
04	Puerto Rico. Water Resources Authority
05	Twain, Mark

fsteeg added a commit that referenced this issue Jun 27, 2018

@fsteeg

This comment has been minimized.

Copy link
Member

fsteeg commented Jun 27, 2018

Deployed restriction by type, see: http://test.lobid.org/gnd/reconcile

About the preview API: it should be quite straightforward to implement, but we're not sure if or where it is used in the OpenRefine UI (I tested with OpenRefine 2.7 and both Wikidata and a local preview implementation for lobid-gnd, @acka47 tested with OpenRefine 3.0 beta and Wikidata reconciliation).

@acka47

This comment has been minimized.

Copy link
Contributor Author

acka47 commented Jun 28, 2018

+1

@acka47 acka47 removed their assignment Jun 28, 2018

@dr0i dr0i added deploy and removed review labels Jun 29, 2018

@fsteeg fsteeg closed this in e36886a Jun 29, 2018

@dr0i dr0i removed the deploy label Jun 29, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.