New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Wikidata reconciliation (was Freebase) [$265] #805

Closed
ghirardinicola opened this Issue Sep 4, 2013 · 20 comments

Comments

8 participants
@ghirardinicola

ghirardinicola commented Sep 4, 2013

tfmorris suggest it could be an upgrade issue and this service should not be used.
The reconciliation services I see are:

  • Sindice (dbpedia.org)
  • Sindice
  • Freebase Reconciliation Service (the one tested here)
  • Freebase Query-based Reconciliation (is this the only one supported right now?)

ERROR REPORTED
11:36:13.186 [ refine] POST /command/core/guess-types-of-column (9400ms)
11:36:16.806 [ command] Failed to guess cell types for load

There is a $265 open bounty on this issue. Add to the bounty at Bountysource.

@ghost ghost assigned tfmorris Sep 4, 2013

@tfmorris

This comment has been minimized.

Show comment
Hide comment
@tfmorris

tfmorris Sep 4, 2013

Member

The standard style of Freebase reconciliation is still supported, but I'm guessing that we're not correctly upgrading existing installations. This was an upgrade from Google Refine 2.5, correct?

The URL listed:

4.standard-reconcile.dfhuynh.user.dev.freebaseapps.com./reconcile

is the old reconciliation service. One clue is the "Could not fetch URL: http://api.freebase.com/api/service/search?read=15000" error message. That's the previous API endpoint which Google has decommissioned.

The new reconciliation services lives at

http://reconcile.freebaseapps.com/reconcile

As a workaround, you should be able to add it by hand (Add Standard Service button at bottom left of reconciliation dialog).

I'll have a look at what needs to be done to make the upgrade smoother.

Member

tfmorris commented Sep 4, 2013

The standard style of Freebase reconciliation is still supported, but I'm guessing that we're not correctly upgrading existing installations. This was an upgrade from Google Refine 2.5, correct?

The URL listed:

4.standard-reconcile.dfhuynh.user.dev.freebaseapps.com./reconcile

is the old reconciliation service. One clue is the "Could not fetch URL: http://api.freebase.com/api/service/search?read=15000" error message. That's the previous API endpoint which Google has decommissioned.

The new reconciliation services lives at

http://reconcile.freebaseapps.com/reconcile

As a workaround, you should be able to add it by hand (Add Standard Service button at bottom left of reconciliation dialog).

I'll have a look at what needs to be done to make the upgrade smoother.

@ghirardinicola

This comment has been minimized.

Show comment
Hide comment
@ghirardinicola

ghirardinicola Sep 5, 2013

It's not an upgrade actually, I am using the last version from github.
I was able to configure the new service by hands, thanks!

ghirardinicola commented Sep 5, 2013

It's not an upgrade actually, I am using the last version from github.
I was able to configure the new service by hands, thanks!

@cldwalker

This comment has been minimized.

Show comment
Hide comment
@cldwalker

cldwalker Sep 13, 2013

@tfmorris Fwiw, I installed 2.5 on a mac last night and then when I installed 2.6-beta.1 I saw this issue. The suggested workaround works. Thanks!

cldwalker commented Sep 13, 2013

@tfmorris Fwiw, I installed 2.5 on a mac last night and then when I installed 2.6-beta.1 I saw this issue. The suggested workaround works. Thanks!

@tfmorris

This comment has been minimized.

Show comment
Hide comment
@tfmorris

tfmorris Sep 13, 2013

Member

@cldwalker Thanks for the confirmation. We'll have it fixed before the next kit goes out.

Member

tfmorris commented Sep 13, 2013

@cldwalker Thanks for the confirmation. We'll have it fixed before the next kit goes out.

@ghirardinicola

This comment has been minimized.

Show comment
Hide comment
@ghirardinicola

ghirardinicola Mar 12, 2014

I just downloaded the trunk (and cleaned the settings ) and I still have this problem using freebase reconciliation.
Creating a new reconciliation service using the new url works.

This the error:

    <h1>Error in <span class="script">//standard-reconcile.freebaseapps.com./reconcile</span></h1>

    <p class="msg">JS exception: acre.errors.URLError: urlfetch failed: 410</p>

ghirardinicola commented Mar 12, 2014

I just downloaded the trunk (and cleaned the settings ) and I still have this problem using freebase reconciliation.
Creating a new reconciliation service using the new url works.

This the error:

    <h1>Error in <span class="script">//standard-reconcile.freebaseapps.com./reconcile</span></h1>

    <p class="msg">JS exception: acre.errors.URLError: urlfetch failed: 410</p>

@ghirardinicola ghirardinicola changed the title from Freebase Reconciliation Service hangs when selected (working...) to Freebase Reconciliation Service hangs when selected (working...) [$68] Apr 19, 2014

@thadguidry

This comment has been minimized.

Show comment
Hide comment
@thadguidry

thadguidry Apr 20, 2014

Member

We just need to add support for the new Freebase /reconcile service on googleapis
https://developers.google.com/freebase/v1/reconciliation-overview

And ensure that a OpenRefine user has a dialog box where they can input their API key.

Member

thadguidry commented Apr 20, 2014

We just need to add support for the new Freebase /reconcile service on googleapis
https://developers.google.com/freebase/v1/reconciliation-overview

And ensure that a OpenRefine user has a dialog box where they can input their API key.

@tfmorris

This comment has been minimized.

Show comment
Hide comment
@tfmorris

tfmorris Apr 20, 2014

Member

There are a couple of different problems describe in this thread. The most recent problem is that the entire freebaseapps.com domain has been retired, so anything that lives on it, including our new Freebase reconciliation service, is gone.

@thadguidry The new reconciliation APIs have been supported since they were introduced. Unfortunately they were proxied through a service which has been shut down by Google. We can either:: a) host the service somewhere else or b) special case Freebase support differently from all the other reconciliation services and self-host it in the Refine server.

Member

tfmorris commented Apr 20, 2014

There are a couple of different problems describe in this thread. The most recent problem is that the entire freebaseapps.com domain has been retired, so anything that lives on it, including our new Freebase reconciliation service, is gone.

@thadguidry The new reconciliation APIs have been supported since they were introduced. Unfortunately they were proxied through a service which has been shut down by Google. We can either:: a) host the service somewhere else or b) special case Freebase support differently from all the other reconciliation services and self-host it in the Refine server.

@thadguidry

This comment has been minimized.

Show comment
Hide comment
@thadguidry

thadguidry Apr 20, 2014

Member

b) self host in Refine. I already spoke with David H. today about that actually and he also thought it was a good idea.

Member

thadguidry commented Apr 20, 2014

b) self host in Refine. I already spoke with David H. today about that actually and he also thought it was a good idea.

@ghirardinicola ghirardinicola changed the title from Freebase Reconciliation Service hangs when selected (working...) [$68] to Freebase Reconciliation Service hangs when selected (working...) [$165] Apr 23, 2014

@magdmartin

This comment has been minimized.

Show comment
Hide comment
@magdmartin

magdmartin Apr 23, 2014

Member

If the reconciliation service is hosted on the machine running refine, what are the impact on the local resource comsumption (RAM and processor). Refine is already demanding on local resource for large project, should we worry about adding an other local service?

Member

magdmartin commented Apr 23, 2014

If the reconciliation service is hosted on the machine running refine, what are the impact on the local resource comsumption (RAM and processor). Refine is already demanding on local resource for large project, should we worry about adding an other local service?

@thadguidry

This comment has been minimized.

Show comment
Hide comment
@thadguidry

thadguidry Apr 23, 2014

Member

Martin, The design should support the reconciliation service to be OFF by default, and enabled as an OpenRefine preference.

Member

thadguidry commented Apr 23, 2014

Martin, The design should support the reconciliation service to be OFF by default, and enabled as an OpenRefine preference.

@vladan-me

This comment has been minimized.

Show comment
Hide comment
@vladan-me

vladan-me Apr 26, 2014

Eh, I was planning a whole project based on this feature and it broke... Some sort of alternative is to use Fetching URLs From Web Services but that can be slow, limited (single query request) and it creates another column which I will have to rename after deleting previous one? Do you have any other temporary better idea?

vladan-me commented Apr 26, 2014

Eh, I was planning a whole project based on this feature and it broke... Some sort of alternative is to use Fetching URLs From Web Services but that can be slow, limited (single query request) and it creates another column which I will have to rename after deleting previous one? Do you have any other temporary better idea?

@ghirardinicola ghirardinicola changed the title from Freebase Reconciliation Service hangs when selected (working...) [$165] to Freebase Reconciliation Service hangs when selected (working...) [$265] Jun 6, 2014

@ghirardinicola

This comment has been minimized.

Show comment
Hide comment
@ghirardinicola

ghirardinicola Jun 30, 2014

Do you have an advice on how to implement this?
In order to start I'd like to know where is the code of the existing wrapper.
Thanks!

ghirardinicola commented Jun 30, 2014

Do you have an advice on how to implement this?
In order to start I'd like to know where is the code of the existing wrapper.
Thanks!

@thadguidry thadguidry changed the title from Freebase Reconciliation Service hangs when selected (working...) [$265] to Implement Wikidata reconciliation (was Freebase) [$265] Jan 2, 2015

@thadguidry

This comment has been minimized.

Show comment
Hide comment
@thadguidry

thadguidry Jan 2, 2015

Member

I have updated the bounty / issue to reflect the new needs for Wikidata Reconciliation. (Given that Freebase is going away and will be absorbed into Wikidata this spring)

The starting point for those interested looks like this:

https://www.wikidata.org/w/api.php?action=wbsearchentities&search=Valve&language=en&type=item

Help for the Wikidata API is here: https://www.wikidata.org/w/api.php

Member

thadguidry commented Jan 2, 2015

I have updated the bounty / issue to reflect the new needs for Wikidata Reconciliation. (Given that Freebase is going away and will be absorbed into Wikidata this spring)

The starting point for those interested looks like this:

https://www.wikidata.org/w/api.php?action=wbsearchentities&search=Valve&language=en&type=item

Help for the Wikidata API is here: https://www.wikidata.org/w/api.php

@magnusmanske

This comment has been minimized.

Show comment
Hide comment
@magnusmanske

magnusmanske Jun 25, 2016

I did implement one a while ago, partially. Will that do?
https://tools.wmflabs.org/wikidata-reconcile/

magnusmanske commented Jun 25, 2016

I did implement one a while ago, partially. Will that do?
https://tools.wmflabs.org/wikidata-reconcile/

@magnusmanske

This comment has been minimized.

Show comment
Hide comment
@thadguidry

This comment has been minimized.

Show comment
Hide comment
@thadguidry

thadguidry Jun 25, 2016

Member

@magnusmanske
Its a bit more than just your server side on Wikidata :) changes also need to be done on our client side in OpenRefine as well. Essentially, the Freebase Standard Reconcile as a default needs to be replaced. This requires working through and cleaning up most of the Java, JS, HTML, and JSON files here: https://github.com/OpenRefine/OpenRefine/search?utf8=%E2%9C%93&q=reconciliation+OR+reconcile+OR+recon&type=Code

The bounty gets awarded when both sides are done. The bounty can be split between developers if they wish, for instance, if you want to have someone do the client side improvements in our code, while you take credit and a partial bounty for the server side. Up to those involved, we don't care as long as it gets done properly and tests complete. Good luck, get others involved, or finish it all yourself :)

Also @magnusmanske when OpenRefine performs the guess-types-of-columns command there's sometimes failures against the Wikidata reconcile. I know that Wikidata doesn't really have Types, but instead of displaying Q numbers, our users are looking to see name values, like "automobile", "animal", etc... not Q1420 and Q729, etc.

`er'>2.1758411128file_get_contents
( )../index.php:147

{"q0":{"result":[{"id":"Q2085381","score":0.5,"match":false,"type":[],"name":"publisher"},{"id":"Q649953","score":0.33333333333333,"match":false,"type":["Q618779"],"name":"Pulitzer Prize for Editorial Cartooning"},{"id":"Q871232","score":0.5,"match":true,"type":["Q4894405"],"name":"editorial"}],"total_search_results":818},"q1":{"result":[{"id":"Q700750","score":0.5,"match":false,"type":["Q215380"],"name":"Blank & Jones"},{"id":"Q6529244","score":0.33333333333333,"match":false,"type":["Q5"],"name":"Les Blank"},{"id":"Q18441355","score":0.25,"match":false,"type":["Q134556","Q7366"],"name":"Blank Space"}],"total_search_results":540}} couldn't be parsed as JSON object at com.google.refine.util.ParsingUtilities.evaluateJsonStringToObject(ParsingUtilities.java:131) at com.google.refine.commands.recon.GuessTypesOfColumnCommand.guessTypes(GuessTypesOfColumnCommand.java:196) at com.google.refine.commands.recon.GuessTypesOfColumnCommand.doPost(GuessTypesOfColumnCommand.java:89) at com.google.refine.RefineServlet.service(RefineServlet.java:177) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)`

capture

Member

thadguidry commented Jun 25, 2016

@magnusmanske
Its a bit more than just your server side on Wikidata :) changes also need to be done on our client side in OpenRefine as well. Essentially, the Freebase Standard Reconcile as a default needs to be replaced. This requires working through and cleaning up most of the Java, JS, HTML, and JSON files here: https://github.com/OpenRefine/OpenRefine/search?utf8=%E2%9C%93&q=reconciliation+OR+reconcile+OR+recon&type=Code

The bounty gets awarded when both sides are done. The bounty can be split between developers if they wish, for instance, if you want to have someone do the client side improvements in our code, while you take credit and a partial bounty for the server side. Up to those involved, we don't care as long as it gets done properly and tests complete. Good luck, get others involved, or finish it all yourself :)

Also @magnusmanske when OpenRefine performs the guess-types-of-columns command there's sometimes failures against the Wikidata reconcile. I know that Wikidata doesn't really have Types, but instead of displaying Q numbers, our users are looking to see name values, like "automobile", "animal", etc... not Q1420 and Q729, etc.

`er'>2.1758411128file_get_contents
( )../index.php:147

{"q0":{"result":[{"id":"Q2085381","score":0.5,"match":false,"type":[],"name":"publisher"},{"id":"Q649953","score":0.33333333333333,"match":false,"type":["Q618779"],"name":"Pulitzer Prize for Editorial Cartooning"},{"id":"Q871232","score":0.5,"match":true,"type":["Q4894405"],"name":"editorial"}],"total_search_results":818},"q1":{"result":[{"id":"Q700750","score":0.5,"match":false,"type":["Q215380"],"name":"Blank & Jones"},{"id":"Q6529244","score":0.33333333333333,"match":false,"type":["Q5"],"name":"Les Blank"},{"id":"Q18441355","score":0.25,"match":false,"type":["Q134556","Q7366"],"name":"Blank Space"}],"total_search_results":540}} couldn't be parsed as JSON object at com.google.refine.util.ParsingUtilities.evaluateJsonStringToObject(ParsingUtilities.java:131) at com.google.refine.commands.recon.GuessTypesOfColumnCommand.guessTypes(GuessTypesOfColumnCommand.java:196) at com.google.refine.commands.recon.GuessTypesOfColumnCommand.doPost(GuessTypesOfColumnCommand.java:89) at com.google.refine.RefineServlet.service(RefineServlet.java:177) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)`

capture

@wetneb

This comment has been minimized.

Show comment
Hide comment
@wetneb

wetneb Feb 5, 2017

Member

I'm starting to work on this. I am building a more complete server here: https://github.com/wetneb/openrefine-wikidata

Member

wetneb commented Feb 5, 2017

I'm starting to work on this. I am building a more complete server here: https://github.com/wetneb/openrefine-wikidata

@wetneb

This comment has been minimized.

Show comment
Hide comment
@wetneb

wetneb Feb 13, 2017

Member

@thadguidry Can we close this issue?

Member

wetneb commented Feb 13, 2017

@thadguidry Can we close this issue?

@thadguidry

This comment has been minimized.

Show comment
Hide comment
@thadguidry

thadguidry Feb 13, 2017

Member

@wetneb Yes, and you can claim the bounty on BountySource.com

Member

thadguidry commented Feb 13, 2017

@wetneb Yes, and you can claim the bounty on BountySource.com

@thadguidry thadguidry closed this Feb 13, 2017

@wetneb

This comment has been minimized.

Show comment
Hide comment
@wetneb

wetneb Feb 13, 2017

Member

Thanks a lot!!

Member

wetneb commented Feb 13, 2017

Thanks a lot!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment