over time the db has accumulated a fair number of subjects and
employers with duplicate names, which causes problems for
views that use a slugified version of the name.
this commit tightens up the lookups to use the freebase id
and also includes a new command line utility to help diagnose
and correct these duplicates.
this is step one in moving form freebase to wikidata. I added wikidata_id
to the Employer, Location and Subject models. Then I added a migration to
lookup the existing entities in Wikidata using Wikidata's SPARQL endpoint.
The matching logic thus far is:
1. Look up entity using the Freebase ID
2. Use the name of the entity to derive the Wikipedia URL and look that up
3. To search for the label
The next step is to purge entities that don't have Wikidata IDs, and then
to create new suggest functionality that uses Wikidata instead of Freebase.