New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nigeria (Senate): Add Shine Your Eye IDs #25487
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NB I haven't tried rerunning the reconciliation myself to make sure that I get the same matches. Perhaps @mhl might like to do that?
"file": "morph/shineyoureye.csv", | ||
"create": { | ||
"from": "morph", | ||
"scraper": "struan/nigeria-shineyoureye", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have quite a strong preference not have to have new scrapers outside the everypolitician-scrapers
account. Are you able to move this across?
"create": { | ||
"from": "morph", | ||
"scraper": "struan/nigeria-shineyoureye", | ||
"query": "SELECT *, NULL as position FROM data WHERE position = 'Senator'" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will need to alias the identifier__pombola_slug
column on the way through. At the minute that creates data in the format:
"identifiers": [
{
"identifier": "samuel-egwu",
"scheme": "pombola_slug"
}
],
But 'pombola_slug' doesn't really describe what the source is. I suspect this should just be source__shineyoureye
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I worry that just having shineyoureye
may turn out to be confusing if / when we migrate to the static ShineYourEye site. Also, I'd like to include slug
in the scheme, since it might well be useful to also include the primary key ID as well. So then you could have the two schemes:
source__shineyoureye__pombola_slug
source__shineyoureye__pombola_id
... perhaps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I understand deeply enough why this is being included here (or indeed what the two different values might be used for) and what the lifespan of the usefulness is.
Possibly the most visible usage will be that once this is in the data, EveryPolitician.org will start displaying it on each card, and I'd like to be able to turn that into a link to the ShineYourEye site. If you're not maintaining URL-compatibility between the two versions of the site, then we'll need to tweak the identifier → URL mapping at that point, but that's fairly simple to do. If the prior identifier also becomes largely meaningless at that point, then I'm not sure it'll be worth continuing to include it.
I suspect you're also wanting to include it/them, though, so that it makes something else easier at your end?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The immediate use for us is so that we can make sure that the old URLs for people on the Pombola-based shineyoureye.org work on the new static site without any redirection.
I do care much more about this being merged soon than I do about the name of the identifier scheme, though, so feel free to ignore this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still struggling to follow this a little. ISTM that the field that Pombola internally calls the slug is really the outward facing identifier, as it's what's used in the URLs, and if you're planning to also keep the URLs the same on the new site, then I think that this is definitely OK as a plain identifier__shineyoureye
.
I think that the internal Pombola IDs are a bit of a red herring here, though I'm open to persuasion that there's some value in EP having them. If there is, then I think we should come up with a different name for them to convey how someone would actually consume/use them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point where I was concerned about the confusion arising is if in the future we change the URLs for people on the new static ShineYourEye to some different scheme. Then we'd need to switch to getting those identifiers from a different source than the old Pombola site anyway, and changing the meaning of source__shineyoureye
would confuse me more than deprecating one called source__shineyoureye_pombola_slug
or source__shineyoureye_pombola
and adding a new one called source__shineyoureye_static
(or whatever) at that point. I can see that you might take the opposite point of view, though: that there'll only be one shineyoureye.org at a time, and so it's less confusing to just have one identifier scheme that always can be used to generate URLs to shineyoureye.org.
The internal Pombola IDs may indeed be a red herring - we don't have an immediate use for them, hence why I said "might well be useful" rather than "will be useful" :)
I've run the reconciliation myself (but using the fuller version of names we can now get from the scraper) and there was one match @struan included that I didn't:
That's: http://www.shineyoureye.org/person/uche-lilian-ekwunife/ in Pombola, and LILIAN EKWUNIFE http://www.nass.gov.ng/images/mps/91.jpg from the National Assembly website, so I think @struan was right and somehow I missed her. |
583079d
to
cc8d2fc
Compare
Summary of changes in PeopleAddedNo people added RemovedNo people removed Name ChangesNo name changes Additional Name Changes
Wikidata ChangesNo changes OrganizationsAddedNo organizations added RemovedNo organizations removed MembershipsAddedNo memberships added RemovedNo memberships removed |
"contact_details": [ | ||
{ | ||
"type": "email", | ||
"value": "abs@sarakimail.com bukisaraki@aol.com" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are about 5 examples of email fields that are doubled up like this. In the scraper you can separate those with a semicolon to have them flow through as separate values here. (Or, I guess, at a pinch you could update the SELECT to replace on space, but it's probably better to do it in the scraper)
cc8d2fc
to
8ca05e6
Compare
Summary of changes in PeopleAddedNo people added RemovedNo people removed Name ChangesNo name changes Additional Name Changes
Wikidata ChangesNo changes OrganizationsAddedNo organizations added RemovedNo organizations removed MembershipsAddedNo memberships added RemovedNo memberships removed |
8ca05e6
to
b13bc73
Compare
Summary of changes in PeopleAddedNo people added RemovedNo people removed Name ChangesNo name changes Additional Name Changes
Wikidata ChangesNo changes OrganizationsAddedNo organizations added RemovedNo organizations removed MembershipsAddedNo memberships added RemovedNo memberships removed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as discussed in Slack, I've brought this legislature up to date separately with the primary source. There are now a few conflicts to resolve, but it's also included an extra Senator who was previously missing, and whom you might need to reconcile to this new source.
[ORDER BY added by @mhl]
b13bc73
to
89b91e0
Compare
I've updated this for master and the updated scraper data, and added an ORDER BY in the instructions for that source. The new senator doesn't appear to be in the Pombola ShineYourEye data. (In the course of doing this I did notice another missing senator, but I'd like to deal with adding him separately afterwards - it's more urgent for us that we have most of this data present.) |
89b91e0
to
a0dfac9
Compare
Summary of changes in PeopleAddedNo people added RemovedNo people removed Name ChangesNo name changes Additional Name Changes
Wikidata ChangesNo changes OrganizationsAddedNo organizations added RemovedNo organizations removed MembershipsAddedNo memberships added RemovedNo memberships removed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Woot!
This will allow adding the IDs from the existing ShineYourEye MP profiles.