-
Notifications
You must be signed in to change notification settings - Fork 499
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parties are messy #10
Comments
While I don't think we need to have a separate mapping file, I would I don't think it'd be easy to re-determine these en masse, as the You'd be welcome to take a shot at it, and I'm happy to take a shot myself On Sun, Jan 6, 2013 at 7:42 PM, Gordon P. Hemsley
Developer | sunlightfoundation.com |
Some of the party info probably came from the bioguide search listing pages: e.g. http://bioguide.congress.gov/biosearch/biosearch1.asp I may have filled it in with other sources. We should correct executive.yaml's parties for Democrats to match the others, since the legislators file is in more use than the executive file (no use). 'Democrat' is correct in so far as it is the noun form (e.g. "I am a Democrat." is correct). In current/recent data, the party for someone who switched is always the most recent party (for that term). I think that's a good rule. We might do that to fix the historical data, and also add a new field like we did for names that links parties to time periods in just the cases where the party changed. But we shouldn't split terms. |
Sorry, for the bioguide link, start at the home page http://bioguide.congress.gov/biosearch/biosearch.asp and just choose a state. That's what I meant. |
Yeah, the conversion part should be easy and rather straightforward. But first we should sort out what the appropriate names are. I was under the impression that the party field represents the name of the party, not the noun used to describe a member of that party. (That is, a member of the XYZ Party would be listed as "XYZ", not as an "XYZan".) And if that is not the case, then I am of the opinion that it should be the case. This might be a benefit to using abbreviation codes from a separate file: We can use the abbreviation to point to the full party name (including "Party") or leave it out when appropriate ("unknown", "no party", "independent", etc.). (The other benefit being that we'd have the abbreviation for free if we want to, for example, have a party/state/district tag after the person's name.) With regard to party switchers, I won't argue with using the party at the end of the term, though I'm not clear what you're referring to wrt "a new field like we did for names that links parties to time periods in just the cases where the party changed". What's an existing example of this that I can look at? I think the party data is likely for the most part accurate (modulo the other issues discussed here), and I am happy to spot-check ones that seem inaccurate. |
I agree that the party names should be the names of the parties, and not What Josh was referring to, I think, is the other_names we added for I still don't think we need abbreviations, or a conversion file - we just On Mon, Jan 7, 2013 at 11:51 AM, Gordon P. Hemsley <notifications@github.com
Developer | sunlightfoundation.com |
Alright, I'll let you two decide how to handle the party switching issue. Though I should note: sometimes mid-term party switching changes party control in Congress, as it did in the 107th Congress when Jim Jeffords changed from Republican to Independent, so it could be important to note a little further than just an "oh yeah by the way" field. But I'd like to continue to argue for the party abbreviation mapping. One thing I edited into my last comment (which was you probably didn't get via e-mail) was this: So if someone wanted to display "Chuck Schumer [D-NY]", they'd have all that information for free, without having to reverse-engineer any of it. Similarly, if they wanted to display someone with a more obscure party abbreviation, like "Al Franken [DFL-MN]" or "Joe Lieberman [ID-CT]", they wouldn't have to do anything special—every party would be processed the same. Along those same lines, the abbreviation/mapping would be a way to differentiate between party and identifier: DFL represents the "Democratic-Farmer-Labor Party", while ID represents "Independent Democrat", which is not actually a party. And "no party" is not strictly the same as "independent": George Washington is (almost?) always described as having no party, not as being an independent. Wikipedia is an excellent source to get all this (IMO, important) information that the BioGuide might not make clear. |
This is a very good point, I agree that people should have the ability to get those abbreviations (I didn't think that was the kind of abbreviation you meant). Not to be a stick in the mud about the separate file, but I think we can still achieve this without that, by having it be a second field. In other words, have both "party" and "party_abbreviation" fields. I only continue to push this because having to link files together is pain from a client parsing standpoint, and having separate files to interact with in our scripts is also a maintenance burden. I'd rather keep the data slightly denormalized. |
@GPHemsley It's actually even more complex than that. The party that members run under during the election may have no connection to whether they caucus with the Republicans or with the Democrats once elected (especially independents), and it's how they caucus that determines majority/minority control. I'm not opposed to changing Democrat/Democratic, but please not right now. I'm still trying to catch my breath from last week. Gimme a few weeks. Normalizing historical party names so that they're at least consistent sounds good to me. Everything else sounds like you (@GPHemsley) should try it out on a separate branch/fork so we can see what it looks like and what the ramifications are. |
I don't think there's anything urgent here w/r/t to Democratic/Democrat. Though, if we add a party abbreviation field, that would then be the thing to hinge one's logic on going forward, so that name corrections could be made to parties whenever, without breaking anyone's stuff. |
I was thinking about this some more, and I think that the benefits of having a separate field would outweigh the costs in the long-term (which doesn't necessarily have to start right this second). There may be some usecases for having two separate fields and allowing them to vary independently, but I think for 99% of the cases, they would be redundant and would only add to the bulk of the filesize. (And they potentially run the risk of becoming out of sync by accident.) On the other hand, having the parties be represented by abbreviations or codes means they can easily be used and referenced in multiple different places. So here's what I'm thinking:
So that's my thinking on what the way forward should be, but I'm not at all averse to letting things settle down before attempting to make any changes. |
Redundancy and adding to file size aren't a big concern for me (I accept that as a price of denormalization). I think you're making good points for why having a separate set of metadata around each party would be beneficial. I'm still not there yet on thinking it's worth having a parties.yaml file that needs to be referenced by anyone who wants to parse the contents of legislators-current.yaml file (my preference would even be to merge the -current and -historical files). There's a tension between the terms list being precise and being understandable. I think there's (at least) 3 reasonable choices:
No. 2 in that list is by far the easiest on we the maintainers. No. 1 is by far the hardest on client parsers. No. 3 is by far the hardest to transition to for both maintainers and existing parsers, but the most precise and useful. So we've been going with No. 2, and I've been fine with that. I would also be fine with No. 3, but I think we'd want to tackle it in full (not just making it so party breaks up the terms, but all relevant changes) rather than do a piece of it now but then realize that another field is also useful to hinge on later. I'm with Josh on not making this drastic of a transition in the immediate future, but I do want to frame the issue for later thinking. |
Not realizing you made that comment, I began the argument about when to split terms in #15. |
Just perusing How are these YAML files updated? Are they updated by your scripts, or by hand? |
The committee files use a different format than the legislators/executive files. See further down in the README for their format. |
Ah, thanks! |
Much of the party information is flawed or straight-up inaccurate. In addition, it doesn't seem like there has been any attempt to standardize the names—a few names differ only in capitalization (e.g. "Pro-administration" vs. "Pro-Administration").
Here's the full list of parties used:
[
'AL',
'Adams',
'Adams Democrat',
'American',
'American Labor',
'Anti Jackson',
'Anti Jacksonian',
'Anti Mason',
'Anti Masonic',
'Anti-Administration',
'Anti-Jacksonian',
'Anti-Lecompton Democrat',
'Anti-administration',
'Coalitionist',
'Conservative',
'Conservative Republican',
'Constitutional Unionist',
'Crawford Republican',
'Democrat',
'Democrat Farmer Labor',
'Democrat-Liberal',
'Democrat-turned-Republican',
'Democrat/Independent',
'Democrat/Republican',
'Democratic',
'Democratic - Republican',
'Democratic Republican',
'Democratic and Union Labor',
'Democratic-Republican',
'Farmer-Labor',
'Federalist',
'Free Silver',
'Free Soil',
'Ind. Democrat',
'Ind. Republican',
'Ind. Republican-Democrat',
'Ind. Whig',
'Independent',
'Independent Democrat',
'Independent/Republican',
'Jackson',
'Jackson Republican',
'Jacksonian',
'Jacksonian Republican',
'Law and Order',
'Liberal',
'Liberal Republican',
'Liberty',
'National Greenbacker',
'New Progressive',
'Nonpartisan',
'Nullifier',
'Popular Democrat',
'Populist',
'Pro-Administration',
'Pro-administration',
'Progressive',
'Progressive Republican',
'Prohibitionist',
'Readjuster',
'Readjuster Democrat',
'Republican',
'Republican-Conservative',
'Silver',
'Silver Republican',
'Socialist',
'States Rights',
'Unconditional Unionist',
'Union',
'Union Democrat',
'Union Labor',
'Unionist',
'Unknown',
'Whig',
'no party'
]
(Note: Legislators are said to be in the "Democrat" party, while executives are in the "Democratic" party; the latter is the appropriate one.)
I would recommend consolidating some of these, and perhaps having a separate file that maps names to abbreviations, which should be distinct.
In addition, I think that changing parties mid-term should be shown with two terms, but I suppose that's debatable. (Another option is to only list the party at the time of election.) As it stands, when a candidate changes parties mid-term, they get a party like "Democrat/Independent" or similar.
This page has abbreviations for some of the more prominent parties (perhaps just the ones in the Senate), but I don't think it covers all the parties used here:
http://www.senate.gov/artandhistory/history/common/generic/Key_Party_Abbreviations.htm
The text was updated successfully, but these errors were encountered: