Permalink
Browse files

Added candidate names and likely religion for 2007, 2009, 2012 and 20…

…14 in UP
  • Loading branch information...
raphael-susewind committed Feb 12, 2016
1 parent 0346d86 commit 1f05b3f1505c29cdcc03e965ffe2b9c54fe6ec05
Showing with 25,862 additions and 3 deletions.
  1. +6 −2 README.md
  2. +0 −1 ROADMAP.md
  3. +4 −0 combined.sql
  4. +52 −0 upcandidates2007/LICENSE.md
  5. +23 −0 upcandidates2007/README.md
  6. +6,087 −0 upcandidates2007/candidates-2007.csv
  7. +62 −0 upcandidates2007/charmap.py
  8. +422 −0 upcandidates2007/createnamedb.pl
  9. +762 −0 upcandidates2007/guesscommunity.pl
  10. +86 −0 upcandidates2007/soundex.py
  11. +155 −0 upcandidates2007/transform.pl
  12. +404 −0 upcandidates2007/upcandidates2007.csv
  13. +411 −0 upcandidates2007/upcandidates2007.sql
  14. +52 −0 upcandidates2009/LICENSE.md
  15. +23 −0 upcandidates2009/README.md
  16. +1,369 −0 upcandidates2009/candidates-2009.csv
  17. +62 −0 upcandidates2009/charmap.py
  18. +422 −0 upcandidates2009/createnamedb.pl
  19. +762 −0 upcandidates2009/guesscommunity.pl
  20. +86 −0 upcandidates2009/soundex.py
  21. +154 −0 upcandidates2009/transform.pl
  22. +87 −0 upcandidates2009/upcandidates2009.csv
  23. +94 −0 upcandidates2009/upcandidates2009.sql
  24. +52 −0 upcandidates2012/LICENSE.md
  25. +23 −0 upcandidates2012/README.md
  26. +7,032 −0 upcandidates2012/candidates-2012.csv
  27. +62 −0 upcandidates2012/charmap.py
  28. +422 −0 upcandidates2012/createnamedb.pl
  29. +762 −0 upcandidates2012/guesscommunity.pl
  30. +86 −0 upcandidates2012/soundex.py
  31. +154 −0 upcandidates2012/transform.pl
  32. +447 −0 upcandidates2012/upcandidates2012.csv
  33. +454 −0 upcandidates2012/upcandidates2012.sql
  34. +1,369 −0 upcandidates2014/Candidates.csv
  35. +52 −0 upcandidates2014/LICENSE.md
  36. +1,667 −0 upcandidates2014/Political_parties.csv
  37. +25 −0 upcandidates2014/README.md
  38. +62 −0 upcandidates2014/charmap.py
  39. +422 −0 upcandidates2014/createnamedb.pl
  40. +762 −0 upcandidates2014/guesscommunity.pl
  41. +86 −0 upcandidates2014/soundex.py
  42. +169 −0 upcandidates2014/transform.pl
  43. +81 −0 upcandidates2014/upcandidates2014.csv
  44. +88 −0 upcandidates2014/upcandidates2014.sql
@@ -15,6 +15,10 @@ table | description
[uploksabha2009](https://github.com/raphael-susewind/india-religion-politics/tree/master/uploksabha2009) | Booth-level (form 20) results for the 2009 Lok Sabha election from Uttar Pradesh
[upvidhansabha2012](https://github.com/raphael-susewind/india-religion-politics/tree/master/upvidhansabha2012) | Booth-level (form 20) results for the 2012 Vidhan Sabha election in Uttar Pradesh
[uploksabha2014](https://github.com/raphael-susewind/india-religion-politics/tree/master/uploksabha2014) | Booth-level (form 20) results for the 2014 Lok Sabha election from Uttar Pradesh
[upcandidates2007](https://github.com/raphael-susewind/india-religion-politics/tree/master/upcandidates2007) | Candidates and their likely religion for the 2007 Vidhan Sabha election in Uttar Pradesh
[upcandidates2009](https://github.com/raphael-susewind/india-religion-politics/tree/master/upcandidates2009) | Candidates and their likely religion for the 2009 Lok Sabha election from Uttar Pradesh
[upcandidates2012](https://github.com/raphael-susewind/india-religion-politics/tree/master/upcandidates2012) | Candidates and their likely religion for the 2012 Vidhan Sabha election in Uttar Pradesh
[upcandidates2014](https://github.com/raphael-susewind/india-religion-politics/tree/master/upcandidates2014) | Candidates and their likely religion for the 2014 Lok Sabha election from Uttar Pradesh
[uprolls2011](https://github.com/raphael-susewind/india-religion-politics/tree/master/uprolls2011) | Booth-level estimates of religious demography for 2011 across Uttar Pradesh
[uprolls2012](https://github.com/raphael-susewind/india-religion-politics/tree/master/uprolls2012) | Booth-level estimates of religious demography for 2012 across Uttar Pradesh
[uprolls2013](https://github.com/raphael-susewind/india-religion-politics/tree/master/uprolls2013) | Booth-level estimates of religious demography for 2013 across Uttar Pradesh
@@ -36,8 +40,8 @@ One particularly important set of tables are the various "id" ones - they map th
The estimates of **religious demography** use an algorith which is also on [GitHub](https://github.com/raphael-susewind/name2community/tree/ngram) and described more fully in the following article of mine (upscaling was generously sponsored by the [Oxford Advanced Research Computing unit](http://arc.ox.ac.uk)):
> Susewind, R. (2015). [What's in a name? Probabilistic inference of religious community from South Asian names](http://dx.doi.org/10.1177/1525822X14564275). Field Methods 27(4), 319-332.
Another useful source that complements this data are the **GIS shapefiles** for polling booths, stations, assembly segments and parliamentary constituencies which I published here (and which use the same ID codes):
Another useful source that complements this data are the **GIS shapefiles** for assembly segments and parliamentary constituencies which are included in the following dataset; the ID codes used therein are compatible to the *loksabha2014 tables (note that the polling booth localities as such are also directly embedded in the *gis tables, so you only need the shapefiles to map higher levels of aggregation):
> Susewind, R. (2014). [GIS shapefiles for India's parliamentary and assembly constituencies including polling booth localities](http://dx.doi.org/10.4119/unibi/2674065). Published under a CC-BY-NC-SA 4.0 license.
@@ -6,7 +6,6 @@ This roadmap is a reminder to myself what I aim to achieve over the next couple
For the first official release of this dataset, I aim to import and double-check all Uttar Pradesh data that was formerly hosted on my personal website. Specifically, these tasks are still open:
* Add candidate name, party and likely religion for 2007, 2012 and 2014 in separate candidate tables
* Add actual data from rolls in 2012 and 2013 and double check lastupdate for both
* Add frontpage stuff from rolls in 2014 (which might be slightly different from 2011-13) into upid
* Integrate the upid rows using old code updated to cover 2014 as well, and preferably taking into account booth details from first pages, psname2partname as well as spatial data - well actually, just rewrite the whole integration script if I am honest...
@@ -19,5 +19,9 @@
.read uprolls2013/uprolls2013.sql
.read uprolls2014/uprolls2014-a.sql
.read uprolls2014/uprolls2014-b.sql
.read upcandidates2007/upcandidates2007.sql
.read upcandidates2009/upcandidates2009.sql
.read upcandidates2012/upcandidates2012.sql
.read upcandidates2014/upcandidates2014.sql
.read upgis/upgis.sql
.read upid/upid.sql
@@ -0,0 +1,52 @@
## ODC Database Contents License
The Licensor and You agree as follows:
### 1.0 Definitions of Capitalised Words
The definitions of the Open Database License (ODbL) 1.0 are incorporated
by reference into the Database Contents License.
### 2.0 Rights granted and Conditions of Use
2.1 Rights granted. The Licensor grants to You a worldwide,
royalty-free, non-exclusive, perpetual, irrevocable copyright license to
do any act that is restricted by copyright over anything within the
Contents, whether in the original medium or any other. These rights
explicitly include commercial use, and do not exclude any field of
endeavour. These rights include, without limitation, the right to
sublicense the work.
2.2 Conditions of Use. You must comply with the ODbL.
2.3 Relationship to Databases and ODbL. This license does not cover any
Database Rights, Database copyright, or contract over the Contents as
part of the Database. Please see the ODbL covering the Database for more
details about Your rights and obligations.
2.4 Non-assertion of copyright over facts. The Licensor takes the
position that factual information is not covered by copyright. The DbCL
grants you permission for any information having copyright contained in
the Contents.
### 3.0 Warranties, disclaimer, and limitation of liability
3.1 The Contents are licensed by the Licensor "as is" and without any
warranty of any kind, either express or implied, whether of title, of
accuracy, of the presence of absence of errors, of fitness for purpose,
or otherwise. Some jurisdictions do not allow the exclusion of implied
warranties, so this exclusion may not apply to You.
3.2 Subject to any liability that may not be excluded or limited by law,
the Licensor is not liable for, and expressly excludes, all liability
for loss or damage however and whenever caused to anyone by any use
under this License, whether by You or by anyone else, and whether caused
by any fault on the part of the Licensor or not. This exclusion of
liability includes, but is not limited to, any special, incidental,
consequential, punitive, or exemplary damages. This exclusion applies
even if the Licensor has been advised of the possibility of such
damages.
3.3 If liability may not be excluded by law, it is limited to actual and
direct financial loss to the extent it is caused by proved negligence on
the part of the Licensor.
@@ -0,0 +1,23 @@
# Data on religion and politics in India
## upcandidates2007
This table contains a list of candidates and their likely religion for the 2007 Vidhan Sabha election in Uttar Pradesh, guessed with the [name2community](https://github.com/raphael-susewind/name2community) algorithm.
## Variables
name | description
--- | ---
id | unique code for each row, in case one ever needs it
ac_id_07 | ID code of the assembly segment assigned by the Election Commission (pre-delimitation)
candidate_*_name_07 | Name of the candidate running for party *
candidate_*_religion_07 | Likely religion of the candidate running for party * (note that this is just a "best bet" based on the social connotations of the candidate's name, not a fact-checked statement!)
candidate_*_religion_certainty_07 | Certainty index of likely religion of the candidate running for party * (a measure to eliminate false matches; see README of the [name2community](https://github.com/raphael-susewind/name2community) algorithm)
## Raw data
Raw data was originally downloaded from http://eci.nic.in/eci_main/StatisticalReports/candidatewise/AE_2007.xls on May 27, 2013 as an Excel file; it was manually converted into candidates-2007.csv, processed using guesscommunity.pl to add likely religion estimates, and then prepared for inclusion into the dataset using transform.pl.
## License
While the database in its entirety is subject to an [ODC Open Database License](http://opendatacommons.org/licenses/odbl/), as explained in the main [README](https://github.com/raphael-susewind/india-religion-politics/blob/master/README.md) and [LICENSE](https://github.com/raphael-susewind/india-religion-politics/blob/master/LICENSE.md) files, the content of this specific table is partly factual data and partly experimental, and as such only subject to a simple [ODC Database Contents License](http://opendatacommons.org/licenses/dbcl/). Code used for compilation is subject to a [CC-BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/) license.
Oops, something went wrong.

0 comments on commit 1f05b3f

Please sign in to comment.