Permalink
Browse files

Added roll data for 2011 and 2014 and prepared to add 2012 and 2013

  • Loading branch information...
raphael-susewind committed Feb 10, 2016
1 parent 85a18d8 commit 95a06fc93a9c8a00c48bb589a435faf4838b099c
Showing with 1,272,472 additions and 462,430 deletions.
  1. +3 −3 ROADMAP.md
  2. +6 −0 combined.sql
  3. +18 −2 upid/README.md
  4. +729,630 −462,425 upid/upid.csv
  5. +177 −0 uprolls2011/LICENSE.md
  6. +41 −0 uprolls2011/README.md
  7. BIN uprolls2011/booths.sqlite.tgz
  8. +24 −0 uprolls2011/combine.pl
  9. +22 −0 uprolls2011/run-in-osc-add-firstpage-stuff/control.pl
  10. +142 −0 uprolls2011/run-in-osc-add-firstpage-stuff/pdf2list.pl
  11. +24 −0 uprolls2011/run-in-osc-add-firstpage-stuff/run.sh
  12. +68 −0 uprolls2011/run-in-osc-add-firstpage-stuff/subcontrol.pl
  13. +53 −0 uprolls2011/run-in-osc/addngram.pl
  14. +62 −0 uprolls2011/run-in-osc/charmap.py
  15. +7 −0 uprolls2011/run-in-osc/compile.pl
  16. +29 −0 uprolls2011/run-in-osc/control.pl
  17. +422 −0 uprolls2011/run-in-osc/createnamedb.pl
  18. +31 −0 uprolls2011/run-in-osc/createngram.pl
  19. +205 −0 uprolls2011/run-in-osc/csv2stats.pl
  20. +26 −0 uprolls2011/run-in-osc/downloadpdf.pl
  21. +598 −0 uprolls2011/run-in-osc/pdf2list.pl
  22. +25 −0 uprolls2011/run-in-osc/run.sh
  23. +91 −0 uprolls2011/run-in-osc/soundex.py
  24. +31 −0 uprolls2011/run-in-osc/subcontrol.pl
  25. +57 −0 uprolls2011/transform.pl
  26. +128,038 −0 uprolls2011/uprolls2011-a.sql
  27. +128,048 −0 uprolls2011/uprolls2011-b.sql
  28. +177 −0 uprolls2012/LICENSE.md
  29. +44 −0 uprolls2012/README.md
  30. +24 −0 uprolls2012/combine.pl
  31. +51 −0 uprolls2012/run-in-osc/addngram.pl
  32. +62 −0 uprolls2012/run-in-osc/charmap.py
  33. +29 −0 uprolls2012/run-in-osc/control.pl
  34. +422 −0 uprolls2012/run-in-osc/createnamedb.pl
  35. +218 −0 uprolls2012/run-in-osc/csv2stats.pl
  36. +26 −0 uprolls2012/run-in-osc/downloadpdf.pl
  37. +643 −0 uprolls2012/run-in-osc/pdf2list.pl
  38. +25 −0 uprolls2012/run-in-osc/run.sh
  39. +91 −0 uprolls2012/run-in-osc/soundex.py
  40. +34 −0 uprolls2012/run-in-osc/subcontrol.pl
  41. +26 −0 uprolls2012/transform.pl
  42. +177 −0 uprolls2013/LICENSE.md
  43. +44 −0 uprolls2013/README.md
  44. +24 −0 uprolls2013/combine.pl
  45. +51 −0 uprolls2013/run-in-osc/addngram.pl
  46. +62 −0 uprolls2013/run-in-osc/charmap.py
  47. +29 −0 uprolls2013/run-in-osc/control.pl
  48. +422 −0 uprolls2013/run-in-osc/createnamedb.pl
  49. +218 −0 uprolls2013/run-in-osc/csv2stats.pl
  50. +46 −0 uprolls2013/run-in-osc/downloadpdf.pl
  51. +675 −0 uprolls2013/run-in-osc/pdf2list.pl
  52. +25 −0 uprolls2013/run-in-osc/run.sh
  53. +91 −0 uprolls2013/run-in-osc/soundex.py
  54. +34 −0 uprolls2013/run-in-osc/subcontrol.pl
  55. +26 −0 uprolls2013/transform.pl
  56. +177 −0 uprolls2014/LICENSE.md
  57. +48 −0 uprolls2014/README.md
  58. BIN uprolls2014/booths.sqlite.tgz
  59. +24 −0 uprolls2014/combine.pl
  60. +53 −0 uprolls2014/run-in-osc-ngram/addngram.pl
  61. +31 −0 uprolls2014/run-in-osc-ngram/control.pl
  62. +308 −0 uprolls2014/run-in-osc-ngram/csv2stats.pl
  63. +40 −0 uprolls2014/run-in-osc-ngram/downloadpdf.pl
  64. +153 −0 uprolls2014/run-in-osc-ngram/frontpage.pl
  65. +25 −0 uprolls2014/run-in-osc-ngram/run.sh
  66. +20 −0 uprolls2014/run-in-osc-ngram/subcontrol.pl
  67. +53 −0 uprolls2014/run-in-osc/addngram.pl
  68. +62 −0 uprolls2014/run-in-osc/charmap.py
  69. +35 −0 uprolls2014/run-in-osc/control.pl
  70. +308 −0 uprolls2014/run-in-osc/csv2stats.pl
  71. +40 −0 uprolls2014/run-in-osc/downloadpdf.pl
  72. +153 −0 uprolls2014/run-in-osc/frontpage.pl
  73. +667 −0 uprolls2014/run-in-osc/pdf2list.pl
  74. +25 −0 uprolls2014/run-in-osc/run.sh
  75. +91 −0 uprolls2014/run-in-osc/soundex.py
  76. +90 −0 uprolls2014/run-in-osc/subcontrol.pl
  77. +46 −0 uprolls2014/transform.pl
  78. +139,175 −0 uprolls2014/uprolls2014-a.sql
  79. +139,174 −0 uprolls2014/uprolls2014-b.sql
View
@@ -6,9 +6,8 @@ This roadmap is a reminder to myself what I aim to achieve over the next couple
For the first official release of this dataset, I aim to import and double-check all Uttar Pradesh data that was formerly hosted on my personal website. Specifically, these tasks are still open:
* Add candidate name, party and religion for 2007, 2009, 2012 and 2014 in separate candidate tables
* Add namematching data from electoral rolls for 2011, 2012, 2013, 2014 and - if available by then - 2015 and/or 2016 into separate tables
* Add booth details from first page of electoral rolls (pincode, postoffice, administrative units etc) for 2011-13 and 2014 (if the latter has run properly) into upid
* Add actual data from rolls in 2012 and 2013 and double check lastupdate for both
* Add frontpage stuff from rolls in 2014 (which might be slightly different from 2011-13)
* Add psname2partname mapping for 2014 into upid
* Add polling booth locality data for 2009, 2012 and 2014 into upid
* Add MODIS 500m rural/urban classification for booth localities to upid
@@ -21,6 +20,7 @@ Once UP is dealt with properly, I will expand to all-India level for the 2014 ge
* Add 2014 Lok Sabha booth level results for more states (to the extent that it is halfway easily accessible)
* Add namematching data from electoral rolls for 2014 across more states (about half of them done already)
* Add candidate name, party and likely religion for 2014 in separate candidate tables
* Add polling booth locality data for 2014 across India (from my GIS dataset, practically done)
* Add MODIS 500m rural/urban classification for booth localities
View
@@ -13,4 +13,10 @@
.read uploksabha2014/uploksabha2014-b.sql
.read uploksabha2014/uploksabha2014-c.sql
.read uploksabha2014/uploksabha2014-d.sql
.read uprolls2011/uprolls2011-a.sql
.read uprolls2011/uprolls2011-b.sql
.read uprolls2012/uprolls2012.sql
.read uprolls2013/uprolls2013.sql
.read uprolls2014/uprolls2014-a.sql
.read uprolls2014/uprolls2014-b.sql
.read upid/upid.sql
View
@@ -26,11 +26,27 @@ ac_name_12 | Name of that assembly segment, as assigned by the Election Commissi
ac_reserved_12 | Reservation status of that assembly segment, as assigned by the Election Commission in 2012
booth_id_12 | ID code of the polling booth, as assigned by the Election Commission in 2012
station_id_12 | ID code of the polling station, i.e. the physical unit housing this polling booth (note that this is a concept not used by the Election Commission, but introduced by me - basically all polling booths with subsequent ID codes and roughly similar names are considered to fall within one station)
station_name_12 | Name of the polling station, i.e. the physical unit housing this polling booth (cleaned up to be the same across all booths within this station)
station_name_12 | Name of the polling station, i.e. the physical unit housing this polling booth (cleaned up to be the same across all booths within this station
ac_name_14 | Name of that assembly segment, as assigned by the Election Commission in 2014
ac_reserved_14 | Reservation status of that assembly segment, as assigned by the Election Commission in 2014
booth_id_14 | ID code of the polling booth, as assigned by the Election Commission in 2014
station_id_14 | ID code of the polling station, i.e. the physical unit housing this polling booth (note that this is a concept not used by the Election Commission, but introduced by me - basically all polling booths with subsequent ID codes and roughly similar names are considered to fall within one station)
station_name_14 | Name of the polling station, i.e. the physical unit housing this polling booth (cleaned up to be the same across all booths within this station)
district_11 | District into which this booth falls as identically listed on the cover sheet of the electoral rolls of 2011, 2012 and 2013
tehsil_11 | Tehsil into which this booth falls as identically listed on the cover sheet of the electoral rolls of 2011, 2012 and 2013
village_11 | Village into which this booth falls as identically listed on the cover sheet of the electoral rolls of 2011, 2012 and 2013 (only rural booths)
town_11 | Town into which this booth falls as identically listed on the cover sheet of the electoral rolls of 2011, 2012 and 2013 (only urban booths)
ward_11 | Ward into which this booth falls as identically listed on the cover sheet of the electoral rolls of 2011, 2012 and 2013 (only urban booths)
thana_11 | Police thana jurisdiction into which this booth falls as identically listed on the cover sheet of the electoral rolls of 2011, 2012 and 2013
circlecourt_11 | Court jurisdiction into which this booth falls as identically listed on the cover sheet of the electoral rolls of 2011, 2012 and 2013
station_name_11 |Station name of this booth as identically listed on the cover sheet of the electoral rolls of 2011, 2012 and 2013 (might be the same or not as the various station_name_* variables generated by comparing names across subsequent booth IDs)
station_address_11 | Address of the station into which this booth falls as identically listed on the cover sheet of the electoral rolls of 2011, 2012 and 2013
areas_11 | Areas (aka 'parts') which this booth covers as identically listed on the cover sheet of the electoral rolls of 2011, 2012 and 2013
pincode_11 | Pincode of this booth as identically listed on the cover sheet of the electoral rolls of 2011, 2012 and 2013
## Processing
This table is generated using calculate.pl on an otherwise complete dataset SQLite file - the output of this script is then used to alter the original (and not very useful, because not integrated) upid table in that very dataset. In other words: whenever any changes or additions happen to the dataset that concerns ID matching and integration, this script has to be run afterwards, and its output upid.sql incorporated into the table. If you are just downloading the whole dataset, though, this comes with the current version of upid.sql, which is automatically run at the right place by combined.sql. So you should be fine...
The original entries for this table stem from the various processing scripts of other tables. They are then compressed using calculate.pl on an otherwise complete dataset SQLite file. In other words: whenever any changes or additions happen to the dataset that concerns ID matching and integration, this script has to be run afterwards, and its output upid.sql incorporated into the table. If you are just downloading the whole dataset, though, this comes with the current version of upid.sql, which is automatically run at the right place by combined.sql. So you should be fine...
## License
Oops, something went wrong.

0 comments on commit 95a06fc

Please sign in to comment.