Skip to content
Switch branches/tags

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Biographical Data of Indian Politicians

Biographical data of national, state and some local elections candidates from and along with scripts for retrieving the data. The data from the 15th Lok Sabha and members in Rajya Sabha as of June, 2014 was used to produce this small note: (No) Missing daughters of Indian Politicians. While data on all political candidates in national, state and some local elections from myNeta was used to analyze spousal income, movable and immovable assets by politician gender. (Analysis.)

Table of Contents

Data on Indian MPs from the 'National Portal of India'

Data on Indian MPs serving the Lok Sabha and the Rajya Sabha.

Get the Data

To get the data, download the scripts in the get_data/archive_india_gov folder to your computer. The scripts require Python 3.x and BeautifulSoup 4 to run. The package dependency is listed in get_data/archive_india_gov/requirements.txt. Once you have installed the dependencies, you can run the scripts.

  1. To download web pages containing the information, run


    The HTML files will be saved in ./rajyasabha and ./loksabha

  2. To parse and extract information from the HTML files, run

    python <dir>

    The script outputs a CSV file, saving it as dir-out.csv


The data were scraped in June, 2014 and November, 2015.

Note: In 2015, the list of Rajya Sabha members on the site appears to differ slightly from the list posted on


Data on All Candidates from myNeta

Select biographical and electoral data of national, state and some local elections candidates from The data were scraped in November, 2015.

Get the Data

There are three scripts. Why three? Information about gender is not provided on candidate pages and is integrated later. The three scripts are:

To begin using the scripts, install the requirements. Then download the scripts into a folder, and run scripts from the command line.

usage: [-h] [-o OUTPUT] [-n MAX_CONN] [-s FROM_STATE]
                    [-y FROM_YEAR] [-c FROM_CONSTITUENCY] [-t TYPE]

optional arguments:
  -h, --help            show this help message and exit
  -o OUTPUT, --output OUTPUT
                        Output CSV file name
  -n MAX_CONN, --max-conn MAX_CONN
                        Max concurrent connections
  -s FROM_STATE, --from-state FROM_STATE
                        Start from a specific state
  -y FROM_YEAR, --from-year FROM_YEAR
                        Start from a specific election year
                        Start from a specific constituency
  -t TYPE, --type TYPE  Type (all|state|nation|local)
  --no-header           Output without header at the first row


python -o india-mps-all.csv

Get all women candidates


URL of all women candidates saved as: output-women.csv

To merge all candidates with gender, run:



Meta Data

  • Each row = politician per constituency per election year.
  • Columns
    • Politician Name, Constituency, State, Party, Election Year, Whether They Won or Not, Type: State/National/Local
    • Education, Age, Address, Self Profession, Spouse Profession
    • Income Tax Return: Self Total Income, Spouse Total Income
    • Self Movable Assests, Spouse Movable Assets:
      • cash--- for self and spouse
      • jewellery --- for self and spouse
      • totals --- for self and spouse
    • Immovable Assets --- Self Totals, Spouse Totals
    • Liabilities --- Self Totals, Spouse Totals


There are missing data for election years before 2011:

  • Income Tax Return so no Self/Spouse Total Income
  • No column for Spouse in the Liabilities
  • In a few elections, multiple candidates with the same name are fighting to get elected from the same constituency. For instance, check here, here, here, here, here, and here.



Scripts, figures, and writing are released under CC BY 2.0.


Biographical data of political candidates in India; rich data on Indian MPs




No releases published