Skip to content
Using Exploratory Data Analysis to understand Gender Pay Gap among Data Scientist in the US and India
Jupyter Notebook
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

Kaggle 2018 Survey Analysis - Exploring the Gender Pay Gap among Data Scientists in the US and in India


This project is using the Anaconda distribution of Python version 3.7. Libraries used are Pandas, Matplotlib, Numpy, and Seaborn. For development testing, PixieDebugger ( was used as well.

Project Motivation

This project is part of my completion of the Udacity Data Science Nanodegree. The goal is to select a data set and to analyze it and provide a writeup of the analysis in a blog post.

Kaggle's second annual survey of platform users created a very rich data set on individual users' demographics and experience in the data science and data analysis space.

The goal of this analysis is to perform exploratory data analysis on the data set. Specifically, the country, age, and pay distributions of the survey takers are analyzed.

File Descriptions

  • data folder: source data for the analysis, downloaded from
  • images folder: charts for the blog post are saved in this folder
  • analysis.ipyb: main Python workbook containing the full analysis workflow


A summary writeup of the results is published at


The source data is from Kaggle and can be found at

You can’t perform that action at this time.