## Kaggle is a an open platform for many, irrespective of any discrimination. For others it is a dream.

The myriad global problems are discussed in the world everyday, that I always wonder what are the real issues to talk about. The other day, I was sharing thoughts with a friend. Our conversation started from importance of education and it turned towards impact of inequality in personality development. This is a continuous disturbing thought in my mind. Thus, I decided to discuss it as a major part of my explanatory data analysis and find meaningful insight. This kernel is an attempt to give a motivational thought to those who continuously struggle due to different forms of discrimination. These include age, gender, economy, education and other inequalities. 

# 1. Introduction

In this kernel, I will discuss some important general questions which are hurdles for many, and which usually people ask when they start in any field.
Since, the data is related to computer science in general, and data science and machine learning in particular, thus, thus I will also mention useful resources to get started in the field of AI.

The survey received 19,717 responses from 171 countries. 19,717 is a very small sample size to discuss large population and make analysis. Thus, for the justification of my analysis, I will be using other references in addition to Kaggle survey results.

In [None]:
# titl_1: Nothing Can Stop You From Success
# titl_2: Beat by Compete
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load in 

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import seaborn as sns
import matplotlib.pyplot as plt
from bokeh.plotting import figure
pd.set_option('display.max_columns', None)  
pd.set_option('display.max_rows', None)  
PATH = '/kaggle/input/kaggle-survey-2019/'
data = pd.read_csv(f'{PATH}multiple_choice_responses.csv', low_memory = False)
data.columns = data.iloc[0]
data = data.drop([0], axis=0)
plt.style.use("seaborn")
sns.set(font_scale=1)
gender = 'What is your gender? - Selected Choice'
country = 'In which country do you currently reside?'
salary = 'What is your current yearly compensation (approximate $USD)?'
age = 'What is your age (# years)?'
education = 'What is the highest level of formal education that you have attained or plan to attain within the next 2 years?'
job_title = 'Select the title most similar to your current role (or most recent title if retired): - Selected Choice'

order = ['$0-999', '1,000-1,999', '2,000-2,999', '3,000-3,999', '4,000-4,999', '5,000-7,499',
         '7,500-9,999', '10,000-14,999', '15,000-19,999', '20,000-24,999', '25,000-29,999',
         '30,000-39,999', '40,000-49,999', '50,000-59,999', '60,000-69,999', '70,000-79,999',
         '80,000-89,999', '90,000-99,999', '100,000-124,999', '125,000-149,999', '150,000-199,999',
         '200,000-249,999', '250,000-299,999', '300,000-500,000', '> $500,000']

data_gender = data[(data[gender]=='Male') | (data[gender]=='Female')]

data[gender].replace(to_replace={"Prefer to self-describe":"Others",
                               "Prefer not to say" : "Others"}, inplace=True)
data_degree = data[(data[education]=='Master’s degree') |(data[education]=='Bachelor’s degree') |(data[education]=='Doctoral degree')];
data_1 = data_degree[(data_degree[country]=='India') |(data_degree[country]=='United States of America') | (data_degree[country]=='China')  | (data_degree[country]=='Russia')];

data.head()

# 2. Gender inequality is reality, but...

> ## "If you educate a man, you educate an individual. But if you educate a woman, you educate a nation" - African proverb

In [None]:
# Plotting
plt.figure(figsize=(10,10))
ax = sns.countplot(x=gender, data=data,
                   linewidth=3,
                   edgecolor=sns.color_palette("dark", 1),
                   color = sns.color_palette()[0])
plt.xlabel('Gender', fontsize = 24);
plt.ylabel('Respondent Count', fontsize = 24);
plt.xticks(fontsize=18);
plt.yticks(fontsize=18);
tick_props = np.arange(0,1,0.10);
tick_names = ['{:0.2f}'.format(v) for v in tick_props];
plt.yticks(tick_props * data.shape[0], tick_names);

> ## Male response is 5 times more than female response

In [None]:
plt.figure(figsize=(20,10))

ax = sns.countplot(x=salary, hue=gender, data=data_gender,
                   linewidth=3,
                   edgecolor=sns.color_palette("dark", 1),
                   color = sns.color_palette()[0],
                  order = order)

plt.xlabel('Gender to Salary', fontsize = 24);
plt.ylabel('Respondent Count', fontsize = 24);
plt.xticks(fontsize=18, rotation=90);
plt.yticks(fontsize=18);
plt.legend(fontsize = 22);

> ## Women to men earning ratio, in all categories of income, seems low. But, since less women participated in survey, I will not make it as a point of analysis. I rather focus on women also earning six figure salary.

In [None]:
plt.figure(figsize=(30,10))
ax = sns.countplot(x=job_title, hue=gender, data=data_gender,
                   linewidth=3,
                   edgecolor=sns.color_palette("dark", 1),
                   color = sns.color_palette()[0])

plt.xlabel('Job Title', fontsize = 24);
plt.ylabel('Respondent Count', fontsize = 24);
plt.xticks(fontsize=18, rotation=90);
plt.yticks(fontsize=18);
plt.legend(fontsize = 22);

> ## It is evident that women are participating in all the professions.

The least participation of girls is due to many reasons. According to United Nations Sustainable Development report(goal 5), 750 million women and girls are married before the age of 18 [[4]](https://www.un.org/sustainabledevelopment/gender-equality/). In 18 countries, husbands can legally prevent their wives from working [[4]](https://www.un.org/sustainabledevelopment/gender-equality/); in 39 countries, daughters and sons do not have equal inheritance rights; and 49 countries lack laws protecting women from domestic violence [[4]](https://www.un.org/sustainabledevelopment/gender-equality/). The suffering from cultural norms ultimately creates frustration and loss of hope.

**So, gender inequality is a reality of this age, but only where there are gender biased laws. **

The data from American Psychology Association reveals that women outnumber men in psychology by 75% [[8]](https://www.apa.org/monitor/2017/07-08/women-psychology). Alexa, Cortana, even the automated announcements on public transport – they all have one thing in common: a female voice or female avatar. There are two popular live Kaggle channels hosted by a female, Rachael Tatman [[7]](http://www.rctatman.com/) (I really like her paper reading session every Wednesday and you should also attend). Platforms like Grace Hopper celebration [[5]](https://ghc.anitab.org/), Google's Women Techmakers [[6]](https://www.womentechmakers.com/) are continuously doing their work to empower women in computing.


##### To eradicte gender gap, we need Rachel and Fei Fei Li from different parts of the world.

# 3. Does my economy define my fate

> ## "If You Are Born Poor It’s Not Your Mistake, But If You Die Poor It’s Your Mistake" - Bill Gates

In [None]:
plt.figure(figsize=(30,30))
ax = sns.countplot(y=country, data=data,
                   linewidth=3,
                   edgecolor=sns.color_palette("dark", 1),
                   color = sns.color_palette()[0],
                   order = data[country].value_counts().index)
plt.ylabel('Country Reside', fontsize = 24);
plt.xlabel('Respondent Count', fontsize = 24);
plt.xticks(fontsize=18);
plt.yticks(fontsize=18);

> ## As a survey respondent, India leads other countries

In [None]:
plt.figure(figsize=(30,10));
ax = sns.countplot(x=education, hue=country, data=data_1,
                   linewidth=3,
                   edgecolor=sns.color_palette("dark", 1),
                   color = sns.color_palette()[0]);
plt.xlabel('Country to Degree', fontsize = 24);
plt.ylabel('Respondent Count', fontsize = 24);
plt.legend(fontsize = 22);
plt.xticks(fontsize=18);
plt.yticks(fontsize=18);

> ## Most of the bachelors and masters degree holders, who participated in survey, live in India. Whereas, Phd degree holders live in USA.

For the current 2020 fiscal year, World Bank classifies countries into 4 classes based on economic condition [[1]](https://datahelpdesk.worldbank.org/knowledgebase/articles/906519-world-bank-country-and-lending-groups):

- Low-income economies
- Lower-middle-income economies
- Upper-middle-income economies
- High-income economies

India, classified as a lower middle income country [[1]](https://datahelpdesk.worldbank.org/knowledgebase/articles/906519-world-bank-country-and-lending-groups), showed large participation in survey. Apart from this, most degree holders belong to India. Poor economic condition is indeed a hurdle in the development of country, but it cannot stop residents from learning and exploring opportunities.

##### Due to continuous involvement of Indians in IT sectors, many big tech companies launched offices in India. These include Google, Microsoft and Amazon (Amazon has globally largest office in India).

# 4. Age is just a number

> ## "Age is an issue of mind over matter. If you don't mind, it doesn't matter" - Mark Twain

In [None]:
# Plotting
plt.figure(figsize=(10,10))

ax = sns.countplot(x=age, data=data,
                   linewidth=3,
                   edgecolor=sns.color_palette("dark", 1),
                   color = sns.color_palette()[0],
                  order = data.groupby(age)[age].unique().index)
plt.xlabel('Age Range', fontsize = 24);
plt.ylabel('Respondent Count', fontsize = 24);
plt.xticks(fontsize=18, rotation = 90);
plt.yticks(fontsize=18);

> ## A large number of reponse came from people in the age between 18 years to 29 years

It is more likely that people in the age between 18 years to 30 years are more passionate to learn new skills. It is because of general norm of defined age to achieve specific goals in life, like graduation, earn six figure job, own a house, get settled. Probably, those in the late 40s have already completed the checklist.

In [None]:
plt.figure(figsize=(70,20))
ax = sns.countplot(x=salary, hue=age, data=data,
                   linewidth=3,
                   edgecolor=sns.color_palette("dark", 1),
                  order = order,
                  hue_order = data.groupby(age)[age].unique().index)
plt.xlabel('Age to Salary', fontsize = 30);
plt.ylabel('Respondent Count', fontsize = 30);
plt.xticks(fontsize=30, rotation = 90);
plt.yticks(fontsize=30);
plt.legend(fontsize = 30);

> ## For all salary ranges greater than or equal to $50000, people with age range 30-40 are more common.

If you start exploring opportunities in the age between 18-30, it is more likely that you'll earn good salary by 30-40 years. But, those in their old age are also earning good amount. If you don't stop learning, you'll be a head of others with your experience and new technology knowledge.

##### Bill Gates is in his early 60s, and the secret of his success is continuous learning [[2]](https://www.youtube.com/watch?v=eTFy8RnUkoU). Mark Zuckerberg started facebook at the age of 19 [[3]](https://en.wikipedia.org/wiki/History_of_Facebook). There are many Kagglers who are young and they have won competitions. It is never too late or early to follow passion, age is just a number.

# 5. You can survive if you hold Bachelor's degree

In [None]:
plt.figure(figsize=(20,10))

ax = sns.countplot(x=education,data=data,
                   linewidth=3,
                   edgecolor=sns.color_palette("dark", 1),
                   color = sns.color_palette()[0])
plt.xlabel('Education', fontsize = 24);
plt.ylabel('Respondent Count', fontsize = 24);
plt.xticks(fontsize=18, rotation = 90);
plt.yticks(fontsize=18);

> ## A significant amount of respondant hold or pursue Master's degree.

In [None]:
plt.figure(figsize=(30,10))
ax = sns.countplot(x=job_title, hue=education, data=data,
                   linewidth=3,
                   edgecolor=sns.color_palette("dark", 1),
                   color = sns.color_palette()[0])
plt.xlabel('Education to Profession', fontsize = 24);
plt.ylabel('Respondent Count', fontsize = 24);
plt.xticks(fontsize=18, rotation = 90);
plt.yticks(fontsize=18);
plt.legend(fontsize = 18);

> ## In all the professions, the highest peak education include Bachlor's degree, Master's degree or Doctoral Degree

In [None]:
plt.figure(figsize=(30,10))
ax = sns.countplot(x=salary, hue=education, data=data_degree,
                   linewidth=3,
                   edgecolor=sns.color_palette("dark", 1),
                   color = sns.color_palette()[0], order=order)

plt.xlabel('Education to Salary', fontsize = 24);
plt.ylabel('Respondent Count', fontsize = 24);
plt.xticks(fontsize=18, rotation = 90);
plt.yticks(fontsize=18);
plt.legend(fontsize = 22);

> ## People with Master's degree earn similar to those holding Doctoral degree. Bachelors degree holder are also earning same salary as Master's and doctoral, but the count is low.

# 6. References

[1] [World Bank Country Classification based on economic condition](https://datahelpdesk.worldbank.org/knowledgebase/articles/906519-world-bank-country-and-lending-groups)

[2] [How Bill Gates reads books](https://www.youtube.com/watch?v=eTFy8RnUkoU)

[3] [History of Facebook](https://en.wikipedia.org/wiki/History_of_Facebook)

[4] [Achieve gender equality and empower all women and girls](https://www.un.org/sustainabledevelopment/gender-equality/)

[5] [Grace Hopper Celebration](https://ghc.anitab.org/)

[6] [Google's Women Techmakers](https://www.womentechmakers.com/)

[7] [Rachael Tatman](http://www.rctatman.com/)

[8] [American Psychology Association](https://www.apa.org/monitor/2017/07-08/women-psychology)