<a href="https://colab.research.google.com/github/SimeonHristov99/ML_21-22/blob/main/03_Hello%2C_Pandas.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Hello, [Pandas](https://pandas.pydata.org/docs/reference/index.html)!

## What is pandas?

- (in short) data analysis library
- (according to the official website) "an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language."

## Download data

We will be using the [Stack Overflow Annual Developer Survey](https://insights.stackoverflow.com/survey) results from last year.

In [None]:
!wget https://info.stackoverflowsolutions.com/rs/719-EMH-566/images/stack-overflow-developer-survey-2021.zip -O tmp.zip
!unzip tmp.zip -d ./data
!rm tmp.zip

## Imports and Constants

In [2]:
import pandas as pd
import numpy as np

In [3]:
DATA_PATH = '/content/data/survey_results_public.csv'
SCHEMA_PATH = '/content/data/survey_results_schema.csv'

## Load the data into a DataFrame

A **dataframe** is a tablular representation of data, i.e. rows and columns of data.

In [4]:
df = pd.read_csv(DATA_PATH)
df

Unnamed: 0,ResponseId,MainBranch,Employment,Country,US_State,UK_Country,EdLevel,Age1stCode,LearnCode,YearsCode,...,Age,Gender,Trans,Sexuality,Ethnicity,Accessibility,MentalHealth,SurveyLength,SurveyEase,ConvertedCompYearly
0,1,I am a developer by profession,"Independent contractor, freelancer, or self-em...",Slovakia,,,"Secondary school (e.g. American high school, G...",18 - 24 years,Coding Bootcamp;Other online resources (ex: vi...,,...,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,62268.0
1,2,I am a student who is learning to code,"Student, full-time",Netherlands,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",7,...,18-24 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,
2,3,"I am not primarily a developer, but I write co...","Student, full-time",Russian Federation,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",,...,18-24 years old,Man,No,Prefer not to say,Prefer not to say,None of the above,None of the above,Appropriate in length,Easy,
3,4,I am a developer by profession,Employed full-time,Austria,,,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",11 - 17 years,,,...,35-44 years old,Man,No,Straight / Heterosexual,White or of European descent,I am deaf / hard of hearing,,Appropriate in length,Neither easy nor difficult,
4,5,I am a developer by profession,"Independent contractor, freelancer, or self-em...",United Kingdom of Great Britain and Northern I...,,England,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",5 - 10 years,Friend or family member,17,...,25-34 years old,Man,No,,White or of European descent,None of the above,,Appropriate in length,Easy,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
83434,83435,I am a developer by profession,Employed full-time,United States of America,Texas,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",6,...,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,I have a concentration and/or memory disorder ...,Appropriate in length,Easy,160500.0
83435,83436,I am a developer by profession,"Independent contractor, freelancer, or self-em...",Benin,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",4,...,18-24 years old,Man,No,Straight / Heterosexual,Black or of African descent,None of the above,None of the above,Appropriate in length,Easy,3960.0
83436,83437,I am a developer by profession,Employed full-time,United States of America,New Jersey,,"Secondary school (e.g. American high school, G...",11 - 17 years,School,10,...,25-34 years old,Man,No,,White or of European descent,None of the above,None of the above,Appropriate in length,Neither easy nor difficult,90000.0
83437,83438,I am a developer by profession,Employed full-time,Canada,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,Online Courses or Certification;Books / Physic...,5,...,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,I have a mood or emotional disorder (e.g. depr...,Appropriate in length,Neither easy nor difficult,816816.0


In [5]:
# (num_rows, num_cols)
df.shape

(83439, 48)

In [6]:
# With `.info()` you see the column names as well as their type and whether they have any missing entries.
# The type `object` usually refers to string.
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 83439 entries, 0 to 83438
Data columns (total 48 columns):
 #   Column                        Non-Null Count  Dtype  
---  ------                        --------------  -----  
 0   ResponseId                    83439 non-null  int64  
 1   MainBranch                    83439 non-null  object 
 2   Employment                    83323 non-null  object 
 3   Country                       83439 non-null  object 
 4   US_State                      14920 non-null  object 
 5   UK_Country                    4418 non-null   object 
 6   EdLevel                       83126 non-null  object 
 7   Age1stCode                    83243 non-null  object 
 8   LearnCode                     82963 non-null  object 
 9   YearsCode                     81641 non-null  object 
 10  YearsCodePro                  61216 non-null  object 
 11  DevType                       66484 non-null  object 
 12  OrgSize                       60726 non-null  object 
 13  C

In [7]:
pd.set_option('display.max_columns', df.shape[1])
# pd.set_option('display.max_rows', df.shape[0]) # In case you want to see all rows
df

Unnamed: 0,ResponseId,MainBranch,Employment,Country,US_State,UK_Country,EdLevel,Age1stCode,LearnCode,YearsCode,YearsCodePro,DevType,OrgSize,Currency,CompTotal,CompFreq,LanguageHaveWorkedWith,LanguageWantToWorkWith,DatabaseHaveWorkedWith,DatabaseWantToWorkWith,PlatformHaveWorkedWith,PlatformWantToWorkWith,WebframeHaveWorkedWith,WebframeWantToWorkWith,MiscTechHaveWorkedWith,MiscTechWantToWorkWith,ToolsTechHaveWorkedWith,ToolsTechWantToWorkWith,NEWCollabToolsHaveWorkedWith,NEWCollabToolsWantToWorkWith,OpSys,NEWStuck,NEWSOSites,SOVisitFreq,SOAccount,SOPartFreq,SOComm,NEWOtherComms,Age,Gender,Trans,Sexuality,Ethnicity,Accessibility,MentalHealth,SurveyLength,SurveyEase,ConvertedCompYearly
0,1,I am a developer by profession,"Independent contractor, freelancer, or self-em...",Slovakia,,,"Secondary school (e.g. American high school, G...",18 - 24 years,Coding Bootcamp;Other online resources (ex: vi...,,,"Developer, mobile",20 to 99 employees,EUR European Euro,4800.0,Monthly,C++;HTML/CSS;JavaScript;Objective-C;PHP;Swift,Swift,PostgreSQL;SQLite,SQLite,,,Laravel;Symfony,,,,,,PHPStorm;Xcode,Atom;Xcode,MacOS,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow,Multiple times per day,Yes,A few times per month or weekly,"Yes, definitely",No,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,62268.0
1,2,I am a student who is learning to code,"Student, full-time",Netherlands,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",7,,,,,,,JavaScript;Python,,PostgreSQL,,,,Angular;Flask;Vue.js,,Cordova,,Docker;Git;Yarn,Git,Android Studio;IntelliJ;Notepad++;PyCharm,,Windows,Visit Stack Overflow;Google it,Stack Overflow,Daily or almost daily,Yes,Daily or almost daily,"Yes, definitely",No,18-24 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,
2,3,"I am not primarily a developer, but I write co...","Student, full-time",Russian Federation,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",,,,,,,,Assembly;C;Python;R;Rust,Julia;Python;Rust,SQLite,SQLite,Heroku,,Flask,Flask,NumPy;Pandas;TensorFlow;Torch/PyTorch,Keras;NumPy;Pandas;TensorFlow;Torch/PyTorch,,,IPython/Jupyter;PyCharm;RStudio;Sublime Text;V...,IPython/Jupyter;RStudio;Sublime Text;Visual St...,MacOS,Visit Stack Overflow;Google it;Watch help / tu...,Stack Overflow;Stack Exchange,Multiple times per day,Yes,Multiple times per day,"Yes, definitely",Yes,18-24 years old,Man,No,Prefer not to say,Prefer not to say,None of the above,None of the above,Appropriate in length,Easy,
3,4,I am a developer by profession,Employed full-time,Austria,,,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",11 - 17 years,,,,"Developer, front-end",100 to 499 employees,EUR European Euro,,Monthly,JavaScript;TypeScript,JavaScript;TypeScript,,,,,Angular;jQuery,Angular;jQuery,,,,,,,Windows,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow,Daily or almost daily,Yes,Daily or almost daily,Neutral,No,35-44 years old,Man,No,Straight / Heterosexual,White or of European descent,I am deaf / hard of hearing,,Appropriate in length,Neither easy nor difficult,
4,5,I am a developer by profession,"Independent contractor, freelancer, or self-em...",United Kingdom of Great Britain and Northern I...,,England,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",5 - 10 years,Friend or family member,17,10,"Developer, desktop or enterprise applications;...","Just me - I am a freelancer, sole proprietor, ...",GBP\tPound sterling,,,Bash/Shell;HTML/CSS;Python;SQL,Bash/Shell;HTML/CSS;Python;SQL,Elasticsearch;PostgreSQL;Redis,Cassandra;Elasticsearch;PostgreSQL;Redis,,,Flask,Flask,Apache Spark;Hadoop;NumPy;Pandas,Hadoop;NumPy;Pandas,Docker;Git;Kubernetes;Yarn,Docker;Git;Kubernetes;Yarn,Atom;IPython/Jupyter;Notepad++;PyCharm;Vim,Atom;IPython/Jupyter;Notepad++;PyCharm;Vim;Vis...,Linux-based,Visit Stack Overflow;Go for a walk or other ph...,Stack Overflow;Stack Exchange,Daily or almost daily,Yes,A few times per week,"Yes, somewhat",No,25-34 years old,Man,No,,White or of European descent,None of the above,,Appropriate in length,Easy,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
83434,83435,I am a developer by profession,Employed full-time,United States of America,Texas,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",6,5,"Developer, back-end",20 to 99 employees,USD\tUnited States dollar,160500.0,Yearly,Clojure;Kotlin;SQL,Clojure,Oracle;SQLite,SQLite,AWS,AWS,,,,,Docker;Git,Git;Kubernetes,IntelliJ;Sublime Text;Vim;Visual Studio Code,Sublime Text;Vim,MacOS,Call a coworker or friend;Google it,Stack Overflow;Stack Exchange,A few times per week,No,,"No, not at all",No,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,I have a concentration and/or memory disorder ...,Appropriate in length,Easy,160500.0
83435,83436,I am a developer by profession,"Independent contractor, freelancer, or self-em...",Benin,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",4,2,"Developer, full-stack","Just me - I am a freelancer, sole proprietor, ...",XOF\tWest African CFA franc,200000.0,Monthly,,,Firebase;MariaDB;MySQL;PostgreSQL;Redis;SQLite,Firebase;MariaDB;MongoDB;MySQL;PostgreSQL;Redi...,,,Django;jQuery;Laravel;React.js;Ruby on Rails,Django;Express;jQuery;Laravel;React.js;Ruby on...,Flutter;Qt,,Git;Unity 3D;Unreal Engine,Docker;Git;Kubernetes,Android Studio;Eclipse;Emacs;IntelliJ;NetBeans...,Emacs;IntelliJ;PHPStorm;PyCharm;RStudio;Sublim...,Linux-based,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow;Stack Exchange,Multiple times per day,Yes,I have never participated in Q&A on Stack Over...,"Yes, somewhat",No,18-24 years old,Man,No,Straight / Heterosexual,Black or of African descent,None of the above,None of the above,Appropriate in length,Easy,3960.0
83436,83437,I am a developer by profession,Employed full-time,United States of America,New Jersey,,"Secondary school (e.g. American high school, G...",11 - 17 years,School,10,4,Data scientist or machine learning specialist;...,"10,000 or more employees",USD\tUnited States dollar,1800.0,Weekly,Groovy;Java;Python,Java;Python,DynamoDB;Elasticsearch;MongoDB;PostgreSQL;Redis,DynamoDB;Redis,AWS;Google Cloud Platform,AWS,FastAPI;Flask,FastAPI;Flask,Hadoop;Keras;NumPy;Pandas,Apache Spark;Hadoop;Keras;NumPy;Pandas;TensorFlow,Ansible;Docker;Git;Terraform,Docker;Git;Kubernetes;Terraform,Android Studio;Eclipse;IntelliJ;IPython/Jupyte...,IntelliJ;IPython/Jupyter;Notepad++;Vim,Windows,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow;Stack Exchange,A few times per week,Yes,I have never participated in Q&A on Stack Over...,"No, not really",No,25-34 years old,Man,No,,White or of European descent,None of the above,None of the above,Appropriate in length,Neither easy nor difficult,90000.0
83437,83438,I am a developer by profession,Employed full-time,Canada,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,Online Courses or Certification;Books / Physic...,5,3,"Developer, back-end",20 to 99 employees,CAD\tCanadian dollar,90000.0,Monthly,Bash/Shell;JavaScript;Node.js;Python,Go;Rust,Cassandra;Elasticsearch;MongoDB;PostgreSQL;Redis,,Heroku,AWS;DigitalOcean,Django;Express;Flask;React.js,,NumPy;Pandas;TensorFlow;Torch/PyTorch,NumPy;Pandas;TensorFlow;Torch/PyTorch,Ansible;Docker;Git;Terraform,Kubernetes;Terraform,PyCharm;Sublime Text,,MacOS,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow,A few times per month or weekly,Yes,Less than once per month or monthly,"No, not really",No,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,I have a mood or emotional disorder (e.g. depr...,Appropriate in length,Neither easy nor difficult,816816.0


In [8]:
# In order to see what question each of these columns refer to, we need to load the schema.
schema_df = pd.read_csv(SCHEMA_PATH)
schema_df

Unnamed: 0,qid,qname,question,force_resp,type,selector
0,QID16,S0,"<div><span style=""font-size:19px;""><strong>Hel...",False,DB,TB
1,QID12,MetaInfo,Browser Meta Info,False,Meta,Browser
2,QID1,S1,"<span style=""font-size:22px; font-family: aria...",False,DB,TB
3,QID2,MainBranch,Which of the following options best describes ...,True,MC,SAVR
4,QID24,Employment,Which of the following best describes your cur...,False,MC,MAVR
5,QID6,Country,"Where do you live? <span style=""font-weight: b...",True,MC,DL
6,QID7,US_State,<p>In which state or territory of the USA do y...,False,MC,DL
7,QID9,UK_Country,In which part of the United Kingdom do you liv...,False,MC,DL
8,QID190,S2,"<span style=""font-size:22px; font-family: aria...",False,DB,TB
9,QID25,EdLevel,Which of the following best describes the high...,False,MC,SAVR


In [9]:
# See the first 5 rows
df.head()

Unnamed: 0,ResponseId,MainBranch,Employment,Country,US_State,UK_Country,EdLevel,Age1stCode,LearnCode,YearsCode,YearsCodePro,DevType,OrgSize,Currency,CompTotal,CompFreq,LanguageHaveWorkedWith,LanguageWantToWorkWith,DatabaseHaveWorkedWith,DatabaseWantToWorkWith,PlatformHaveWorkedWith,PlatformWantToWorkWith,WebframeHaveWorkedWith,WebframeWantToWorkWith,MiscTechHaveWorkedWith,MiscTechWantToWorkWith,ToolsTechHaveWorkedWith,ToolsTechWantToWorkWith,NEWCollabToolsHaveWorkedWith,NEWCollabToolsWantToWorkWith,OpSys,NEWStuck,NEWSOSites,SOVisitFreq,SOAccount,SOPartFreq,SOComm,NEWOtherComms,Age,Gender,Trans,Sexuality,Ethnicity,Accessibility,MentalHealth,SurveyLength,SurveyEase,ConvertedCompYearly
0,1,I am a developer by profession,"Independent contractor, freelancer, or self-em...",Slovakia,,,"Secondary school (e.g. American high school, G...",18 - 24 years,Coding Bootcamp;Other online resources (ex: vi...,,,"Developer, mobile",20 to 99 employees,EUR European Euro,4800.0,Monthly,C++;HTML/CSS;JavaScript;Objective-C;PHP;Swift,Swift,PostgreSQL;SQLite,SQLite,,,Laravel;Symfony,,,,,,PHPStorm;Xcode,Atom;Xcode,MacOS,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow,Multiple times per day,Yes,A few times per month or weekly,"Yes, definitely",No,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,62268.0
1,2,I am a student who is learning to code,"Student, full-time",Netherlands,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",7.0,,,,,,,JavaScript;Python,,PostgreSQL,,,,Angular;Flask;Vue.js,,Cordova,,Docker;Git;Yarn,Git,Android Studio;IntelliJ;Notepad++;PyCharm,,Windows,Visit Stack Overflow;Google it,Stack Overflow,Daily or almost daily,Yes,Daily or almost daily,"Yes, definitely",No,18-24 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,
2,3,"I am not primarily a developer, but I write co...","Student, full-time",Russian Federation,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",,,,,,,,Assembly;C;Python;R;Rust,Julia;Python;Rust,SQLite,SQLite,Heroku,,Flask,Flask,NumPy;Pandas;TensorFlow;Torch/PyTorch,Keras;NumPy;Pandas;TensorFlow;Torch/PyTorch,,,IPython/Jupyter;PyCharm;RStudio;Sublime Text;V...,IPython/Jupyter;RStudio;Sublime Text;Visual St...,MacOS,Visit Stack Overflow;Google it;Watch help / tu...,Stack Overflow;Stack Exchange,Multiple times per day,Yes,Multiple times per day,"Yes, definitely",Yes,18-24 years old,Man,No,Prefer not to say,Prefer not to say,None of the above,None of the above,Appropriate in length,Easy,
3,4,I am a developer by profession,Employed full-time,Austria,,,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",11 - 17 years,,,,"Developer, front-end",100 to 499 employees,EUR European Euro,,Monthly,JavaScript;TypeScript,JavaScript;TypeScript,,,,,Angular;jQuery,Angular;jQuery,,,,,,,Windows,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow,Daily or almost daily,Yes,Daily or almost daily,Neutral,No,35-44 years old,Man,No,Straight / Heterosexual,White or of European descent,I am deaf / hard of hearing,,Appropriate in length,Neither easy nor difficult,
4,5,I am a developer by profession,"Independent contractor, freelancer, or self-em...",United Kingdom of Great Britain and Northern I...,,England,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",5 - 10 years,Friend or family member,17.0,10.0,"Developer, desktop or enterprise applications;...","Just me - I am a freelancer, sole proprietor, ...",GBP\tPound sterling,,,Bash/Shell;HTML/CSS;Python;SQL,Bash/Shell;HTML/CSS;Python;SQL,Elasticsearch;PostgreSQL;Redis,Cassandra;Elasticsearch;PostgreSQL;Redis,,,Flask,Flask,Apache Spark;Hadoop;NumPy;Pandas,Hadoop;NumPy;Pandas,Docker;Git;Kubernetes;Yarn,Docker;Git;Kubernetes;Yarn,Atom;IPython/Jupyter;Notepad++;PyCharm;Vim,Atom;IPython/Jupyter;Notepad++;PyCharm;Vim;Vis...,Linux-based,Visit Stack Overflow;Go for a walk or other ph...,Stack Overflow;Stack Exchange,Daily or almost daily,Yes,A few times per week,"Yes, somewhat",No,25-34 years old,Man,No,,White or of European descent,None of the above,,Appropriate in length,Easy,


In [10]:
# See the first n rows
df.head(10)

Unnamed: 0,ResponseId,MainBranch,Employment,Country,US_State,UK_Country,EdLevel,Age1stCode,LearnCode,YearsCode,YearsCodePro,DevType,OrgSize,Currency,CompTotal,CompFreq,LanguageHaveWorkedWith,LanguageWantToWorkWith,DatabaseHaveWorkedWith,DatabaseWantToWorkWith,PlatformHaveWorkedWith,PlatformWantToWorkWith,WebframeHaveWorkedWith,WebframeWantToWorkWith,MiscTechHaveWorkedWith,MiscTechWantToWorkWith,ToolsTechHaveWorkedWith,ToolsTechWantToWorkWith,NEWCollabToolsHaveWorkedWith,NEWCollabToolsWantToWorkWith,OpSys,NEWStuck,NEWSOSites,SOVisitFreq,SOAccount,SOPartFreq,SOComm,NEWOtherComms,Age,Gender,Trans,Sexuality,Ethnicity,Accessibility,MentalHealth,SurveyLength,SurveyEase,ConvertedCompYearly
0,1,I am a developer by profession,"Independent contractor, freelancer, or self-em...",Slovakia,,,"Secondary school (e.g. American high school, G...",18 - 24 years,Coding Bootcamp;Other online resources (ex: vi...,,,"Developer, mobile",20 to 99 employees,EUR European Euro,4800.0,Monthly,C++;HTML/CSS;JavaScript;Objective-C;PHP;Swift,Swift,PostgreSQL;SQLite,SQLite,,,Laravel;Symfony,,,,,,PHPStorm;Xcode,Atom;Xcode,MacOS,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow,Multiple times per day,Yes,A few times per month or weekly,"Yes, definitely",No,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,62268.0
1,2,I am a student who is learning to code,"Student, full-time",Netherlands,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",7.0,,,,,,,JavaScript;Python,,PostgreSQL,,,,Angular;Flask;Vue.js,,Cordova,,Docker;Git;Yarn,Git,Android Studio;IntelliJ;Notepad++;PyCharm,,Windows,Visit Stack Overflow;Google it,Stack Overflow,Daily or almost daily,Yes,Daily or almost daily,"Yes, definitely",No,18-24 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,
2,3,"I am not primarily a developer, but I write co...","Student, full-time",Russian Federation,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",,,,,,,,Assembly;C;Python;R;Rust,Julia;Python;Rust,SQLite,SQLite,Heroku,,Flask,Flask,NumPy;Pandas;TensorFlow;Torch/PyTorch,Keras;NumPy;Pandas;TensorFlow;Torch/PyTorch,,,IPython/Jupyter;PyCharm;RStudio;Sublime Text;V...,IPython/Jupyter;RStudio;Sublime Text;Visual St...,MacOS,Visit Stack Overflow;Google it;Watch help / tu...,Stack Overflow;Stack Exchange,Multiple times per day,Yes,Multiple times per day,"Yes, definitely",Yes,18-24 years old,Man,No,Prefer not to say,Prefer not to say,None of the above,None of the above,Appropriate in length,Easy,
3,4,I am a developer by profession,Employed full-time,Austria,,,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",11 - 17 years,,,,"Developer, front-end",100 to 499 employees,EUR European Euro,,Monthly,JavaScript;TypeScript,JavaScript;TypeScript,,,,,Angular;jQuery,Angular;jQuery,,,,,,,Windows,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow,Daily or almost daily,Yes,Daily or almost daily,Neutral,No,35-44 years old,Man,No,Straight / Heterosexual,White or of European descent,I am deaf / hard of hearing,,Appropriate in length,Neither easy nor difficult,
4,5,I am a developer by profession,"Independent contractor, freelancer, or self-em...",United Kingdom of Great Britain and Northern I...,,England,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",5 - 10 years,Friend or family member,17.0,10.0,"Developer, desktop or enterprise applications;...","Just me - I am a freelancer, sole proprietor, ...",GBP\tPound sterling,,,Bash/Shell;HTML/CSS;Python;SQL,Bash/Shell;HTML/CSS;Python;SQL,Elasticsearch;PostgreSQL;Redis,Cassandra;Elasticsearch;PostgreSQL;Redis,,,Flask,Flask,Apache Spark;Hadoop;NumPy;Pandas,Hadoop;NumPy;Pandas,Docker;Git;Kubernetes;Yarn,Docker;Git;Kubernetes;Yarn,Atom;IPython/Jupyter;Notepad++;PyCharm;Vim,Atom;IPython/Jupyter;Notepad++;PyCharm;Vim;Vis...,Linux-based,Visit Stack Overflow;Go for a walk or other ph...,Stack Overflow;Stack Exchange,Daily or almost daily,Yes,A few times per week,"Yes, somewhat",No,25-34 years old,Man,No,,White or of European descent,None of the above,,Appropriate in length,Easy,
5,6,I am a student who is learning to code,"Student, part-time",United States of America,Georgia,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",,,,,,,,C;C#;C++;HTML/CSS;Java;JavaScript;Node.js;Powe...,C#;C++;Go;HTML/CSS;Java;JavaScript;Node.js;Obj...,MySQL;PostgreSQL;SQLite,Elasticsearch;Firebase;IBM DB2;MariaDB;Microso...,,,Express;Flask;jQuery;React.js,Express;Flask;jQuery;React.js,Keras;NumPy;Pandas;TensorFlow;Torch/PyTorch,Keras;NumPy;Pandas;Qt;React Native;TensorFlow;...,Git,Docker;Git;Unity 3D;Unreal Engine,IPython/Jupyter;Notepad++;PyCharm;Sublime Text...,Android Studio;IPython/Jupyter;Vim;Visual Stud...,Windows,Visit Stack Overflow;Go for a walk or other ph...,Stack Overflow;Stack Exchange,Multiple times per day,Yes,I have never participated in Q&A on Stack Over...,"Yes, somewhat",No,18-24 years old,Prefer not to say,No,Straight / Heterosexual,Prefer not to say,None of the above,I have a concentration and/or memory disorder ...,Too long,Neither easy nor difficult,
6,7,I code primarily as a hobby,I prefer not to say,United States of America,New Hampshire,,"Secondary school (e.g. American high school, G...",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",3.0,,,,,,,HTML/CSS;JavaScript,HTML/CSS;JavaScript;PHP,,,,,jQuery,jQuery,,,,,,,Windows,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow,Daily or almost daily,Yes,A few times per week,"Yes, somewhat",No,Prefer not to say,Prefer not to say,No,,I don't know,None of the above,None of the above,Appropriate in length,Neither easy nor difficult,
7,8,I am a student who is learning to code,"Student, full-time",Malaysia,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,School;Online Courses or Certification,4.0,,,,,,,HTML/CSS;JavaScript;PHP;Ruby;SQL;TypeScript,Ruby,MySQL;PostgreSQL;SQLite,PostgreSQL,Heroku;Microsoft Azure,Heroku,Angular.js;jQuery;Ruby on Rails,Ruby on Rails,,,,,Sublime Text;Visual Studio Code,Sublime Text;Visual Studio Code,Linux-based,Visit Stack Overflow;Go for a walk or other ph...,Stack Overflow;Stack Exchange,Multiple times per day,Yes,I have never participated in Q&A on Stack Over...,"No, not at all",Yes,18-24 years old,Woman,No,Straight / Heterosexual,White or of European descent;Multiracial;South...,None of the above,None of the above,Appropriate in length,Easy,
8,9,I am a developer by profession,Employed part-time,India,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",18 - 24 years,Coding Bootcamp,6.0,4.0,"Developer, front-end","10,000 or more employees",INR\tIndian rupee,,Monthly,HTML/CSS;JavaScript,HTML/CSS;JavaScript,PostgreSQL,PostgreSQL,AWS,,Django;FastAPI,Django;FastAPI,,,Docker;Git,Docker;Git;Kubernetes,PyCharm;Sublime Text,PyCharm;Sublime Text,Windows,Visit Stack Overflow;Google it;Panic,Stack Overflow;Stack Exchange,A few times per week,Yes,Less than once per month or monthly,"Yes, definitely",No,25-34 years old,Man,No,,South Asian,,I have a concentration and/or memory disorder ...,Appropriate in length,Easy,
9,10,I am a developer by profession,Employed full-time,Sweden,,,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",11 - 17 years,School,7.0,4.0,Data scientist or machine learning specialist,10 to 19 employees,SEK\tSwedish krona,42000.0,Monthly,C++;Python,Haskell;Python,PostgreSQL,,,,,,Keras;NumPy;Pandas;TensorFlow;Torch/PyTorch,Keras;NumPy;TensorFlow;Torch/PyTorch,Git,Git,IPython/Jupyter;Vim;Visual Studio Code,Emacs;IPython/Jupyter;Vim;Visual Studio Code,Linux-based,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow;Stack Exchange,Multiple times per day,Yes,Daily or almost daily,"Yes, somewhat",No,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Neither easy nor difficult,51552.0


In [11]:
# See the last 5 rows
df.tail()

Unnamed: 0,ResponseId,MainBranch,Employment,Country,US_State,UK_Country,EdLevel,Age1stCode,LearnCode,YearsCode,YearsCodePro,DevType,OrgSize,Currency,CompTotal,CompFreq,LanguageHaveWorkedWith,LanguageWantToWorkWith,DatabaseHaveWorkedWith,DatabaseWantToWorkWith,PlatformHaveWorkedWith,PlatformWantToWorkWith,WebframeHaveWorkedWith,WebframeWantToWorkWith,MiscTechHaveWorkedWith,MiscTechWantToWorkWith,ToolsTechHaveWorkedWith,ToolsTechWantToWorkWith,NEWCollabToolsHaveWorkedWith,NEWCollabToolsWantToWorkWith,OpSys,NEWStuck,NEWSOSites,SOVisitFreq,SOAccount,SOPartFreq,SOComm,NEWOtherComms,Age,Gender,Trans,Sexuality,Ethnicity,Accessibility,MentalHealth,SurveyLength,SurveyEase,ConvertedCompYearly
83434,83435,I am a developer by profession,Employed full-time,United States of America,Texas,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",6,5,"Developer, back-end",20 to 99 employees,USD\tUnited States dollar,160500.0,Yearly,Clojure;Kotlin;SQL,Clojure,Oracle;SQLite,SQLite,AWS,AWS,,,,,Docker;Git,Git;Kubernetes,IntelliJ;Sublime Text;Vim;Visual Studio Code,Sublime Text;Vim,MacOS,Call a coworker or friend;Google it,Stack Overflow;Stack Exchange,A few times per week,No,,"No, not at all",No,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,I have a concentration and/or memory disorder ...,Appropriate in length,Easy,160500.0
83435,83436,I am a developer by profession,"Independent contractor, freelancer, or self-em...",Benin,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",4,2,"Developer, full-stack","Just me - I am a freelancer, sole proprietor, ...",XOF\tWest African CFA franc,200000.0,Monthly,,,Firebase;MariaDB;MySQL;PostgreSQL;Redis;SQLite,Firebase;MariaDB;MongoDB;MySQL;PostgreSQL;Redi...,,,Django;jQuery;Laravel;React.js;Ruby on Rails,Django;Express;jQuery;Laravel;React.js;Ruby on...,Flutter;Qt,,Git;Unity 3D;Unreal Engine,Docker;Git;Kubernetes,Android Studio;Eclipse;Emacs;IntelliJ;NetBeans...,Emacs;IntelliJ;PHPStorm;PyCharm;RStudio;Sublim...,Linux-based,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow;Stack Exchange,Multiple times per day,Yes,I have never participated in Q&A on Stack Over...,"Yes, somewhat",No,18-24 years old,Man,No,Straight / Heterosexual,Black or of African descent,None of the above,None of the above,Appropriate in length,Easy,3960.0
83436,83437,I am a developer by profession,Employed full-time,United States of America,New Jersey,,"Secondary school (e.g. American high school, G...",11 - 17 years,School,10,4,Data scientist or machine learning specialist;...,"10,000 or more employees",USD\tUnited States dollar,1800.0,Weekly,Groovy;Java;Python,Java;Python,DynamoDB;Elasticsearch;MongoDB;PostgreSQL;Redis,DynamoDB;Redis,AWS;Google Cloud Platform,AWS,FastAPI;Flask,FastAPI;Flask,Hadoop;Keras;NumPy;Pandas,Apache Spark;Hadoop;Keras;NumPy;Pandas;TensorFlow,Ansible;Docker;Git;Terraform,Docker;Git;Kubernetes;Terraform,Android Studio;Eclipse;IntelliJ;IPython/Jupyte...,IntelliJ;IPython/Jupyter;Notepad++;Vim,Windows,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow;Stack Exchange,A few times per week,Yes,I have never participated in Q&A on Stack Over...,"No, not really",No,25-34 years old,Man,No,,White or of European descent,None of the above,None of the above,Appropriate in length,Neither easy nor difficult,90000.0
83437,83438,I am a developer by profession,Employed full-time,Canada,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,Online Courses or Certification;Books / Physic...,5,3,"Developer, back-end",20 to 99 employees,CAD\tCanadian dollar,90000.0,Monthly,Bash/Shell;JavaScript;Node.js;Python,Go;Rust,Cassandra;Elasticsearch;MongoDB;PostgreSQL;Redis,,Heroku,AWS;DigitalOcean,Django;Express;Flask;React.js,,NumPy;Pandas;TensorFlow;Torch/PyTorch,NumPy;Pandas;TensorFlow;Torch/PyTorch,Ansible;Docker;Git;Terraform,Kubernetes;Terraform,PyCharm;Sublime Text,,MacOS,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow,A few times per month or weekly,Yes,Less than once per month or monthly,"No, not really",No,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,I have a mood or emotional disorder (e.g. depr...,Appropriate in length,Neither easy nor difficult,816816.0
83438,83439,I am a developer by profession,Employed full-time,Brazil,,,"Professional degree (JD, MD, etc.)",11 - 17 years,School,14,4,"Developer, front-end;Developer, full-stack;Dev...",I don’t know,BRL\tBrazilian real,7700.0,Monthly,Delphi;Elixir;HTML/CSS;Java;JavaScript,Elixir;HTML/CSS;Java;JavaScript;Node.js;PHP;SQ...,Oracle;PostgreSQL,Elasticsearch;MongoDB;MySQL;Oracle;PostgreSQL;...,Microsoft Azure,AWS,Angular;Spring,Express;Laravel;Spring;Symfony,,,Docker;Git,Docker;Git;Kubernetes,IntelliJ;Visual Studio Code,IntelliJ;PHPStorm;Visual Studio Code,Linux-based,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow;Stack Exchange;Stack Overflow f...,A few times per week,Yes,A few times per week,"Yes, somewhat",No,18-24 years old,Man,No,Straight / Heterosexual,Hispanic or Latino/a/x,None of the above,None of the above,Appropriate in length,Easy,21168.0


In [12]:
# See the last 10 rows
df.tail(10)

Unnamed: 0,ResponseId,MainBranch,Employment,Country,US_State,UK_Country,EdLevel,Age1stCode,LearnCode,YearsCode,YearsCodePro,DevType,OrgSize,Currency,CompTotal,CompFreq,LanguageHaveWorkedWith,LanguageWantToWorkWith,DatabaseHaveWorkedWith,DatabaseWantToWorkWith,PlatformHaveWorkedWith,PlatformWantToWorkWith,WebframeHaveWorkedWith,WebframeWantToWorkWith,MiscTechHaveWorkedWith,MiscTechWantToWorkWith,ToolsTechHaveWorkedWith,ToolsTechWantToWorkWith,NEWCollabToolsHaveWorkedWith,NEWCollabToolsWantToWorkWith,OpSys,NEWStuck,NEWSOSites,SOVisitFreq,SOAccount,SOPartFreq,SOComm,NEWOtherComms,Age,Gender,Trans,Sexuality,Ethnicity,Accessibility,MentalHealth,SurveyLength,SurveyEase,ConvertedCompYearly
83429,83430,I code primarily as a hobby,"Not employed, but looking for work",United States of America,Washington,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",18 - 24 years,"Other online resources (ex: videos, blogs, etc)",6,,Other (please specify):;Student,,,,,HTML/CSS;PHP;PowerShell;Python;SQL;VBA,C#;Go;HTML/CSS;Java;JavaScript;PHP;Python;SQL;VBA,MongoDB;MySQL;PostgreSQL,MariaDB;Microsoft SQL Server;MongoDB;MySQL;Pos...,Heroku,AWS;Heroku;IBM Cloud or Watson;Microsoft Azure...,Django;Flask;jQuery,Angular.js;ASP.NET;ASP.NET Core ;Django;FastAP...,NumPy;Pandas,.NET Framework;.NET Core / .NET 5;NumPy;Pandas,Git,Docker;Git;Xamarin,Atom;Notepad++;Sublime Text;Visual Studio Code,Atom;Notepad++;Sublime Text;Visual Studio Code,Windows,Visit Stack Overflow;Google it;Watch help / tu...,Stack Overflow;Stack Exchange,A few times per month or weekly,Not sure/can't remember,,"Yes, somewhat",No,25-34 years old,"Man;Or, in your own words:",Yes,Queer,White or of European descent,I am unable to / find it difficult to walk or ...,I have an anxiety disorder,Appropriate in length,Neither easy nor difficult,
83430,83431,I am a developer by profession,Employed full-time,United States of America,Illinois,,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",23,21,"Developer, front-end;Developer, full-stack;Dev...",10 to 19 employees,USD\tUnited States dollar,125000.0,Yearly,APL;Clojure;LISP;Python;Ruby;SQL;TypeScript,APL;Clojure;Haskell;LISP;R,MongoDB;MySQL;PostgreSQL;Redis,PostgreSQL,AWS;DigitalOcean;Google Cloud Platform;Heroku,AWS;DigitalOcean;Google Cloud Platform,Django;React.js;Ruby on Rails,,,,Docker;Git,Docker;Git;Kubernetes;Unity 3D,Emacs;Vim,Emacs,Linux-based,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow;Stack Exchange,A few times per week,No,,"No, not really",No,45-54 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,125000.0
83431,83432,I am a developer by profession,"Independent contractor, freelancer, or self-em...",Pakistan,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",9,4,"Developer, mobile;Developer, desktop or enterp...",2 to 9 employees,PKR\tPakistani rupee,150000.0,Monthly,C#;Dart;HTML/CSS;Java;JavaScript;Kotlin;Node.j...,C#;Dart;HTML/CSS;JavaScript;Kotlin;Node.js;Pyt...,Firebase;MySQL;SQLite,DynamoDB;Firebase;MongoDB;MySQL;SQLite,Google Cloud Platform,AWS;Google Cloud Platform,Flask;jQuery,Angular;Django;Flask;jQuery;Laravel,Flutter,Flutter;Hadoop;NumPy;TensorFlow;Torch/PyTorch,Git,Docker;Git,Android Studio;IntelliJ;IPython/Jupyter;Notepa...,Android Studio;IntelliJ;IPython/Jupyter;PyChar...,Windows,Visit Stack Overflow;Go for a walk or other ph...,Stack Overflow;Stack Exchange,Multiple times per day,Yes,A few times per month or weekly,"Yes, definitely",No,18-24 years old,Man,No,Straight / Heterosexual,Southeast Asian,None of the above,None of the above,Appropriate in length,Easy,11676.0
83432,83433,I am a developer by profession,Employed full-time,Canada,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",18 - 24 years,School,5,Less than 1 year,"Developer, back-end","10,000 or more employees",CAD\tCanadian dollar,106000.0,Yearly,Ruby,Java;Ruby;TypeScript,MySQL;PostgreSQL,,Google Cloud Platform;Heroku,,Flask;React.js;Ruby on Rails;Vue.js,,NumPy;Pandas;TensorFlow;Torch/PyTorch,,Docker;Git;Kubernetes;Yarn,,Atom;IPython/Jupyter;Vim;Visual Studio Code,,MacOS,Visit Stack Overflow;Go for a walk or other ph...,Stack Overflow;Stack Exchange,Daily or almost daily,No,,"No, not really",No,18-24 years old,Woman,No,Straight / Heterosexual,East Asian,None of the above,None of the above,Appropriate in length,Easy,80169.0
83433,83434,I am a developer by profession,"Independent contractor, freelancer, or self-em...",Brazil,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",18 - 24 years,Online Forum;Online Courses or Certification;O...,15,11,"Developer, mobile;Developer, front-end;Develop...","1,000 to 4,999 employees",BRL\tBrazilian real,80000.0,Yearly,Java;JavaScript;Kotlin;Objective-C;TypeScript,Kotlin,Firebase;MongoDB;MySQL;SQLite,,,,React.js,,React Native,Flutter,Docker;Git;Yarn,Docker;Git,Android Studio;Visual Studio Code;Xcode,Android Studio;Xcode,MacOS,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow,Daily or almost daily,Yes,A few times per month or weekly,"Yes, somewhat",No,25-34 years old,Man,No,Straight / Heterosexual,Hispanic or Latino/a/x,None of the above,I have an anxiety disorder,Appropriate in length,Easy,18326.0
83434,83435,I am a developer by profession,Employed full-time,United States of America,Texas,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",6,5,"Developer, back-end",20 to 99 employees,USD\tUnited States dollar,160500.0,Yearly,Clojure;Kotlin;SQL,Clojure,Oracle;SQLite,SQLite,AWS,AWS,,,,,Docker;Git,Git;Kubernetes,IntelliJ;Sublime Text;Vim;Visual Studio Code,Sublime Text;Vim,MacOS,Call a coworker or friend;Google it,Stack Overflow;Stack Exchange,A few times per week,No,,"No, not at all",No,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,I have a concentration and/or memory disorder ...,Appropriate in length,Easy,160500.0
83435,83436,I am a developer by profession,"Independent contractor, freelancer, or self-em...",Benin,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",4,2,"Developer, full-stack","Just me - I am a freelancer, sole proprietor, ...",XOF\tWest African CFA franc,200000.0,Monthly,,,Firebase;MariaDB;MySQL;PostgreSQL;Redis;SQLite,Firebase;MariaDB;MongoDB;MySQL;PostgreSQL;Redi...,,,Django;jQuery;Laravel;React.js;Ruby on Rails,Django;Express;jQuery;Laravel;React.js;Ruby on...,Flutter;Qt,,Git;Unity 3D;Unreal Engine,Docker;Git;Kubernetes,Android Studio;Eclipse;Emacs;IntelliJ;NetBeans...,Emacs;IntelliJ;PHPStorm;PyCharm;RStudio;Sublim...,Linux-based,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow;Stack Exchange,Multiple times per day,Yes,I have never participated in Q&A on Stack Over...,"Yes, somewhat",No,18-24 years old,Man,No,Straight / Heterosexual,Black or of African descent,None of the above,None of the above,Appropriate in length,Easy,3960.0
83436,83437,I am a developer by profession,Employed full-time,United States of America,New Jersey,,"Secondary school (e.g. American high school, G...",11 - 17 years,School,10,4,Data scientist or machine learning specialist;...,"10,000 or more employees",USD\tUnited States dollar,1800.0,Weekly,Groovy;Java;Python,Java;Python,DynamoDB;Elasticsearch;MongoDB;PostgreSQL;Redis,DynamoDB;Redis,AWS;Google Cloud Platform,AWS,FastAPI;Flask,FastAPI;Flask,Hadoop;Keras;NumPy;Pandas,Apache Spark;Hadoop;Keras;NumPy;Pandas;TensorFlow,Ansible;Docker;Git;Terraform,Docker;Git;Kubernetes;Terraform,Android Studio;Eclipse;IntelliJ;IPython/Jupyte...,IntelliJ;IPython/Jupyter;Notepad++;Vim,Windows,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow;Stack Exchange,A few times per week,Yes,I have never participated in Q&A on Stack Over...,"No, not really",No,25-34 years old,Man,No,,White or of European descent,None of the above,None of the above,Appropriate in length,Neither easy nor difficult,90000.0
83437,83438,I am a developer by profession,Employed full-time,Canada,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,Online Courses or Certification;Books / Physic...,5,3,"Developer, back-end",20 to 99 employees,CAD\tCanadian dollar,90000.0,Monthly,Bash/Shell;JavaScript;Node.js;Python,Go;Rust,Cassandra;Elasticsearch;MongoDB;PostgreSQL;Redis,,Heroku,AWS;DigitalOcean,Django;Express;Flask;React.js,,NumPy;Pandas;TensorFlow;Torch/PyTorch,NumPy;Pandas;TensorFlow;Torch/PyTorch,Ansible;Docker;Git;Terraform,Kubernetes;Terraform,PyCharm;Sublime Text,,MacOS,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow,A few times per month or weekly,Yes,Less than once per month or monthly,"No, not really",No,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,I have a mood or emotional disorder (e.g. depr...,Appropriate in length,Neither easy nor difficult,816816.0
83438,83439,I am a developer by profession,Employed full-time,Brazil,,,"Professional degree (JD, MD, etc.)",11 - 17 years,School,14,4,"Developer, front-end;Developer, full-stack;Dev...",I don’t know,BRL\tBrazilian real,7700.0,Monthly,Delphi;Elixir;HTML/CSS;Java;JavaScript,Elixir;HTML/CSS;Java;JavaScript;Node.js;PHP;SQ...,Oracle;PostgreSQL,Elasticsearch;MongoDB;MySQL;Oracle;PostgreSQL;...,Microsoft Azure,AWS,Angular;Spring,Express;Laravel;Spring;Symfony,,,Docker;Git,Docker;Git;Kubernetes,IntelliJ;Visual Studio Code,IntelliJ;PHPStorm;Visual Studio Code,Linux-based,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow;Stack Exchange;Stack Overflow f...,A few times per week,Yes,A few times per week,"Yes, somewhat",No,18-24 years old,Man,No,Straight / Heterosexual,Hispanic or Latino/a/x,None of the above,None of the above,Appropriate in length,Easy,21168.0


## Selecting Rows and Columns

If we wanted to store information about a person, we could use the following format.

In [13]:
person = {
    'first': 'SimoFirst',
    'last': 'SimoLast',
    'email': 's.e.hristov99@gmail.com'
}

However, we can't store information about multiple people. For that, we would need this approach.

In [14]:
people = {
    'first': ['SimoFirst', 'Jane', 'John'],
    'last': ['SimoLast', 'Doe', 'Doe'],
    'email': ['s.e.hristov99@gmail.com', 'JaneDoe@gmail.com', 'JohnDoe@gmail.com']
}

We can think of these as rows and columns. They keys are the columns and the values are the rows. Usually, the definition for a dataframe object in pandas is `two-dimensional data structure`. If that sounds confusing, think of it as multiple rows and columns.

In [15]:
# If we wanted to see the `email` column, we would access the `email` key in the `people` dictionary.
people['email']

['s.e.hristov99@gmail.com', 'JaneDoe@gmail.com', 'JohnDoe@gmail.com']

In [16]:
# Let's create a dataframe from the `people` dictionary!
# Note: The left-most column is called an idex. Think of it as the primary key for a database table. More on that later.
ppl_df = pd.DataFrame(people)
ppl_df

Unnamed: 0,first,last,email
0,SimoFirst,SimoLast,s.e.hristov99@gmail.com
1,Jane,Doe,JaneDoe@gmail.com
2,John,Doe,JohnDoe@gmail.com


In [17]:
# Get all the values in a column.
ppl_df['email'] # Same as ppl_df.email

0    s.e.hristov99@gmail.com
1          JaneDoe@gmail.com
2          JohnDoe@gmail.com
Name: email, dtype: object

In [18]:
# Note: The return type is actually a pandas.Series object. A Series is a list of data with more information and functionality. You can think of it as the values of a single column.
type(ppl_df['email'])

pandas.core.series.Series

> **Conclusion**: We can then say that a DataFrame is a container for Series objects.

In [19]:
# Get the rows of multiple columns
ppl_df[['last', 'email']]

Unnamed: 0,last,email
0,SimoLast,s.e.hristov99@gmail.com
1,Doe,JaneDoe@gmail.com
2,Doe,JohnDoe@gmail.com


In [20]:
# Note: The return type is not a Series. It's a new DataFrame.
type(ppl_df[['last', 'email']])

pandas.core.frame.DataFrame

In [21]:
# Get the names of all columns.
ppl_df.columns

Index(['first', 'last', 'email'], dtype='object')

In order to get specific rows, we use the:
- **iloc** method: access rows by integer location, i.e. like indexing an array in C++.
- **loc** method: access by the dataframe index index.

### iloc

In [22]:
# It returns a Series with all the values for the 0th row.
ppl_df.iloc[0]

first                  SimoFirst
last                    SimoLast
email    s.e.hristov99@gmail.com
Name: 0, dtype: object

In [23]:
# Get the values of n rows.
ppl_df.iloc[[0, -1]]

Unnamed: 0,first,last,email
0,SimoFirst,SimoLast,s.e.hristov99@gmail.com
2,John,Doe,JohnDoe@gmail.com


In [24]:
# We can also get the values of a certain column.
ppl_df.iloc[[0, -1], -1]

0    s.e.hristov99@gmail.com
2          JohnDoe@gmail.com
Name: email, dtype: object

### loc

In [25]:
# While the `iloc` method indexes the rows as intergers from 0 .. len - 1
# loc indexes the rows by the left-most column.
# Notice how there's no name above it.
ppl_df

Unnamed: 0,first,last,email
0,SimoFirst,SimoLast,s.e.hristov99@gmail.com
1,Jane,Doe,JaneDoe@gmail.com
2,John,Doe,JohnDoe@gmail.com


In [26]:
ppl_df.loc[0]

first                  SimoFirst
last                    SimoLast
email    s.e.hristov99@gmail.com
Name: 0, dtype: object

In [27]:
ppl_df.loc[[0, 2]]

Unnamed: 0,first,last,email
0,SimoFirst,SimoLast,s.e.hristov99@gmail.com
2,John,Doe,JohnDoe@gmail.com


In [28]:
ppl_df.loc[[0, 2], 'email']

0    s.e.hristov99@gmail.com
2          JohnDoe@gmail.com
Name: email, dtype: object

In [29]:
ppl_df.loc[[0, 2], ['email', 'first']]

Unnamed: 0,email,first
0,s.e.hristov99@gmail.com,SimoFirst
2,JohnDoe@gmail.com,John


### **Question**: How many people are there in each employment type?

In [30]:
df.columns

Index(['ResponseId', 'MainBranch', 'Employment', 'Country', 'US_State',
       'UK_Country', 'EdLevel', 'Age1stCode', 'LearnCode', 'YearsCode',
       'YearsCodePro', 'DevType', 'OrgSize', 'Currency', 'CompTotal',
       'CompFreq', 'LanguageHaveWorkedWith', 'LanguageWantToWorkWith',
       'DatabaseHaveWorkedWith', 'DatabaseWantToWorkWith',
       'PlatformHaveWorkedWith', 'PlatformWantToWorkWith',
       'WebframeHaveWorkedWith', 'WebframeWantToWorkWith',
       'MiscTechHaveWorkedWith', 'MiscTechWantToWorkWith',
       'ToolsTechHaveWorkedWith', 'ToolsTechWantToWorkWith',
       'NEWCollabToolsHaveWorkedWith', 'NEWCollabToolsWantToWorkWith', 'OpSys',
       'NEWStuck', 'NEWSOSites', 'SOVisitFreq', 'SOAccount', 'SOPartFreq',
       'SOComm', 'NEWOtherComms', 'Age', 'Gender', 'Trans', 'Sexuality',
       'Ethnicity', 'Accessibility', 'MentalHealth', 'SurveyLength',
       'SurveyEase', 'ConvertedCompYearly'],
      dtype='object')

In [31]:
df['Employment']

0        Independent contractor, freelancer, or self-em...
1                                       Student, full-time
2                                       Student, full-time
3                                       Employed full-time
4        Independent contractor, freelancer, or self-em...
                               ...                        
83434                                   Employed full-time
83435    Independent contractor, freelancer, or self-em...
83436                                   Employed full-time
83437                                   Employed full-time
83438                                   Employed full-time
Name: Employment, Length: 83439, dtype: object

In [32]:
df['Employment'].value_counts()

Employed full-time                                      53584
Student, full-time                                      11781
Independent contractor, freelancer, or self-employed     8041
Not employed, but looking for work                       2961
Employed part-time                                       2461
Student, part-time                                       2051
Not employed, and not looking for work                   1228
I prefer not to say                                       890
Retired                                                   326
Name: Employment, dtype: int64

In [33]:
# Get the responce of the first person for the `employment` question.
df.loc[0, 'Employment']

'Independent contractor, freelancer, or self-employed'

In [34]:
# Get the responce of the first three people for the `employment` question.
# via a list
df.loc[[0, 1, 2], 'Employment']

0    Independent contractor, freelancer, or self-em...
1                                   Student, full-time
2                                   Student, full-time
Name: Employment, dtype: object

In [35]:
# via slicing
# Note: The last element when using `loc` is inclusive.
df.loc[0:2, 'Employment']

0    Independent contractor, freelancer, or self-em...
1                                   Student, full-time
2                                   Student, full-time
Name: Employment, dtype: object

In [36]:
# Get the values of the first three respondents for the columns from `employment` to `edlevel`
df.loc[0:2, 'Employment':'EdLevel']

Unnamed: 0,Employment,Country,US_State,UK_Country,EdLevel
0,"Independent contractor, freelancer, or self-em...",Slovakia,,,"Secondary school (e.g. American high school, G..."
1,"Student, full-time",Netherlands,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)"
2,"Student, full-time",Russian Federation,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)"


## Indexes - Set, Reset, and Use

In [37]:
ppl_df

Unnamed: 0,first,last,email
0,SimoFirst,SimoLast,s.e.hristov99@gmail.com
1,Jane,Doe,JaneDoe@gmail.com
2,John,Doe,JohnDoe@gmail.com


In [38]:
ppl_df.index

RangeIndex(start=0, stop=3, step=1)

In [39]:
# Set the email addresses as an index.
ppl_df.set_index('email')

Unnamed: 0_level_0,first,last
email,Unnamed: 1_level_1,Unnamed: 2_level_1
s.e.hristov99@gmail.com,SimoFirst,SimoLast
JaneDoe@gmail.com,Jane,Doe
JohnDoe@gmail.com,John,Doe


In [40]:
# Note: Our dataframe didn't actually change because `set_index` returns a new dataframe.
ppl_df

Unnamed: 0,first,last,email
0,SimoFirst,SimoLast,s.e.hristov99@gmail.com
1,Jane,Doe,JaneDoe@gmail.com
2,John,Doe,JohnDoe@gmail.com


In [41]:
ppl_df = ppl_df.set_index('email') # Same as ppl_df.set_index('email', inplace=True)
ppl_df

Unnamed: 0_level_0,first,last
email,Unnamed: 1_level_1,Unnamed: 2_level_1
s.e.hristov99@gmail.com,SimoFirst,SimoLast
JaneDoe@gmail.com,Jane,Doe
JohnDoe@gmail.com,John,Doe


In [42]:
ppl_df.index

Index(['s.e.hristov99@gmail.com', 'JaneDoe@gmail.com', 'JohnDoe@gmail.com'], dtype='object', name='email')

In [43]:
# Get the last name of Jane using `loc`
ppl_df.loc['JaneDoe@gmail.com', 'last']

'Doe'

In [44]:
# Get the last name of Jane using `iloc`
ppl_df.iloc[1, 1]
# or: ppl_df.iloc[1, -1]

'Doe'

In [45]:
# ppl_df.loc[0] # This will no longer work, since the index is now a string.
ppl_df.iloc[0]

first    SimoFirst
last      SimoLast
Name: s.e.hristov99@gmail.com, dtype: object

In [46]:
# Reset the idex.
ppl_df = ppl_df.reset_index()
ppl_df

Unnamed: 0,email,first,last
0,s.e.hristov99@gmail.com,SimoFirst,SimoLast
1,JaneDoe@gmail.com,Jane,Doe
2,JohnDoe@gmail.com,John,Doe


In [47]:
df

Unnamed: 0,ResponseId,MainBranch,Employment,Country,US_State,UK_Country,EdLevel,Age1stCode,LearnCode,YearsCode,YearsCodePro,DevType,OrgSize,Currency,CompTotal,CompFreq,LanguageHaveWorkedWith,LanguageWantToWorkWith,DatabaseHaveWorkedWith,DatabaseWantToWorkWith,PlatformHaveWorkedWith,PlatformWantToWorkWith,WebframeHaveWorkedWith,WebframeWantToWorkWith,MiscTechHaveWorkedWith,MiscTechWantToWorkWith,ToolsTechHaveWorkedWith,ToolsTechWantToWorkWith,NEWCollabToolsHaveWorkedWith,NEWCollabToolsWantToWorkWith,OpSys,NEWStuck,NEWSOSites,SOVisitFreq,SOAccount,SOPartFreq,SOComm,NEWOtherComms,Age,Gender,Trans,Sexuality,Ethnicity,Accessibility,MentalHealth,SurveyLength,SurveyEase,ConvertedCompYearly
0,1,I am a developer by profession,"Independent contractor, freelancer, or self-em...",Slovakia,,,"Secondary school (e.g. American high school, G...",18 - 24 years,Coding Bootcamp;Other online resources (ex: vi...,,,"Developer, mobile",20 to 99 employees,EUR European Euro,4800.0,Monthly,C++;HTML/CSS;JavaScript;Objective-C;PHP;Swift,Swift,PostgreSQL;SQLite,SQLite,,,Laravel;Symfony,,,,,,PHPStorm;Xcode,Atom;Xcode,MacOS,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow,Multiple times per day,Yes,A few times per month or weekly,"Yes, definitely",No,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,62268.0
1,2,I am a student who is learning to code,"Student, full-time",Netherlands,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",7,,,,,,,JavaScript;Python,,PostgreSQL,,,,Angular;Flask;Vue.js,,Cordova,,Docker;Git;Yarn,Git,Android Studio;IntelliJ;Notepad++;PyCharm,,Windows,Visit Stack Overflow;Google it,Stack Overflow,Daily or almost daily,Yes,Daily or almost daily,"Yes, definitely",No,18-24 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,
2,3,"I am not primarily a developer, but I write co...","Student, full-time",Russian Federation,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",,,,,,,,Assembly;C;Python;R;Rust,Julia;Python;Rust,SQLite,SQLite,Heroku,,Flask,Flask,NumPy;Pandas;TensorFlow;Torch/PyTorch,Keras;NumPy;Pandas;TensorFlow;Torch/PyTorch,,,IPython/Jupyter;PyCharm;RStudio;Sublime Text;V...,IPython/Jupyter;RStudio;Sublime Text;Visual St...,MacOS,Visit Stack Overflow;Google it;Watch help / tu...,Stack Overflow;Stack Exchange,Multiple times per day,Yes,Multiple times per day,"Yes, definitely",Yes,18-24 years old,Man,No,Prefer not to say,Prefer not to say,None of the above,None of the above,Appropriate in length,Easy,
3,4,I am a developer by profession,Employed full-time,Austria,,,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",11 - 17 years,,,,"Developer, front-end",100 to 499 employees,EUR European Euro,,Monthly,JavaScript;TypeScript,JavaScript;TypeScript,,,,,Angular;jQuery,Angular;jQuery,,,,,,,Windows,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow,Daily or almost daily,Yes,Daily or almost daily,Neutral,No,35-44 years old,Man,No,Straight / Heterosexual,White or of European descent,I am deaf / hard of hearing,,Appropriate in length,Neither easy nor difficult,
4,5,I am a developer by profession,"Independent contractor, freelancer, or self-em...",United Kingdom of Great Britain and Northern I...,,England,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",5 - 10 years,Friend or family member,17,10,"Developer, desktop or enterprise applications;...","Just me - I am a freelancer, sole proprietor, ...",GBP\tPound sterling,,,Bash/Shell;HTML/CSS;Python;SQL,Bash/Shell;HTML/CSS;Python;SQL,Elasticsearch;PostgreSQL;Redis,Cassandra;Elasticsearch;PostgreSQL;Redis,,,Flask,Flask,Apache Spark;Hadoop;NumPy;Pandas,Hadoop;NumPy;Pandas,Docker;Git;Kubernetes;Yarn,Docker;Git;Kubernetes;Yarn,Atom;IPython/Jupyter;Notepad++;PyCharm;Vim,Atom;IPython/Jupyter;Notepad++;PyCharm;Vim;Vis...,Linux-based,Visit Stack Overflow;Go for a walk or other ph...,Stack Overflow;Stack Exchange,Daily or almost daily,Yes,A few times per week,"Yes, somewhat",No,25-34 years old,Man,No,,White or of European descent,None of the above,,Appropriate in length,Easy,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
83434,83435,I am a developer by profession,Employed full-time,United States of America,Texas,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",6,5,"Developer, back-end",20 to 99 employees,USD\tUnited States dollar,160500.0,Yearly,Clojure;Kotlin;SQL,Clojure,Oracle;SQLite,SQLite,AWS,AWS,,,,,Docker;Git,Git;Kubernetes,IntelliJ;Sublime Text;Vim;Visual Studio Code,Sublime Text;Vim,MacOS,Call a coworker or friend;Google it,Stack Overflow;Stack Exchange,A few times per week,No,,"No, not at all",No,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,I have a concentration and/or memory disorder ...,Appropriate in length,Easy,160500.0
83435,83436,I am a developer by profession,"Independent contractor, freelancer, or self-em...",Benin,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",4,2,"Developer, full-stack","Just me - I am a freelancer, sole proprietor, ...",XOF\tWest African CFA franc,200000.0,Monthly,,,Firebase;MariaDB;MySQL;PostgreSQL;Redis;SQLite,Firebase;MariaDB;MongoDB;MySQL;PostgreSQL;Redi...,,,Django;jQuery;Laravel;React.js;Ruby on Rails,Django;Express;jQuery;Laravel;React.js;Ruby on...,Flutter;Qt,,Git;Unity 3D;Unreal Engine,Docker;Git;Kubernetes,Android Studio;Eclipse;Emacs;IntelliJ;NetBeans...,Emacs;IntelliJ;PHPStorm;PyCharm;RStudio;Sublim...,Linux-based,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow;Stack Exchange,Multiple times per day,Yes,I have never participated in Q&A on Stack Over...,"Yes, somewhat",No,18-24 years old,Man,No,Straight / Heterosexual,Black or of African descent,None of the above,None of the above,Appropriate in length,Easy,3960.0
83436,83437,I am a developer by profession,Employed full-time,United States of America,New Jersey,,"Secondary school (e.g. American high school, G...",11 - 17 years,School,10,4,Data scientist or machine learning specialist;...,"10,000 or more employees",USD\tUnited States dollar,1800.0,Weekly,Groovy;Java;Python,Java;Python,DynamoDB;Elasticsearch;MongoDB;PostgreSQL;Redis,DynamoDB;Redis,AWS;Google Cloud Platform,AWS,FastAPI;Flask,FastAPI;Flask,Hadoop;Keras;NumPy;Pandas,Apache Spark;Hadoop;Keras;NumPy;Pandas;TensorFlow,Ansible;Docker;Git;Terraform,Docker;Git;Kubernetes;Terraform,Android Studio;Eclipse;IntelliJ;IPython/Jupyte...,IntelliJ;IPython/Jupyter;Notepad++;Vim,Windows,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow;Stack Exchange,A few times per week,Yes,I have never participated in Q&A on Stack Over...,"No, not really",No,25-34 years old,Man,No,,White or of European descent,None of the above,None of the above,Appropriate in length,Neither easy nor difficult,90000.0
83437,83438,I am a developer by profession,Employed full-time,Canada,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,Online Courses or Certification;Books / Physic...,5,3,"Developer, back-end",20 to 99 employees,CAD\tCanadian dollar,90000.0,Monthly,Bash/Shell;JavaScript;Node.js;Python,Go;Rust,Cassandra;Elasticsearch;MongoDB;PostgreSQL;Redis,,Heroku,AWS;DigitalOcean,Django;Express;Flask;React.js,,NumPy;Pandas;TensorFlow;Torch/PyTorch,NumPy;Pandas;TensorFlow;Torch/PyTorch,Ansible;Docker;Git;Terraform,Kubernetes;Terraform,PyCharm;Sublime Text,,MacOS,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow,A few times per month or weekly,Yes,Less than once per month or monthly,"No, not really",No,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,I have a mood or emotional disorder (e.g. depr...,Appropriate in length,Neither easy nor difficult,816816.0


In [48]:
# Notice how the `ResponseId` column is a unique identifier.
# Set it as an index
df.set_index('ResponseId')

Unnamed: 0_level_0,MainBranch,Employment,Country,US_State,UK_Country,EdLevel,Age1stCode,LearnCode,YearsCode,YearsCodePro,DevType,OrgSize,Currency,CompTotal,CompFreq,LanguageHaveWorkedWith,LanguageWantToWorkWith,DatabaseHaveWorkedWith,DatabaseWantToWorkWith,PlatformHaveWorkedWith,PlatformWantToWorkWith,WebframeHaveWorkedWith,WebframeWantToWorkWith,MiscTechHaveWorkedWith,MiscTechWantToWorkWith,ToolsTechHaveWorkedWith,ToolsTechWantToWorkWith,NEWCollabToolsHaveWorkedWith,NEWCollabToolsWantToWorkWith,OpSys,NEWStuck,NEWSOSites,SOVisitFreq,SOAccount,SOPartFreq,SOComm,NEWOtherComms,Age,Gender,Trans,Sexuality,Ethnicity,Accessibility,MentalHealth,SurveyLength,SurveyEase,ConvertedCompYearly
ResponseId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1
1,I am a developer by profession,"Independent contractor, freelancer, or self-em...",Slovakia,,,"Secondary school (e.g. American high school, G...",18 - 24 years,Coding Bootcamp;Other online resources (ex: vi...,,,"Developer, mobile",20 to 99 employees,EUR European Euro,4800.0,Monthly,C++;HTML/CSS;JavaScript;Objective-C;PHP;Swift,Swift,PostgreSQL;SQLite,SQLite,,,Laravel;Symfony,,,,,,PHPStorm;Xcode,Atom;Xcode,MacOS,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow,Multiple times per day,Yes,A few times per month or weekly,"Yes, definitely",No,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,62268.0
2,I am a student who is learning to code,"Student, full-time",Netherlands,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",7,,,,,,,JavaScript;Python,,PostgreSQL,,,,Angular;Flask;Vue.js,,Cordova,,Docker;Git;Yarn,Git,Android Studio;IntelliJ;Notepad++;PyCharm,,Windows,Visit Stack Overflow;Google it,Stack Overflow,Daily or almost daily,Yes,Daily or almost daily,"Yes, definitely",No,18-24 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,
3,"I am not primarily a developer, but I write co...","Student, full-time",Russian Federation,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",,,,,,,,Assembly;C;Python;R;Rust,Julia;Python;Rust,SQLite,SQLite,Heroku,,Flask,Flask,NumPy;Pandas;TensorFlow;Torch/PyTorch,Keras;NumPy;Pandas;TensorFlow;Torch/PyTorch,,,IPython/Jupyter;PyCharm;RStudio;Sublime Text;V...,IPython/Jupyter;RStudio;Sublime Text;Visual St...,MacOS,Visit Stack Overflow;Google it;Watch help / tu...,Stack Overflow;Stack Exchange,Multiple times per day,Yes,Multiple times per day,"Yes, definitely",Yes,18-24 years old,Man,No,Prefer not to say,Prefer not to say,None of the above,None of the above,Appropriate in length,Easy,
4,I am a developer by profession,Employed full-time,Austria,,,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",11 - 17 years,,,,"Developer, front-end",100 to 499 employees,EUR European Euro,,Monthly,JavaScript;TypeScript,JavaScript;TypeScript,,,,,Angular;jQuery,Angular;jQuery,,,,,,,Windows,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow,Daily or almost daily,Yes,Daily or almost daily,Neutral,No,35-44 years old,Man,No,Straight / Heterosexual,White or of European descent,I am deaf / hard of hearing,,Appropriate in length,Neither easy nor difficult,
5,I am a developer by profession,"Independent contractor, freelancer, or self-em...",United Kingdom of Great Britain and Northern I...,,England,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",5 - 10 years,Friend or family member,17,10,"Developer, desktop or enterprise applications;...","Just me - I am a freelancer, sole proprietor, ...",GBP\tPound sterling,,,Bash/Shell;HTML/CSS;Python;SQL,Bash/Shell;HTML/CSS;Python;SQL,Elasticsearch;PostgreSQL;Redis,Cassandra;Elasticsearch;PostgreSQL;Redis,,,Flask,Flask,Apache Spark;Hadoop;NumPy;Pandas,Hadoop;NumPy;Pandas,Docker;Git;Kubernetes;Yarn,Docker;Git;Kubernetes;Yarn,Atom;IPython/Jupyter;Notepad++;PyCharm;Vim,Atom;IPython/Jupyter;Notepad++;PyCharm;Vim;Vis...,Linux-based,Visit Stack Overflow;Go for a walk or other ph...,Stack Overflow;Stack Exchange,Daily or almost daily,Yes,A few times per week,"Yes, somewhat",No,25-34 years old,Man,No,,White or of European descent,None of the above,,Appropriate in length,Easy,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
83435,I am a developer by profession,Employed full-time,United States of America,Texas,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",6,5,"Developer, back-end",20 to 99 employees,USD\tUnited States dollar,160500.0,Yearly,Clojure;Kotlin;SQL,Clojure,Oracle;SQLite,SQLite,AWS,AWS,,,,,Docker;Git,Git;Kubernetes,IntelliJ;Sublime Text;Vim;Visual Studio Code,Sublime Text;Vim,MacOS,Call a coworker or friend;Google it,Stack Overflow;Stack Exchange,A few times per week,No,,"No, not at all",No,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,I have a concentration and/or memory disorder ...,Appropriate in length,Easy,160500.0
83436,I am a developer by profession,"Independent contractor, freelancer, or self-em...",Benin,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",4,2,"Developer, full-stack","Just me - I am a freelancer, sole proprietor, ...",XOF\tWest African CFA franc,200000.0,Monthly,,,Firebase;MariaDB;MySQL;PostgreSQL;Redis;SQLite,Firebase;MariaDB;MongoDB;MySQL;PostgreSQL;Redi...,,,Django;jQuery;Laravel;React.js;Ruby on Rails,Django;Express;jQuery;Laravel;React.js;Ruby on...,Flutter;Qt,,Git;Unity 3D;Unreal Engine,Docker;Git;Kubernetes,Android Studio;Eclipse;Emacs;IntelliJ;NetBeans...,Emacs;IntelliJ;PHPStorm;PyCharm;RStudio;Sublim...,Linux-based,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow;Stack Exchange,Multiple times per day,Yes,I have never participated in Q&A on Stack Over...,"Yes, somewhat",No,18-24 years old,Man,No,Straight / Heterosexual,Black or of African descent,None of the above,None of the above,Appropriate in length,Easy,3960.0
83437,I am a developer by profession,Employed full-time,United States of America,New Jersey,,"Secondary school (e.g. American high school, G...",11 - 17 years,School,10,4,Data scientist or machine learning specialist;...,"10,000 or more employees",USD\tUnited States dollar,1800.0,Weekly,Groovy;Java;Python,Java;Python,DynamoDB;Elasticsearch;MongoDB;PostgreSQL;Redis,DynamoDB;Redis,AWS;Google Cloud Platform,AWS,FastAPI;Flask,FastAPI;Flask,Hadoop;Keras;NumPy;Pandas,Apache Spark;Hadoop;Keras;NumPy;Pandas;TensorFlow,Ansible;Docker;Git;Terraform,Docker;Git;Kubernetes;Terraform,Android Studio;Eclipse;IntelliJ;IPython/Jupyte...,IntelliJ;IPython/Jupyter;Notepad++;Vim,Windows,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow;Stack Exchange,A few times per week,Yes,I have never participated in Q&A on Stack Over...,"No, not really",No,25-34 years old,Man,No,,White or of European descent,None of the above,None of the above,Appropriate in length,Neither easy nor difficult,90000.0
83438,I am a developer by profession,Employed full-time,Canada,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,Online Courses or Certification;Books / Physic...,5,3,"Developer, back-end",20 to 99 employees,CAD\tCanadian dollar,90000.0,Monthly,Bash/Shell;JavaScript;Node.js;Python,Go;Rust,Cassandra;Elasticsearch;MongoDB;PostgreSQL;Redis,,Heroku,AWS;DigitalOcean,Django;Express;Flask;React.js,,NumPy;Pandas;TensorFlow;Torch/PyTorch,NumPy;Pandas;TensorFlow;Torch/PyTorch,Ansible;Docker;Git;Terraform,Kubernetes;Terraform,PyCharm;Sublime Text,,MacOS,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow,A few times per month or weekly,Yes,Less than once per month or monthly,"No, not really",No,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,I have a mood or emotional disorder (e.g. depr...,Appropriate in length,Neither easy nor difficult,816816.0


In [49]:
# Set it as an index while loading in the data!
df = pd.read_csv(DATA_PATH, index_col='ResponseId')
df

Unnamed: 0_level_0,MainBranch,Employment,Country,US_State,UK_Country,EdLevel,Age1stCode,LearnCode,YearsCode,YearsCodePro,DevType,OrgSize,Currency,CompTotal,CompFreq,LanguageHaveWorkedWith,LanguageWantToWorkWith,DatabaseHaveWorkedWith,DatabaseWantToWorkWith,PlatformHaveWorkedWith,PlatformWantToWorkWith,WebframeHaveWorkedWith,WebframeWantToWorkWith,MiscTechHaveWorkedWith,MiscTechWantToWorkWith,ToolsTechHaveWorkedWith,ToolsTechWantToWorkWith,NEWCollabToolsHaveWorkedWith,NEWCollabToolsWantToWorkWith,OpSys,NEWStuck,NEWSOSites,SOVisitFreq,SOAccount,SOPartFreq,SOComm,NEWOtherComms,Age,Gender,Trans,Sexuality,Ethnicity,Accessibility,MentalHealth,SurveyLength,SurveyEase,ConvertedCompYearly
ResponseId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1
1,I am a developer by profession,"Independent contractor, freelancer, or self-em...",Slovakia,,,"Secondary school (e.g. American high school, G...",18 - 24 years,Coding Bootcamp;Other online resources (ex: vi...,,,"Developer, mobile",20 to 99 employees,EUR European Euro,4800.0,Monthly,C++;HTML/CSS;JavaScript;Objective-C;PHP;Swift,Swift,PostgreSQL;SQLite,SQLite,,,Laravel;Symfony,,,,,,PHPStorm;Xcode,Atom;Xcode,MacOS,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow,Multiple times per day,Yes,A few times per month or weekly,"Yes, definitely",No,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,62268.0
2,I am a student who is learning to code,"Student, full-time",Netherlands,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",7,,,,,,,JavaScript;Python,,PostgreSQL,,,,Angular;Flask;Vue.js,,Cordova,,Docker;Git;Yarn,Git,Android Studio;IntelliJ;Notepad++;PyCharm,,Windows,Visit Stack Overflow;Google it,Stack Overflow,Daily or almost daily,Yes,Daily or almost daily,"Yes, definitely",No,18-24 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,
3,"I am not primarily a developer, but I write co...","Student, full-time",Russian Federation,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",,,,,,,,Assembly;C;Python;R;Rust,Julia;Python;Rust,SQLite,SQLite,Heroku,,Flask,Flask,NumPy;Pandas;TensorFlow;Torch/PyTorch,Keras;NumPy;Pandas;TensorFlow;Torch/PyTorch,,,IPython/Jupyter;PyCharm;RStudio;Sublime Text;V...,IPython/Jupyter;RStudio;Sublime Text;Visual St...,MacOS,Visit Stack Overflow;Google it;Watch help / tu...,Stack Overflow;Stack Exchange,Multiple times per day,Yes,Multiple times per day,"Yes, definitely",Yes,18-24 years old,Man,No,Prefer not to say,Prefer not to say,None of the above,None of the above,Appropriate in length,Easy,
4,I am a developer by profession,Employed full-time,Austria,,,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",11 - 17 years,,,,"Developer, front-end",100 to 499 employees,EUR European Euro,,Monthly,JavaScript;TypeScript,JavaScript;TypeScript,,,,,Angular;jQuery,Angular;jQuery,,,,,,,Windows,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow,Daily or almost daily,Yes,Daily or almost daily,Neutral,No,35-44 years old,Man,No,Straight / Heterosexual,White or of European descent,I am deaf / hard of hearing,,Appropriate in length,Neither easy nor difficult,
5,I am a developer by profession,"Independent contractor, freelancer, or self-em...",United Kingdom of Great Britain and Northern I...,,England,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",5 - 10 years,Friend or family member,17,10,"Developer, desktop or enterprise applications;...","Just me - I am a freelancer, sole proprietor, ...",GBP\tPound sterling,,,Bash/Shell;HTML/CSS;Python;SQL,Bash/Shell;HTML/CSS;Python;SQL,Elasticsearch;PostgreSQL;Redis,Cassandra;Elasticsearch;PostgreSQL;Redis,,,Flask,Flask,Apache Spark;Hadoop;NumPy;Pandas,Hadoop;NumPy;Pandas,Docker;Git;Kubernetes;Yarn,Docker;Git;Kubernetes;Yarn,Atom;IPython/Jupyter;Notepad++;PyCharm;Vim,Atom;IPython/Jupyter;Notepad++;PyCharm;Vim;Vis...,Linux-based,Visit Stack Overflow;Go for a walk or other ph...,Stack Overflow;Stack Exchange,Daily or almost daily,Yes,A few times per week,"Yes, somewhat",No,25-34 years old,Man,No,,White or of European descent,None of the above,,Appropriate in length,Easy,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
83435,I am a developer by profession,Employed full-time,United States of America,Texas,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",6,5,"Developer, back-end",20 to 99 employees,USD\tUnited States dollar,160500.0,Yearly,Clojure;Kotlin;SQL,Clojure,Oracle;SQLite,SQLite,AWS,AWS,,,,,Docker;Git,Git;Kubernetes,IntelliJ;Sublime Text;Vim;Visual Studio Code,Sublime Text;Vim,MacOS,Call a coworker or friend;Google it,Stack Overflow;Stack Exchange,A few times per week,No,,"No, not at all",No,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,I have a concentration and/or memory disorder ...,Appropriate in length,Easy,160500.0
83436,I am a developer by profession,"Independent contractor, freelancer, or self-em...",Benin,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",4,2,"Developer, full-stack","Just me - I am a freelancer, sole proprietor, ...",XOF\tWest African CFA franc,200000.0,Monthly,,,Firebase;MariaDB;MySQL;PostgreSQL;Redis;SQLite,Firebase;MariaDB;MongoDB;MySQL;PostgreSQL;Redi...,,,Django;jQuery;Laravel;React.js;Ruby on Rails,Django;Express;jQuery;Laravel;React.js;Ruby on...,Flutter;Qt,,Git;Unity 3D;Unreal Engine,Docker;Git;Kubernetes,Android Studio;Eclipse;Emacs;IntelliJ;NetBeans...,Emacs;IntelliJ;PHPStorm;PyCharm;RStudio;Sublim...,Linux-based,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow;Stack Exchange,Multiple times per day,Yes,I have never participated in Q&A on Stack Over...,"Yes, somewhat",No,18-24 years old,Man,No,Straight / Heterosexual,Black or of African descent,None of the above,None of the above,Appropriate in length,Easy,3960.0
83437,I am a developer by profession,Employed full-time,United States of America,New Jersey,,"Secondary school (e.g. American high school, G...",11 - 17 years,School,10,4,Data scientist or machine learning specialist;...,"10,000 or more employees",USD\tUnited States dollar,1800.0,Weekly,Groovy;Java;Python,Java;Python,DynamoDB;Elasticsearch;MongoDB;PostgreSQL;Redis,DynamoDB;Redis,AWS;Google Cloud Platform,AWS,FastAPI;Flask,FastAPI;Flask,Hadoop;Keras;NumPy;Pandas,Apache Spark;Hadoop;Keras;NumPy;Pandas;TensorFlow,Ansible;Docker;Git;Terraform,Docker;Git;Kubernetes;Terraform,Android Studio;Eclipse;IntelliJ;IPython/Jupyte...,IntelliJ;IPython/Jupyter;Notepad++;Vim,Windows,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow;Stack Exchange,A few times per week,Yes,I have never participated in Q&A on Stack Over...,"No, not really",No,25-34 years old,Man,No,,White or of European descent,None of the above,None of the above,Appropriate in length,Neither easy nor difficult,90000.0
83438,I am a developer by profession,Employed full-time,Canada,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,Online Courses or Certification;Books / Physic...,5,3,"Developer, back-end",20 to 99 employees,CAD\tCanadian dollar,90000.0,Monthly,Bash/Shell;JavaScript;Node.js;Python,Go;Rust,Cassandra;Elasticsearch;MongoDB;PostgreSQL;Redis,,Heroku,AWS;DigitalOcean,Django;Express;Flask;React.js,,NumPy;Pandas;TensorFlow;Torch/PyTorch,NumPy;Pandas;TensorFlow;Torch/PyTorch,Ansible;Docker;Git;Terraform,Kubernetes;Terraform,PyCharm;Sublime Text,,MacOS,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow,A few times per month or weekly,Yes,Less than once per month or monthly,"No, not really",No,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,I have a mood or emotional disorder (e.g. depr...,Appropriate in length,Neither easy nor difficult,816816.0


In [50]:
# The first responded has an index of 1 now.
df.loc[1]

MainBranch                                         I am a developer by profession
Employment                      Independent contractor, freelancer, or self-em...
Country                                                                  Slovakia
US_State                                                                      NaN
UK_Country                                                                    NaN
EdLevel                         Secondary school (e.g. American high school, G...
Age1stCode                                                          18 - 24 years
LearnCode                       Coding Bootcamp;Other online resources (ex: vi...
YearsCode                                                                     NaN
YearsCodePro                                                                  NaN
DevType                                                         Developer, mobile
OrgSize                                                        20 to 99 employees
Currency        

### **Task**: Get the question a column refers to.

In [51]:
# First, get the name of the column that will be used as an index.
schema_df

Unnamed: 0,qid,qname,question,force_resp,type,selector
0,QID16,S0,"<div><span style=""font-size:19px;""><strong>Hel...",False,DB,TB
1,QID12,MetaInfo,Browser Meta Info,False,Meta,Browser
2,QID1,S1,"<span style=""font-size:22px; font-family: aria...",False,DB,TB
3,QID2,MainBranch,Which of the following options best describes ...,True,MC,SAVR
4,QID24,Employment,Which of the following best describes your cur...,False,MC,MAVR
5,QID6,Country,"Where do you live? <span style=""font-weight: b...",True,MC,DL
6,QID7,US_State,<p>In which state or territory of the USA do y...,False,MC,DL
7,QID9,UK_Country,In which part of the United Kingdom do you liv...,False,MC,DL
8,QID190,S2,"<span style=""font-size:22px; font-family: aria...",False,DB,TB
9,QID25,EdLevel,Which of the following best describes the high...,False,MC,SAVR


In [52]:
# In this case it's `qname`
schema_df = pd.read_csv(SCHEMA_PATH, index_col='qname')
schema_df

Unnamed: 0_level_0,qid,question,force_resp,type,selector
qname,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
S0,QID16,"<div><span style=""font-size:19px;""><strong>Hel...",False,DB,TB
MetaInfo,QID12,Browser Meta Info,False,Meta,Browser
S1,QID1,"<span style=""font-size:22px; font-family: aria...",False,DB,TB
MainBranch,QID2,Which of the following options best describes ...,True,MC,SAVR
Employment,QID24,Which of the following best describes your cur...,False,MC,MAVR
Country,QID6,"Where do you live? <span style=""font-weight: b...",True,MC,DL
US_State,QID7,<p>In which state or territory of the USA do y...,False,MC,DL
UK_Country,QID9,In which part of the United Kingdom do you liv...,False,MC,DL
S2,QID190,"<span style=""font-size:22px; font-family: aria...",False,DB,TB
EdLevel,QID25,Which of the following best describes the high...,False,MC,SAVR


In [53]:
# Then, locate the row with index `Employment`
schema_df.loc['Employment']

qid                                                       QID24
question      Which of the following best describes your cur...
force_resp                                                False
type                                                         MC
selector                                                   MAVR
Name: Employment, dtype: object

In [54]:
schema_df.loc['Employment', 'question']

'Which of the following best describes your current <b>employment status</b>?'

In [55]:
# Now we can check any question
df.columns

Index(['MainBranch', 'Employment', 'Country', 'US_State', 'UK_Country',
       'EdLevel', 'Age1stCode', 'LearnCode', 'YearsCode', 'YearsCodePro',
       'DevType', 'OrgSize', 'Currency', 'CompTotal', 'CompFreq',
       'LanguageHaveWorkedWith', 'LanguageWantToWorkWith',
       'DatabaseHaveWorkedWith', 'DatabaseWantToWorkWith',
       'PlatformHaveWorkedWith', 'PlatformWantToWorkWith',
       'WebframeHaveWorkedWith', 'WebframeWantToWorkWith',
       'MiscTechHaveWorkedWith', 'MiscTechWantToWorkWith',
       'ToolsTechHaveWorkedWith', 'ToolsTechWantToWorkWith',
       'NEWCollabToolsHaveWorkedWith', 'NEWCollabToolsWantToWorkWith', 'OpSys',
       'NEWStuck', 'NEWSOSites', 'SOVisitFreq', 'SOAccount', 'SOPartFreq',
       'SOComm', 'NEWOtherComms', 'Age', 'Gender', 'Trans', 'Sexuality',
       'Ethnicity', 'Accessibility', 'MentalHealth', 'SurveyLength',
       'SurveyEase', 'ConvertedCompYearly'],
      dtype='object')

In [56]:
schema_df.loc['NEWOtherComms', 'question']

'Are you a member of any other online developer communities?'

In [57]:
# Sort the indices lexicographically.
schema_df.sort_index()

Unnamed: 0_level_0,qid,question,force_resp,type,selector
qname,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Accessibility,QID124,"Which of the following describe you, if any? P...",False,MC,MAVR
Age,QID127,What is your age?,False,MC,MAVR
Age1stCode,QID149,At what age did you write your first line of c...,False,MC,MAVR
CompFreq,QID52,"Is that compensation weekly, monthly, or yearly?",False,MC,MAVR
CompTotal,QID51,What is your current total compensation (salar...,False,TE,SL
Country,QID6,"Where do you live? <span style=""font-weight: b...",True,MC,DL
Currency,QID50,Which currency do you use day-to-day? If your ...,True,MC,SB
Database,QID262,Which <b>database environments </b>have you do...,False,Matrix,Likert
DevType,QID31,Which of the following describes your current ...,False,MC,MAVR
EdLevel,QID25,Which of the following best describes the high...,False,MC,SAVR


In [58]:
# Note that this does not change the dataframe.
schema_df.sort_index(ascending=False)

Unnamed: 0_level_0,qid,question,force_resp,type,selector
qname,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
YearsCodePro,QID34,"NOT including education, how many years have y...",False,MC,DL
YearsCode,QID32,"Including any education, how many years have y...",False,MC,DL
Webframe,QID264,Which <strong>web frameworks </strong><span st...,False,Matrix,Likert
US_State,QID7,<p>In which state or territory of the USA do y...,False,MC,DL
UK_Country,QID9,In which part of the United Kingdom do you liv...,False,MC,DL
Trans,QID153,Do you identify as transgender?,False,MC,MAVR
ToolsTech,QID275,Which <strong>tools</strong> have you done ext...,False,Matrix,Likert
SurveyLength,QID132,How do you feel about the length of the survey...,False,MC,MAVR
SurveyEase,QID133,How easy or difficult was this survey to compl...,False,MC,MAVR
Sexuality,QID136,"Which of the following describe you, if any? P...",False,MC,MAVR


## Filtering - Using Conditionals to Filter Rows and Columns

In [59]:
ppl_df

Unnamed: 0,email,first,last
0,s.e.hristov99@gmail.com,SimoFirst,SimoLast
1,JaneDoe@gmail.com,Jane,Doe
2,JohnDoe@gmail.com,John,Doe


In [60]:
# Get all people with last name `Doe`.
ppl_df['last'] == 'Doe'

0    False
1     True
2     True
Name: last, dtype: bool

In [61]:
ppl_df[ppl_df['last'] == 'Doe']

Unnamed: 0,email,first,last
1,JaneDoe@gmail.com,Jane,Doe
2,JohnDoe@gmail.com,John,Doe


In [62]:
ppl_df[ppl_df['last'] == 'Doe']['email']

1    JaneDoe@gmail.com
2    JohnDoe@gmail.com
Name: email, dtype: object

In [63]:
ppl_df.loc[ppl_df['last'] == 'Doe']

Unnamed: 0,email,first,last
1,JaneDoe@gmail.com,Jane,Doe
2,JohnDoe@gmail.com,John,Doe


In [64]:
ppl_df.loc[ppl_df['last'] == 'Doe', 'email']

1    JaneDoe@gmail.com
2    JohnDoe@gmail.com
Name: email, dtype: object

> **Note**: Chaining multiple conditions happens with **&**, **|**, **~**, not with the Python keywords `and`, `or`, `not`.

> **THE BRACKETS ARE IMPORTANT!**

In [65]:
# Get the rows in which the last name is `Doe` and the first name is `John`.
ppl_df[(ppl_df['last'] == 'Doe') & (ppl_df['first'] == 'John')]

Unnamed: 0,email,first,last
2,JohnDoe@gmail.com,John,Doe


In [66]:
ppl_df[(ppl_df['last'] == 'Doe') & (ppl_df['first'] == 'John')]['email']

2    JohnDoe@gmail.com
Name: email, dtype: object

In [67]:
# Get the rows in which the last name is `SimoLast` or the first name is `John`.
ppl_df[(ppl_df['last'] == 'SimoLast') | (ppl_df['first'] == 'John')]

Unnamed: 0,email,first,last
0,s.e.hristov99@gmail.com,SimoFirst,SimoLast
2,JohnDoe@gmail.com,John,Doe


In [68]:
ppl_df[(ppl_df['last'] == 'SimoLast') | (ppl_df['first'] == 'John')]['email']

0    s.e.hristov99@gmail.com
2          JohnDoe@gmail.com
Name: email, dtype: object

In [69]:
# Get the opposite of the above condition.
ppl_df[~((ppl_df['last'] == 'SimoLast') | (ppl_df['first'] == 'John'))]['email']

1    JaneDoe@gmail.com
Name: email, dtype: object

In [70]:
df.columns

Index(['MainBranch', 'Employment', 'Country', 'US_State', 'UK_Country',
       'EdLevel', 'Age1stCode', 'LearnCode', 'YearsCode', 'YearsCodePro',
       'DevType', 'OrgSize', 'Currency', 'CompTotal', 'CompFreq',
       'LanguageHaveWorkedWith', 'LanguageWantToWorkWith',
       'DatabaseHaveWorkedWith', 'DatabaseWantToWorkWith',
       'PlatformHaveWorkedWith', 'PlatformWantToWorkWith',
       'WebframeHaveWorkedWith', 'WebframeWantToWorkWith',
       'MiscTechHaveWorkedWith', 'MiscTechWantToWorkWith',
       'ToolsTechHaveWorkedWith', 'ToolsTechWantToWorkWith',
       'NEWCollabToolsHaveWorkedWith', 'NEWCollabToolsWantToWorkWith', 'OpSys',
       'NEWStuck', 'NEWSOSites', 'SOVisitFreq', 'SOAccount', 'SOPartFreq',
       'SOComm', 'NEWOtherComms', 'Age', 'Gender', 'Trans', 'Sexuality',
       'Ethnicity', 'Accessibility', 'MentalHealth', 'SurveyLength',
       'SurveyEase', 'ConvertedCompYearly'],
      dtype='object')

### **Task**: Get all responces from people that earn a salary > 70 000.

In [71]:
# We'll use the last column `ConvertedCompYearly` to do that.
filt = (df['ConvertedCompYearly'] > 70000)
df[filt]

Unnamed: 0_level_0,MainBranch,Employment,Country,US_State,UK_Country,EdLevel,Age1stCode,LearnCode,YearsCode,YearsCodePro,DevType,OrgSize,Currency,CompTotal,CompFreq,LanguageHaveWorkedWith,LanguageWantToWorkWith,DatabaseHaveWorkedWith,DatabaseWantToWorkWith,PlatformHaveWorkedWith,PlatformWantToWorkWith,WebframeHaveWorkedWith,WebframeWantToWorkWith,MiscTechHaveWorkedWith,MiscTechWantToWorkWith,ToolsTechHaveWorkedWith,ToolsTechWantToWorkWith,NEWCollabToolsHaveWorkedWith,NEWCollabToolsWantToWorkWith,OpSys,NEWStuck,NEWSOSites,SOVisitFreq,SOAccount,SOPartFreq,SOComm,NEWOtherComms,Age,Gender,Trans,Sexuality,Ethnicity,Accessibility,MentalHealth,SurveyLength,SurveyEase,ConvertedCompYearly
ResponseId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1
13,I am a developer by profession,Employed full-time,Germany,,,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",11 - 17 years,Other (please specify):,15,6,"Developer, desktop or enterprise applications;...","1,000 to 4,999 employees",EUR European Euro,71500.0,Yearly,C;C++;Java;Perl;Ruby,Rust,,,,,Ruby on Rails,,Qt,NumPy;TensorFlow,Git,Docker;Kubernetes,Vim,Vim;Visual Studio Code,Linux-based,Visit Stack Overflow;Google it;Watch help / tu...,Stack Overflow,Multiple times per day,Yes,Daily or almost daily,"Yes, definitely",No,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,Prefer not to say,Appropriate in length,Easy,77290.0
19,"I am not primarily a developer, but I write co...",I prefer not to say,Singapore,,,"Other doctoral degree (Ph.D., Ed.D., etc.)",11 - 17 years,Other (please specify):,40,30,,"5,000 to 9,999 employees",SGD\tSingapore dollar,18700.0,Monthly,C++;Python,C++;Python,,,,,,,NumPy;Pandas;Torch/PyTorch,NumPy;Pandas;Torch/PyTorch,,,IPython/Jupyter;Vim,IPython/Jupyter;Vim,Linux-based,Google it,Stack Overflow;Stack Exchange,Multiple times per day,Yes,Multiple times per day,"Yes, somewhat",No,45-54 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,160932.0
25,I am a developer by profession,Employed full-time,Germany,,,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",11 - 17 years,School;Friend or family member,26,18,"Developer, front-end;Developer, desktop or ent...","1,000 to 4,999 employees",EUR European Euro,72000.0,Yearly,C++;HTML/CSS;Java;JavaScript;Kotlin;Node.js;Ty...,HTML/CSS;Java;JavaScript;Kotlin;Node.js;TypeSc...,DynamoDB;PostgreSQL,PostgreSQL,AWS;Heroku,AWS;Heroku,Angular;Express;Spring,Angular;Express;Spring,,,Docker;Git;Kubernetes,Docker;Git;Kubernetes,Android Studio;IntelliJ;Visual Studio Code,Android Studio;IntelliJ;Visual Studio Code,Linux-based,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow,Multiple times per day,Yes,A few times per week,"Yes, somewhat",No,35-44 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,77831.0
27,"I am not primarily a developer, but I write co...",Employed full-time,Switzerland,,,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",11 - 17 years,Other (please specify):,14,5,Academic researcher;Scientist;Student,"5,000 to 9,999 employees",CHF\tSwiss franc,80000.0,Yearly,C++;Python,C++;Go;Java;Python,,,,,,,NumPy;Pandas,,Docker;Git,Docker;Git,IntelliJ;IPython/Jupyter,IntelliJ;IPython/Jupyter,Linux-based,Visit Stack Overflow;Go for a walk or other ph...,Stack Overflow;Stack Exchange,Multiple times per day,Yes,I have never participated in Q&A on Stack Over...,Neutral,No,25-34 years old,Man,No,,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,81319.0
32,I am a developer by profession,Employed full-time,Israel,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",25 - 34 years,School;Online Courses or Certification,4,2,"Engineer, data;Developer, back-end","5,000 to 9,999 employees",ILS\tIsraeli new shekel,35000.0,Monthly,Bash/Shell;Go;Java;Node.js;Python;Scala;SQL,Bash/Shell;Java;Python;Scala;SQL,DynamoDB;MongoDB;MySQL;PostgreSQL,Cassandra;MongoDB;MySQL;PostgreSQL;Redis,AWS;Google Cloud Platform;Microsoft Azure,AWS;Google Cloud Platform,,,Apache Spark;NumPy;Pandas,Apache Spark,Docker;Git;Kubernetes;Terraform,Docker;Git;Kubernetes,Atom;IntelliJ;IPython/Jupyter;PyCharm;Sublime ...,IntelliJ;Visual Studio Code,MacOS,Call a coworker or friend;Google it;Do other w...,Stack Overflow;Stack Exchange,A few times per week,Yes,A few times per month or weekly,"No, not really",No,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent;Middle Eastern,None of the above,I have a concentration and/or memory disorder ...,Appropriate in length,Easy,122580.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
83431,I am a developer by profession,Employed full-time,United States of America,Illinois,,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",23,21,"Developer, front-end;Developer, full-stack;Dev...",10 to 19 employees,USD\tUnited States dollar,125000.0,Yearly,APL;Clojure;LISP;Python;Ruby;SQL;TypeScript,APL;Clojure;Haskell;LISP;R,MongoDB;MySQL;PostgreSQL;Redis,PostgreSQL,AWS;DigitalOcean;Google Cloud Platform;Heroku,AWS;DigitalOcean;Google Cloud Platform,Django;React.js;Ruby on Rails,,,,Docker;Git,Docker;Git;Kubernetes;Unity 3D,Emacs;Vim,Emacs,Linux-based,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow;Stack Exchange,A few times per week,No,,"No, not really",No,45-54 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,125000.0
83433,I am a developer by profession,Employed full-time,Canada,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",18 - 24 years,School,5,Less than 1 year,"Developer, back-end","10,000 or more employees",CAD\tCanadian dollar,106000.0,Yearly,Ruby,Java;Ruby;TypeScript,MySQL;PostgreSQL,,Google Cloud Platform;Heroku,,Flask;React.js;Ruby on Rails;Vue.js,,NumPy;Pandas;TensorFlow;Torch/PyTorch,,Docker;Git;Kubernetes;Yarn,,Atom;IPython/Jupyter;Vim;Visual Studio Code,,MacOS,Visit Stack Overflow;Go for a walk or other ph...,Stack Overflow;Stack Exchange,Daily or almost daily,No,,"No, not really",No,18-24 years old,Woman,No,Straight / Heterosexual,East Asian,None of the above,None of the above,Appropriate in length,Easy,80169.0
83435,I am a developer by profession,Employed full-time,United States of America,Texas,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",6,5,"Developer, back-end",20 to 99 employees,USD\tUnited States dollar,160500.0,Yearly,Clojure;Kotlin;SQL,Clojure,Oracle;SQLite,SQLite,AWS,AWS,,,,,Docker;Git,Git;Kubernetes,IntelliJ;Sublime Text;Vim;Visual Studio Code,Sublime Text;Vim,MacOS,Call a coworker or friend;Google it,Stack Overflow;Stack Exchange,A few times per week,No,,"No, not at all",No,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,I have a concentration and/or memory disorder ...,Appropriate in length,Easy,160500.0
83437,I am a developer by profession,Employed full-time,United States of America,New Jersey,,"Secondary school (e.g. American high school, G...",11 - 17 years,School,10,4,Data scientist or machine learning specialist;...,"10,000 or more employees",USD\tUnited States dollar,1800.0,Weekly,Groovy;Java;Python,Java;Python,DynamoDB;Elasticsearch;MongoDB;PostgreSQL;Redis,DynamoDB;Redis,AWS;Google Cloud Platform,AWS,FastAPI;Flask,FastAPI;Flask,Hadoop;Keras;NumPy;Pandas,Apache Spark;Hadoop;Keras;NumPy;Pandas;TensorFlow,Ansible;Docker;Git;Terraform,Docker;Git;Kubernetes;Terraform,Android Studio;Eclipse;IntelliJ;IPython/Jupyte...,IntelliJ;IPython/Jupyter;Notepad++;Vim,Windows,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow;Stack Exchange,A few times per week,Yes,I have never participated in Q&A on Stack Over...,"No, not really",No,25-34 years old,Man,No,,White or of European descent,None of the above,None of the above,Appropriate in length,Neither easy nor difficult,90000.0


In [72]:
df[filt][['Country', 'LanguageHaveWorkedWith', 'ConvertedCompYearly']]

Unnamed: 0_level_0,Country,LanguageHaveWorkedWith,ConvertedCompYearly
ResponseId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
13,Germany,C;C++;Java;Perl;Ruby,77290.0
19,Singapore,C++;Python,160932.0
25,Germany,C++;HTML/CSS;Java;JavaScript;Kotlin;Node.js;Ty...,77831.0
27,Switzerland,C++;Python,81319.0
32,Israel,Bash/Shell;Go;Java;Node.js;Python;Scala;SQL,122580.0
...,...,...,...
83431,United States of America,APL;Clojure;LISP;Python;Ruby;SQL;TypeScript,125000.0
83433,Canada,Ruby,80169.0
83435,United States of America,Clojure;Kotlin;SQL,160500.0
83437,United States of America,Groovy;Java;Python,90000.0


In [73]:
# Task: Get only responces from specific countries.
countries = ['United States', 'Germany', 'Bulgaria', 'United Kingdom', 'Canada', 'India']
filt = (df['Country'].isin(countries))

In [74]:
df.loc[filt, 'Country']

ResponseId
9          India
13       Germany
18        Canada
21       Germany
23         India
          ...   
83416    Germany
83418      India
83425    Germany
83433     Canada
83438     Canada
Name: Country, Length: 19550, dtype: object

In [75]:
# Task: Get people who chose Python as the only programming language they know.
df['LanguageHaveWorkedWith']

ResponseId
1        C++;HTML/CSS;JavaScript;Objective-C;PHP;Swift
2                                    JavaScript;Python
3                             Assembly;C;Python;R;Rust
4                                JavaScript;TypeScript
5                       Bash/Shell;HTML/CSS;Python;SQL
                             ...                      
83435                               Clojure;Kotlin;SQL
83436                                              NaN
83437                               Groovy;Java;Python
83438             Bash/Shell;JavaScript;Node.js;Python
83439           Delphi;Elixir;HTML/CSS;Java;JavaScript
Name: LanguageHaveWorkedWith, Length: 83439, dtype: object

In [76]:
filt = df['LanguageHaveWorkedWith'].str.contains('Python', na=False)
df.loc[filt]

Unnamed: 0_level_0,MainBranch,Employment,Country,US_State,UK_Country,EdLevel,Age1stCode,LearnCode,YearsCode,YearsCodePro,DevType,OrgSize,Currency,CompTotal,CompFreq,LanguageHaveWorkedWith,LanguageWantToWorkWith,DatabaseHaveWorkedWith,DatabaseWantToWorkWith,PlatformHaveWorkedWith,PlatformWantToWorkWith,WebframeHaveWorkedWith,WebframeWantToWorkWith,MiscTechHaveWorkedWith,MiscTechWantToWorkWith,ToolsTechHaveWorkedWith,ToolsTechWantToWorkWith,NEWCollabToolsHaveWorkedWith,NEWCollabToolsWantToWorkWith,OpSys,NEWStuck,NEWSOSites,SOVisitFreq,SOAccount,SOPartFreq,SOComm,NEWOtherComms,Age,Gender,Trans,Sexuality,Ethnicity,Accessibility,MentalHealth,SurveyLength,SurveyEase,ConvertedCompYearly
ResponseId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1
2,I am a student who is learning to code,"Student, full-time",Netherlands,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",7,,,,,,,JavaScript;Python,,PostgreSQL,,,,Angular;Flask;Vue.js,,Cordova,,Docker;Git;Yarn,Git,Android Studio;IntelliJ;Notepad++;PyCharm,,Windows,Visit Stack Overflow;Google it,Stack Overflow,Daily or almost daily,Yes,Daily or almost daily,"Yes, definitely",No,18-24 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,
3,"I am not primarily a developer, but I write co...","Student, full-time",Russian Federation,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",,,,,,,,Assembly;C;Python;R;Rust,Julia;Python;Rust,SQLite,SQLite,Heroku,,Flask,Flask,NumPy;Pandas;TensorFlow;Torch/PyTorch,Keras;NumPy;Pandas;TensorFlow;Torch/PyTorch,,,IPython/Jupyter;PyCharm;RStudio;Sublime Text;V...,IPython/Jupyter;RStudio;Sublime Text;Visual St...,MacOS,Visit Stack Overflow;Google it;Watch help / tu...,Stack Overflow;Stack Exchange,Multiple times per day,Yes,Multiple times per day,"Yes, definitely",Yes,18-24 years old,Man,No,Prefer not to say,Prefer not to say,None of the above,None of the above,Appropriate in length,Easy,
5,I am a developer by profession,"Independent contractor, freelancer, or self-em...",United Kingdom of Great Britain and Northern I...,,England,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",5 - 10 years,Friend or family member,17,10,"Developer, desktop or enterprise applications;...","Just me - I am a freelancer, sole proprietor, ...",GBP\tPound sterling,,,Bash/Shell;HTML/CSS;Python;SQL,Bash/Shell;HTML/CSS;Python;SQL,Elasticsearch;PostgreSQL;Redis,Cassandra;Elasticsearch;PostgreSQL;Redis,,,Flask,Flask,Apache Spark;Hadoop;NumPy;Pandas,Hadoop;NumPy;Pandas,Docker;Git;Kubernetes;Yarn,Docker;Git;Kubernetes;Yarn,Atom;IPython/Jupyter;Notepad++;PyCharm;Vim,Atom;IPython/Jupyter;Notepad++;PyCharm;Vim;Vis...,Linux-based,Visit Stack Overflow;Go for a walk or other ph...,Stack Overflow;Stack Exchange,Daily or almost daily,Yes,A few times per week,"Yes, somewhat",No,25-34 years old,Man,No,,White or of European descent,None of the above,,Appropriate in length,Easy,
6,I am a student who is learning to code,"Student, part-time",United States of America,Georgia,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",,,,,,,,C;C#;C++;HTML/CSS;Java;JavaScript;Node.js;Powe...,C#;C++;Go;HTML/CSS;Java;JavaScript;Node.js;Obj...,MySQL;PostgreSQL;SQLite,Elasticsearch;Firebase;IBM DB2;MariaDB;Microso...,,,Express;Flask;jQuery;React.js,Express;Flask;jQuery;React.js,Keras;NumPy;Pandas;TensorFlow;Torch/PyTorch,Keras;NumPy;Pandas;Qt;React Native;TensorFlow;...,Git,Docker;Git;Unity 3D;Unreal Engine,IPython/Jupyter;Notepad++;PyCharm;Sublime Text...,Android Studio;IPython/Jupyter;Vim;Visual Stud...,Windows,Visit Stack Overflow;Go for a walk or other ph...,Stack Overflow;Stack Exchange,Multiple times per day,Yes,I have never participated in Q&A on Stack Over...,"Yes, somewhat",No,18-24 years old,Prefer not to say,No,Straight / Heterosexual,Prefer not to say,None of the above,I have a concentration and/or memory disorder ...,Too long,Neither easy nor difficult,
10,I am a developer by profession,Employed full-time,Sweden,,,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",11 - 17 years,School,7,4,Data scientist or machine learning specialist,10 to 19 employees,SEK\tSwedish krona,42000.0,Monthly,C++;Python,Haskell;Python,PostgreSQL,,,,,,Keras;NumPy;Pandas;TensorFlow;Torch/PyTorch,Keras;NumPy;TensorFlow;Torch/PyTorch,Git,Git,IPython/Jupyter;Vim;Visual Studio Code,Emacs;IPython/Jupyter;Vim;Visual Studio Code,Linux-based,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow;Stack Exchange,Multiple times per day,Yes,Daily or almost daily,"Yes, somewhat",No,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Neither easy nor difficult,51552.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
83430,I code primarily as a hobby,"Not employed, but looking for work",United States of America,Washington,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",18 - 24 years,"Other online resources (ex: videos, blogs, etc)",6,,Other (please specify):;Student,,,,,HTML/CSS;PHP;PowerShell;Python;SQL;VBA,C#;Go;HTML/CSS;Java;JavaScript;PHP;Python;SQL;VBA,MongoDB;MySQL;PostgreSQL,MariaDB;Microsoft SQL Server;MongoDB;MySQL;Pos...,Heroku,AWS;Heroku;IBM Cloud or Watson;Microsoft Azure...,Django;Flask;jQuery,Angular.js;ASP.NET;ASP.NET Core ;Django;FastAP...,NumPy;Pandas,.NET Framework;.NET Core / .NET 5;NumPy;Pandas,Git,Docker;Git;Xamarin,Atom;Notepad++;Sublime Text;Visual Studio Code,Atom;Notepad++;Sublime Text;Visual Studio Code,Windows,Visit Stack Overflow;Google it;Watch help / tu...,Stack Overflow;Stack Exchange,A few times per month or weekly,Not sure/can't remember,,"Yes, somewhat",No,25-34 years old,"Man;Or, in your own words:",Yes,Queer,White or of European descent,I am unable to / find it difficult to walk or ...,I have an anxiety disorder,Appropriate in length,Neither easy nor difficult,
83431,I am a developer by profession,Employed full-time,United States of America,Illinois,,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",23,21,"Developer, front-end;Developer, full-stack;Dev...",10 to 19 employees,USD\tUnited States dollar,125000.0,Yearly,APL;Clojure;LISP;Python;Ruby;SQL;TypeScript,APL;Clojure;Haskell;LISP;R,MongoDB;MySQL;PostgreSQL;Redis,PostgreSQL,AWS;DigitalOcean;Google Cloud Platform;Heroku,AWS;DigitalOcean;Google Cloud Platform,Django;React.js;Ruby on Rails,,,,Docker;Git,Docker;Git;Kubernetes;Unity 3D,Emacs;Vim,Emacs,Linux-based,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow;Stack Exchange,A few times per week,No,,"No, not really",No,45-54 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,125000.0
83432,I am a developer by profession,"Independent contractor, freelancer, or self-em...",Pakistan,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",9,4,"Developer, mobile;Developer, desktop or enterp...",2 to 9 employees,PKR\tPakistani rupee,150000.0,Monthly,C#;Dart;HTML/CSS;Java;JavaScript;Kotlin;Node.j...,C#;Dart;HTML/CSS;JavaScript;Kotlin;Node.js;Pyt...,Firebase;MySQL;SQLite,DynamoDB;Firebase;MongoDB;MySQL;SQLite,Google Cloud Platform,AWS;Google Cloud Platform,Flask;jQuery,Angular;Django;Flask;jQuery;Laravel,Flutter,Flutter;Hadoop;NumPy;TensorFlow;Torch/PyTorch,Git,Docker;Git,Android Studio;IntelliJ;IPython/Jupyter;Notepa...,Android Studio;IntelliJ;IPython/Jupyter;PyChar...,Windows,Visit Stack Overflow;Go for a walk or other ph...,Stack Overflow;Stack Exchange,Multiple times per day,Yes,A few times per month or weekly,"Yes, definitely",No,18-24 years old,Man,No,Straight / Heterosexual,Southeast Asian,None of the above,None of the above,Appropriate in length,Easy,11676.0
83437,I am a developer by profession,Employed full-time,United States of America,New Jersey,,"Secondary school (e.g. American high school, G...",11 - 17 years,School,10,4,Data scientist or machine learning specialist;...,"10,000 or more employees",USD\tUnited States dollar,1800.0,Weekly,Groovy;Java;Python,Java;Python,DynamoDB;Elasticsearch;MongoDB;PostgreSQL;Redis,DynamoDB;Redis,AWS;Google Cloud Platform,AWS,FastAPI;Flask,FastAPI;Flask,Hadoop;Keras;NumPy;Pandas,Apache Spark;Hadoop;Keras;NumPy;Pandas;TensorFlow,Ansible;Docker;Git;Terraform,Docker;Git;Kubernetes;Terraform,Android Studio;Eclipse;IntelliJ;IPython/Jupyte...,IntelliJ;IPython/Jupyter;Notepad++;Vim,Windows,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow;Stack Exchange,A few times per week,Yes,I have never participated in Q&A on Stack Over...,"No, not really",No,25-34 years old,Man,No,,White or of European descent,None of the above,None of the above,Appropriate in length,Neither easy nor difficult,90000.0


## Updating Rows and Columns - Modifying Data Within DataFrames

In [77]:
ppl_df = pd.DataFrame(people)
ppl_df

Unnamed: 0,first,last,email
0,SimoFirst,SimoLast,s.e.hristov99@gmail.com
1,Jane,Doe,JaneDoe@gmail.com
2,John,Doe,JohnDoe@gmail.com


**Column names**

In [78]:
# Rename all columns
ppl_df.columns = ['first_name', 'last_name', 'email']
ppl_df

Unnamed: 0,first_name,last_name,email
0,SimoFirst,SimoLast,s.e.hristov99@gmail.com
1,Jane,Doe,JaneDoe@gmail.com
2,John,Doe,JohnDoe@gmail.com


In [79]:
# Uppercase all of the column names
ppl_df.columns = ppl_df.columns.str.upper()
ppl_df

Unnamed: 0,FIRST_NAME,LAST_NAME,EMAIL
0,SimoFirst,SimoLast,s.e.hristov99@gmail.com
1,Jane,Doe,JaneDoe@gmail.com
2,John,Doe,JohnDoe@gmail.com


In [80]:
# Replace underlines with spaces
ppl_df.columns = ppl_df.columns.str.replace('_', ' ')
ppl_df

Unnamed: 0,FIRST NAME,LAST NAME,EMAIL
0,SimoFirst,SimoLast,s.e.hristov99@gmail.com
1,Jane,Doe,JaneDoe@gmail.com
2,John,Doe,JohnDoe@gmail.com


In [81]:
# Change only specific columns
ppl_df = ppl_df.rename(columns={
    'FIRST NAME': 'first',
    'LAST NAME': 'last',
})
ppl_df

Unnamed: 0,first,last,EMAIL
0,SimoFirst,SimoLast,s.e.hristov99@gmail.com
1,Jane,Doe,JaneDoe@gmail.com
2,John,Doe,JohnDoe@gmail.com


**Row values**

In [82]:
ppl_df.columns = ['first', 'last', 'email']
ppl_df

Unnamed: 0,first,last,email
0,SimoFirst,SimoLast,s.e.hristov99@gmail.com
1,Jane,Doe,JaneDoe@gmail.com
2,John,Doe,JohnDoe@gmail.com


In [83]:
# Task: Change John's last name to Smith
ppl_df.loc[2]

first                 John
last                   Doe
email    JohnDoe@gmail.com
Name: 2, dtype: object

In [84]:
# solution 1: replace the whole sample
ppl_df.loc[2] = ['John', 'Smith', 'JohnSmith@email.com']
ppl_df

Unnamed: 0,first,last,email
0,SimoFirst,SimoLast,s.e.hristov99@gmail.com
1,Jane,Doe,JaneDoe@gmail.com
2,John,Smith,JohnSmith@email.com


In [85]:
# solution 2: Change only the values for specific columns
ppl_df.loc[2, 'last'] = 'Smith'
ppl_df

Unnamed: 0,first,last,email
0,SimoFirst,SimoLast,s.e.hristov99@gmail.com
1,Jane,Doe,JaneDoe@gmail.com
2,John,Smith,JohnSmith@email.com


In [86]:
# Change only the values for specific columns
ppl_df.loc[2, ['last', 'email']] = ['Doe', 'JohnDoe@email.com']
ppl_df

Unnamed: 0,first,last,email
0,SimoFirst,SimoLast,s.e.hristov99@gmail.com
1,Jane,Doe,JaneDoe@gmail.com
2,John,Doe,JohnDoe@email.com


In [87]:
# Common mistake: Changing a value through a filter
filt = (ppl_df['email'] == 'JohnDoe@email.com')
ppl_df[filt]

Unnamed: 0,first,last,email
2,John,Doe,JohnDoe@email.com


In [88]:
ppl_df[filt]['last'] = 'Smith'
ppl_df

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  """Entry point for launching an IPython kernel.


Unnamed: 0,first,last,email
0,SimoFirst,SimoLast,s.e.hristov99@gmail.com
1,Jane,Doe,JaneDoe@gmail.com
2,John,Doe,JohnDoe@email.com


In [89]:
# Solution
ppl_df.loc[filt, 'last'] = 'Smith'
ppl_df

Unnamed: 0,first,last,email
0,SimoFirst,SimoLast,s.e.hristov99@gmail.com
1,Jane,Doe,JaneDoe@gmail.com
2,John,Smith,JohnDoe@email.com


In [90]:
# Change all emails to lowercase.

# First way: just assign the column to its lowercase version.
ppl_df['email'] = ppl_df['email'].str.lower()
ppl_df

Unnamed: 0,first,last,email
0,SimoFirst,SimoLast,s.e.hristov99@gmail.com
1,Jane,Doe,janedoe@gmail.com
2,John,Smith,johndoe@email.com


However, we could also use one of the following functions:
- **map**: calling a function for every value in a series. *If the functions does not change a value, it replaces it with NaN*. ONLY WORKS ON SERIES
- **replace**: calling a function for every value in a series. *If the functions does not change a value, it leaves it as it was*. ONLY WORKS ON SERIES
- **apply**: calling a function for every value in a series or for every series in a dataframe.
- **applymap**: calling a function for every value in a dataframe. ONLY WORKS ON DATAFRAMES

In [91]:
ppl_df['email'].apply(len)

0    23
1    17
2    17
Name: email, dtype: int64

In [92]:
ppl_df['email'].apply(lambda email: email.upper())

0    S.E.HRISTOV99@GMAIL.COM
1          JANEDOE@GMAIL.COM
2          JOHNDOE@EMAIL.COM
Name: email, dtype: object

In [93]:
ppl_df['email'] = ppl_df['email'].apply(lambda email: email.upper())
ppl_df

Unnamed: 0,first,last,email
0,SimoFirst,SimoLast,S.E.HRISTOV99@GMAIL.COM
1,Jane,Doe,JANEDOE@GMAIL.COM
2,John,Smith,JOHNDOE@EMAIL.COM


In [94]:
# The above examples, should not be that surprising, but see what it does when called for the dataframe
# It will apply the function to each of the row (series object).
ppl_df.apply(len) # Same as len(ppl_df['email'])

first    3
last     3
email    3
dtype: int64

In [95]:
ppl_df.apply(len, axis='columns')

0    3
1    3
2    3
dtype: int64

In [96]:
# Return the lexicographically smallest element for every row
ppl_df.apply(pd.Series.min)

first                 Jane
last                   Doe
email    JANEDOE@GMAIL.COM
dtype: object

In [97]:
# Get the length of each string in the dataframe.
ppl_df.applymap(len)

Unnamed: 0,first,last,email
0,9,8,23
1,4,3,17
2,4,5,17


In [98]:
# Lowercase every string in the dataframe.
ppl_df.applymap(str.lower)

Unnamed: 0,first,last,email
0,simofirst,simolast,s.e.hristov99@gmail.com
1,jane,doe,janedoe@gmail.com
2,john,smith,johndoe@email.com


In [99]:
# Get the length of every first name.
ppl_df['first'].map(len)

0    9
1    4
2    4
Name: first, dtype: int64

> **Note**: If you don't use the assignment operator (`=`), the dataframe stays unchanged.

In [100]:
ppl_df

Unnamed: 0,first,last,email
0,SimoFirst,SimoLast,S.E.HRISTOV99@GMAIL.COM
1,Jane,Doe,JANEDOE@GMAIL.COM
2,John,Smith,JOHNDOE@EMAIL.COM


In [101]:
# Substitute certain values.
ppl_df['first'] = ppl_df['first'].map({
    'SimoFirst': 'firstsimo',
    'Jane': 'Jessy',
})
ppl_df

Unnamed: 0,first,last,email
0,firstsimo,SimoLast,S.E.HRISTOV99@GMAIL.COM
1,Jessy,Doe,JANEDOE@GMAIL.COM
2,,Smith,JOHNDOE@EMAIL.COM


In [102]:
ppl_df['last'] = ppl_df['last'].replace({
    'SimoLast': 'LastSimo',
    'Smith': 'Bobson',
})
ppl_df

Unnamed: 0,first,last,email
0,firstsimo,LastSimo,S.E.HRISTOV99@GMAIL.COM
1,Jessy,Doe,JANEDOE@GMAIL.COM
2,,Bobson,JOHNDOE@EMAIL.COM


In [103]:
# Rename the column that holds the yearly converted salary to `SalaryUSD`.
df.head()

Unnamed: 0_level_0,MainBranch,Employment,Country,US_State,UK_Country,EdLevel,Age1stCode,LearnCode,YearsCode,YearsCodePro,DevType,OrgSize,Currency,CompTotal,CompFreq,LanguageHaveWorkedWith,LanguageWantToWorkWith,DatabaseHaveWorkedWith,DatabaseWantToWorkWith,PlatformHaveWorkedWith,PlatformWantToWorkWith,WebframeHaveWorkedWith,WebframeWantToWorkWith,MiscTechHaveWorkedWith,MiscTechWantToWorkWith,ToolsTechHaveWorkedWith,ToolsTechWantToWorkWith,NEWCollabToolsHaveWorkedWith,NEWCollabToolsWantToWorkWith,OpSys,NEWStuck,NEWSOSites,SOVisitFreq,SOAccount,SOPartFreq,SOComm,NEWOtherComms,Age,Gender,Trans,Sexuality,Ethnicity,Accessibility,MentalHealth,SurveyLength,SurveyEase,ConvertedCompYearly
ResponseId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1
1,I am a developer by profession,"Independent contractor, freelancer, or self-em...",Slovakia,,,"Secondary school (e.g. American high school, G...",18 - 24 years,Coding Bootcamp;Other online resources (ex: vi...,,,"Developer, mobile",20 to 99 employees,EUR European Euro,4800.0,Monthly,C++;HTML/CSS;JavaScript;Objective-C;PHP;Swift,Swift,PostgreSQL;SQLite,SQLite,,,Laravel;Symfony,,,,,,PHPStorm;Xcode,Atom;Xcode,MacOS,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow,Multiple times per day,Yes,A few times per month or weekly,"Yes, definitely",No,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,62268.0
2,I am a student who is learning to code,"Student, full-time",Netherlands,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",7.0,,,,,,,JavaScript;Python,,PostgreSQL,,,,Angular;Flask;Vue.js,,Cordova,,Docker;Git;Yarn,Git,Android Studio;IntelliJ;Notepad++;PyCharm,,Windows,Visit Stack Overflow;Google it,Stack Overflow,Daily or almost daily,Yes,Daily or almost daily,"Yes, definitely",No,18-24 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,
3,"I am not primarily a developer, but I write co...","Student, full-time",Russian Federation,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",,,,,,,,Assembly;C;Python;R;Rust,Julia;Python;Rust,SQLite,SQLite,Heroku,,Flask,Flask,NumPy;Pandas;TensorFlow;Torch/PyTorch,Keras;NumPy;Pandas;TensorFlow;Torch/PyTorch,,,IPython/Jupyter;PyCharm;RStudio;Sublime Text;V...,IPython/Jupyter;RStudio;Sublime Text;Visual St...,MacOS,Visit Stack Overflow;Google it;Watch help / tu...,Stack Overflow;Stack Exchange,Multiple times per day,Yes,Multiple times per day,"Yes, definitely",Yes,18-24 years old,Man,No,Prefer not to say,Prefer not to say,None of the above,None of the above,Appropriate in length,Easy,
4,I am a developer by profession,Employed full-time,Austria,,,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",11 - 17 years,,,,"Developer, front-end",100 to 499 employees,EUR European Euro,,Monthly,JavaScript;TypeScript,JavaScript;TypeScript,,,,,Angular;jQuery,Angular;jQuery,,,,,,,Windows,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow,Daily or almost daily,Yes,Daily or almost daily,Neutral,No,35-44 years old,Man,No,Straight / Heterosexual,White or of European descent,I am deaf / hard of hearing,,Appropriate in length,Neither easy nor difficult,
5,I am a developer by profession,"Independent contractor, freelancer, or self-em...",United Kingdom of Great Britain and Northern I...,,England,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",5 - 10 years,Friend or family member,17.0,10.0,"Developer, desktop or enterprise applications;...","Just me - I am a freelancer, sole proprietor, ...",GBP\tPound sterling,,,Bash/Shell;HTML/CSS;Python;SQL,Bash/Shell;HTML/CSS;Python;SQL,Elasticsearch;PostgreSQL;Redis,Cassandra;Elasticsearch;PostgreSQL;Redis,,,Flask,Flask,Apache Spark;Hadoop;NumPy;Pandas,Hadoop;NumPy;Pandas,Docker;Git;Kubernetes;Yarn,Docker;Git;Kubernetes;Yarn,Atom;IPython/Jupyter;Notepad++;PyCharm;Vim,Atom;IPython/Jupyter;Notepad++;PyCharm;Vim;Vis...,Linux-based,Visit Stack Overflow;Go for a walk or other ph...,Stack Overflow;Stack Exchange,Daily or almost daily,Yes,A few times per week,"Yes, somewhat",No,25-34 years old,Man,No,,White or of European descent,None of the above,,Appropriate in length,Easy,


In [104]:
df = df.rename(columns={
    'ConvertedCompYearly': 'SalaryUSD'
})
df.head()

Unnamed: 0_level_0,MainBranch,Employment,Country,US_State,UK_Country,EdLevel,Age1stCode,LearnCode,YearsCode,YearsCodePro,DevType,OrgSize,Currency,CompTotal,CompFreq,LanguageHaveWorkedWith,LanguageWantToWorkWith,DatabaseHaveWorkedWith,DatabaseWantToWorkWith,PlatformHaveWorkedWith,PlatformWantToWorkWith,WebframeHaveWorkedWith,WebframeWantToWorkWith,MiscTechHaveWorkedWith,MiscTechWantToWorkWith,ToolsTechHaveWorkedWith,ToolsTechWantToWorkWith,NEWCollabToolsHaveWorkedWith,NEWCollabToolsWantToWorkWith,OpSys,NEWStuck,NEWSOSites,SOVisitFreq,SOAccount,SOPartFreq,SOComm,NEWOtherComms,Age,Gender,Trans,Sexuality,Ethnicity,Accessibility,MentalHealth,SurveyLength,SurveyEase,SalaryUSD
ResponseId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1
1,I am a developer by profession,"Independent contractor, freelancer, or self-em...",Slovakia,,,"Secondary school (e.g. American high school, G...",18 - 24 years,Coding Bootcamp;Other online resources (ex: vi...,,,"Developer, mobile",20 to 99 employees,EUR European Euro,4800.0,Monthly,C++;HTML/CSS;JavaScript;Objective-C;PHP;Swift,Swift,PostgreSQL;SQLite,SQLite,,,Laravel;Symfony,,,,,,PHPStorm;Xcode,Atom;Xcode,MacOS,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow,Multiple times per day,Yes,A few times per month or weekly,"Yes, definitely",No,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,62268.0
2,I am a student who is learning to code,"Student, full-time",Netherlands,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",7.0,,,,,,,JavaScript;Python,,PostgreSQL,,,,Angular;Flask;Vue.js,,Cordova,,Docker;Git;Yarn,Git,Android Studio;IntelliJ;Notepad++;PyCharm,,Windows,Visit Stack Overflow;Google it,Stack Overflow,Daily or almost daily,Yes,Daily or almost daily,"Yes, definitely",No,18-24 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,
3,"I am not primarily a developer, but I write co...","Student, full-time",Russian Federation,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",,,,,,,,Assembly;C;Python;R;Rust,Julia;Python;Rust,SQLite,SQLite,Heroku,,Flask,Flask,NumPy;Pandas;TensorFlow;Torch/PyTorch,Keras;NumPy;Pandas;TensorFlow;Torch/PyTorch,,,IPython/Jupyter;PyCharm;RStudio;Sublime Text;V...,IPython/Jupyter;RStudio;Sublime Text;Visual St...,MacOS,Visit Stack Overflow;Google it;Watch help / tu...,Stack Overflow;Stack Exchange,Multiple times per day,Yes,Multiple times per day,"Yes, definitely",Yes,18-24 years old,Man,No,Prefer not to say,Prefer not to say,None of the above,None of the above,Appropriate in length,Easy,
4,I am a developer by profession,Employed full-time,Austria,,,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",11 - 17 years,,,,"Developer, front-end",100 to 499 employees,EUR European Euro,,Monthly,JavaScript;TypeScript,JavaScript;TypeScript,,,,,Angular;jQuery,Angular;jQuery,,,,,,,Windows,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow,Daily or almost daily,Yes,Daily or almost daily,Neutral,No,35-44 years old,Man,No,Straight / Heterosexual,White or of European descent,I am deaf / hard of hearing,,Appropriate in length,Neither easy nor difficult,
5,I am a developer by profession,"Independent contractor, freelancer, or self-em...",United Kingdom of Great Britain and Northern I...,,England,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",5 - 10 years,Friend or family member,17.0,10.0,"Developer, desktop or enterprise applications;...","Just me - I am a freelancer, sole proprietor, ...",GBP\tPound sterling,,,Bash/Shell;HTML/CSS;Python;SQL,Bash/Shell;HTML/CSS;Python;SQL,Elasticsearch;PostgreSQL;Redis,Cassandra;Elasticsearch;PostgreSQL;Redis,,,Flask,Flask,Apache Spark;Hadoop;NumPy;Pandas,Hadoop;NumPy;Pandas,Docker;Git;Kubernetes;Yarn,Docker;Git;Kubernetes;Yarn,Atom;IPython/Jupyter;Notepad++;PyCharm;Vim,Atom;IPython/Jupyter;Notepad++;PyCharm;Vim;Vis...,Linux-based,Visit Stack Overflow;Go for a walk or other ph...,Stack Overflow;Stack Exchange,Daily or almost daily,Yes,A few times per week,"Yes, somewhat",No,25-34 years old,Man,No,,White or of European descent,None of the above,,Appropriate in length,Easy,


In [105]:
df['SalaryUSD']

ResponseId
1         62268.0
2             NaN
3             NaN
4             NaN
5             NaN
           ...   
83435    160500.0
83436      3960.0
83437     90000.0
83438    816816.0
83439     21168.0
Name: SalaryUSD, Length: 83439, dtype: float64

In [106]:
# Convert every 'Yes' to True and every 'No' value to False.
df['SOAccount'] = df['SOAccount'].map({'Yes': True, 'No': False})

In [107]:
df.head()

Unnamed: 0_level_0,MainBranch,Employment,Country,US_State,UK_Country,EdLevel,Age1stCode,LearnCode,YearsCode,YearsCodePro,DevType,OrgSize,Currency,CompTotal,CompFreq,LanguageHaveWorkedWith,LanguageWantToWorkWith,DatabaseHaveWorkedWith,DatabaseWantToWorkWith,PlatformHaveWorkedWith,PlatformWantToWorkWith,WebframeHaveWorkedWith,WebframeWantToWorkWith,MiscTechHaveWorkedWith,MiscTechWantToWorkWith,ToolsTechHaveWorkedWith,ToolsTechWantToWorkWith,NEWCollabToolsHaveWorkedWith,NEWCollabToolsWantToWorkWith,OpSys,NEWStuck,NEWSOSites,SOVisitFreq,SOAccount,SOPartFreq,SOComm,NEWOtherComms,Age,Gender,Trans,Sexuality,Ethnicity,Accessibility,MentalHealth,SurveyLength,SurveyEase,SalaryUSD
ResponseId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1
1,I am a developer by profession,"Independent contractor, freelancer, or self-em...",Slovakia,,,"Secondary school (e.g. American high school, G...",18 - 24 years,Coding Bootcamp;Other online resources (ex: vi...,,,"Developer, mobile",20 to 99 employees,EUR European Euro,4800.0,Monthly,C++;HTML/CSS;JavaScript;Objective-C;PHP;Swift,Swift,PostgreSQL;SQLite,SQLite,,,Laravel;Symfony,,,,,,PHPStorm;Xcode,Atom;Xcode,MacOS,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow,Multiple times per day,True,A few times per month or weekly,"Yes, definitely",No,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,62268.0
2,I am a student who is learning to code,"Student, full-time",Netherlands,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",7.0,,,,,,,JavaScript;Python,,PostgreSQL,,,,Angular;Flask;Vue.js,,Cordova,,Docker;Git;Yarn,Git,Android Studio;IntelliJ;Notepad++;PyCharm,,Windows,Visit Stack Overflow;Google it,Stack Overflow,Daily or almost daily,True,Daily or almost daily,"Yes, definitely",No,18-24 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,
3,"I am not primarily a developer, but I write co...","Student, full-time",Russian Federation,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",,,,,,,,Assembly;C;Python;R;Rust,Julia;Python;Rust,SQLite,SQLite,Heroku,,Flask,Flask,NumPy;Pandas;TensorFlow;Torch/PyTorch,Keras;NumPy;Pandas;TensorFlow;Torch/PyTorch,,,IPython/Jupyter;PyCharm;RStudio;Sublime Text;V...,IPython/Jupyter;RStudio;Sublime Text;Visual St...,MacOS,Visit Stack Overflow;Google it;Watch help / tu...,Stack Overflow;Stack Exchange,Multiple times per day,True,Multiple times per day,"Yes, definitely",Yes,18-24 years old,Man,No,Prefer not to say,Prefer not to say,None of the above,None of the above,Appropriate in length,Easy,
4,I am a developer by profession,Employed full-time,Austria,,,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",11 - 17 years,,,,"Developer, front-end",100 to 499 employees,EUR European Euro,,Monthly,JavaScript;TypeScript,JavaScript;TypeScript,,,,,Angular;jQuery,Angular;jQuery,,,,,,,Windows,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow,Daily or almost daily,True,Daily or almost daily,Neutral,No,35-44 years old,Man,No,Straight / Heterosexual,White or of European descent,I am deaf / hard of hearing,,Appropriate in length,Neither easy nor difficult,
5,I am a developer by profession,"Independent contractor, freelancer, or self-em...",United Kingdom of Great Britain and Northern I...,,England,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",5 - 10 years,Friend or family member,17.0,10.0,"Developer, desktop or enterprise applications;...","Just me - I am a freelancer, sole proprietor, ...",GBP\tPound sterling,,,Bash/Shell;HTML/CSS;Python;SQL,Bash/Shell;HTML/CSS;Python;SQL,Elasticsearch;PostgreSQL;Redis,Cassandra;Elasticsearch;PostgreSQL;Redis,,,Flask,Flask,Apache Spark;Hadoop;NumPy;Pandas,Hadoop;NumPy;Pandas,Docker;Git;Kubernetes;Yarn,Docker;Git;Kubernetes;Yarn,Atom;IPython/Jupyter;Notepad++;PyCharm;Vim,Atom;IPython/Jupyter;Notepad++;PyCharm;Vim;Vis...,Linux-based,Visit Stack Overflow;Go for a walk or other ph...,Stack Overflow;Stack Exchange,Daily or almost daily,True,A few times per week,"Yes, somewhat",No,25-34 years old,Man,No,,White or of European descent,None of the above,,Appropriate in length,Easy,


## Add/Remove Rows and Columns From DataFrames

In [108]:
ppl_df = pd.DataFrame(people)
ppl_df

Unnamed: 0,first,last,email
0,SimoFirst,SimoLast,s.e.hristov99@gmail.com
1,Jane,Doe,JaneDoe@gmail.com
2,John,Doe,JohnDoe@gmail.com


In [109]:
# Add a new column called `name` which is the concatenation of the `first` and `last` columns.
ppl_df['name'] = ppl_df['first'] + ' ' + ppl_df['last']  # ppl_df.name will not work, because Python will think that it already exists
ppl_df

Unnamed: 0,first,last,email,name
0,SimoFirst,SimoLast,s.e.hristov99@gmail.com,SimoFirst SimoLast
1,Jane,Doe,JaneDoe@gmail.com,Jane Doe
2,John,Doe,JohnDoe@gmail.com,John Doe


In [110]:
# Remove the `first` and `last` columns.
ppl_df = ppl_df.drop(['first', 'last'], axis=1)
ppl_df

Unnamed: 0,email,name
0,s.e.hristov99@gmail.com,SimoFirst SimoLast
1,JaneDoe@gmail.com,Jane Doe
2,JohnDoe@gmail.com,John Doe


In [111]:
# Split the `name` column into two new columns.
ppl_df['name'].str.split()

0    [SimoFirst, SimoLast]
1              [Jane, Doe]
2              [John, Doe]
Name: name, dtype: object

In [112]:
ppl_df['name'].str.split(expand=True)

Unnamed: 0,0,1
0,SimoFirst,SimoLast
1,Jane,Doe
2,John,Doe


In [113]:
ppl_df[['first', 'last']] = ppl_df['name'].str.split(expand=True)
ppl_df

Unnamed: 0,email,name,first,last
0,s.e.hristov99@gmail.com,SimoFirst SimoLast,SimoFirst,SimoLast
1,JaneDoe@gmail.com,Jane Doe,Jane,Doe
2,JohnDoe@gmail.com,John Doe,John,Doe


In [114]:
# Add a row of data.
ppl_df.append({'first': 'Elon'}, ignore_index=True)

Unnamed: 0,email,name,first,last
0,s.e.hristov99@gmail.com,SimoFirst SimoLast,SimoFirst,SimoLast
1,JaneDoe@gmail.com,Jane Doe,Jane,Doe
2,JohnDoe@gmail.com,John Doe,John,Doe
3,,,Elon,


In [115]:
# Note again that this actually returns a new dataframe and doesn't change the current one.
ppl_df

Unnamed: 0,email,name,first,last
0,s.e.hristov99@gmail.com,SimoFirst SimoLast,SimoFirst,SimoLast
1,JaneDoe@gmail.com,Jane Doe,Jane,Doe
2,JohnDoe@gmail.com,John Doe,John,Doe


In [116]:
# Concatenate dataframes.
people = {
  'first': ['Bob', 'John'],
  'last': ['Bobson', 'Johnson'],
  'email': ['bob@email.com', 'john@email.com']
}
ppl_df2 = pd.DataFrame(people)
# Note that the columns don't match in order and in number. Also, there are conflicting indices.
ppl_df2

Unnamed: 0,first,last,email
0,Bob,Bobson,bob@email.com
1,John,Johnson,john@email.com


In [117]:
ppl_df = ppl_df.append(ppl_df2, ignore_index=True)
ppl_df

Unnamed: 0,email,name,first,last
0,s.e.hristov99@gmail.com,SimoFirst SimoLast,SimoFirst,SimoLast
1,JaneDoe@gmail.com,Jane Doe,Jane,Doe
2,JohnDoe@gmail.com,John Doe,John,Doe
3,bob@email.com,,Bob,Bobson
4,john@email.com,,John,Johnson


In [118]:
# Remove rows by index.
ppl_df.drop(index=3)

Unnamed: 0,email,name,first,last
0,s.e.hristov99@gmail.com,SimoFirst SimoLast,SimoFirst,SimoLast
1,JaneDoe@gmail.com,Jane Doe,Jane,Doe
2,JohnDoe@gmail.com,John Doe,John,Doe
4,john@email.com,,John,Johnson


In [119]:
# Remove columns by a predicate

# Remove all rows that have last name `Doe`.
pred = (ppl_df['last'] == 'Doe')
ppl_df.drop(index=ppl_df[pred].index)

Unnamed: 0,email,name,first,last
0,s.e.hristov99@gmail.com,SimoFirst SimoLast,SimoFirst,SimoLast
3,bob@email.com,,Bob,Bobson
4,john@email.com,,John,Johnson


## Sorting Data

In [120]:
# Add a new entry `Adam`.
ppl_df = ppl_df.append({
    'first': 'Adam',
    'last': 'Doe',
    'email': 'a@email.com',
}, ignore_index=True)
ppl_df

Unnamed: 0,email,name,first,last
0,s.e.hristov99@gmail.com,SimoFirst SimoLast,SimoFirst,SimoLast
1,JaneDoe@gmail.com,Jane Doe,Jane,Doe
2,JohnDoe@gmail.com,John Doe,John,Doe
3,bob@email.com,,Bob,Bobson
4,john@email.com,,John,Johnson
5,a@email.com,,Adam,Doe


In [121]:
# Drop the name column.
ppl_df = ppl_df.drop(['name'], axis=1)
ppl_df

Unnamed: 0,email,first,last
0,s.e.hristov99@gmail.com,SimoFirst,SimoLast
1,JaneDoe@gmail.com,Jane,Doe
2,JohnDoe@gmail.com,John,Doe
3,bob@email.com,Bob,Bobson
4,john@email.com,John,Johnson
5,a@email.com,Adam,Doe


In [122]:
# Sort values by last name in ascending order.
ppl_df.sort_values(by='last')

Unnamed: 0,email,first,last
3,bob@email.com,Bob,Bobson
1,JaneDoe@gmail.com,Jane,Doe
2,JohnDoe@gmail.com,John,Doe
5,a@email.com,Adam,Doe
4,john@email.com,John,Johnson
0,s.e.hristov99@gmail.com,SimoFirst,SimoLast


In [123]:
# Sort values by last name in descending order.
ppl_df.sort_values(by='last', ascending=False)

Unnamed: 0,email,first,last
0,s.e.hristov99@gmail.com,SimoFirst,SimoLast
4,john@email.com,John,Johnson
1,JaneDoe@gmail.com,Jane,Doe
2,JohnDoe@gmail.com,John,Doe
5,a@email.com,Adam,Doe
3,bob@email.com,Bob,Bobson


In [124]:
# Sort by multiple columns.
ppl_df.sort_values(by=['last', 'first'], ascending=False)

Unnamed: 0,email,first,last
0,s.e.hristov99@gmail.com,SimoFirst,SimoLast
4,john@email.com,John,Johnson
2,JohnDoe@gmail.com,John,Doe
1,JaneDoe@gmail.com,Jane,Doe
5,a@email.com,Adam,Doe
3,bob@email.com,Bob,Bobson


In [125]:
# Sort the last name in descending order but sort the first name in ascending order.
ppl_df.sort_values(by=['last', 'first'], ascending=[False, True])

Unnamed: 0,email,first,last
0,s.e.hristov99@gmail.com,SimoFirst,SimoLast
4,john@email.com,John,Johnson
5,a@email.com,Adam,Doe
1,JaneDoe@gmail.com,Jane,Doe
2,JohnDoe@gmail.com,John,Doe
3,bob@email.com,Bob,Bobson


In [126]:
# Notice how the indices are not sorted. Let's ignore them when sorting the values.
ppl_df = ppl_df.sort_values(by=['last', 'first'], ascending=[False, True], ignore_index=True)
ppl_df

Unnamed: 0,email,first,last
0,s.e.hristov99@gmail.com,SimoFirst,SimoLast
1,john@email.com,John,Johnson
2,a@email.com,Adam,Doe
3,JaneDoe@gmail.com,Jane,Doe
4,JohnDoe@gmail.com,John,Doe
5,bob@email.com,Bob,Bobson


In [127]:
# We can also sort series objects.
ppl_df['last'].sort_values()

5      Bobson
2         Doe
3         Doe
4         Doe
1     Johnson
0    SimoLast
Name: last, dtype: object

In [128]:
# Sort result by country name.
df = df.sort_values(by='Country')
df

Unnamed: 0_level_0,MainBranch,Employment,Country,US_State,UK_Country,EdLevel,Age1stCode,LearnCode,YearsCode,YearsCodePro,DevType,OrgSize,Currency,CompTotal,CompFreq,LanguageHaveWorkedWith,LanguageWantToWorkWith,DatabaseHaveWorkedWith,DatabaseWantToWorkWith,PlatformHaveWorkedWith,PlatformWantToWorkWith,WebframeHaveWorkedWith,WebframeWantToWorkWith,MiscTechHaveWorkedWith,MiscTechWantToWorkWith,ToolsTechHaveWorkedWith,ToolsTechWantToWorkWith,NEWCollabToolsHaveWorkedWith,NEWCollabToolsWantToWorkWith,OpSys,NEWStuck,NEWSOSites,SOVisitFreq,SOAccount,SOPartFreq,SOComm,NEWOtherComms,Age,Gender,Trans,Sexuality,Ethnicity,Accessibility,MentalHealth,SurveyLength,SurveyEase,SalaryUSD
ResponseId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1
37432,I am a student who is learning to code,"Student, part-time",Afghanistan,,,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",Younger than 5 years,Books / Physical media,,,,,,,,C++;Python,,Oracle,,Microsoft Azure,,Django,,Qt;TensorFlow,,Unity 3D,,Notepad++;PyCharm;Visual Studio Code,,Windows,Visit Stack Overflow;Go for a walk or other ph...,Stack Overflow;Stack Exchange;Stack Overflow f...,A few times per month or weekly,True,A few times per month or weekly,"No, not at all",No,Under 18 years old,"Or, in your own words:","Or, in your own words:",Straight / Heterosexual;Bisexual;Prefer to sel...,White or of European descent;I don't know;Mult...,"Or, in your own words:","Or, in your own words:",Too long,Difficult,
11462,I am a developer by profession,"Not employed, but looking for work",Afghanistan,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",7,,"Developer, mobile;Developer, front-end;Develop...",,,,,C#;HTML/CSS;SQL;TypeScript,C#;HTML/CSS;Java;JavaScript;SQL;TypeScript,Firebase;Microsoft SQL Server;MongoDB;MySQL;SQ...,Firebase;Microsoft SQL Server;MySQL,Microsoft Azure,Microsoft Azure,Angular;ASP.NET Core,Angular;ASP.NET Core ;Spring,.NET Framework;.NET Core / .NET 5,.NET Framework;.NET Core / .NET 5,Git,Git,Eclipse;IntelliJ;Notepad++;Visual Studio;Visua...,Eclipse;IntelliJ;Notepad++;Visual Studio;Visua...,Windows,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow;Stack Exchange,Daily or almost daily,True,A few times per month or weekly,"Yes, somewhat",No,18-24 years old,Man,No,Straight / Heterosexual,Middle Eastern,None of the above,None of the above,Appropriate in length,Easy,
67131,I am a developer by profession,,Afghanistan,,,,,,,,,,AFN\tAfghan afghani,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
62971,"I am not primarily a developer, but I write co...","Not employed, but looking for work",Afghanistan,,,Some college/university study without earning ...,18 - 24 years,Books / Physical media,6,,"Developer, desktop or enterprise applications",,,,,C;C#;C++;Delphi;SQL,,Cassandra;Elasticsearch;SQLite,,AWS,,Angular;Angular.js;ASP.NET;ASP.NET Core ;Djang...,,.NET Framework;.NET Core / .NET 5;Apache Spark...,,Ansible;Chef;Deno;Docker,,Android Studio;Atom;Eclipse;IntelliJ;NetBeans,,Windows,Visit Stack Overflow,Stack Overflow,A few times per month or weekly,,,Not sure,No,25-34 years old,Man,No,Straight / Heterosexual,Multiracial;Indigenous (such as Native America...,None of the above,None of the above,Too long,Difficult,
14883,I am a developer by profession,Employed full-time,Afghanistan,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",18 - 24 years,Coding Bootcamp;Other online resources (ex: vi...,8,4,"Developer, front-end;Developer, full-stack;Dev...",20 to 99 employees,AFN\tAfghan afghani,6500.0,Weekly,HTML/CSS;JavaScript;PHP;SQL,HTML/CSS;JavaScript;Node.js;PHP;SQL,Elasticsearch;Firebase;MongoDB;MySQL;Redis;SQLite,Elasticsearch;Firebase;MariaDB;MySQL,AWS;DigitalOcean;Heroku;IBM Cloud or Watson,AWS;DigitalOcean;Heroku,Angular;Angular.js;Drupal;jQuery;Laravel;React.js,Angular;Angular.js;Drupal;jQuery;Laravel;React.js,.NET Framework,,Docker;Git,Docker;Git,Atom;Eclipse;NetBeans;Notepad++;PHPStorm;Subli...,Notepad++;Sublime Text;Visual Studio Code,Linux-based,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow;Stack Exchange,Daily or almost daily,True,A few times per month or weekly,"Yes, somewhat",Yes,25-34 years old,Man,No,,South Asian,None of the above,None of the above,Appropriate in length,Easy,4200.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
13195,I am a developer by profession,Employed full-time,Zimbabwe,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",18 - 24 years,"Other online resources (ex: videos, blogs, etc...",18,14,"Developer, back-end;Database administrator;Sys...",2 to 9 employees,USD\tUnited States dollar,,,Rust,Rust,,,,,,,,,Git,Git,Vim,Vim,Linux-based,Visit Stack Overflow;Google it;Watch help / tu...,Stack Overflow;Stack Exchange,A few times per week,True,I have never participated in Q&A on Stack Over...,"Yes, somewhat",Yes,35-44 years old,Man,No,Straight / Heterosexual,Black or of African descent,None of the above,None of the above,Too long,Easy,
80997,I code primarily as a hobby,"Not employed, and not looking for work",Zimbabwe,,,Primary/elementary school,11 - 17 years,"Other online resources (ex: videos, blogs, etc...",Less than 1 year,,,,,,,HTML/CSS;JavaScript;Node.js;Python,Assembly;C#;C++;Go;Java;Kotlin;Node.js;Python,,,,,React.js,Angular.js;Django;Express;jQuery,,,,,Notepad++;PyCharm;Visual Studio Code,Android Studio;PyCharm;Visual Studio Code,Windows,Watch help / tutorial videos;Do other work and...,Stack Overflow,A few times per month or weekly,True,I have never participated in Q&A on Stack Over...,"No, not really",No,Under 18 years old,Man,No,Straight / Heterosexual,Black or of African descent,I am blind / have difficulty seeing,None of the above,Appropriate in length,Easy,
73742,I am a developer by profession,"Student, part-time",Zimbabwe,,,Some college/university study without earning ...,11 - 17 years,"Other online resources (ex: videos, blogs, etc...",5,,,,,,,Bash/Shell;C#;Go;HTML/CSS;JavaScript;PowerShel...,,PostgreSQL;SQLite,,,,Angular;Django;jQuery;React.js,,NumPy;Pandas,,Git,,Android Studio;IPython/Jupyter;Sublime Text;Vi...,,Windows,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow,A few times per week,True,I have never participated in Q&A on Stack Over...,"Yes, definitely",Yes,18-24 years old,Man,No,Straight / Heterosexual,Black or of African descent,None of the above,None of the above,Appropriate in length,Neither easy nor difficult,
73193,I am a developer by profession,Employed full-time,Zimbabwe,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",18 - 24 years,"Other online resources (ex: videos, blogs, etc...",5,1,"Developer, full-stack",10 to 19 employees,USD\tUnited States dollar,200.0,Monthly,C#,C#,,,,,ASP.NET;jQuery,ASP.NET;jQuery,.NET Framework,.NET Framework,Git,Git,Visual Studio;Visual Studio Code,Visual Studio;Visual Studio Code,Windows,Visit Stack Overflow;Google it,Stack Overflow,A few times per month or weekly,True,Less than once per month or monthly,"Yes, definitely",No,25-34 years old,Man,No,Straight / Heterosexual,Black or of African descent,None of the above,None of the above,Appropriate in length,Easy,2400.0


In [129]:
# Sort the countries in asceding order and the salaries in descending order.
df[['Country', 'SalaryUSD']]

Unnamed: 0_level_0,Country,SalaryUSD
ResponseId,Unnamed: 1_level_1,Unnamed: 2_level_1
37432,Afghanistan,
11462,Afghanistan,
67131,Afghanistan,
62971,Afghanistan,
14883,Afghanistan,4200.0
...,...,...
13195,Zimbabwe,
80997,Zimbabwe,
73742,Zimbabwe,
73193,Zimbabwe,2400.0


In [130]:
df = df.sort_values(by=['Country', 'SalaryUSD'], ascending=[True, False])
df[['Country', 'SalaryUSD']]

Unnamed: 0_level_0,Country,SalaryUSD
ResponseId,Unnamed: 1_level_1,Unnamed: 2_level_1
65400,Afghanistan,30468516.0
22667,Afghanistan,155496.0
27199,Afghanistan,51804.0
31087,Afghanistan,23964.0
44641,Afghanistan,15132.0
...,...,...
56303,Zimbabwe,
13195,Zimbabwe,
80997,Zimbabwe,
73742,Zimbabwe,


In [131]:
# Get the 10 highest salaries from the survey.
df['SalaryUSD'].nlargest(10)

ResponseId
66911    45241312.0
65400    30468516.0
40587    21822250.0
28792    20000000.0
12701    19200000.0
9609     17500000.0
5306     15000000.0
12904    14411628.0
66489    12750000.0
47564    12500000.0
Name: SalaryUSD, dtype: float64

In [132]:
# Get the entries with the 10 highest salaries.
df.nlargest(10, 'SalaryUSD')

Unnamed: 0_level_0,MainBranch,Employment,Country,US_State,UK_Country,EdLevel,Age1stCode,LearnCode,YearsCode,YearsCodePro,DevType,OrgSize,Currency,CompTotal,CompFreq,LanguageHaveWorkedWith,LanguageWantToWorkWith,DatabaseHaveWorkedWith,DatabaseWantToWorkWith,PlatformHaveWorkedWith,PlatformWantToWorkWith,WebframeHaveWorkedWith,WebframeWantToWorkWith,MiscTechHaveWorkedWith,MiscTechWantToWorkWith,ToolsTechHaveWorkedWith,ToolsTechWantToWorkWith,NEWCollabToolsHaveWorkedWith,NEWCollabToolsWantToWorkWith,OpSys,NEWStuck,NEWSOSites,SOVisitFreq,SOAccount,SOPartFreq,SOComm,NEWOtherComms,Age,Gender,Trans,Sexuality,Ethnicity,Accessibility,MentalHealth,SurveyLength,SurveyEase,SalaryUSD
ResponseId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1
66911,I am a developer by profession,I prefer not to say,Belgium,,,Primary/elementary school,55 - 64 years,Books / Physical media,1,21,,"Just me - I am a freelancer, sole proprietor, ...",ALL\tAlbanian lek,5123468000.0,Yearly,APL,,Couchbase,,AWS,,,,,,,,IntelliJ;PyCharm,,Other (please specify):,Visit Stack Overflow;Go for a walk or other ph...,Stack Overflow for Teams (private knowledge sh...,Less than once per month or monthly,True,I have never participated in Q&A on Stack Over...,"No, not at all",Yes,55-64 years old,,Yes,,Black or of African descent,I am blind / have difficulty seeing,I have a concentration and/or memory disorder ...,Too short,,45241312.0
65400,I am a developer by profession,Employed full-time,Afghanistan,,,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",25 - 34 years,Online Courses or Certification,7,3,"Developer, desktop or enterprise applications;...",500 to 999 employees,ANG Netherlands Antillean guilder,4544242.0,Monthly,Bash/Shell;C++;Elixir;LISP;Node.js;Ruby;Rust;S...,Clojure;Crystal;Dart;Matlab;Node.js;Python;R,Couchbase;Elasticsearch;MariaDB,MySQL;Oracle,AWS;IBM Cloud or Watson,Oracle Cloud Infrastructure,ASP.NET;Drupal;Laravel;Svelte,Angular;ASP.NET;Django;Express;Flask;jQuery;Ru...,.NET Framework;Apache Spark;Hadoop;Pandas,Flutter;Pandas;Qt,Docker;Flow;Terraform,Chef;Docker;Git,NetBeans;PHPStorm;RubyMine,Eclipse;NetBeans,Linux-based,Watch help / tutorial videos;Play games,Stack Overflow;Stack Overflow for Teams (priva...,A few times per month or weekly,,,"No, not at all",Yes,35-44 years old,"Man;Non-binary, genderqueer, or gender non-con...",Yes,Bisexual;Prefer to self-describe:;Gay or Lesbi...,South Asian;Hispanic or Latino/a/x;Black or of...,I am unable to / find it difficult to walk or ...,I have a mood or emotional disorder (e.g. depr...,Too short,Difficult,30468516.0
40587,I am a developer by profession,Employed full-time,United States of America,California,,Some college/university study without earning ...,Younger than 5 years,"Other online resources (ex: videos, blogs, etc...",37,13,Other (please specify):,"10,000 or more employees",USD\tUnited States dollar,436445.0,Weekly,Bash/Shell;HTML/CSS;Java;JavaScript;Node.js;Py...,Bash/Shell;Haskell;HTML/CSS;Java;JavaScript;No...,MariaDB;MySQL;SQLite,MariaDB;MySQL;SQLite,Microsoft Azure,Microsoft Azure,Flask,Express;Flask;React.js;Svelte,,,Git,Docker;Git,IntelliJ;Visual Studio Code,IntelliJ;Visual Studio;Visual Studio Code,MacOS,Visit Stack Overflow;Google it;Do other work a...,Stack Overflow;Stack Exchange;Stack Overflow f...,Multiple times per day,True,Daily or almost daily,"Yes, somewhat",Yes,35-44 years old,Man,No,Bisexual,White or of European descent,I am blind / have difficulty seeing,I have a concentration and/or memory disorder ...,Appropriate in length,Easy,21822250.0
28792,I am a developer by profession,Employed full-time,India,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",35 - 44 years,Coding Bootcamp;Other online resources (ex: vi...,11,9,"Developer, desktop or enterprise applications;...",100 to 499 employees,USD\tUnited States dollar,20000000.0,Yearly,C#;C++;Groovy;HTML/CSS;Java;JavaScript;Node.js...,Ruby;Swift,Microsoft SQL Server;MySQL;Oracle;PostgreSQL;S...,MongoDB,AWS;DigitalOcean;Heroku,IBM Cloud or Watson;Microsoft Azure,Angular.js;Drupal;Laravel;Ruby on Rails;Symfony,Angular;ASP.NET,Apache Spark;Cordova;Hadoop,.NET Framework,Git,Chef;Docker;Kubernetes;Puppet,Android Studio;Atom;Eclipse;IPython/Jupyter;Ne...,Visual Studio Code,MacOS,Visit Stack Overflow;Google it;Visit another d...,Stack Overflow;Stack Exchange;Stack Overflow f...,A few times per month or weekly,True,A few times per month or weekly,Neutral,Yes,35-44 years old,Man,No,Prefer not to say,Prefer not to say,Prefer not to say,None of the above,Appropriate in length,Easy,20000000.0
12701,I am a developer by profession,Employed full-time,India,,,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",25 - 34 years,Online Forum,5,5,"Developer, full-stack",100 to 499 employees,USD\tUnited States dollar,1600000.0,Monthly,JavaScript,JavaScript,MongoDB;MySQL;PostgreSQL,MongoDB;MySQL;PostgreSQL,AWS;Heroku,AWS;Heroku,React.js,React.js,,,Git,Git,Eclipse;Visual Studio Code,Eclipse;Visual Studio Code,Windows,Visit Stack Overflow;Google it;Visit another d...,Stack Exchange,A few times per week,True,A few times per week,Neutral,Yes,35-44 years old,Man,No,Straight / Heterosexual,South Asian,None of the above,None of the above,Appropriate in length,Easy,19200000.0
9609,"I am not primarily a developer, but I write co...",Employed full-time,United States of America,New York,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",18 - 24 years,School,25,21,"Senior Executive (C-Suite, VP, etc.)",20 to 99 employees,USD\tUnited States dollar,350000.0,Weekly,Bash/Shell;Go;HTML/CSS;Java;JavaScript;Kotlin;...,,Elasticsearch;IBM DB2;MariaDB;MongoDB;MySQL;Or...,,AWS;DigitalOcean,,Angular;jQuery;Spring,,,,Git;Kubernetes,,Android Studio;IntelliJ;Notepad++;PHPStorm;Vim...,,Windows,Call a coworker or friend;Google it,Stack Overflow,Less than once per month or monthly,False,,Neutral,Yes,35-44 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,17500000.0
5306,I am a developer by profession,"Independent contractor, freelancer, or self-em...",United States of America,New Jersey,,"Professional degree (JD, MD, etc.)",11 - 17 years,Friend or family member;Books / Physical media,25,5,"Developer, full-stack;DevOps specialist","Just me - I am a freelancer, sole proprietor, ...",USD\tUnited States dollar,300000.0,Weekly,Python,Python;Rust;SQL,,,AWS;Google Cloud Platform,,Flask,Django,,,Ansible;Docker;Git;Terraform,Docker;Git,Vim,Vim,Linux-based,Other (please specify):,Stack Overflow;Stack Exchange,A few times per week,True,Less than once per month or monthly,"Yes, somewhat",Yes,35-44 years old,Man,No,,White or of European descent,None of the above,None of the above,Too long,Easy,15000000.0
12904,I am a developer by profession,Employed full-time,Serbia,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",6,1,"Developer, front-end;Developer, full-stack;Dev...",100 to 499 employees,EUR European Euro,1111000.0,Monthly,C#;JavaScript;Node.js;Python,C#;Python,MySQL;PostgreSQL,MySQL;PostgreSQL,,,Angular;ASP.NET;ASP.NET Core ;Django;jQuery;Re...,Angular;ASP.NET Core ;Django;React.js,.NET Framework;.NET Core / .NET 5,.NET Core / .NET 5,Docker;Git,Docker;Git,PyCharm;Visual Studio;Visual Studio Code,PyCharm;Visual Studio;Visual Studio Code,Windows,Visit Stack Overflow;Google it;Watch help / tu...,Stack Overflow;Stack Exchange,Daily or almost daily,False,,Neutral,No,18-24 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Too long,Neither easy nor difficult,14411628.0
66489,I am a developer by profession,Employed full-time,United States of America,Washington,,Some college/university study without earning ...,18 - 24 years,School;Online Courses or Certification;Books /...,5,3,"Developer, desktop or enterprise applications;...",20 to 99 employees,USD\tUnited States dollar,255000.0,Weekly,Bash/Shell;Go;HTML/CSS;Java;JavaScript;Kotlin;...,Node.js;SQL;TypeScript,DynamoDB;Elasticsearch;Firebase;MySQL;PostgreS...,DynamoDB;Redis,AWS;Google Cloud Platform,AWS,Express;React.js;Spring,Express;React.js,NumPy;Pandas,NumPy;Pandas,Docker;Git;Yarn,Git,Atom;IntelliJ;PyCharm;Sublime Text;Visual Stud...,Visual Studio Code,MacOS,Call a coworker or friend;Google it;Meditate,Stack Overflow;Stack Exchange,Daily or almost daily,True,I have never participated in Q&A on Stack Over...,"No, not really",No,25-34 years old,Man,No,Straight / Heterosexual,Prefer not to say,Prefer not to say,Prefer not to say,Appropriate in length,Easy,12750000.0
47564,I am a developer by profession,Employed full-time,United States of America,Colorado,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,Books / Physical media,24,13,"Developer, front-end","10,000 or more employees",USD\tUnited States dollar,250000.0,Weekly,Go;HTML/CSS;JavaScript;Python;TypeScript,HTML/CSS;JavaScript;Python;Rust;TypeScript,PostgreSQL,PostgreSQL,AWS,,React.js;Vue.js,React.js;Vue.js,,,Docker;Git;Kubernetes;Yarn,Git;Yarn,Visual Studio Code,Visual Studio Code,MacOS,Go for a walk or other physical activity;Googl...,Stack Overflow,A few times per month or weekly,True,I have never participated in Q&A on Stack Over...,Neutral,No,35-44 years old,Prefer not to say,Prefer not to say,Prefer not to say,Prefer not to say,None of the above,I have an anxiety disorder,Appropriate in length,Neither easy nor difficult,12500000.0


In [133]:
# Get the entries with the 10 lowest salaries.
df.nsmallest(10, 'SalaryUSD')

Unnamed: 0_level_0,MainBranch,Employment,Country,US_State,UK_Country,EdLevel,Age1stCode,LearnCode,YearsCode,YearsCodePro,DevType,OrgSize,Currency,CompTotal,CompFreq,LanguageHaveWorkedWith,LanguageWantToWorkWith,DatabaseHaveWorkedWith,DatabaseWantToWorkWith,PlatformHaveWorkedWith,PlatformWantToWorkWith,WebframeHaveWorkedWith,WebframeWantToWorkWith,MiscTechHaveWorkedWith,MiscTechWantToWorkWith,ToolsTechHaveWorkedWith,ToolsTechWantToWorkWith,NEWCollabToolsHaveWorkedWith,NEWCollabToolsWantToWorkWith,OpSys,NEWStuck,NEWSOSites,SOVisitFreq,SOAccount,SOPartFreq,SOComm,NEWOtherComms,Age,Gender,Trans,Sexuality,Ethnicity,Accessibility,MentalHealth,SurveyLength,SurveyEase,SalaryUSD
ResponseId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1
1925,I am a developer by profession,Employed full-time,United States of America,Pennsylvania,,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",11 - 17 years,School;Other (please specify):,30,25,"Developer, mobile;Developer, front-end;Develop...","1,000 to 4,999 employees",USD\tUnited States dollar,1.0,Yearly,C#;HTML/CSS;JavaScript;PowerShell;SQL,C#;HTML/CSS;JavaScript;PowerShell;SQL,Microsoft SQL Server,Microsoft SQL Server,AWS;Microsoft Azure,,Angular;Angular.js;ASP.NET;ASP.NET Core ;jQuery,Angular;Angular.js;ASP.NET;ASP.NET Core ;jQuery,.NET Framework;.NET Core / .NET 5,.NET Framework;.NET Core / .NET 5,Docker;Git;Kubernetes,Docker;Git;Kubernetes,Notepad++;Visual Studio;Visual Studio Code,Notepad++;Visual Studio;Visual Studio Code,Windows,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow;Stack Exchange,A few times per week,True,A few times per week,"Yes, definitely",No,45-54 years old,Man,No,,White or of European descent,None of the above,None of the above,Too long,Neither easy nor difficult,1.0
46465,"I am not primarily a developer, but I write co...",Employed full-time,United States of America,Washington,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,Coding Bootcamp;Other online resources (ex: vi...,27,12,"Engineer, data;Data scientist or machine learn...",10 to 19 employees,USD\tUnited States dollar,1.0,Yearly,Perl;Python,JavaScript;Node.js;Python,SQLite,,Google Cloud Platform,,Flask,Flask,Flutter,,Git,Git,Android Studio;IntelliJ;PyCharm;Sublime Text,PyCharm;Sublime Text,MacOS,Go for a walk or other physical activity;Do ot...,Stack Overflow,Less than once per month or monthly,False,,"No, not at all",No,35-44 years old,Man,Prefer not to say,Prefer not to say,Prefer not to say,Prefer not to say,Prefer not to say,Too long,Easy,1.0
24286,I am a developer by profession,"Independent contractor, freelancer, or self-em...",United States of America,California,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,Books / Physical media,16,9,Other (please specify):;Data scientist or mach...,2 to 9 employees,USD\tUnited States dollar,1.0,Yearly,Python;SQL,Python,PostgreSQL;Redis,PostgreSQL;Redis,AWS,AWS,Django;Flask,Django;Flask,NumPy;Pandas;TensorFlow,NumPy;Pandas;TensorFlow,Docker;Git;Terraform,Docker;Git;Terraform,IPython/Jupyter;PyCharm,IPython/Jupyter;PyCharm,MacOS,Visit Stack Overflow;Go for a walk or other ph...,Stack Overflow;Stack Exchange,A few times per week,True,A few times per week,"Yes, somewhat",Yes,35-44 years old,Man,No,Prefer to self-describe:,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,1.0
15164,I am a developer by profession,Employed full-time,China,,,Primary/elementary school,Younger than 5 years,Books / Physical media,5,1,Academic researcher,2 to 9 employees,CNY\tChinese Yuan Renminbi,11.0,Yearly,Objective-C,,MySQL,,Microsoft Azure,,Vue.js,,,,,,,,MacOS,Google it,Stack Overflow for Teams (private knowledge sh...,Less than once per month or monthly,True,I have never participated in Q&A on Stack Over...,"No, not at all",Yes,35-44 years old,Woman,Yes,Bisexual,Black or of African descent,I am blind / have difficulty seeing,I have a concentration and/or memory disorder ...,Too short,Easy,2.0
56791,I am a developer by profession,Employed full-time,Taiwan,,,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",11 - 17 years,School,12,2,"Developer, full-stack;Database administrator;S...",100 to 499 employees,TWD\tNew Taiwan dollar,50.0,Yearly,HTML/CSS;JavaScript;Node.js;Python;SQL,,MySQL;PostgreSQL;SQLite,,,,Django;jQuery,,,,Docker;Git,,PyCharm,,Windows,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow,Daily or almost daily,True,Less than once per month or monthly,"No, not really",No,25-34 years old,Woman,No,Straight / Heterosexual,East Asian,None of the above,None of the above,Appropriate in length,Easy,2.0
22033,I am a developer by profession,Employed full-time,Republic of Korea,,,"Associate degree (A.A., A.S., etc.)",18 - 24 years,Coding Bootcamp;Other online resources (ex: vi...,5,Less than 1 year,"Developer, front-end",20 to 99 employees,KRW\tSouth Korean won,3800.0,Yearly,JavaScript;TypeScript,JavaScript;TypeScript,MySQL;PostgreSQL;Redis,MySQL;PostgreSQL;Redis,AWS;Google Cloud Platform,AWS;Google Cloud Platform,React.js,React.js;Svelte;Vue.js,Hadoop;React Native,Flutter;Hadoop;React Native,Docker;Git;Kubernetes;Yarn,Deno;Docker;Git;Kubernetes;Yarn,Visual Studio Code;Webstorm,Visual Studio Code;Webstorm,MacOS,Visit Stack Overflow;Go for a walk or other ph...,Stack Overflow,Daily or almost daily,True,Less than once per month or monthly,"Yes, somewhat",No,25-34 years old,Man,No,,East Asian,None of the above,I have an anxiety disorder,Appropriate in length,Easy,3.0
59044,I am a developer by profession,Employed full-time,Taiwan,,,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",18 - 24 years,"Other online resources (ex: videos, blogs, etc...",20,,"Developer, front-end",100 to 499 employees,TWD\tNew Taiwan dollar,90.0,Yearly,HTML/CSS;JavaScript;Node.js;PowerShell;TypeScript,HTML/CSS;JavaScript;Node.js;TypeScript,Elasticsearch;Firebase;MongoDB;MySQL;Redis,Firebase,DigitalOcean;Microsoft Azure,,Angular;Express,Angular,,,Ansible;Docker;Git,Docker;Git,IntelliJ;Notepad++;Vim;Visual Studio Code;Webs...,IntelliJ;Notepad++;Vim;Visual Studio Code;Webs...,MacOS,Go for a walk or other physical activity;Googl...,Stack Overflow,Multiple times per day,True,I have never participated in Q&A on Stack Over...,"Yes, somewhat",No,25-34 years old,Man,No,Prefer not to say,Southeast Asian,None of the above,None of the above,Appropriate in length,Easy,3.0
22454,I am a developer by profession,Employed full-time,Japan,,,"Secondary school (e.g. American high school, G...",5 - 10 years,"Other online resources (ex: videos, blogs, etc)",Less than 1 year,6,"Developer, full-stack",20 to 99 employees,JPY\tJapanese yen,450.0,Yearly,C#;HTML/CSS;SQL;TypeScript,C#;HTML/CSS;SQL;TypeScript,Oracle,Oracle,AWS;Microsoft Azure,AWS;Microsoft Azure,Angular;ASP.NET Core,Angular;ASP.NET Core,.NET Core / .NET 5,.NET Core / .NET 5,,,Visual Studio;Visual Studio Code,Visual Studio;Visual Studio Code,Windows,Call a coworker or friend;Meditate,Stack Overflow,A few times per month or weekly,,,"No, not at all",No,35-44 years old,Man,No,,East Asian,None of the above,,Appropriate in length,Easy,4.0
45142,"I am not primarily a developer, but I write co...",Employed full-time,China,,,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",12,1,Scientist,"1,000 to 4,999 employees",CNY\tChinese Yuan Renminbi,35.0,Yearly,C++;Java;Python;R;Scala;SQL,C++;Python,,,,,,,Hadoop;NumPy;Pandas;Torch/PyTorch,NumPy;Pandas;Torch/PyTorch,Git,Git,IntelliJ;PyCharm;RStudio;Visual Studio;Visual ...,Visual Studio Code,Windows,Visit Stack Overflow;Google it;Do other work a...,Stack Overflow;Stack Exchange,Daily or almost daily,True,A few times per month or weekly,"Yes, somewhat",Yes,18-24 years old,Man,No,Straight / Heterosexual,East Asian,None of the above,I have a mood or emotional disorder (e.g. depr...,Too long,Easy,5.0
53307,I am a developer by profession,Employed full-time,Japan,,,Some college/university study without earning ...,11 - 17 years,"Other online resources (ex: videos, blogs, etc...",25,17,"Developer, mobile;Data scientist or machine le...","10,000 or more employees",JPY\tJapanese yen,600.0,Yearly,Bash/Shell;HTML/CSS;Java;JavaScript;Kotlin;Nod...,Bash/Shell;C#;HTML/CSS;Java;Kotlin;Python;SQL;...,Firebase;MySQL;PostgreSQL;SQLite,MariaDB;PostgreSQL;SQLite,AWS,AWS,Angular;Express;Flask;Laravel;Spring;Vue.js,Angular;ASP.NET Core ;FastAPI;Laravel,Keras;NumPy;Pandas;TensorFlow;Torch/PyTorch,.NET Core / .NET 5;NumPy;Pandas;Torch/PyTorch,Docker;Git,Git;Unity 3D;Unreal Engine,Android Studio;IntelliJ;IPython/Jupyter;PHPSto...,Android Studio;IntelliJ;PHPStorm;PyCharm;Vim;V...,Linux-based,Visit Stack Overflow;Google it;Watch help / tu...,Stack Overflow;Stack Exchange,Daily or almost daily,False,,Neutral,No,35-44 years old,Man,No,Straight / Heterosexual,East Asian,None of the above,None of the above,Appropriate in length,Easy,5.0


## Grouping and Aggregating - Analyzing and Exploring Your Data

In [134]:
df = pd.read_csv(DATA_PATH)
df

Unnamed: 0,ResponseId,MainBranch,Employment,Country,US_State,UK_Country,EdLevel,Age1stCode,LearnCode,YearsCode,YearsCodePro,DevType,OrgSize,Currency,CompTotal,CompFreq,LanguageHaveWorkedWith,LanguageWantToWorkWith,DatabaseHaveWorkedWith,DatabaseWantToWorkWith,PlatformHaveWorkedWith,PlatformWantToWorkWith,WebframeHaveWorkedWith,WebframeWantToWorkWith,MiscTechHaveWorkedWith,MiscTechWantToWorkWith,ToolsTechHaveWorkedWith,ToolsTechWantToWorkWith,NEWCollabToolsHaveWorkedWith,NEWCollabToolsWantToWorkWith,OpSys,NEWStuck,NEWSOSites,SOVisitFreq,SOAccount,SOPartFreq,SOComm,NEWOtherComms,Age,Gender,Trans,Sexuality,Ethnicity,Accessibility,MentalHealth,SurveyLength,SurveyEase,ConvertedCompYearly
0,1,I am a developer by profession,"Independent contractor, freelancer, or self-em...",Slovakia,,,"Secondary school (e.g. American high school, G...",18 - 24 years,Coding Bootcamp;Other online resources (ex: vi...,,,"Developer, mobile",20 to 99 employees,EUR European Euro,4800.0,Monthly,C++;HTML/CSS;JavaScript;Objective-C;PHP;Swift,Swift,PostgreSQL;SQLite,SQLite,,,Laravel;Symfony,,,,,,PHPStorm;Xcode,Atom;Xcode,MacOS,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow,Multiple times per day,Yes,A few times per month or weekly,"Yes, definitely",No,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,62268.0
1,2,I am a student who is learning to code,"Student, full-time",Netherlands,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",7,,,,,,,JavaScript;Python,,PostgreSQL,,,,Angular;Flask;Vue.js,,Cordova,,Docker;Git;Yarn,Git,Android Studio;IntelliJ;Notepad++;PyCharm,,Windows,Visit Stack Overflow;Google it,Stack Overflow,Daily or almost daily,Yes,Daily or almost daily,"Yes, definitely",No,18-24 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,
2,3,"I am not primarily a developer, but I write co...","Student, full-time",Russian Federation,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",,,,,,,,Assembly;C;Python;R;Rust,Julia;Python;Rust,SQLite,SQLite,Heroku,,Flask,Flask,NumPy;Pandas;TensorFlow;Torch/PyTorch,Keras;NumPy;Pandas;TensorFlow;Torch/PyTorch,,,IPython/Jupyter;PyCharm;RStudio;Sublime Text;V...,IPython/Jupyter;RStudio;Sublime Text;Visual St...,MacOS,Visit Stack Overflow;Google it;Watch help / tu...,Stack Overflow;Stack Exchange,Multiple times per day,Yes,Multiple times per day,"Yes, definitely",Yes,18-24 years old,Man,No,Prefer not to say,Prefer not to say,None of the above,None of the above,Appropriate in length,Easy,
3,4,I am a developer by profession,Employed full-time,Austria,,,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",11 - 17 years,,,,"Developer, front-end",100 to 499 employees,EUR European Euro,,Monthly,JavaScript;TypeScript,JavaScript;TypeScript,,,,,Angular;jQuery,Angular;jQuery,,,,,,,Windows,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow,Daily or almost daily,Yes,Daily or almost daily,Neutral,No,35-44 years old,Man,No,Straight / Heterosexual,White or of European descent,I am deaf / hard of hearing,,Appropriate in length,Neither easy nor difficult,
4,5,I am a developer by profession,"Independent contractor, freelancer, or self-em...",United Kingdom of Great Britain and Northern I...,,England,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",5 - 10 years,Friend or family member,17,10,"Developer, desktop or enterprise applications;...","Just me - I am a freelancer, sole proprietor, ...",GBP\tPound sterling,,,Bash/Shell;HTML/CSS;Python;SQL,Bash/Shell;HTML/CSS;Python;SQL,Elasticsearch;PostgreSQL;Redis,Cassandra;Elasticsearch;PostgreSQL;Redis,,,Flask,Flask,Apache Spark;Hadoop;NumPy;Pandas,Hadoop;NumPy;Pandas,Docker;Git;Kubernetes;Yarn,Docker;Git;Kubernetes;Yarn,Atom;IPython/Jupyter;Notepad++;PyCharm;Vim,Atom;IPython/Jupyter;Notepad++;PyCharm;Vim;Vis...,Linux-based,Visit Stack Overflow;Go for a walk or other ph...,Stack Overflow;Stack Exchange,Daily or almost daily,Yes,A few times per week,"Yes, somewhat",No,25-34 years old,Man,No,,White or of European descent,None of the above,,Appropriate in length,Easy,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
83434,83435,I am a developer by profession,Employed full-time,United States of America,Texas,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",6,5,"Developer, back-end",20 to 99 employees,USD\tUnited States dollar,160500.0,Yearly,Clojure;Kotlin;SQL,Clojure,Oracle;SQLite,SQLite,AWS,AWS,,,,,Docker;Git,Git;Kubernetes,IntelliJ;Sublime Text;Vim;Visual Studio Code,Sublime Text;Vim,MacOS,Call a coworker or friend;Google it,Stack Overflow;Stack Exchange,A few times per week,No,,"No, not at all",No,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,I have a concentration and/or memory disorder ...,Appropriate in length,Easy,160500.0
83435,83436,I am a developer by profession,"Independent contractor, freelancer, or self-em...",Benin,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",4,2,"Developer, full-stack","Just me - I am a freelancer, sole proprietor, ...",XOF\tWest African CFA franc,200000.0,Monthly,,,Firebase;MariaDB;MySQL;PostgreSQL;Redis;SQLite,Firebase;MariaDB;MongoDB;MySQL;PostgreSQL;Redi...,,,Django;jQuery;Laravel;React.js;Ruby on Rails,Django;Express;jQuery;Laravel;React.js;Ruby on...,Flutter;Qt,,Git;Unity 3D;Unreal Engine,Docker;Git;Kubernetes,Android Studio;Eclipse;Emacs;IntelliJ;NetBeans...,Emacs;IntelliJ;PHPStorm;PyCharm;RStudio;Sublim...,Linux-based,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow;Stack Exchange,Multiple times per day,Yes,I have never participated in Q&A on Stack Over...,"Yes, somewhat",No,18-24 years old,Man,No,Straight / Heterosexual,Black or of African descent,None of the above,None of the above,Appropriate in length,Easy,3960.0
83436,83437,I am a developer by profession,Employed full-time,United States of America,New Jersey,,"Secondary school (e.g. American high school, G...",11 - 17 years,School,10,4,Data scientist or machine learning specialist;...,"10,000 or more employees",USD\tUnited States dollar,1800.0,Weekly,Groovy;Java;Python,Java;Python,DynamoDB;Elasticsearch;MongoDB;PostgreSQL;Redis,DynamoDB;Redis,AWS;Google Cloud Platform,AWS,FastAPI;Flask,FastAPI;Flask,Hadoop;Keras;NumPy;Pandas,Apache Spark;Hadoop;Keras;NumPy;Pandas;TensorFlow,Ansible;Docker;Git;Terraform,Docker;Git;Kubernetes;Terraform,Android Studio;Eclipse;IntelliJ;IPython/Jupyte...,IntelliJ;IPython/Jupyter;Notepad++;Vim,Windows,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow;Stack Exchange,A few times per week,Yes,I have never participated in Q&A on Stack Over...,"No, not really",No,25-34 years old,Man,No,,White or of European descent,None of the above,None of the above,Appropriate in length,Neither easy nor difficult,90000.0
83437,83438,I am a developer by profession,Employed full-time,Canada,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,Online Courses or Certification;Books / Physic...,5,3,"Developer, back-end",20 to 99 employees,CAD\tCanadian dollar,90000.0,Monthly,Bash/Shell;JavaScript;Node.js;Python,Go;Rust,Cassandra;Elasticsearch;MongoDB;PostgreSQL;Redis,,Heroku,AWS;DigitalOcean,Django;Express;Flask;React.js,,NumPy;Pandas;TensorFlow;Torch/PyTorch,NumPy;Pandas;TensorFlow;Torch/PyTorch,Ansible;Docker;Git;Terraform,Kubernetes;Terraform,PyCharm;Sublime Text,,MacOS,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow,A few times per month or weekly,Yes,Less than once per month or monthly,"No, not really",No,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,I have a mood or emotional disorder (e.g. depr...,Appropriate in length,Neither easy nor difficult,816816.0


In [135]:
df = df.rename(columns={
    'ConvertedCompYearly': 'SalaryUSD'
})

> **Mean**: the average of a data set;

> **Mode**: the most common number in a data set;

> **Median**: the middle of the set of numbers.

> **Aggregation**: taking multiple values and returning a single result;

In [136]:
# What is the most common salary?
df['SalaryUSD'].median()

56211.0

In [137]:
df.median(numeric_only=True)

ResponseId    41720.0
CompTotal     67000.0
SalaryUSD     56211.0
dtype: float64

In [138]:
df.describe()

Unnamed: 0,ResponseId,CompTotal,SalaryUSD
count,83439.0,47183.0,46844.0
mean,41720.0,2.119407e+69,118426.2
std,24086.908893,4.603702e+71,527294.4
min,1.0,0.0,1.0
25%,20860.5,16000.0,27025.0
50%,41720.0,67000.0,56211.0
75%,62579.5,140000.0,100000.0
max,83439.0,1e+74,45241310.0


In [139]:
# How many people gave information about their salary?
df['SalaryUSD'].count()

46844

In [140]:
df['SalaryUSD'].count() / df.shape[0]

0.5614161243543188

In [141]:
# What percentage of the respondents have a StackOverflow account?
df['SOAccount'].value_counts(normalize=True) # Preferred over: df['SOAccount'].value_counts() / df.shape[0]

Yes                        0.821727
No                         0.127355
Not sure/can't remember    0.050918
Name: SOAccount, dtype: float64

In [142]:
# What is the most popular OS for work?
schema_df.loc['OpSys', 'question']

'What is the primary operating system in which you work? *'

In [143]:
df['OpSys'].value_counts()

Windows                              37758
Linux-based                          21088
MacOS                                20984
Windows Subsystem for Linux (WSL)     2743
Other (please specify):                575
BSD                                    146
Name: OpSys, dtype: int64

In [144]:
# What do people do when they get stuck on a problem?
schema_df.loc['NEWStuck', 'question']

'What do you do when you get stuck on a problem? Select all that apply.'

In [145]:
df['NEWStuck'].value_counts().head(15)

Visit Stack Overflow;Google it                                                                                                                                      5094
Visit Stack Overflow;Google it;Watch help / tutorial videos                                                                                                         4483
Google it                                                                                                                                                           3404
Call a coworker or friend;Visit Stack Overflow;Google it                                                                                                            2905
Visit Stack Overflow;Google it;Do other work and come back later                                                                                                    2703
Call a coworker or friend;Visit Stack Overflow;Google it;Watch help / tutorial videos                                                                      

### **Grouping**: splitting into groups, applying a function, combining results.

In [146]:
# Let's get a breakdown for the most popular OS by country.

# Get the countries of the majority of respondents.
df['Country'].value_counts()

United States of America                                15288
India                                                   10511
Germany                                                  5625
United Kingdom of Great Britain and Northern Ireland     4475
Canada                                                   3012
                                                        ...  
Saint Kitts and Nevis                                       1
Dominica                                                    1
Saint Vincent and the Grenadines                            1
Tuvalu                                                      1
Papua New Guinea                                            1
Name: Country, Length: 181, dtype: int64

In [147]:
country_group = df.groupby(['Country'])
country_group # Splitting into groups.

<pandas.core.groupby.generic.DataFrameGroupBy object at 0x7f74cdcc3510>

In [148]:
country_group.get_group('United States of America')

Unnamed: 0,ResponseId,MainBranch,Employment,Country,US_State,UK_Country,EdLevel,Age1stCode,LearnCode,YearsCode,YearsCodePro,DevType,OrgSize,Currency,CompTotal,CompFreq,LanguageHaveWorkedWith,LanguageWantToWorkWith,DatabaseHaveWorkedWith,DatabaseWantToWorkWith,PlatformHaveWorkedWith,PlatformWantToWorkWith,WebframeHaveWorkedWith,WebframeWantToWorkWith,MiscTechHaveWorkedWith,MiscTechWantToWorkWith,ToolsTechHaveWorkedWith,ToolsTechWantToWorkWith,NEWCollabToolsHaveWorkedWith,NEWCollabToolsWantToWorkWith,OpSys,NEWStuck,NEWSOSites,SOVisitFreq,SOAccount,SOPartFreq,SOComm,NEWOtherComms,Age,Gender,Trans,Sexuality,Ethnicity,Accessibility,MentalHealth,SurveyLength,SurveyEase,SalaryUSD
5,6,I am a student who is learning to code,"Student, part-time",United States of America,Georgia,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",,,,,,,,C;C#;C++;HTML/CSS;Java;JavaScript;Node.js;Powe...,C#;C++;Go;HTML/CSS;Java;JavaScript;Node.js;Obj...,MySQL;PostgreSQL;SQLite,Elasticsearch;Firebase;IBM DB2;MariaDB;Microso...,,,Express;Flask;jQuery;React.js,Express;Flask;jQuery;React.js,Keras;NumPy;Pandas;TensorFlow;Torch/PyTorch,Keras;NumPy;Pandas;Qt;React Native;TensorFlow;...,Git,Docker;Git;Unity 3D;Unreal Engine,IPython/Jupyter;Notepad++;PyCharm;Sublime Text...,Android Studio;IPython/Jupyter;Vim;Visual Stud...,Windows,Visit Stack Overflow;Go for a walk or other ph...,Stack Overflow;Stack Exchange,Multiple times per day,Yes,I have never participated in Q&A on Stack Over...,"Yes, somewhat",No,18-24 years old,Prefer not to say,No,Straight / Heterosexual,Prefer not to say,None of the above,I have a concentration and/or memory disorder ...,Too long,Neither easy nor difficult,
6,7,I code primarily as a hobby,I prefer not to say,United States of America,New Hampshire,,"Secondary school (e.g. American high school, G...",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",3,,,,,,,HTML/CSS;JavaScript,HTML/CSS;JavaScript;PHP,,,,,jQuery,jQuery,,,,,,,Windows,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow,Daily or almost daily,Yes,A few times per week,"Yes, somewhat",No,Prefer not to say,Prefer not to say,No,,I don't know,None of the above,None of the above,Appropriate in length,Neither easy nor difficult,
15,16,I am a student who is learning to code,"Student, full-time",United States of America,Missouri,,"Secondary school (e.g. American high school, G...",5 - 10 years,Other (please specify):,7,,,,,,,Bash/Shell;Python,Bash/Shell;Python,,,,,,,,,Git,Git,Emacs;Neovim;Vim,Neovim;Vim,Linux-based,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow;Stack Exchange,Multiple times per day,Yes,Less than once per month or monthly,Neutral,No,Under 18 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,I have a concentration and/or memory disorder ...,Too long,Easy,
36,37,I am a developer by profession,Employed full-time,United States of America,District of Columbia,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",8,Less than 1 year,"Developer, embedded applications or devices",20 to 99 employees,USD\tUnited States dollar,103000.0,Yearly,Assembly;C;Java;Kotlin;Rust,Assembly;Kotlin;Rust;SQL,,,,,,,,,Docker;Git,Git,IntelliJ;PyCharm;Webstorm,Android Studio;IntelliJ;PyCharm;Sublime Text;W...,Linux-based,Visit Stack Overflow;Google it;Do other work a...,Stack Overflow;Stack Exchange,Daily or almost daily,Yes,A few times per week,"Yes, somewhat",Yes,18-24 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,Prefer not to say,Too short,Easy,103000.0
37,38,I am a developer by profession,Employed full-time,United States of America,Massachusetts,,Some college/university study without earning ...,18 - 24 years,"Other online resources (ex: videos, blogs, etc)",20,15,"Developer, back-end","1,000 to 4,999 employees",USD\tUnited States dollar,300000.0,Yearly,Go,Go;Rust,,,AWS;Google Cloud Platform;Microsoft Azure,,,,,,Docker;Git;Terraform,Git;Terraform,Vim;Visual Studio Code,Vim;Visual Studio Code,MacOS,Google it;Do other work and come back later;Me...,Stack Overflow;Stack Exchange,Daily or almost daily,Yes,Daily or almost daily,"Yes, definitely",No,35-44 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,I have a concentration and/or memory disorder ...,Appropriate in length,Easy,300000.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
83427,83428,I am a developer by profession,Employed full-time,United States of America,Pennsylvania,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",13,2,"Developer, front-end;Developer, desktop or ent...","1,000 to 4,999 employees",USD\tUnited States dollar,86000.0,Weekly,Java;Python,C#;Java;Python,,,,,Angular;Flask,Angular,,,Ansible;Git;Terraform;Unity 3D,Ansible;Git;Terraform;Unity 3D,Eclipse;IntelliJ;NetBeans;Sublime Text;Visual ...,IntelliJ;Sublime Text;Visual Studio Code,Windows,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow;Stack Exchange,Multiple times per day,Yes,A few times per month or weekly,"Yes, somewhat",No,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,"I have an anxiety disorder;Or, in your own words:",Too short,Easy,4300000.0
83429,83430,I code primarily as a hobby,"Not employed, but looking for work",United States of America,Washington,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",18 - 24 years,"Other online resources (ex: videos, blogs, etc)",6,,Other (please specify):;Student,,,,,HTML/CSS;PHP;PowerShell;Python;SQL;VBA,C#;Go;HTML/CSS;Java;JavaScript;PHP;Python;SQL;VBA,MongoDB;MySQL;PostgreSQL,MariaDB;Microsoft SQL Server;MongoDB;MySQL;Pos...,Heroku,AWS;Heroku;IBM Cloud or Watson;Microsoft Azure...,Django;Flask;jQuery,Angular.js;ASP.NET;ASP.NET Core ;Django;FastAP...,NumPy;Pandas,.NET Framework;.NET Core / .NET 5;NumPy;Pandas,Git,Docker;Git;Xamarin,Atom;Notepad++;Sublime Text;Visual Studio Code,Atom;Notepad++;Sublime Text;Visual Studio Code,Windows,Visit Stack Overflow;Google it;Watch help / tu...,Stack Overflow;Stack Exchange,A few times per month or weekly,Not sure/can't remember,,"Yes, somewhat",No,25-34 years old,"Man;Or, in your own words:",Yes,Queer,White or of European descent,I am unable to / find it difficult to walk or ...,I have an anxiety disorder,Appropriate in length,Neither easy nor difficult,
83430,83431,I am a developer by profession,Employed full-time,United States of America,Illinois,,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",23,21,"Developer, front-end;Developer, full-stack;Dev...",10 to 19 employees,USD\tUnited States dollar,125000.0,Yearly,APL;Clojure;LISP;Python;Ruby;SQL;TypeScript,APL;Clojure;Haskell;LISP;R,MongoDB;MySQL;PostgreSQL;Redis,PostgreSQL,AWS;DigitalOcean;Google Cloud Platform;Heroku,AWS;DigitalOcean;Google Cloud Platform,Django;React.js;Ruby on Rails,,,,Docker;Git,Docker;Git;Kubernetes;Unity 3D,Emacs;Vim,Emacs,Linux-based,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow;Stack Exchange,A few times per week,No,,"No, not really",No,45-54 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,125000.0
83434,83435,I am a developer by profession,Employed full-time,United States of America,Texas,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",6,5,"Developer, back-end",20 to 99 employees,USD\tUnited States dollar,160500.0,Yearly,Clojure;Kotlin;SQL,Clojure,Oracle;SQLite,SQLite,AWS,AWS,,,,,Docker;Git,Git;Kubernetes,IntelliJ;Sublime Text;Vim;Visual Studio Code,Sublime Text;Vim,MacOS,Call a coworker or friend;Google it,Stack Overflow;Stack Exchange,A few times per week,No,,"No, not at all",No,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,I have a concentration and/or memory disorder ...,Appropriate in length,Easy,160500.0


In [149]:
# We can simulate groupby. However, the difference is that groupby does this
# for every country.
cond = (df['Country'] == 'United States of America')
df[cond]

Unnamed: 0,ResponseId,MainBranch,Employment,Country,US_State,UK_Country,EdLevel,Age1stCode,LearnCode,YearsCode,YearsCodePro,DevType,OrgSize,Currency,CompTotal,CompFreq,LanguageHaveWorkedWith,LanguageWantToWorkWith,DatabaseHaveWorkedWith,DatabaseWantToWorkWith,PlatformHaveWorkedWith,PlatformWantToWorkWith,WebframeHaveWorkedWith,WebframeWantToWorkWith,MiscTechHaveWorkedWith,MiscTechWantToWorkWith,ToolsTechHaveWorkedWith,ToolsTechWantToWorkWith,NEWCollabToolsHaveWorkedWith,NEWCollabToolsWantToWorkWith,OpSys,NEWStuck,NEWSOSites,SOVisitFreq,SOAccount,SOPartFreq,SOComm,NEWOtherComms,Age,Gender,Trans,Sexuality,Ethnicity,Accessibility,MentalHealth,SurveyLength,SurveyEase,SalaryUSD
5,6,I am a student who is learning to code,"Student, part-time",United States of America,Georgia,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",,,,,,,,C;C#;C++;HTML/CSS;Java;JavaScript;Node.js;Powe...,C#;C++;Go;HTML/CSS;Java;JavaScript;Node.js;Obj...,MySQL;PostgreSQL;SQLite,Elasticsearch;Firebase;IBM DB2;MariaDB;Microso...,,,Express;Flask;jQuery;React.js,Express;Flask;jQuery;React.js,Keras;NumPy;Pandas;TensorFlow;Torch/PyTorch,Keras;NumPy;Pandas;Qt;React Native;TensorFlow;...,Git,Docker;Git;Unity 3D;Unreal Engine,IPython/Jupyter;Notepad++;PyCharm;Sublime Text...,Android Studio;IPython/Jupyter;Vim;Visual Stud...,Windows,Visit Stack Overflow;Go for a walk or other ph...,Stack Overflow;Stack Exchange,Multiple times per day,Yes,I have never participated in Q&A on Stack Over...,"Yes, somewhat",No,18-24 years old,Prefer not to say,No,Straight / Heterosexual,Prefer not to say,None of the above,I have a concentration and/or memory disorder ...,Too long,Neither easy nor difficult,
6,7,I code primarily as a hobby,I prefer not to say,United States of America,New Hampshire,,"Secondary school (e.g. American high school, G...",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",3,,,,,,,HTML/CSS;JavaScript,HTML/CSS;JavaScript;PHP,,,,,jQuery,jQuery,,,,,,,Windows,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow,Daily or almost daily,Yes,A few times per week,"Yes, somewhat",No,Prefer not to say,Prefer not to say,No,,I don't know,None of the above,None of the above,Appropriate in length,Neither easy nor difficult,
15,16,I am a student who is learning to code,"Student, full-time",United States of America,Missouri,,"Secondary school (e.g. American high school, G...",5 - 10 years,Other (please specify):,7,,,,,,,Bash/Shell;Python,Bash/Shell;Python,,,,,,,,,Git,Git,Emacs;Neovim;Vim,Neovim;Vim,Linux-based,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow;Stack Exchange,Multiple times per day,Yes,Less than once per month or monthly,Neutral,No,Under 18 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,I have a concentration and/or memory disorder ...,Too long,Easy,
36,37,I am a developer by profession,Employed full-time,United States of America,District of Columbia,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",8,Less than 1 year,"Developer, embedded applications or devices",20 to 99 employees,USD\tUnited States dollar,103000.0,Yearly,Assembly;C;Java;Kotlin;Rust,Assembly;Kotlin;Rust;SQL,,,,,,,,,Docker;Git,Git,IntelliJ;PyCharm;Webstorm,Android Studio;IntelliJ;PyCharm;Sublime Text;W...,Linux-based,Visit Stack Overflow;Google it;Do other work a...,Stack Overflow;Stack Exchange,Daily or almost daily,Yes,A few times per week,"Yes, somewhat",Yes,18-24 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,Prefer not to say,Too short,Easy,103000.0
37,38,I am a developer by profession,Employed full-time,United States of America,Massachusetts,,Some college/university study without earning ...,18 - 24 years,"Other online resources (ex: videos, blogs, etc)",20,15,"Developer, back-end","1,000 to 4,999 employees",USD\tUnited States dollar,300000.0,Yearly,Go,Go;Rust,,,AWS;Google Cloud Platform;Microsoft Azure,,,,,,Docker;Git;Terraform,Git;Terraform,Vim;Visual Studio Code,Vim;Visual Studio Code,MacOS,Google it;Do other work and come back later;Me...,Stack Overflow;Stack Exchange,Daily or almost daily,Yes,Daily or almost daily,"Yes, definitely",No,35-44 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,I have a concentration and/or memory disorder ...,Appropriate in length,Easy,300000.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
83427,83428,I am a developer by profession,Employed full-time,United States of America,Pennsylvania,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",13,2,"Developer, front-end;Developer, desktop or ent...","1,000 to 4,999 employees",USD\tUnited States dollar,86000.0,Weekly,Java;Python,C#;Java;Python,,,,,Angular;Flask,Angular,,,Ansible;Git;Terraform;Unity 3D,Ansible;Git;Terraform;Unity 3D,Eclipse;IntelliJ;NetBeans;Sublime Text;Visual ...,IntelliJ;Sublime Text;Visual Studio Code,Windows,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow;Stack Exchange,Multiple times per day,Yes,A few times per month or weekly,"Yes, somewhat",No,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,"I have an anxiety disorder;Or, in your own words:",Too short,Easy,4300000.0
83429,83430,I code primarily as a hobby,"Not employed, but looking for work",United States of America,Washington,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",18 - 24 years,"Other online resources (ex: videos, blogs, etc)",6,,Other (please specify):;Student,,,,,HTML/CSS;PHP;PowerShell;Python;SQL;VBA,C#;Go;HTML/CSS;Java;JavaScript;PHP;Python;SQL;VBA,MongoDB;MySQL;PostgreSQL,MariaDB;Microsoft SQL Server;MongoDB;MySQL;Pos...,Heroku,AWS;Heroku;IBM Cloud or Watson;Microsoft Azure...,Django;Flask;jQuery,Angular.js;ASP.NET;ASP.NET Core ;Django;FastAP...,NumPy;Pandas,.NET Framework;.NET Core / .NET 5;NumPy;Pandas,Git,Docker;Git;Xamarin,Atom;Notepad++;Sublime Text;Visual Studio Code,Atom;Notepad++;Sublime Text;Visual Studio Code,Windows,Visit Stack Overflow;Google it;Watch help / tu...,Stack Overflow;Stack Exchange,A few times per month or weekly,Not sure/can't remember,,"Yes, somewhat",No,25-34 years old,"Man;Or, in your own words:",Yes,Queer,White or of European descent,I am unable to / find it difficult to walk or ...,I have an anxiety disorder,Appropriate in length,Neither easy nor difficult,
83430,83431,I am a developer by profession,Employed full-time,United States of America,Illinois,,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",23,21,"Developer, front-end;Developer, full-stack;Dev...",10 to 19 employees,USD\tUnited States dollar,125000.0,Yearly,APL;Clojure;LISP;Python;Ruby;SQL;TypeScript,APL;Clojure;Haskell;LISP;R,MongoDB;MySQL;PostgreSQL;Redis,PostgreSQL,AWS;DigitalOcean;Google Cloud Platform;Heroku,AWS;DigitalOcean;Google Cloud Platform,Django;React.js;Ruby on Rails,,,,Docker;Git,Docker;Git;Kubernetes;Unity 3D,Emacs;Vim,Emacs,Linux-based,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow;Stack Exchange,A few times per week,No,,"No, not really",No,45-54 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,125000.0
83434,83435,I am a developer by profession,Employed full-time,United States of America,Texas,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",6,5,"Developer, back-end",20 to 99 employees,USD\tUnited States dollar,160500.0,Yearly,Clojure;Kotlin;SQL,Clojure,Oracle;SQLite,SQLite,AWS,AWS,,,,,Docker;Git,Git;Kubernetes,IntelliJ;Sublime Text;Vim;Visual Studio Code,Sublime Text;Vim,MacOS,Call a coworker or friend;Google it,Stack Overflow;Stack Exchange,A few times per week,No,,"No, not at all",No,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,I have a concentration and/or memory disorder ...,Appropriate in length,Easy,160500.0


In [150]:
df[cond]['OpSys'].value_counts(normalize=True)

Windows                              0.382403
MacOS                                0.356525
Linux-based                          0.221764
Windows Subsystem for Linux (WSL)    0.027712
Other (please specify):              0.009762
BSD                                  0.001834
Name: OpSys, dtype: float64

In [151]:
country_group['OpSys'].value_counts(normalize=True).head(50) # Appyling a function.

Country      OpSys                            
Afghanistan  Windows                              0.459016
             Linux-based                          0.295082
             MacOS                                0.098361
             Other (please specify):              0.098361
             BSD                                  0.032787
             Windows Subsystem for Linux (WSL)    0.016393
Albania      Windows                              0.614286
             Linux-based                          0.228571
             MacOS                                0.100000
             Windows Subsystem for Linux (WSL)    0.028571
             BSD                                  0.014286
             Other (please specify):              0.014286
Algeria      Windows                              0.565217
             Linux-based                          0.282609
             MacOS                                0.086957
             BSD                                  0.021739
         

In [152]:
country_group['OpSys'].value_counts(normalize=True).loc['Bulgaria']

OpSys
Windows                              0.534826
Linux-based                          0.243781
MacOS                                0.184080
Windows Subsystem for Linux (WSL)    0.019900
Other (please specify):              0.012438
BSD                                  0.004975
Name: OpSys, dtype: float64

In [153]:
country_group['OpSys'].value_counts(normalize=True).loc['China']

OpSys
Windows                              0.465700
MacOS                                0.257005
Linux-based                          0.208696
Windows Subsystem for Linux (WSL)    0.058937
BSD                                  0.006763
Other (please specify):              0.002899
Name: OpSys, dtype: float64

In [154]:
country_group['OpSys'].value_counts(normalize=True).loc['Russian Federation']

OpSys
Windows                              0.417799
Linux-based                          0.302989
MacOS                                0.240489
Windows Subsystem for Linux (WSL)    0.027853
Other (please specify):              0.008152
BSD                                  0.002717
Name: OpSys, dtype: float64

In [155]:
# Get the median salaries
country_group['SalaryUSD'].median()

Country
Afghanistan                              9792.0
Albania                                 15900.0
Algeria                                  9875.0
Andorra                                 94045.5
Angola                                   9750.0
                                         ...   
Venezuela, Bolivarian Republic of...    12000.0
Viet Nam                                12678.0
Yemen                                    3954.0
Zambia                                   9816.0
Zimbabwe                                 7200.0
Name: SalaryUSD, Length: 181, dtype: float64

In [156]:
country_group['SalaryUSD'].median().loc['Germany']

64859.0

In [157]:
country_group['SalaryUSD'].median().loc['Bulgaria']

33486.0

In [158]:
country_group['SalaryUSD'].median().loc['United States of America']

125000.0

In [159]:
# See the median and mean salaries
country_group['SalaryUSD'].agg(['median', 'mean'])

Unnamed: 0_level_0,median,mean
Country,Unnamed: 1_level_1,Unnamed: 2_level_1
Afghanistan,9792.0,2.794748e+06
Albania,15900.0,4.499814e+04
Algeria,9875.0,1.446114e+04
Andorra,94045.5,8.928200e+04
Angola,9750.0,2.155680e+04
...,...,...
"Venezuela, Bolivarian Republic of...",12000.0,2.246505e+04
Viet Nam,12678.0,1.995289e+04
Yemen,3954.0,5.628667e+03
Zambia,9816.0,1.991491e+04


In [160]:
# What percentage of people from each country know Python?

# (note: using the following approaches we'll ignore NaN values)

# using the filtering approach. Works, but we'll have to manually go through each country.
cond = (df['Country'] == 'India')
df[cond]['LanguageHaveWorkedWith'].str.contains('Python').value_counts(normalize=True).loc[True]

0.5088981814645531

In [161]:
# using the groupby approach
num_ppl_using_py_by_country = country_group['LanguageHaveWorkedWith'].apply(lambda x: x.str.contains('Python').sum() / x.count())
num_ppl_using_py_by_country

Country
Afghanistan                             0.327869
Albania                                 0.422535
Algeria                                 0.422222
Andorra                                 0.400000
Angola                                  0.200000
                                          ...   
Venezuela, Bolivarian Republic of...    0.413462
Viet Nam                                0.454068
Yemen                                   0.315789
Zambia                                  0.545455
Zimbabwe                                0.500000
Name: LanguageHaveWorkedWith, Length: 181, dtype: float64

In [162]:
num_ppl_using_py_by_country.loc['India']

0.5088981814645531

In [163]:
num_ppl_using_py_by_country.loc['Bulgaria']

0.2947103274559194

In [164]:
pd.DataFrame(num_ppl_using_py_by_country.sort_values(ascending=False))

Unnamed: 0_level_0,LanguageHaveWorkedWith
Country,Unnamed: 1_level_1
Tuvalu,1.0
Liechtenstein,1.0
Brunei Darussalam,1.0
Sierra Leone,1.0
Saint Lucia,1.0
...,...
Mali,0.0
Monaco,0.0
Papua New Guinea,0.0
Saint Kitts and Nevis,0.0


## Cleaning Data - Casting Datatypes and Handling Missing Values

In [165]:
people = {
    'first': ['SimoFirst', 'Jane', 'John', 'Chris', np.nan, None, 'NA'],
    'last': ['SimoLast', 'Doe', 'Doe', 'Sirhc', np.nan, np.nan, 'Missing'],
    'email': ['s.e.hristov99@gmail.com', 'JaneDoe@gmail.com', 'JohnDoe@gmail.com', None, np.nan, 'Anon@email.com', 'NA'],
    'age': ['33', '55', '63', '36', None, None, 'Missing'],
}

> **Approach 1**: just remove the rows that have at least 1 missing value.

In [166]:
ppl_df = pd.DataFrame(people)
ppl_df

Unnamed: 0,first,last,email,age
0,SimoFirst,SimoLast,s.e.hristov99@gmail.com,33
1,Jane,Doe,JaneDoe@gmail.com,55
2,John,Doe,JohnDoe@gmail.com,63
3,Chris,Sirhc,,36
4,,,,
5,,,Anon@email.com,
6,,Missing,,Missing


In [167]:
ppl_df.dropna()
# by default it is actually ppl_df.dropna(axis='index', how='any')
# index -> drops rows
# any -> if any columns has a missing values. Note: sometimes it might be ok,
# if the person does not have an `age`.

Unnamed: 0,first,last,email,age
0,SimoFirst,SimoLast,s.e.hristov99@gmail.com,33
1,Jane,Doe,JaneDoe@gmail.com,55
2,John,Doe,JohnDoe@gmail.com,63
6,,Missing,,Missing


In [168]:
# remove rows if all columns have missing values
ppl_df.dropna(how='all')

Unnamed: 0,first,last,email,age
0,SimoFirst,SimoLast,s.e.hristov99@gmail.com,33
1,Jane,Doe,JaneDoe@gmail.com,55
2,John,Doe,JohnDoe@gmail.com,63
3,Chris,Sirhc,,36
5,,,Anon@email.com,
6,,Missing,,Missing


In [169]:
# remove columns if all rows have missing values
ppl_df.dropna(axis=1, how='all')  # Same as ppl_df.dropna(axis='columns', how='all')

Unnamed: 0,first,last,email,age
0,SimoFirst,SimoLast,s.e.hristov99@gmail.com,33
1,Jane,Doe,JaneDoe@gmail.com,55
2,John,Doe,JohnDoe@gmail.com,63
3,Chris,Sirhc,,36
4,,,,
5,,,Anon@email.com,
6,,Missing,,Missing


In [170]:
# remove columns if any row has a missing value
ppl_df.dropna(axis=1, how='any')

0
1
2
3
4
5
6


In [171]:
# remove entries with no emails
ppl_df.dropna(axis=0, subset=['email'])

Unnamed: 0,first,last,email,age
0,SimoFirst,SimoLast,s.e.hristov99@gmail.com,33
1,Jane,Doe,JaneDoe@gmail.com,55
2,John,Doe,JohnDoe@gmail.com,63
5,,,Anon@email.com,
6,,Missing,,Missing


In [172]:
# remove entries if they don't have an email and last name
ppl_df.dropna(axis=0, how='all', subset=['email', 'last'])

Unnamed: 0,first,last,email,age
0,SimoFirst,SimoLast,s.e.hristov99@gmail.com,33
1,Jane,Doe,JaneDoe@gmail.com,55
2,John,Doe,JohnDoe@gmail.com,63
3,Chris,Sirhc,,36
5,,,Anon@email.com,
6,,Missing,,Missing


> **Custom missing values**: substitue them with `np.nan` by using the `replace` method.

In [173]:
ppl_df = ppl_df.replace({
    'Missing': np.nan,
    'NA': np.nan,
})

In [174]:
ppl_df.dropna()

Unnamed: 0,first,last,email,age
0,SimoFirst,SimoLast,s.e.hristov99@gmail.com,33
1,Jane,Doe,JaneDoe@gmail.com,55
2,John,Doe,JohnDoe@gmail.com,63


> **Approach 2**: full them in (impute them) using statistics about the column.

In [175]:
ppl_df = pd.DataFrame(people).replace({
    'Missing': np.nan,
    'NA': np.nan,
})
ppl_df

Unnamed: 0,first,last,email,age
0,SimoFirst,SimoLast,s.e.hristov99@gmail.com,33.0
1,Jane,Doe,JaneDoe@gmail.com,55.0
2,John,Doe,JohnDoe@gmail.com,63.0
3,Chris,Sirhc,,36.0
4,,,,
5,,,Anon@email.com,
6,,,,


In [176]:
# for numerical values: fill in with the mean / mode / the mean value of the KNN
# for categorical values: create a new category, for ex. 'missing'
ppl_df.fillna('MISSING')

Unnamed: 0,first,last,email,age
0,SimoFirst,SimoLast,s.e.hristov99@gmail.com,33
1,Jane,Doe,JaneDoe@gmail.com,55
2,John,Doe,JohnDoe@gmail.com,63
3,Chris,Sirhc,MISSING,36
4,MISSING,MISSING,MISSING,MISSING
5,MISSING,MISSING,Anon@email.com,MISSING
6,MISSING,MISSING,MISSING,MISSING


**Casting**

In [177]:
ppl_df.fillna(0)

Unnamed: 0,first,last,email,age
0,SimoFirst,SimoLast,s.e.hristov99@gmail.com,33
1,Jane,Doe,JaneDoe@gmail.com,55
2,John,Doe,JohnDoe@gmail.com,63
3,Chris,Sirhc,0,36
4,0,0,0,0
5,0,0,Anon@email.com,0
6,0,0,0,0


In [178]:
ppl_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7 entries, 0 to 6
Data columns (total 4 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   first   4 non-null      object
 1   last    4 non-null      object
 2   email   4 non-null      object
 3   age     4 non-null      object
dtypes: object(4)
memory usage: 352.0+ bytes


In [179]:
# because the `age` column is of type `object` we cannot get its mean
# ppl_df['age'].mean()

In [180]:
# the NaN value is actually a float 'under the hood'
type(np.nan)

float

In [181]:
# so we cannot cast to integers directly (i.e. without substitution or removal)
# ppl_df['age'] = ppl_df['age'].astype(np.int32)

In [182]:
# because it is a numerical variable, let's substitute is with the mean
ppl_df['age'] = ppl_df['age'].astype(np.float32)
ppl_df['age']

0    33.0
1    55.0
2    63.0
3    36.0
4     NaN
5     NaN
6     NaN
Name: age, dtype: float32

In [183]:
ppl_df['age'] = ppl_df['age'].fillna(ppl_df['age'].mean())
ppl_df

Unnamed: 0,first,last,email,age
0,SimoFirst,SimoLast,s.e.hristov99@gmail.com,33.0
1,Jane,Doe,JaneDoe@gmail.com,55.0
2,John,Doe,JohnDoe@gmail.com,63.0
3,Chris,Sirhc,,36.0
4,,,,46.75
5,,,Anon@email.com,46.75
6,,,,46.75


In [184]:
ppl_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7 entries, 0 to 6
Data columns (total 4 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   first   4 non-null      object 
 1   last    4 non-null      object 
 2   email   4 non-null      object 
 3   age     7 non-null      float32
dtypes: float32(1), object(3)
memory usage: 324.0+ bytes


In [185]:
ppl_df['age'].mean()

46.75

In [186]:
# What is the average number of years of coding experience?
df['YearsCode'].head(10)

0    NaN
1      7
2    NaN
3    NaN
4     17
5    NaN
6      3
7      4
8      6
9      7
Name: YearsCode, dtype: object

In [187]:
# Note that the column has a dtype `object`!
# That means that we can neither take the mean:
# df['YearsCode'].mean()

In [188]:
# nor can we cast to float
# df['YearsCode'].astype(np.float)

In [189]:
# however, the good thing about this error is that it shows us
# what the problem actually is - we have a string as an entry

# in order to deal with this problem, we'll replace
# all the strings with a number
df['YearsCode'].unique()

array([nan, '7', '17', '3', '4', '6', '16', '12', '15', '10', '40', '9',
       '26', '14', '39', '20', '8', '19', '5', 'Less than 1 year', '22',
       '2', '1', '34', '21', '13', '25', '24', '30', '31', '18', '38',
       'More than 50 years', '27', '41', '42', '35', '23', '28', '11',
       '37', '44', '43', '36', '33', '45', '29', '50', '46', '32', '47',
       '49', '48'], dtype=object)

In [190]:
df['YearsCode'] = df['YearsCode'].replace({
    'Less than 1 year': 0,
    'More than 50 years': 51,
})
df['YearsCode'].unique()

array([nan, '7', '17', '3', '4', '6', '16', '12', '15', '10', '40', '9',
       '26', '14', '39', '20', '8', '19', '5', 0, '22', '2', '1', '34',
       '21', '13', '25', '24', '30', '31', '18', '38', 51, '27', '41',
       '42', '35', '23', '28', '11', '37', '44', '43', '36', '33', '45',
       '29', '50', '46', '32', '47', '49', '48'], dtype=object)

In [191]:
# now we can convert to float
df['YearsCode'] = df['YearsCode'].astype(np.float32)
df['YearsCode']

0         NaN
1         7.0
2         NaN
3         NaN
4        17.0
         ... 
83434     6.0
83435     4.0
83436    10.0
83437     5.0
83438    14.0
Name: YearsCode, Length: 83439, dtype: float32

In [192]:
df['YearsCode'].mean()

12.338200569152832

In [193]:
df['YearsCode'].mode()[0]

5.0

In [194]:
df['YearsCode'].median()

10.0

## Reading/Writing Data to Different Sources - Excel, JSON, SQL, Etc

We'll pretend that we only need the results for Bulgaria. Let's save them as a `csv` to show how it's done.

In [195]:
# We say that the table already has a column that can act as an index.
# Let's use it.

df = pd.read_csv(DATA_PATH, index_col='ResponseId')
df

Unnamed: 0_level_0,MainBranch,Employment,Country,US_State,UK_Country,EdLevel,Age1stCode,LearnCode,YearsCode,YearsCodePro,DevType,OrgSize,Currency,CompTotal,CompFreq,LanguageHaveWorkedWith,LanguageWantToWorkWith,DatabaseHaveWorkedWith,DatabaseWantToWorkWith,PlatformHaveWorkedWith,PlatformWantToWorkWith,WebframeHaveWorkedWith,WebframeWantToWorkWith,MiscTechHaveWorkedWith,MiscTechWantToWorkWith,ToolsTechHaveWorkedWith,ToolsTechWantToWorkWith,NEWCollabToolsHaveWorkedWith,NEWCollabToolsWantToWorkWith,OpSys,NEWStuck,NEWSOSites,SOVisitFreq,SOAccount,SOPartFreq,SOComm,NEWOtherComms,Age,Gender,Trans,Sexuality,Ethnicity,Accessibility,MentalHealth,SurveyLength,SurveyEase,ConvertedCompYearly
ResponseId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1
1,I am a developer by profession,"Independent contractor, freelancer, or self-em...",Slovakia,,,"Secondary school (e.g. American high school, G...",18 - 24 years,Coding Bootcamp;Other online resources (ex: vi...,,,"Developer, mobile",20 to 99 employees,EUR European Euro,4800.0,Monthly,C++;HTML/CSS;JavaScript;Objective-C;PHP;Swift,Swift,PostgreSQL;SQLite,SQLite,,,Laravel;Symfony,,,,,,PHPStorm;Xcode,Atom;Xcode,MacOS,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow,Multiple times per day,Yes,A few times per month or weekly,"Yes, definitely",No,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,62268.0
2,I am a student who is learning to code,"Student, full-time",Netherlands,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",7,,,,,,,JavaScript;Python,,PostgreSQL,,,,Angular;Flask;Vue.js,,Cordova,,Docker;Git;Yarn,Git,Android Studio;IntelliJ;Notepad++;PyCharm,,Windows,Visit Stack Overflow;Google it,Stack Overflow,Daily or almost daily,Yes,Daily or almost daily,"Yes, definitely",No,18-24 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,
3,"I am not primarily a developer, but I write co...","Student, full-time",Russian Federation,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",,,,,,,,Assembly;C;Python;R;Rust,Julia;Python;Rust,SQLite,SQLite,Heroku,,Flask,Flask,NumPy;Pandas;TensorFlow;Torch/PyTorch,Keras;NumPy;Pandas;TensorFlow;Torch/PyTorch,,,IPython/Jupyter;PyCharm;RStudio;Sublime Text;V...,IPython/Jupyter;RStudio;Sublime Text;Visual St...,MacOS,Visit Stack Overflow;Google it;Watch help / tu...,Stack Overflow;Stack Exchange,Multiple times per day,Yes,Multiple times per day,"Yes, definitely",Yes,18-24 years old,Man,No,Prefer not to say,Prefer not to say,None of the above,None of the above,Appropriate in length,Easy,
4,I am a developer by profession,Employed full-time,Austria,,,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",11 - 17 years,,,,"Developer, front-end",100 to 499 employees,EUR European Euro,,Monthly,JavaScript;TypeScript,JavaScript;TypeScript,,,,,Angular;jQuery,Angular;jQuery,,,,,,,Windows,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow,Daily or almost daily,Yes,Daily or almost daily,Neutral,No,35-44 years old,Man,No,Straight / Heterosexual,White or of European descent,I am deaf / hard of hearing,,Appropriate in length,Neither easy nor difficult,
5,I am a developer by profession,"Independent contractor, freelancer, or self-em...",United Kingdom of Great Britain and Northern I...,,England,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",5 - 10 years,Friend or family member,17,10,"Developer, desktop or enterprise applications;...","Just me - I am a freelancer, sole proprietor, ...",GBP\tPound sterling,,,Bash/Shell;HTML/CSS;Python;SQL,Bash/Shell;HTML/CSS;Python;SQL,Elasticsearch;PostgreSQL;Redis,Cassandra;Elasticsearch;PostgreSQL;Redis,,,Flask,Flask,Apache Spark;Hadoop;NumPy;Pandas,Hadoop;NumPy;Pandas,Docker;Git;Kubernetes;Yarn,Docker;Git;Kubernetes;Yarn,Atom;IPython/Jupyter;Notepad++;PyCharm;Vim,Atom;IPython/Jupyter;Notepad++;PyCharm;Vim;Vis...,Linux-based,Visit Stack Overflow;Go for a walk or other ph...,Stack Overflow;Stack Exchange,Daily or almost daily,Yes,A few times per week,"Yes, somewhat",No,25-34 years old,Man,No,,White or of European descent,None of the above,,Appropriate in length,Easy,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
83435,I am a developer by profession,Employed full-time,United States of America,Texas,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",6,5,"Developer, back-end",20 to 99 employees,USD\tUnited States dollar,160500.0,Yearly,Clojure;Kotlin;SQL,Clojure,Oracle;SQLite,SQLite,AWS,AWS,,,,,Docker;Git,Git;Kubernetes,IntelliJ;Sublime Text;Vim;Visual Studio Code,Sublime Text;Vim,MacOS,Call a coworker or friend;Google it,Stack Overflow;Stack Exchange,A few times per week,No,,"No, not at all",No,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,I have a concentration and/or memory disorder ...,Appropriate in length,Easy,160500.0
83436,I am a developer by profession,"Independent contractor, freelancer, or self-em...",Benin,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",4,2,"Developer, full-stack","Just me - I am a freelancer, sole proprietor, ...",XOF\tWest African CFA franc,200000.0,Monthly,,,Firebase;MariaDB;MySQL;PostgreSQL;Redis;SQLite,Firebase;MariaDB;MongoDB;MySQL;PostgreSQL;Redi...,,,Django;jQuery;Laravel;React.js;Ruby on Rails,Django;Express;jQuery;Laravel;React.js;Ruby on...,Flutter;Qt,,Git;Unity 3D;Unreal Engine,Docker;Git;Kubernetes,Android Studio;Eclipse;Emacs;IntelliJ;NetBeans...,Emacs;IntelliJ;PHPStorm;PyCharm;RStudio;Sublim...,Linux-based,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow;Stack Exchange,Multiple times per day,Yes,I have never participated in Q&A on Stack Over...,"Yes, somewhat",No,18-24 years old,Man,No,Straight / Heterosexual,Black or of African descent,None of the above,None of the above,Appropriate in length,Easy,3960.0
83437,I am a developer by profession,Employed full-time,United States of America,New Jersey,,"Secondary school (e.g. American high school, G...",11 - 17 years,School,10,4,Data scientist or machine learning specialist;...,"10,000 or more employees",USD\tUnited States dollar,1800.0,Weekly,Groovy;Java;Python,Java;Python,DynamoDB;Elasticsearch;MongoDB;PostgreSQL;Redis,DynamoDB;Redis,AWS;Google Cloud Platform,AWS,FastAPI;Flask,FastAPI;Flask,Hadoop;Keras;NumPy;Pandas,Apache Spark;Hadoop;Keras;NumPy;Pandas;TensorFlow,Ansible;Docker;Git;Terraform,Docker;Git;Kubernetes;Terraform,Android Studio;Eclipse;IntelliJ;IPython/Jupyte...,IntelliJ;IPython/Jupyter;Notepad++;Vim,Windows,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow;Stack Exchange,A few times per week,Yes,I have never participated in Q&A on Stack Over...,"No, not really",No,25-34 years old,Man,No,,White or of European descent,None of the above,None of the above,Appropriate in length,Neither easy nor difficult,90000.0
83438,I am a developer by profession,Employed full-time,Canada,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,Online Courses or Certification;Books / Physic...,5,3,"Developer, back-end",20 to 99 employees,CAD\tCanadian dollar,90000.0,Monthly,Bash/Shell;JavaScript;Node.js;Python,Go;Rust,Cassandra;Elasticsearch;MongoDB;PostgreSQL;Redis,,Heroku,AWS;DigitalOcean,Django;Express;Flask;React.js,,NumPy;Pandas;TensorFlow;Torch/PyTorch,NumPy;Pandas;TensorFlow;Torch/PyTorch,Ansible;Docker;Git;Terraform,Kubernetes;Terraform,PyCharm;Sublime Text,,MacOS,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow,A few times per month or weekly,Yes,Less than once per month or monthly,"No, not really",No,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,I have a mood or emotional disorder (e.g. depr...,Appropriate in length,Neither easy nor difficult,816816.0


In [196]:
# Get the results from bulgarian developers.
cond = (df['Country'] == 'Bulgaria')
df[cond]

Unnamed: 0_level_0,MainBranch,Employment,Country,US_State,UK_Country,EdLevel,Age1stCode,LearnCode,YearsCode,YearsCodePro,DevType,OrgSize,Currency,CompTotal,CompFreq,LanguageHaveWorkedWith,LanguageWantToWorkWith,DatabaseHaveWorkedWith,DatabaseWantToWorkWith,PlatformHaveWorkedWith,PlatformWantToWorkWith,WebframeHaveWorkedWith,WebframeWantToWorkWith,MiscTechHaveWorkedWith,MiscTechWantToWorkWith,ToolsTechHaveWorkedWith,ToolsTechWantToWorkWith,NEWCollabToolsHaveWorkedWith,NEWCollabToolsWantToWorkWith,OpSys,NEWStuck,NEWSOSites,SOVisitFreq,SOAccount,SOPartFreq,SOComm,NEWOtherComms,Age,Gender,Trans,Sexuality,Ethnicity,Accessibility,MentalHealth,SurveyLength,SurveyEase,ConvertedCompYearly
ResponseId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1
49,I am a developer by profession,Employed full-time,Bulgaria,,,Some college/university study without earning ...,11 - 17 years,"Other online resources (ex: videos, blogs, etc...",7,1,"Developer, desktop or enterprise applications","1,000 to 4,999 employees",BGN\tBulgarian lev,3100.0,Monthly,C++;SQL,Go;Java;Python,MongoDB;MySQL,,,,,,,,Git,,Visual Studio;Visual Studio Code,,Linux-based,Visit Stack Overflow;Google it;Do other work a...,Stack Overflow;Stack Exchange,Daily or almost daily,Yes,Less than once per month or monthly,"Yes, somewhat",Yes,18-24 years old,Man,No,Prefer to self-describe:,White or of European descent,None of the above,I have an anxiety disorder,Too long,Easy,20556.0
261,I am a developer by profession,Employed full-time,Bulgaria,,,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",13,8,"Developer, front-end;Developer, full-stack","5,000 to 9,999 employees",BGN\tBulgarian lev,4200.0,Monthly,C#;HTML/CSS;Java;JavaScript;Node.js;SQL;TypeSc...,C#;Elixir;Go;Haskell;HTML/CSS;JavaScript;Kotli...,Microsoft SQL Server;Redis,Elasticsearch;Redis;SQLite,,,jQuery;Vue.js,React.js;Vue.js,.NET Core / .NET 5,Flutter;React Native,Git,Docker;Git;Kubernetes;Puppet,IntelliJ;Notepad++;Visual Studio;Visual Studio...,IntelliJ;Notepad++;Rider;Vim;Visual Studio Cod...,Windows,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow;Stack Exchange,Multiple times per day,Yes,Multiple times per day,"Yes, definitely",No,25-34 years old,Man,No,Straight / Heterosexual,,None of the above,Prefer not to say,Appropriate in length,Easy,27852.0
329,I am a developer by profession,Employed full-time,Bulgaria,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,School,25,15,"Developer, desktop or enterprise applications;...",100 to 499 employees,EUR European Euro,,Monthly,C#;C++;JavaScript;SQL,C#;Rust,Microsoft SQL Server;SQLite,Microsoft SQL Server;SQLite,,,ASP.NET;jQuery,ASP.NET Core,.NET Framework;TensorFlow,.NET Framework;.NET Core / .NET 5,Git,Git,PyCharm;Visual Studio;Visual Studio Code,Visual Studio;Visual Studio Code,Windows,Visit Stack Overflow;Google it;Watch help / tu...,Stack Overflow,Multiple times per day,Yes,Less than once per month or monthly,"No, not really",No,35-44 years old,Man,"Or, in your own words:",,White or of European descent,,,,,
1208,I am a developer by profession,Employed full-time,Bulgaria,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",18 - 24 years,School,14,8,"Developer, back-end",500 to 999 employees,EUR European Euro,,Monthly,Java,Java;Kotlin;Python;TypeScript,,,,,Spring,Angular;Spring,,,Git,Docker;Git;Kubernetes,IntelliJ,IntelliJ;Webstorm,MacOS,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow;Stack Exchange,Daily or almost daily,Yes,A few times per week,"Yes, somewhat",No,25-34 years old,Man,No,Straight / Heterosexual,Biracial,None of the above,None of the above,Appropriate in length,Easy,
1335,I am a developer by profession,Employed full-time,Bulgaria,,,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",11 - 17 years,Coding Bootcamp;Other online resources (ex: vi...,28,21,"Developer, desktop or enterprise applications;...",100 to 499 employees,BGN\tBulgarian lev,,,C#;PowerShell;SQL,C#;PowerShell;SQL,Microsoft SQL Server,Microsoft SQL Server,Microsoft Azure,Microsoft Azure,,,.NET Framework,.NET Framework,Git,Git,Notepad++;Visual Studio,Notepad++;Visual Studio,Windows,Call a coworker or friend;Visit Stack Overflow...,Stack Overflow;Stack Exchange,Multiple times per day,Yes,Daily or almost daily,"Yes, definitely",Yes,35-44 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
82247,I am a developer by profession,"Independent contractor, freelancer, or self-em...",Bulgaria,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",18 - 24 years,"Other online resources (ex: videos, blogs, etc...",20,15,"Developer, full-stack;Developer, back-end;Acad...",2 to 9 employees,BGN\tBulgarian lev,60000.0,Yearly,C#;HTML/CSS;JavaScript;PHP;Python,,MariaDB;Microsoft SQL Server;MySQL;SQLite,,DigitalOcean;Google Cloud Platform,,jQuery;Symfony,,.NET Framework;Qt;TensorFlow,,Git;Unity 3D,,Notepad++;Sublime Text;Visual Studio;Visual St...,,Windows,Visit Stack Overflow;Go for a walk or other ph...,Stack Overflow,Multiple times per day,Yes,A few times per month or weekly,"Yes, somewhat",Yes,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Too short,Easy,33153.0
82354,I am a developer by profession,Employed full-time,Bulgaria,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",5 - 10 years,Coding Bootcamp;Other online resources (ex: vi...,13,6,"Developer, front-end;Developer, desktop or ent...",100 to 499 employees,BGN\tBulgarian lev,7500.0,Monthly,Bash/Shell;C++;HTML/CSS;JavaScript;Node.js;Pow...,C++;Go;HTML/CSS;JavaScript;Node.js;Rust;TypeSc...,,,,,Express;React.js,ASP.NET Core ;Express;React.js,,,Docker;Git;Yarn,Docker;Git;Unity 3D;Unreal Engine;Yarn,Notepad++;Visual Studio;Visual Studio Code,Notepad++;Visual Studio;Visual Studio Code,Windows,Call a coworker or friend;Go for a walk or oth...,Stack Overflow;Stack Exchange,A few times per week,Yes,Less than once per month or monthly,Neutral,No,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,49728.0
82711,I am a student who is learning to code,"Student, full-time",Bulgaria,,,"Secondary school (e.g. American high school, G...",18 - 24 years,"Other online resources (ex: videos, blogs, etc...",2,,,,,,,C#;C++;Java;JavaScript;Kotlin;PHP;SQL,C#;Kotlin;SQL,Microsoft SQL Server;MySQL,Microsoft SQL Server;MySQL,,,ASP.NET;ASP.NET Core,ASP.NET Core,.NET Framework;.NET Core / .NET 5,.NET Core / .NET 5,Git;Unreal Engine,Git,Android Studio;Atom;Eclipse;Visual Studio;Visu...,Android Studio;Visual Studio;Visual Studio Code,Windows,Visit Stack Overflow;Google it;Watch help / tu...,Stack Overflow,Daily or almost daily,Yes,I have never participated in Q&A on Stack Over...,Neutral,No,18-24 years old,Woman,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,
82847,I am a developer by profession,Employed full-time,Bulgaria,,,Some college/university study without earning ...,11 - 17 years,"Other online resources (ex: videos, blogs, etc...",4,2,"Developer, front-end;Developer, full-stack;Dev...",20 to 99 employees,BGN\tBulgarian lev,1800.0,Monthly,C#;HTML/CSS;JavaScript;SQL,C#;C++;JavaScript;PowerShell;Python;SQL;TypeSc...,Microsoft SQL Server,Microsoft SQL Server,Microsoft Azure,AWS;Microsoft Azure,ASP.NET;ASP.NET Core ;jQuery,ASP.NET;ASP.NET Core ;jQuery;React.js,.NET Framework;.NET Core / .NET 5,.NET Framework;.NET Core / .NET 5;React Native,Git,Docker;Git;Xamarin,Notepad++;Visual Studio;Visual Studio Code,Android Studio;Notepad++;Visual Studio;Visual ...,Windows,Visit Stack Overflow;Go for a walk or other ph...,Stack Overflow,A few times per month or weekly,No,,"No, not really",No,18-24 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,11940.0


In [197]:
# Export this dataframe to a new csv file.
df.to_csv('results_for_bulgaria.csv')

# For Home

Choose a dataset from a previous year and analyse it using Pandas!