## Introduction
In this project, stackoverflow developer servey 2020 dataset is being used to study on the behaviour of the developers across the world.

Here we will try to answer the following question


### 2) What programming languages the developers currently working on and are desired to work?



In [4]:
#importing necessary modules
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from collections import defaultdict
%matplotlib inline

import warnings
warnings.filterwarnings('ignore')

In [5]:
# gathering data 
df=pd.read_csv('survey_results_public.csv')
df.head()

Unnamed: 0,Respondent,MainBranch,Hobbyist,Age,Age1stCode,CompFreq,CompTotal,ConvertedComp,Country,CurrencyDesc,...,SurveyEase,SurveyLength,Trans,UndergradMajor,WebframeDesireNextYear,WebframeWorkedWith,WelcomeChange,WorkWeekHrs,YearsCode,YearsCodePro
0,1,I am a developer by profession,Yes,,13,Monthly,,,Germany,European Euro,...,Neither easy nor difficult,Appropriate in length,No,"Computer science, computer engineering, or sof...",ASP.NET Core,ASP.NET;ASP.NET Core,Just as welcome now as I felt last year,50.0,36,27.0
1,2,I am a developer by profession,No,,19,,,,United Kingdom,Pound sterling,...,,,,"Computer science, computer engineering, or sof...",,,Somewhat more welcome now than last year,,7,4.0
2,3,I code primarily as a hobby,Yes,,15,,,,Russian Federation,,...,Neither easy nor difficult,Appropriate in length,,,,,Somewhat more welcome now than last year,,4,
3,4,I am a developer by profession,Yes,25.0,18,,,,Albania,Albanian lek,...,,,No,"Computer science, computer engineering, or sof...",,,Somewhat less welcome now than last year,40.0,7,4.0
4,5,"I used to be a developer by profession, but no...",Yes,31.0,16,,,,United States,,...,Easy,Too short,No,"Computer science, computer engineering, or sof...",Django;Ruby on Rails,Ruby on Rails,Just as welcome now as I felt last year,,15,8.0


In [21]:
#data cleanup and preparation

df_languages_current=df[df.LanguageWorkedWith.notna()]
df_languages_desired=df[df.LanguageDesireNextYear.notna()]


In [22]:
#Function to simplify the compact programming language strings of all developers

def getAllLanguagesWithTotalResponseCount(lang_strings):
    '''
    INPUT:
    lang_strings - a pandas series holding compacted programming language string responses of developers
    
    OUTPUT:
    X - a pandas series with keys as individual programming languages and values as its weights (total response count)
    
    '''
    languages_dict=defaultdict(int)
    for lang in lang_strings:
        tmp=lang.split(';')
        for j in tmp:
            languages_dict[j]=languages_dict[j]+1
    
    return pd.Series(languages_dict).sort_values(ascending=False)


In [23]:
#Find different programming languages that developers are currently working, with its total response count
languages_current=getAllLanguagesWithTotalResponseCount(df_languages_current.LanguageWorkedWith)
languages_current

JavaScript               38822
HTML/CSS                 36181
SQL                      31413
Python                   25287
Java                     23074
Bash/Shell/PowerShell    18980
C#                       18041
PHP                      15007
TypeScript               14578
C++                      13707
C                        12487
Go                        5038
Kotlin                    4468
Ruby                      4046
Assembly                  3553
VBA                       3499
Swift                     3397
R                         3288
Rust                      2929
Objective-C               2340
Dart                      2280
Scala                     2052
Perl                      1796
Haskell                   1222
Julia                      519
dtype: int64

In [24]:
#Find different programming languages that developers are desired to work, with its total response count
languages_desired=getAllLanguagesWithTotalResponseCount(df_languages_desired.LanguageDesireNextYear)
languages_desired

Python                   26682
JavaScript               26188
HTML/CSS                 20771
SQL                      19970
TypeScript               17150
C#                       13674
Java                     13264
Go                       12605
Bash/Shell/PowerShell    11728
Rust                     10563
C++                       9756
Kotlin                    9575
PHP                       7106
C                         6091
Swift                     5643
Dart                      4742
R                         4271
Ruby                      4184
Scala                     3465
Haskell                   2996
Assembly                  2469
Julia                     1661
Objective-C               1525
Perl                      1150
VBA                       1055
dtype: int64

In [25]:
#Plot the percentages & differences between developers_worked/working_with and developers_desired_to_work for all programming language
languages_current_perct=pd.DataFrame(languages_current*100/languages_current.sum()).reset_index()
languages_current_perct.columns=['Programming Languages','developers_worked/working_with_%']
languages_current_perct.set_index('Programming Languages', inplace=True)
languages_current_perct

languages_desired_perct=pd.DataFrame(languages_desired*100/languages_desired.sum()).reset_index()
languages_desired_perct.columns=['Programming Languages','developers_desired_to_work_%']
languages_desired_perct.set_index('Programming Languages', inplace=True)
languages_desired_perct

comp_df = pd.merge(languages_current_perct, languages_desired_perct, left_index=True, right_index=True) 
comp_df.sort_values(by='developers_desired_to_work_%',ascending=False,inplace=True)
comp_df['difference_%'] = comp_df['developers_desired_to_work_%'] - comp_df['developers_worked/working_with_%']
comp_df.style.bar(subset=['difference_%'], align='mid', color=['#d65f5f', '#5fba7d'])

Unnamed: 0_level_0,developers_worked/working_with_%,developers_desired_to_work_%,difference_%
Programming Languages,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Python,8.78009,11.1976,2.41748
JavaScript,13.4797,10.9902,-2.48943
HTML/CSS,12.5627,8.71691,-3.84576
SQL,10.9071,8.38076,-2.52638
TypeScript,5.06174,7.19729,2.13556
C#,6.26415,5.73853,-0.525619
Java,8.01169,5.56647,-2.44523
Go,1.74928,5.28991,3.54062
Bash/Shell/PowerShell,6.59019,4.92186,-1.66833
Rust,1.017,4.43295,3.41595
