# Analysing Stack Overflow Annual Developer Survey

## Introduction 
For a developer, it is important to stay up to date with the new technologies as well as the most used languages and contributes to the open source community. In this project, we will analyze the data from Stack Overflow Annual Developer Survey to find out the most popular programming languages, databases, platforms, and other tools. We will also find out the most used languages and tools in different countries.

## Dataset description

In our database, it represents multiple programming languages, databases and tools that has been used throughout the year of 2022 in one of the most famous plateforms for developers, Stack Overflow. The data is collected from the annual survey that Stack Overflow conducts every year. The survey is conducted to get the insights of the developers and their preferences, it is conducted in the month of January and the results are published in the month of May, in 170 countries and 56 languages, has 65 questions and the data is collected from 64,000 developers.

We'll also compare it to the data from the previous years 2020 and 2021 to see how the preferences have changed.

Example of the survey:

In [1]:
class PDF(object):
  def __init__(self, pdf, size=(200,200)):
    self.pdf = pdf
    self.size = size
    
  def _repr_html_(self):
    return '<iframe src={0} width={1[0]} height={1[1]}></iframe>'.format(self.pdf, self.size)

  def _repr_latex_(self):
    return r'\includegraphics[width=1.0\textwidth]{{{0}}}'.format(self.pdf)
  
PDF('so_survey_2022.pdf', size=(800, 600))

In [3]:
import pandas as pd

df = pd.read_csv('survey_results_public.csv')
df.head(5)

Unnamed: 0,ResponseId,MainBranch,Employment,RemoteWork,CodingActivities,EdLevel,LearnCode,LearnCodeOnline,LearnCodeCoursesCert,YearsCode,...,TimeSearching,TimeAnswering,Onboarding,ProfessionalTech,TrueFalse_1,TrueFalse_2,TrueFalse_3,SurveyLength,SurveyEase,ConvertedCompYearly
0,1,None of these,,,,,,,,,...,,,,,,,,,,
1,2,I am a developer by profession,"Employed, full-time",Fully remote,Hobby;Contribute to open-source projects,,,,,,...,,,,,,,,Too long,Difficult,
2,3,"I am not primarily a developer, but I write co...","Employed, full-time","Hybrid (some remote, some in-person)",Hobby,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",Books / Physical media;Friend or family member...,Technical documentation;Blogs;Programming Game...,,14.0,...,,,,,,,,Appropriate in length,Neither easy nor difficult,40205.0
3,4,I am a developer by profession,"Employed, full-time",Fully remote,I don’t code outside of work,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)","Books / Physical media;School (i.e., Universit...",,,20.0,...,,,,,,,,Appropriate in length,Easy,215232.0
4,5,I am a developer by profession,"Employed, full-time","Hybrid (some remote, some in-person)",Hobby,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)","Other online resources (e.g., videos, blogs, f...",Technical documentation;Blogs;Stack Overflow;O...,,8.0,...,,,,,,,,Too long,Easy,


In [4]:
df.columns

Index(['ResponseId', 'MainBranch', 'Employment', 'RemoteWork',
       'CodingActivities', 'EdLevel', 'LearnCode', 'LearnCodeOnline',
       'LearnCodeCoursesCert', 'YearsCode', 'YearsCodePro', 'DevType',
       'OrgSize', 'PurchaseInfluence', 'BuyNewTool', 'Country', 'Currency',
       'CompTotal', 'CompFreq', 'LanguageHaveWorkedWith',
       'LanguageWantToWorkWith', 'DatabaseHaveWorkedWith',
       'DatabaseWantToWorkWith', 'PlatformHaveWorkedWith',
       'PlatformWantToWorkWith', 'WebframeHaveWorkedWith',
       'WebframeWantToWorkWith', 'MiscTechHaveWorkedWith',
       'MiscTechWantToWorkWith', 'ToolsTechHaveWorkedWith',
       'ToolsTechWantToWorkWith', 'NEWCollabToolsHaveWorkedWith',
       'NEWCollabToolsWantToWorkWith', 'OpSysProfessional use',
       'OpSysPersonal use', 'VersionControlSystem', 'VCInteraction',
       'VCHostingPersonal use', 'VCHostingProfessional use',
       'OfficeStackAsyncHaveWorkedWith', 'OfficeStackAsyncWantToWorkWith',
       'OfficeStackSyncHaveWork

<p>As seen above, we have multiple columns in our dataset, each column represents a different question that was asked in the survey. The questions are related to the programming languages, databases, platforms, and tools that the developers use, as well as some informations about themselves. The data is collected from the developers from all over the world, and the developers are asked to provide their degree of studies, gender and work status.

Using this dataset, we could analyse the different features and their caracteristics, the correlation between them, and try to predict which language would be used more in the future or which languages are going to be less used. We could also try to predict the salary of a developer based on the languages and tools that he uses, the country of a developer based on the languages and tools that he uses, the age of a developer based on the languages and tools that he uses. and many more variables.</p>

In [10]:
df[df['Country'] == 'Morocco']

Unnamed: 0,ResponseId,MainBranch,Employment,RemoteWork,CodingActivities,EdLevel,LearnCode,LearnCodeOnline,LearnCodeCoursesCert,YearsCode,...,TimeSearching,TimeAnswering,Onboarding,ProfessionalTech,TrueFalse_1,TrueFalse_2,TrueFalse_3,SurveyLength,SurveyEase,ConvertedCompYearly
1579,1580,I am a developer by profession,"Employed, full-time","Hybrid (some remote, some in-person)",Hobby;Bootstrapping a business,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)","Books / Physical media;School (i.e., Universit...",,,12,...,30-60 minutes a day,15-30 minutes a day,Somewhat long,Microservices,Yes,No,Yes,Appropriate in length,Neither easy nor difficult,26484.0
2120,2121,I am a developer by profession,"Independent contractor, freelancer, or self-em...",Fully remote,Hobby,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",Books / Physical media;Other online resources ...,Technical documentation;Blogs;Written Tutorial...,Coursera;Udemy;Pluralsight;Udacity,8,...,60-120 minutes a day,30-60 minutes a day,Very long,Microservices;Continuous integration (CI) and ...,Yes,Yes,Yes,Appropriate in length,Easy,90264.0
2651,2652,I am learning to code,"Student, full-time",,,"Secondary school (e.g. American high school, G...",Books / Physical media;Other online resources ...,Technical documentation;Blogs;Stack Overflow;V...,,2,...,,,,,,,,Appropriate in length,Easy,
2666,2667,I am a developer by profession,"Employed, full-time","Hybrid (some remote, some in-person)",Hobby;Bootstrapping a business;School or acade...,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)","Other online resources (e.g., videos, blogs, f...",Technical documentation;Blogs;Stack Overflow;O...,Coursera;Udemy;Codecademy;Skillsoft,6,...,Less than 15 minutes a day,Less than 15 minutes a day,Somewhat short,DevOps function;Continuous integration (CI) an...,Yes,Yes,Yes,Too long,Easy,14448.0
2745,2746,I am learning to code,"Not employed, but looking for work",,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",Books / Physical media;Other online resources ...,Technical documentation;Programming Games;Onli...,Codecademy,Less than 1 year,...,,,,,,,,Appropriate in length,Easy,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
71739,71740,I am a developer by profession,"Employed, full-time",Full in-person,Hobby;Bootstrapping a business;Freelance/contr...,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)","Books / Physical media;School (i.e., Universit...",,Other,6,...,15-30 minutes a day,15-30 minutes a day,Very long,None of these,Yes,No,Yes,Appropriate in length,Easy,20460.0
71925,71926,I am a developer by profession,"Independent contractor, freelancer, or self-em...",Fully remote,Contribute to open-source projects;Freelance/c...,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",Books / Physical media;Other online resources ...,Blogs;Programming Games;Stack Overflow;Video-b...,Udemy,4,...,,,,,,,,Appropriate in length,Easy,
72047,72048,I code primarily as a hobby,"Not employed, but looking for work",,,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",Books / Physical media;Online Courses or Certi...,,Udemy,3,...,,,,,,,,Appropriate in length,Neither easy nor difficult,
72330,72331,I am a developer by profession,"Student, full-time;Not employed, but looking f...",,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)","Other online resources (e.g., videos, blogs, f...",Blogs;Written Tutorials;Stack Overflow;Online ...,,4,...,,,,,,,,Appropriate in length,Neither easy nor difficult,


In [12]:
df_2021 = pd.read_csv('survey_results_public_2021.csv')
df_2021.head(5)

Unnamed: 0,ResponseId,MainBranch,Employment,Country,US_State,UK_Country,EdLevel,Age1stCode,LearnCode,YearsCode,...,Age,Gender,Trans,Sexuality,Ethnicity,Accessibility,MentalHealth,SurveyLength,SurveyEase,ConvertedCompYearly
0,1,I am a developer by profession,"Independent contractor, freelancer, or self-em...",Slovakia,,,"Secondary school (e.g. American high school, G...",18 - 24 years,Coding Bootcamp;Other online resources (ex: vi...,,...,25-34 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,62268.0
1,2,I am a student who is learning to code,"Student, full-time",Netherlands,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",7.0,...,18-24 years old,Man,No,Straight / Heterosexual,White or of European descent,None of the above,None of the above,Appropriate in length,Easy,
2,3,"I am not primarily a developer, but I write co...","Student, full-time",Russian Federation,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",,...,18-24 years old,Man,No,Prefer not to say,Prefer not to say,None of the above,None of the above,Appropriate in length,Easy,
3,4,I am a developer by profession,Employed full-time,Austria,,,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",11 - 17 years,,,...,35-44 years old,Man,No,Straight / Heterosexual,White or of European descent,I am deaf / hard of hearing,,Appropriate in length,Neither easy nor difficult,
4,5,I am a developer by profession,"Independent contractor, freelancer, or self-em...",United Kingdom of Great Britain and Northern I...,,England,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",5 - 10 years,Friend or family member,17.0,...,25-34 years old,Man,No,,White or of European descent,None of the above,,Appropriate in length,Easy,


In [14]:
df_2021[df_2021['Country'] == 'Morocco']

Unnamed: 0,ResponseId,MainBranch,Employment,Country,US_State,UK_Country,EdLevel,Age1stCode,LearnCode,YearsCode,...,Age,Gender,Trans,Sexuality,Ethnicity,Accessibility,MentalHealth,SurveyLength,SurveyEase,ConvertedCompYearly
221,222,I am a student who is learning to code,"Not employed, but looking for work",Morocco,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",25 - 34 years,"Other online resources (ex: videos, blogs, etc)",,...,25-34 years old,Man,No,Straight / Heterosexual,Black or of African descent,None of the above,Prefer not to say,Appropriate in length,Neither easy nor difficult,
905,906,I am a developer by profession,"Student, part-time",Morocco,,,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",18 - 24 years,"Other online resources (ex: videos, blogs, etc...",5,...,25-34 years old,Man,No,Straight / Heterosexual,I don't know,None of the above,None of the above,Appropriate in length,Easy,
1058,1059,I am a developer by profession,Employed full-time,Morocco,,,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",18 - 24 years,School;Online Courses or Certification;Colleague,6,...,25-34 years old,Man,No,Straight / Heterosexual,Black or of African descent,None of the above,None of the above,Appropriate in length,Neither easy nor difficult,
1232,1233,I code primarily as a hobby,"Not employed, but looking for work",Morocco,,,Some college/university study without earning ...,18 - 24 years,Online Courses or Certification,2,...,18-24 years old,Man,No,,,,,Too long,Easy,
1681,1682,I am a developer by profession,Employed full-time,Morocco,,,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",18 - 24 years,"Other online resources (ex: videos, blogs, etc...",11,...,25-34 years old,Man,No,Straight / Heterosexual,Black or of African descent,None of the above,I have a concentration and/or memory disorder ...,,Neither easy nor difficult,12324.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
79917,79918,I am a developer by profession,"Not employed, but looking for work",Morocco,,,"Master’s degree (M.A., M.S., M.Eng., MBA, etc.)",11 - 17 years,"Other online resources (ex: videos, blogs, etc...",7,...,25-34 years old,Man,No,Straight / Heterosexual,"Or, in your own words:",None of the above,I have a mood or emotional disorder (e.g. depr...,Appropriate in length,Easy,
80615,80616,I code primarily as a hobby,"Not employed, but looking for work",Morocco,,,"Bachelor’s degree (B.A., B.S., B.Eng., etc.)",5 - 10 years,Coding Bootcamp;Other online resources (ex: vi...,10,...,18-24 years old,Man,No,Prefer to self-describe:,White or of European descent,None of the above,None of the above,Appropriate in length,Neither easy nor difficult,
82306,82307,"I used to be a developer by profession, but no...","Independent contractor, freelancer, or self-em...",Morocco,,,"Associate degree (A.A., A.S., etc.)",18 - 24 years,"Other online resources (ex: videos, blogs, etc...",2,...,18-24 years old,Man,No,Straight / Heterosexual,"Or, in your own words:",None of the above,None of the above,Appropriate in length,Neither easy nor difficult,
82565,82566,I code primarily as a hobby,"Independent contractor, freelancer, or self-em...",Morocco,,,Some college/university study without earning ...,Younger than 5 years,"Other online resources (ex: videos, blogs, etc...",3,...,18-24 years old,Man,No,Straight / Heterosexual,Middle Eastern,None of the above,I have an anxiety disorder,Too long,Neither easy nor difficult,


In [15]:
df_2020 = pd.read_csv('survey_results_public_2020.csv')
df_2020.head(5)

Unnamed: 0,Respondent,MainBranch,Hobbyist,Age,Age1stCode,CompFreq,CompTotal,ConvertedComp,Country,CurrencyDesc,...,SurveyEase,SurveyLength,Trans,UndergradMajor,WebframeDesireNextYear,WebframeWorkedWith,WelcomeChange,WorkWeekHrs,YearsCode,YearsCodePro
0,1,I am a developer by profession,Yes,,13,Monthly,,,Germany,European Euro,...,Neither easy nor difficult,Appropriate in length,No,"Computer science, computer engineering, or sof...",ASP.NET Core,ASP.NET;ASP.NET Core,Just as welcome now as I felt last year,50.0,36,27.0
1,2,I am a developer by profession,No,,19,,,,United Kingdom,Pound sterling,...,,,,"Computer science, computer engineering, or sof...",,,Somewhat more welcome now than last year,,7,4.0
2,3,I code primarily as a hobby,Yes,,15,,,,Russian Federation,,...,Neither easy nor difficult,Appropriate in length,,,,,Somewhat more welcome now than last year,,4,
3,4,I am a developer by profession,Yes,25.0,18,,,,Albania,Albanian lek,...,,,No,"Computer science, computer engineering, or sof...",,,Somewhat less welcome now than last year,40.0,7,4.0
4,5,"I used to be a developer by profession, but no...",Yes,31.0,16,,,,United States,,...,Easy,Too short,No,"Computer science, computer engineering, or sof...",Django;Ruby on Rails,Ruby on Rails,Just as welcome now as I felt last year,,15,8.0


In [16]:
df_2020[df_2020['Country'] == 'Morocco']

Unnamed: 0,Respondent,MainBranch,Hobbyist,Age,Age1stCode,CompFreq,CompTotal,ConvertedComp,Country,CurrencyDesc,...,SurveyEase,SurveyLength,Trans,UndergradMajor,WebframeDesireNextYear,WebframeWorkedWith,WelcomeChange,WorkWeekHrs,YearsCode,YearsCodePro
571,573,I am a developer by profession,Yes,26.0,15,Monthly,6700.0,8256.0,Morocco,Moroccan dirham,...,Easy,Appropriate in length,No,"Computer science, computer engineering, or sof...",Angular;Express;Laravel;React.js;Symfony;Vue.js,jQuery;Laravel,Just as welcome now as I felt last year,44.0,8,5
839,841,I am a developer by profession,Yes,28.0,18,Monthly,17300.0,21312.0,Morocco,Moroccan dirham,...,Easy,Appropriate in length,No,"Computer science, computer engineering, or sof...",Angular;ASP.NET;Laravel,Angular;ASP.NET;Laravel;Spring,Somewhat less welcome now than last year,40.0,11,6
1712,1721,I am a developer by profession,Yes,,18,,,,Morocco,Moroccan dirham,...,Neither easy nor difficult,Appropriate in length,No,"Computer science, computer engineering, or sof...",Django;Flask;Vue.js,Flask;jQuery;Vue.js,Just as welcome now as I felt last year,38.0,10,6
3430,3444,I am a student who is learning to code,Yes,21.0,18,,,,Morocco,,...,Easy,Appropriate in length,,"Computer science, computer engineering, or sof...",Express;Gatsby;React.js;Ruby on Rails,Express;React.js,Just as welcome now as I felt last year,,3,
3577,3591,I code primarily as a hobby,Yes,20.0,10,,,,Morocco,,...,Neither easy nor difficult,Appropriate in length,No,Web development or web design,jQuery,jQuery,Not applicable - I did not use Stack Overflow ...,,10,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
64022,65486,I am a student who is learning to code,Yes,,,,,,Morocco,,...,,,,,,,,,,
64315,39520,,Yes,,,,,,Morocco,,...,,,,,,,Not applicable - I did not use Stack Overflow ...,,,
64438,62391,,Yes,,Younger than 5 years,,,,Morocco,,...,Neither easy nor difficult,Too short,,,Angular.js;Express;React.js;Ruby on Rails,Angular.js;Express;React.js;Ruby on Rails,,,Less than 1 year,Less than 1 year
64454,64480,,Yes,,21,,,,Morocco,,...,,,,,,,Just as welcome now as I felt last year,,3,
