In this notebook, we dive into the question: Do those who prefer noise while working want their coworkers to work quietly? 

First, let's read in the data from the StackOverflow 2017 survey. 

In [4]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

df = pd.read_csv('./2017_SOF.csv')

Now, we clean up the data by removing any rows with blanks or Nan in the ClickyKeys column.

In [5]:
df = df.dropna(subset = ['ClickyKeys'], axis = 0, how = 'all')
df

Unnamed: 0,Respondent,Professional,ProgramHobby,Country,University,EmploymentStatus,FormalEducation,MajorUndergrad,HomeRemote,CompanySize,...,StackOverflowMakeMoney,Gender,HighestEducationParents,Race,SurveyLong,QuestionsInteresting,QuestionsConfusing,InterestedAnswers,Salary,ExpectedSalary
0,1,Student,"Yes, both",United States,No,"Not employed, and not looking for work",Secondary school,,,,...,Strongly disagree,Male,High school,White or of European descent,Strongly disagree,Strongly agree,Disagree,Strongly agree,,
1,2,Student,"Yes, both",United Kingdom,"Yes, full-time",Employed part-time,Some college/university study without earning ...,Computer science or software engineering,"More than half, but not all, the time",20 to 99 employees,...,Strongly disagree,Male,A master's degree,White or of European descent,Somewhat agree,Somewhat agree,Disagree,Strongly agree,,37500.0
2,3,Professional developer,"Yes, both",United Kingdom,No,Employed full-time,Bachelor's degree,Computer science or software engineering,"Less than half the time, but at least one day ...","10,000 or more employees",...,Disagree,Male,A professional degree,White or of European descent,Somewhat agree,Agree,Disagree,Agree,113750.0,
3,4,Professional non-developer who sometimes write...,"Yes, both",United States,No,Employed full-time,Doctoral degree,A non-computer-focused engineering discipline,"Less than half the time, but at least one day ...","10,000 or more employees",...,Disagree,Male,A doctoral degree,White or of European descent,Agree,Agree,Somewhat agree,Strongly agree,,
5,6,Student,"Yes, both",New Zealand,"Yes, full-time","Not employed, and not looking for work",Secondary school,,,,...,Disagree,,A bachelor's degree,White or of European descent,Disagree,Agree,Disagree,Agree,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
51386,51387,Professional developer,"Yes, both",Romania,No,Employed full-time,Some college/university study without earning ...,Something else,It's complicated,100 to 499 employees,...,Somewhat agree,Male,High school,White or of European descent,Agree,Agree,Disagree,Somewhat agree,,
51387,51388,Professional developer,"Yes, I program as a hobby",United States,No,Employed full-time,Bachelor's degree,A social science,A few days each month,100 to 499 employees,...,Disagree,Male,A doctoral degree,East Asian; White or of European descent,Disagree,Agree,Strongly disagree,Strongly agree,58000.0,
51388,51389,Student,No,Venezuela,"Yes, full-time",Employed full-time,Master's degree,Computer programming or Web development,Never,100 to 499 employees,...,,Male,A master's degree,Black or of African descent; Hispanic or Latin...,Somewhat agree,Agree,Disagree,Agree,,
51390,51391,Professional developer,"Yes, I program as a hobby",United States,No,Employed full-time,Bachelor's degree,Computer science or software engineering,Never,Fewer than 10 employees,...,Disagree,Male,A bachelor's degree,White or of European descent,Disagree,Agree,Disagree,Strongly agree,40000.0,


Now I create a function which I encode the categorical variable ClickyKeys. 1 is for yes and 0 is for no. ClickyKeys contains responses to the question: If two developers are in an office, is it OK for one of them to get a mechanical keyboard with "clicky" keys?

In [6]:
def clicky_keys(clicky_keys_str):
    '''
    input 
        clicky_keys_str - a string of one of the values from the ClickyKeys column
    output
        return 1 if yes
        return 0 if no
    '''
    if clicky_keys_str in ("Yes"):
        return 1
    else:
        return 0
    
df["ClickyKeys"].apply(clicky_keys)[:66]

0     1
1     0
2     1
3     1
5     1
     ..
73    1
75    0
76    1
77    1
80    1
Name: ClickyKeys, Length: 66, dtype: int64

I apply this funtion to a new column named clicky. Then split the column into two separate dataframes. 

In [7]:
df['clicky'] = df["ClickyKeys"].apply(clicky_keys)

Now we have encoded the categorical variables. 

In [8]:
ck_1 = df[df["clicky"] == 1]
ck_0 = df[df["clicky"] == 0]
# print(ck_1) #Assure it looks like what you would expect
# print(ck_0) #Assure it looks like what you would expect

Within each dataset, I utilize the AuditoryEnvironment variable, which contains responses to the question: Suppose you are about to start a few hours of coding and have complete control over your auditory environment (music, background noise, etc.) What would you do?

For Auditory Environment, I calculate the frequency of occurences as a percentage. I do that for both in both datasets then present it in a chart. 

I gather the unique values to calculate the frequency of occurences as a percentage. Then organize each dataset (ck_0 and ck_1) by those values to display who is okay with clicky keys and their ideal workspace noise. 

In [9]:
df_1 = ck_1['AuditoryEnvironment'].value_counts(normalize=True)
df_0 = ck_0['AuditoryEnvironment'].value_counts(normalize=True)

df_1

comp_df = pd.merge(df_1, df_0,left_index = True, right_index = True)
comp_df.columns = ['okay with clicking','not okay with clicking']
comp_df['Difference'] = comp_df['okay with clicking'] - comp_df['not okay with clicking']
comp_df.style.bar(subset=['Difference'], align='mid', color=['#d65f5f', '#5fba7d'])

Unnamed: 0,okay with clicking,not okay with clicking,Difference
Turn on some music,0.638436,0.546579,0.091858
Keep the room absolutely quiet,0.201625,0.290018,-0.088393
"Put on some ambient sounds (e.g. whale songs, forest sounds)",0.065194,0.078732,-0.013538
Put on a movie or TV show,0.035237,0.029032,0.006205
Something else,0.034679,0.035154,-0.000475
Turn on the news or talk radio,0.024829,0.020486,0.004343


Based on the chart above, there is a positive bias between those who like noise (music, TV, and talk news or radio) and those who are okay with their coworker having clicky keys on their keyboard. There is a negative bias between those who prefer silence and those who are not okay with their coworker having clicky keys. This instance having the largest difference at 9%. Those who prefer music while working are more likely to be okay with their coworker having clicky keys. In conclusion, people are not hypocrites. 