# Introduction

This kernel depends on the data cleaning and preparation done by HASAN BASRI AKÇAY in the notebook

https://www.kaggle.com/hasanbasriakcay/what-has-changed-in-data-science/comments?kernelSessionId=84205371


# Purpose 

The objective of this notebook is to explore the changes in data science with the Plotly Express charts. Learn about the interfaces available with Plotly Express to render the data, in a more legible, friendly and usable manner. Also, find some interesting hacks that can help in making the chart rendering more fun, and rewarding.

1. Introduction
2. Data Preparation
3. Data Cleaning
*** All the above done by Hasan***

4. Data rendering with Plotly


# Conclusion

I believe that, one has to see the result before exploring how it was achieved. You will see the plots first. Then the conclusion. Rest of the codes will be hidden, which can be viewed and learned from.

In [None]:

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import warnings
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots


warnings.simplefilter("ignore")
sns.set()

df21 = pd.read_csv('../input/kaggle-survey-2021/kaggle_survey_2021_responses.csv')
df20 = pd.read_csv('../input/kaggle-survey-2020/kaggle_survey_2020_responses.csv')

# Data Preparation

In this part, we created 3 functions that are used for simplification the datasets. 

In the datasets, some questions have more than one column and function **group_cols** is used for grouping the questions. For example, Q24 is one group, and Q12_Part_1, Q12_Part_2, Q12_Part_3, Q12_OTHER are also one group. 

Function **part_cols_convert** is written for converting the questions that have more than one column to one column. For instance, this function converts Q12_Part_1, Q12_Part_2, Q12_Part_3, Q12_OTHER to Q12 column. 

The last function is **dict_preparation** that is used for matching the same question in 2020 and 2021. Of course in the datasets, some questions mean are the same but the questions are different. We solved that kind of problem with manual correction. For example, Q12 is "Which types of specialized hardware do you use on a regular basis?  (Select all that apply) - Selected Choice - GPUs" in 2020 and "Which types of specialized hardware do you use on a regular basis?  (Select all that apply) - Selected Choice -  NVIDIA GPUs" in 2021

After all preparation, we combined  survey 2020 and 2021 by function **prepare_data**.

In [None]:
def group_cols(df):
    cols = df.columns # listing the columns
    
    col_part = [] # creating empty array
    for col in cols: 
        if '_' in col: #This checks if the column is part of the questions
            col_part.append(col) #Appending the column names to the array
    
    cols_1 = list(set(cols) - set(col_part)) #Removes the elements that are having underscores.
    
    temp_df = pd.DataFrame(col_part) #creating dataframe with the columns that are questions part
    temp_df['question'] = temp_df[0].str.split('_').str[0]
    temp_group = temp_df.groupby('question')[0] #This gives a groupby object, which can again be used 
    
    cols_2 = []
    for name, group in temp_group:
        if len(group) > 1: # This checks if there are multiple values in the group. 
            cols_2.append(list(group.values)) #Then it creates, multi dimensional list
    
    return list(cols_1+cols_2)

In [None]:
def part_cols_convert(df):
    cols = df.columns
    
    col_part = []
    for col in cols:
        if '_' in col:
            col_part.append(col) #till here same as group cols
    
    temp_df = pd.DataFrame(col_part)
    temp_df['question'] = temp_df[0].str.split('_').str[0]
    temp_group = temp_df.groupby('question')[0] #gives out a groupby object
    
    cols_2 = []
    for name, group in temp_group:
        if len(group) > 1:
            cols_2.append(list(group.values)) # creates a column list
    
    part_df_list = []
    for cols in cols_2:
        part_df = pd.DataFrame()
        new_col = cols[0].split('_')[0]#This will be the question number
        
        values_list = []
        for col in cols:
            str_value = df.loc[0, col].split('-')[-1]# this takes the default selection 
            count_num = df[col].value_counts()[0] #Counts the number of values in the column
            values = [str_value for i in range(count_num)]
            values_list.extend(values) # each of the list element is appended, as seperate element 
        
        part_df[new_col] = values_list #list is populated under the new_col column name
        part_df_list.append(part_df) #part_df is appended to the main part_df_list
    
    df_parts = pd.concat(part_df_list, 1)
    return df_parts

In [None]:
def dict_preparation(question_2020, question_2021, df20, df21):
    same_questions_dict = {} #Creating empty dictionaries
    question_mean_dict = {} #What is this dictionary for

    for c_20 in question_2020: #iterates on the question series that is created from group cols
        if type(c_20) is list:
            c_20 = c_20[0]
            question_mean_dict[c_20.split('_')[0]] = df20.loc[0, c_20]
            #print('c_20:', c_20 , df20.loc[0, c_20])
        else:
            question_mean_dict[c_20] = df20.loc[0, c_20]
        q_20 = df20.loc[0, c_20]

        for c_21 in question_2021:
            if type(c_21) is list:
                c_21 = c_21[0]
            q_21 = df21.loc[0, c_21]
            #print('c_21:', c_21, q_21)
            if q_21 == q_20:
                if '_' in c_20:
                    if '_' in c_21:
                        same_questions_dict[c_20.split('_')[0]] = c_21.split('_')[0]
                    else:
                        same_questions_dict[c_20.split('_')[0]] = c_21
                else:
                    if '_' in c_21:
                        same_questions_dict[c_20] = c_21.split('_')[0]
                    else:
                        same_questions_dict[c_20] = c_21
                break
    return same_questions_dict, question_mean_dict

In [None]:
df20_parts = part_cols_convert(df20)
df21_parts = part_cols_convert(df21)

question_2020 = group_cols(df20) #This creates list of the questions headers that have questions with multiple parts
question_2021 = group_cols(df21)

same_questions_dict, question_mean_dict = dict_preparation(question_2020, question_2021, df20, df21)

In [None]:
from termcolor import colored

diff_questions_20_list = ['Q12_Part_1', 'Q27_A_Part_1', 'Q27_B_Part_1', 'Q28_A_Part_1', 'Q28_B_Part_1', 'Q36_Part_1']
diff_questions_21_list = ['Q12_Part_1', 'Q27_A_Part_1', 'Q27_B_Part_1', 'Q28', 'Q36_A_Part_1', 'Q36_B_Part_1']
more_questions_list = ['Q40_Part_1', 'Q41', 'Q42_Part_1']

print(colored('1) ', 'green'), df20.loc[0, 'Q12_Part_1'], ' - ', df21.loc[0, 'Q12_Part_1'])
print(colored('2) ', 'green'), df20.loc[0, 'Q27_A_Part_1'], ' - ', df21.loc[0, 'Q27_A_Part_1'])
print(colored('3) ', 'green'), df20.loc[0, 'Q27_B_Part_1'], ' - ', df21.loc[0, 'Q27_B_Part_1'])

print(colored('3.5) ', 'green'), df20.loc[0, 'Q26_A_Part_1'], ' - ', df21.loc[0, 'Q27_A_Part_1'])

print(colored('4) ', 'red'), df20.loc[0, 'Q28_A_Part_1'], ' - ', df21.loc[0, 'Q28'])
print(colored('5) ', 'red'), df20.loc[0, 'Q28_B_Part_1'], ' - ', df21.loc[0, 'Q28'])
print(colored('6) ', 'red'), df20.loc[0, 'Q36_Part_1'], ' - ', df21.loc[0, 'Q36_A_Part_1'])
print(colored('7) ', 'red'), df20.loc[0, 'Q36_Part_1'], ' - ', df21.loc[0, 'Q36_B_Part_1'])

print(colored('8) ', 'blue'), df21.loc[0, 'Q40_Part_1'])
print(colored('9) ', 'blue'), df21.loc[0, 'Q41'])
print(colored('10) ', 'blue'), df21.loc[0, 'Q42_Part_1'])

same_questions_dict['Q12'] = 'Q12'

In [None]:
not_exist_2020 = ['Q27', 'Q28', 'Q36']
not_exist_2021 = ['Q20', 'Q28', 'Q29', 'Q30', 'Q31', 'Q39']

print(colored('1) 2020 --- ', 'blue'), df20.loc[0, 'Q27_A_Part_1'])
print(colored('2) 2020 --- ', 'blue'), df20.loc[0, 'Q28_A_Part_1'])
print(colored('3) 2020 --- ', 'blue'), df20.loc[0, 'Q36_Part_1'])
print()
print(colored('1) 2021 --- ', 'red'), df21.loc[0, 'Q20'])
print(colored('2) 2021 --- ', 'red'), df21.loc[0, 'Q28'])
print(colored('3) 2021 --- ', 'red'), df21.loc[0, 'Q29_A_Part_1'])
print(colored('4) 2021 --- ', 'red'), df21.loc[0, 'Q30_A_Part_1'])
print(colored('5) 2021 --- ', 'red'), df21.loc[0, 'Q31_A_Part_1'])
print(colored('6) 2021 --- ', 'red'), df21.loc[0, 'Q39_Part_1'])

same_questions_dict['Q27'] = 'Q29'
same_questions_dict['Q28'] = 'Q31'
same_questions_dict['Q36'] = 'Q39'

In [None]:
def prepare_data(same_questions_dict, df20, df21, df20_parts, df21_parts):
    cols_20, cols_21 = [], []
    part_cols_20, part_cols_21 = [], []
    for key in same_questions_dict.keys():
        if key in df20_parts.columns:
            part_cols_20.append(key)
            part_cols_21.append(same_questions_dict[key])
        else:
            cols_20.append(key)
            cols_21.append(same_questions_dict[key])
    
    df20['years'] = 2020
    df21['years'] = 2021
    df20_parts['years'] = 2020
    df21_parts['years'] = 2021
    
    cols_20.append('years')
    cols_21.append('years')
    part_cols_20.append('years')
    part_cols_21.append('years')
    
    temp_df21 = df21[cols_21]
    temp_df21.columns = cols_20
    temp_df21_parts = df21_parts[part_cols_21]
    temp_df21_parts.columns = part_cols_20
    
    df_20_21 = pd.concat([df20[cols_20].loc[1:, :], temp_df21.loc[1:, :]], join='outer')
    df_part_20_21 = pd.concat([df20_parts[part_cols_20].loc[1:, :], temp_df21_parts.loc[1:, :]], join='outer')
    
    return df_20_21, df_part_20_21

In [None]:
df_20_21, df_part_20_21 = prepare_data(same_questions_dict, df20, df21, df20_parts, df21_parts)

print(df_20_21.shape)
print(df_part_20_21.shape)

# Data Cleaning

Data cleaning is one of the most importants part of data science. As with most datasets, this dataset needs data cleaning. According to my view, some answers were split like 'Product/Project Manager' to 'Program/Project Manager', 'Product Manager' and some answers have been fixed like PostgresSQL to PostgreSQL in 2021. In the below, we tried to match the same answers.

In [None]:
df_20_21_clean = df_20_21.copy()
df_part_20_21_clean = df_part_20_21.copy()

df_20_21_clean['Q6'] = df_20_21_clean['Q6'].str.replace('1-3 years', '1-2 years')
df_20_21_clean['Q30'] = df_20_21_clean['Q30'].str.replace('PostgresSQL', 'PostgreSQL')
df_20_21_clean['Q4'] = df_20_21_clean['Q4'].str.replace('Professional degree', 'Professional doctorate')
df_20_21_clean['Q5'] = df_20_21_clean['Q5'].str.replace('Program/Project Manager', 'Product/Project Manager')
df_20_21_clean['Q5'] = df_20_21_clean['Q5'].str.replace('Product Manager', 'Product/Project Manager')
df_20_21_clean['Q24'] = df_20_21_clean['Q24'].str.replace('$', '')
df_20_21_clean['Q24'] = df_20_21_clean['Q24'].str.replace('300,000-499,999', '300,000-500,000')
df_20_21_clean['Q24'] = df_20_21_clean['Q24'].str.replace('500,000-999,999', '> 500,000')
df_20_21_clean['Q24'] = df_20_21_clean['Q24'].str.replace('>1,000,000', '> 500,000')
df_20_21_clean['Q11'] = df_20_21_clean['Q11'].str.replace('A personal computer / desktop', 'A personal computer or laptop')
df_20_21_clean['Q11'] = df_20_21_clean['Q11'].str.replace('A laptop', 'A personal computer or laptop')
df_20_21_clean['Q11'] = df_20_21_clean['Q11'].str.replace('A cloud computing platform (AWS, Azure, GCP, hosted notebooks, etc)', 'Cloud computing platform')

df_part_20_21_clean['Q10'] = df_part_20_21_clean['Q10'].str.replace('  Amazon Sagemaker Studio Notebooks ', '  Amazon Sagemaker Studio ')
df_part_20_21_clean['Q10'] = df_part_20_21_clean['Q10'].str.replace('\n', '')
df_part_20_21_clean['Q10'] = df_part_20_21_clean['Q10'].str.replace(' Google Cloud Datalab Notebooks', ' Google Cloud Datalab')
df_part_20_21_clean['Q10'] = df_part_20_21_clean['Q10'].str.replace(' Google Cloud AI Platform Notebooks ', ' Google Cloud Notebooks (AI Platform / Vertex AI) ')
df_part_20_21_clean['Q29'] = df_part_20_21_clean['Q29'].str.replace('PostgresSQL', 'PostgreSQL')
df_part_20_21_clean['Q29'] = df_part_20_21_clean['Q29'].str.replace('\n', '')
df_part_20_21_clean['Q29'] = df_part_20_21_clean['Q29'].str.replace(' Microsoft Azure SQL Database ', ' Microsoft Azure Data Lake Storage ')
df_part_20_21_clean['Q29'] = df_part_20_21_clean['Q29'].str.replace(' Microsoft Azure Cosmos DB ', ' Microsoft Azure Data Lake Storage ')
df_part_20_21_clean['Q33'] = df_part_20_21_clean['Q33'].str.replace('\n', '')
df_part_20_21_clean['Q33'] = df_part_20_21_clean['Q33'].str.replace('(', '')
df_part_20_21_clean['Q33'] = df_part_20_21_clean['Q33'].str.replace(')', '')
df_part_20_21_clean['Q33'] = df_part_20_21_clean['Q33'].str.replace(' Automation of full ML pipelines e.g. Google AutoML, H2O Driverless AI', 
                                                        ' Automation of full ML pipelines AutoML')
df_part_20_21_clean['Q33'] = df_part_20_21_clean['Q33'].str.replace(' Automation of full ML pipelines e.g. Google Cloud AutoML, H2O Driverless AI', 
                                                        ' Automation of full ML pipelines Cloud AutoML')
df_part_20_21_clean['Q33'] = df_part_20_21_clean['Q33'].str.replace(' Automation of full ML pipelines e.g. Google AutoML, H20 Driverless AI', 
                                                        ' Automation of full ML pipelines AutoML')
df_part_20_21_clean['Q33'] = df_part_20_21_clean['Q33'].str.replace(' Automation of full ML pipelines e.g. Google Cloud AutoML, H20 Driverless AI', 
                                                        ' Automation of full ML pipelines Cloud AutoML')
df_part_20_21_clean['Q34'] = df_part_20_21_clean['Q34'].str.replace('  H20 Driverless AI  ', '  H2O Driverless AI  ')
df_part_20_21_clean['Q9'] = df_part_20_21_clean['Q9'].str.replace('  Visual Studio / Visual Studio Code ', '  VisualStudio ')
df_part_20_21_clean['Q9'] = df_part_20_21_clean['Q9'].str.replace('(', '')
df_part_20_21_clean['Q9'] = df_part_20_21_clean['Q9'].str.replace(')', '')
df_part_20_21_clean['Q9'] = df_part_20_21_clean['Q9'].str.replace('  Visual Studio Code VSCode ', '  VisualStudio ')
df_part_20_21_clean['Q9'] = df_part_20_21_clean['Q9'].str.replace('  Visual Studio ', '  VisualStudio ')
df_part_20_21_clean['Q9'] = df_part_20_21_clean['Q9'].str.replace('  VisualStudio ', '  Visual Studio / Visual Studio Code ')
df_part_20_21_clean['Q9'] = df_part_20_21_clean['Q9'].str.replace(' Jupyter (JupyterLab, Jupyter Notebooks, etc) ', '  Jupyter Notebook')
df_part_20_21_clean['Q9'] = df_part_20_21_clean['Q9'].str.replace('  Jupyter Notebook', ' Jupyter (JupyterLab, Jupyter Notebooks, etc) ')
#df_part_20_21_clean['Q12'] = df_part_20_21_clean['Q12'].str.replace('  Google Cloud TPUs ', ' TPUs')
#df_part_20_21_clean['Q12'] = df_part_20_21_clean['Q12'].str.replace('  NVIDIA GPUs ', ' GPUs')
#df_part_20_21_clean['Q27'] = df_part_20_21_clean['Q27'].str.replace('  Amazon Elastic Container Service ', '  Amazon Elastic Compute Cloud (EC2) ')
#df_part_20_21_clean['Q27'] = df_part_20_21_clean['Q27'].str.replace('  Microsoft Azure Container Instances ', '  Microsoft Azure Virtual Machines ')


#print(df_20_21_clean['Q11'].unique())

# Data Analysis

In this part, we plotted all questions for visual pieces of information. We created 2 functions. 

Function **long_sentences_seperate** is used for visual editing. For instance, if a question or an answer text is so long for plotting, this function splits the text by adding '\n' to the text.

The **barplot_all_cols** function is used for plotting all columns. For color, we selected the 'years' column.

In [None]:
import textwrap
def long_sentences_seperate(sentence, width=30):
    try:
        splittext = textwrap.wrap(sentence,width)
        text = '<br>'.join(splittext)#whitespace is removed, and the sentence is joined
        return text
    except:
        return sentence

In [None]:
def prepared_df(df, question_mean_dict, df_cols, figsize=(24, 96)):
    response_num_2020 = df20.shape[0] #gets the number of rows
    response_num_2021 = df21.shape[0] 
    
    ncols = 2 
    nrows = round(len(df_cols) / ncols) #intending to create 2 columns and corresponding no of rows.
    counted_df = pd.DataFrame() #Initialising the empty dataframe to store the prepared data
    index = 0
    for row in range(nrows):
        for col in range(ncols):
            try:
                col_name = df_cols[index]
                question = question_mean_dict[col_name]
                question = long_sentences_seperate(question,40)
            except:
                axes[row][col].set_visible(False) #if data is not available, then chart is invis
                continue
            
            if col_name == 'Q3': #This Q3 is the country, all countries are not necessary
                selected_countries = df[col_name].value_counts(normalize=True).index[:10]
                temp_df = df[df[col_name].isin(selected_countries)] #Only selected top 10 countries 
                
                temp_df = temp_df.groupby([col_name, 'years']).agg({col_name:'count'})
                temp_df.columns = ['counts']
                temp_df.reset_index(inplace=True) 
            else:
                temp_df = df.groupby([col_name, 'years']).agg({col_name:'count'}) #Only take the count change
                #Occurences of each answer choice is counted. If someone is selecting more than one S/W or 
                #choice is not considered.
                temp_df.columns = ['counts']
                temp_df.reset_index(inplace=True)
            
            temp_df.loc[temp_df['years'] == 2020, 'counts_norm'] = temp_df.loc[temp_df['years'] == 2020, 'counts'] / response_num_2020
            temp_df.loc[temp_df['years'] == 2021, 'counts_norm'] = temp_df.loc[temp_df['years'] == 2021, 'counts'] / response_num_2021
            temp_df['choices'] = temp_df[col_name].apply(lambda x: long_sentences_seperate(x, 25))
            temp_df['quest_no'] = col_name #The question number is added for later use with Plotly
            temp_df['question_asked'] = question
            ### Find The Order That Biggest Change to Lowest Change
            count_df = temp_df[col_name].value_counts() #Gets the value counts of the said column
            selected_values = list(count_df[count_df > 1].index) #Consider only the values more than 1
            clean_temp_df = temp_df[temp_df[col_name].isin(selected_values)] #df is created withe selected values
            #Below we calculate the %change, and prioritize the max change
            changes_list = ((clean_temp_df.loc[clean_temp_df['years'] == 2021, 'counts'].values - clean_temp_df.loc[clean_temp_df['years'] == 2020, 'counts'].values) / 
                            clean_temp_df.loc[clean_temp_df['years'] == 2020, 'counts'].values)
            change_twice_list = [] # Why this?
            for value in changes_list:
                change_twice_list.append(value)
                change_twice_list.append(value) #twice the values is appended to the list
            clean_temp_df['change'] = change_twice_list
            clean_temp_df.sort_values('change', inplace=True, ascending=False)
            order_list = list(clean_temp_df[col_name].unique()) #the column names are ordered
            temp_df_unique = temp_df[col_name].unique()
            diff_order = list(set(temp_df_unique) - set(order_list))
            if len(diff_order) > 0:
                order_list.extend(diff_order)
            ###
            #print(temp_df.shape)
            counted_df = pd.concat([counted_df,temp_df],join='outer')
            #Taking out the clean_temp_df and concatenating to counted_df            
            index += 1
            
    return counted_df

In [None]:
# Initiating the Plotly express charting helper function 
def plot_helper(data_trial):
    data_trial.years = data_trial.years.astype('category')

    re_chart = px.bar(data_frame=data_trial,y='choices',x='counts',color='years',
                      facet_col='question_asked',width= 900,facet_col_wrap=2,facet_col_spacing=0.25)

    re_chart.update_yaxes(matches=None,automargin=False, #Automargin option lets us control, how much margin to allow
                          tickfont=dict(family='Rockwell', color='black', size=10),
                         title=None)
    
    re_chart.update_yaxes(showticklabels=True, col=2) #This adds the y-axis ticks

    re_chart.update_xaxes(matches=None) #The matches None option allows to automatically modify the axis ticks

    re_chart.update_layout(height = 1500,barmode='group',autosize=False,
                           margin=dict(l=150, r=150, t=100, b=5),paper_bgcolor="LightSteelBlue")
    re_chart.show()

In [None]:
DS_col_1 = ['Q11', 'Q32', 'Q3', 'Q4', 'Q1', 'Q38']
DS_col_2 = ['Q13', 'Q30', 'Q6', 'Q25', 'Q5', 'Q8']

trial_1 = prepared_df(df_20_21_clean, question_mean_dict, DS_col_1)
trial_2 = prepared_df(df_20_21_clean, question_mean_dict, DS_col_2)

In [None]:
#Reducing some of the very long ticks
trial_1.loc[trial_1.Q11 == 'A cloud computing platform (AWS, Azure, GCP, hosted notebooks, etc)','choices'] = 'A cloud computing platform'
trial_1.loc[trial_1.Q11 == 'A deep learning workstation (NVIDIA GTX, LambdaLabs, etc)','choices'] = 'A deep learning platform'
trial_2.loc[trial_2.choices.str.contains('Great'),'choices'] = 'UK'

In [None]:
#getting the data prepared for the rest of the questions, 2 PARTS
trial_part_1 = prepared_df(df_part_20_21_clean, question_mean_dict, df_part_20_21_clean.columns[:-9])
trial_part_2 = prepared_df(df_part_20_21_clean, question_mean_dict, df_part_20_21_clean.columns[9:-1])

In [None]:
plot_helper(trial_1)
plot_helper(trial_2)
plot_helper(trial_part_1)
plot_helper(trial_part_2)


### So is Plotly any use??? 

When I began my journey into the world of Data Visualisation, I began with Seaborn (still struggle a lot). Around that time, I was thinking about venturing into dash boards and started searching for the libraries in Python that support it. 

#### In came Plotly and Dash

It was instant love. The pleasure of working starts with our visualisation in our mind. The data of numbers in the hands of Plotly transforms easily into something very close to our visualisation.

#### What we will learn???

There is simply two helper functions created on top of the data that was prepared. There were some additional columns included to make the plotly rendering easier.

textwrap library introduced by the plotly community here.
https://community.plotly.com/t/wrap-long-text-in-title-in-dash/11419/2?u=plotkamal
This library solves the very long chart titles, when we use the questions are directly used.
@Hasan had used a split function, by adding new line for Seaborn. Same cannot be used for plotly.

plot_helper() function invokes the plotly bar chart. In the bar chart, there are couple of options that made the plots presentable.I am listing them here.
    
This provides seperate Y-ticks for each of the facets, in same row.
    update_yaxes(matches=None)
    re_chart.update_yaxes(showticklabels=True, col=2)

Following code decouples the margins from the plotly control, and lets you to set the margin

    fig.update_layout(height = 1500,barmode='group',autosize=False,
                           margin=dict(l=250, r=150, t=100, b=5),paper_bgcolor="LightSteelBlue")
    automargin=False option in update_yaxes() function

As the saying, the beauty is in the "Eyes of the Beholder", so is the data visualisation. Plotly lets us achieve it with lot of help from data preparation, and external helper libraries if searched. 

Haappppyyy Visualising...