## Plotting Notebook for Survey Responses

As described in the thesis, Google forms was used to create a user study based on a questionnaire. The questionnaire is available in 3 languages so results have to be pre-processed in order to be comparable.

This notebook uses the Excel file 'reactions_long.xlsx' to create Plotly visualizations in line with the other plots in the thesis.
To udate with new results:
  - Download the xlsx files from all languages and put them in this folder.
  - Open file 'reactions_long.xlsx' in Excel and hit refresh all. Power Query was used to do some of the pre-processing.
  - Then run this notebook to update all charts

In [1]:
import pandas as pd
pd.options.plotting.backend = "plotly" #interactive plots will be useful in this context
import plotly.express as px
import plotly.graph_objects as go
import numpy as np
import pickle
import textwrap
import dtale
import textwrap

In [2]:
xl = pd.ExcelFile('reactions_long.xlsx')
df_likert = pd.read_excel(xl, 'likert')
df_time = pd.read_excel(xl, 'time')
df_use = pd.read_excel(xl, 'would_use')
df_pers = pd.read_excel(xl, 'personal')
df_open = pd.read_excel(xl, 'open')

In [3]:
# Colors for each perspective in line with the schematic figures in the thesis.

length = len(df_use)
color_discrete_map = {'Job Seeker': 'rgb(230,230,230)', 'Employer': 'rgb(217,223,201)', 'Education': 'rgb(193,211,225)'}
pattern_shape_map={"Job Seeker": "", "Employer": "/", "Education": "."}

## Plotting time based queries
All time based queries were plotted in one overview figure with a facet per question.

In [28]:
margin=go.layout.Margin(l=0, r=0,b=20, t=50 )
fig = px.histogram(df_time,
                    x='Answer',
                    histnorm='percent',
                    barmode='group',
                    color='Perspective',
                    facet_row= 'Question',
                    pattern_shape='Perspective',
                    height = 1200,
                    width = 900,
                    color_discrete_map = color_discrete_map,
                    pattern_shape_map = pattern_shape_map,
                    title='<b>Time</b> Spent on Tasks Skill Scanner Could Accelerate',
                    category_orders={'Perspective': ["Job Seeker", "Employer", "Education"]},)
for annotation in fig['layout']['annotations']: 
    annotation['textangle']= 0
    annotation['x']=0
    annotation['y']=annotation['y']-0.0555
    annotation['text']=annotation['text'].split('=')[1].split('(')[0].split(' / ')[0]
fig.update_layout(margin=margin,
                  plot_bgcolor = 'rgb(240,245,255)',
                  legend_title_text='',
                  bargap = 0.1,
                  xaxis_title="",
                  font=dict(size=14),
                  legend=dict(xanchor="right",x=0.95, yanchor='bottom',y=0.93),
                  #yaxis_title="Relative Frequency [%]",
                  )
fig.update_xaxes(tickprefix='<br>')
fig.update_yaxes(ticksuffix='%')
fig.for_each_yaxis(lambda y: y.update(title = ''))
#fig.add_annotation(x=-0.05,y=0.5,
#                   text="Relative Frequency [%]", textangle=-90,
#                    xref="paper", yref="paper")
fig.show()
fig.write_image('time.pdf')

## Likert Scale Questions Plots
Optimized with large font to make 2 figures fit on one page next to each other.

In [30]:
color_discrete_map = {'Job Seeker': 'rgb(230,230,230)', 'Employer': 'rgb(217,223,201)', 'Education': 'rgb(193,211,225)'}
count = 0
for subject in df_likert.Subject.unique():
    title=subject
    if count ==0:
        title = "Potential of AI for User Group's <b>Tasks</b>"
        print(title)
    elif count ==1:
        title = "Potential of AI to <b>Find Study Programs</b>"
    count+=1
    margin=go.layout.Margin(l=0, r=0,b=100, t=50 )
    if len(title)>40:
        title =  '<br>'.join(textwrap.wrap(title, 45))
        margin = go.layout.Margin(l=0, r=0,b=100, t=100 )
    df_=df_likert[df_likert['Subject']==subject]
    fig = px.histogram(df_,
                       x='Answer',
                       histnorm='percent',
                       barmode='group',
                       pattern_shape_map = pattern_shape_map,
                       color_discrete_map = color_discrete_map,
                       color=df_['Perspective'],
                       pattern_shape=df_['Perspective'],
                       title=title,
                       nbins=5,
                       range_x=[0.5,5.5])
    
    fig.update_layout(margin=margin,
                     plot_bgcolor = 'rgb(240,245,255)',
                     legend_title_text='',
                     xaxis_title="",
                     yaxis_title="",
                     bargap = 0.1,
                     font=dict(size=22),
                     #marker_size=10,
                     legend=dict(xanchor="left",x=0.01, yanchor='top',y=0.99,)#font=dict(size=20,)),
                     #yaxis_tickformat = '%'
                     )
    if subject == "Interface":
        fig.add_annotation(x=1,
            text="Hate it",
            showarrow=False,
            yref='paper', y=-0.15)
        fig.add_annotation(x=5,
            text="Love it",
            showarrow=False,
            yref='paper', y=-0.15)
    else:
        fig.add_annotation(x=1,
                text="Strongly<br>disagree",
                showarrow=False,
                yref='paper', y=-0.25)
        fig.add_annotation(x=5,
                text="Strongly<br>agree",
                showarrow=False,
                yref='paper', y=-0.25)
    
    #fig.update_traces(marker=dict(colorbar=dict(thickness=10)))
    
    fig.update_yaxes(ticksuffix='%')
    
    filename = subject
    filename=filename.replace('<b>','')
    filename=filename.replace('</b>','')

    filename = ''.join(e for e in filename if e.isalnum())
    filename = filename+".pdf"
    fig.show()
    fig.write_image(filename)
    

Potential of AI for User Group's <b>Tasks</b>


In [31]:
#Average result summary to plot below main plot
color_discrete_map = {'Job Seeker': 'rgb(230,230,230)', 'Employer': 'rgb(217,223,201)', 'Education': 'rgb(193,211,225)'}
for subject in df_likert.Subject.unique():
    
    title=subject
    margin=go.layout.Margin(l=0, r=0,b=20, t=50 )
    
    if len(title)>40:
        title =  '<br>'.join(textwrap.wrap(subject, 40))
        margin = go.layout.Margin(l=0, r=0,b=20, t=100 )
  
    df_=df_likert[df_likert['Subject']==subject]
    df_=df_.groupby(['Perspective']).mean().reset_index()
    df_['answer_str']=""
    for index, row in df_.iterrows():
        df_.loc[index, 'answer_str']='<b>'+str(round(row['Answer'],1))+'</b>'

    fig = px.bar(
        data_frame=df_, 
        y="Perspective",
        x="Answer",
        text = "answer_str",
        color='Perspective',
        pattern_shape='Perspective',
        pattern_shape_map = pattern_shape_map,
        color_discrete_map = color_discrete_map,
        orientation='h',
        height = 200,
        )
    
    fig.update_layout(margin=go.layout.Margin(l=1, r=0,b=0, t=0 ),
                     plot_bgcolor = 'rgb(240,245,255)',
                     xaxis_title="",
                     yaxis_title="",
                     showlegend = False,
                     font=dict(size =40, color='black', family='Arial, sans-serif')            
                     )

    fig.update_xaxes(showticklabels=False, range=[0,5])
    fig.update_yaxes(showticklabels=False)
    
    fig.update_traces(textposition='outside')
    
    fig.add_annotation(x=-0.5,
            font=dict(size=40,),
            text="Average",
            showarrow=False,
            yref='paper', y=0.5)
    
    filename = subject
    filename=filename.replace('<b>','')
    filename=filename.replace('</b>','')

    filename = ''.join(e for e in filename if e.isalnum())
    filename = filename+"overview.pdf"

    fig.show()
    fig.write_image(filename)

In [32]:
margin=go.layout.Margin(l=0, r=0,b=20, t=50 )
fig = px.histogram(df_use,
                    x='Answer',
                    histnorm='percent',
                    barmode='group',
                    color='Perspective',
                    pattern_shape_map = pattern_shape_map,
                    pattern_shape = 'Perspective',
                    #facet_row= 'Question',
                    #height = 1200,
                    #width = 900,
                    color_discrete_map = color_discrete_map,
                    title='Would You <b>Use</b> Skill Scanner in Tasks from Your Perspective?',
                    category_orders={'Perspective': ["Job Seeker", "Employer", "Education"],'Answer':["Yes","Maybe","No"]})

fig.update_layout(margin=margin,
                  plot_bgcolor = 'rgb(240,245,255)',
                  legend_title_text='Perspective',
                  xaxis_title="",
                  font=dict(size=14),
                  bargap = 0.1,
                  legend=dict(xanchor="right",x=0.95, yanchor='top',y=0.9),
                  #yaxis_title="Relative Frequency [%]",
                  )
fig.update_xaxes(tickprefix='<br>')
fig.update_yaxes(ticksuffix='%')

fig.for_each_yaxis(lambda y: y.update(title = ''))
#fig.add_annotation(x=-0.05,y=0.5,
#                   text="Relative Frequency [%]", textangle=-90,
#                    xref="paper", yref="paper")
#fig.write_image('use.pdf')
fig.show()

## Personal Questions
From the personal questions, most answers are most sensible to report without figures. For the Age and Role distributions of respondents it makes sense to plot a figure.

In [8]:
title='Age Distribution of Respondents'
margin=go.layout.Margin(l=0, r=0,b=20, t=50 )
    
fig = px.histogram(df_pers,
                    x='Respondant\'s AGE?',
                    histnorm='percent',
                    barmode='group',
                    #color=df_pers['Perspective'],
                    color_discrete_map = color_discrete_map,
                    #pattern_shape_map = pattern_shape_map,
                    #pattern_shape = 'Perspective',
                    title=title,
                    #nbins=5,
                    #range_x=[0.5,5.5])
                    )
fig.update_layout(margin=margin,
                     plot_bgcolor = 'rgb(240,245,255)',
                     legend_title_text='Perspective',
                     xaxis_title="",
                     yaxis_title="",
                     font=dict(size=22),
                     legend=dict(xanchor="left",x=0.01, yanchor='top',y=0.99),
                     #yaxis_tickformat = '%'
                     )
fig.update_yaxes(ticksuffix='%')
    #filename = ''.join(e for e in subject if e.isalnum())
    #filename = filename+".pdf"
fig.show()
fig.write_image('age.pdf')

In [9]:
title='Role Description Distribution of Participants'
margin=go.layout.Margin(l=0, r=0,b=20, t=50 )
    
fig = px.histogram(df_pers,
                    y='Respondant\'s Role',
                    histnorm='percent',
                    barmode='group',
                    #color=df_pers['Perspective'],
                    color_discrete_map = color_discrete_map,
                    title=title,
                    #nbins=5,
                    #range_x=[0.5,5.5])
                    )
fig.update_layout(margin=margin,
                     plot_bgcolor = 'rgb(240,245,255)',
                     xaxis_title="",
                     yaxis_title="",
                     font=dict(size=20),
                     legend=dict(xanchor="left",x=0.01, yanchor='top',y=0.99),
                     #yaxis_tickformat = '%'
                     )

    #filename = ''.join(e for e in subject if e.isalnum())
    #filename = filename+".pdf"
fig.update_yaxes(categoryorder="total ascending")
fig.update_xaxes(ticksuffix='%')
fig.show()
fig.write_image('role.pdf')

In [10]:
title='Perspective Distribution of Participants'
margin=go.layout.Margin(l=0, r=0,b=20, t=50 )
    
fig = px.histogram(df_pers,
                    y='Respondant\'s Perspective',
                    histnorm='percent',
                    barmode='group',
                    #color=df_pers['Perspective'],
                    color_discrete_map = color_discrete_map,
                    title=title,
                    #nbins=5,
                    #range_x=[0.5,5.5])
                    )
fig.update_layout(margin=margin,
                     plot_bgcolor = 'rgb(240,245,255)',
                     xaxis_title="",
                     yaxis_title="",
                     font=dict(size=20),
                     legend=dict(xanchor="left",x=0.01, yanchor='top',y=0.99),
                     #yaxis_tickformat = '%'
                     )

    #filename = ''.join(e for e in subject if e.isalnum())
    #filename = filename+".pdf"
fig.update_yaxes(categoryorder="total ascending")
fig.update_xaxes(ticksuffix='%')
fig.show()

In [11]:
print(len(df_pers))
title='Perspective Distribution of Participants'
margin=go.layout.Margin(l=0, r=0,b=20, t=50 )
    
fig = px.histogram(df_pers,
                    y='Respondant\'s Country',
                    histnorm='percent',
                    barmode='group',
                    #color=df_pers['Perspective'],
                    color_discrete_map = color_discrete_map,
                    title=title,
                    #nbins=5,
                    #range_x=[0.5,5.5])
                    )
fig.update_layout(margin=margin,
                     plot_bgcolor = 'rgb(240,245,255)',
                     xaxis_title="",
                     yaxis_title="",
                     font=dict(size=20),
                     legend=dict(xanchor="left",x=0.01, yanchor='top',y=0.99),
                     #yaxis_tickformat = '%'
                     )

    #filename = ''.join(e for e in subject if e.isalnum())
    #filename = filename+".pdf"
fig.update_yaxes(categoryorder="total ascending")
fig.update_xaxes(ticksuffix='%')
fig.show()

108


In [12]:
df_open.head()

Unnamed: 0,Timestamp,Role,Age,Country,Gender,Perspective,Question,Answer
0,2021-11-15 08:14:14.021,Learning and Development professional,25-34,Netherlands,Female,Job Seeker,What's the main reason for your score?,Niet bekend met skill scanner
1,2021-11-15 08:14:14.021,Learning and Development professional,25-34,Netherlands,Female,Job Seeker,What would you CHANGE in Skill Scanner's inter...,nvt
2,2021-11-15 08:14:14.021,Learning and Development professional,25-34,Netherlands,Female,Job Seeker,What would you ADD to Skill Scanner's interface?,nvt
3,2021-11-15 10:03:58.557,Learning and Development professional,25-34,Netherlands,Female,Employer,Do you have any REMARKS regarding using Skill ...,Hoe kan ik een programma beoordelen op werking...
4,2021-11-15 10:03:58.557,Learning and Development professional,25-34,Netherlands,Female,Employer,What's the main reason for your score?,simple and clear


In [13]:
for question in df_open.Question.unique():
    count=0
    print('-----------------')
    
    print(question)
    df_=df_open[df_open['Question']==question]
    for index, row in df_.iterrows():
        print(row['Answer'])
        count+=1
    print(count/length)

-----------------
What's the main reason for your score?
Niet bekend met skill scanner
simple and clear
Looks relatively conventional.
Font size is not consistent, Download report button is not in the expected location and seems somewhat done a bit ad hoc. It needs polish.
Haven't used it
Drop is ok. Job should be muuuuuuch more detailed (to a particular ad?). Result (and warnign about time delivery) should just be an empty zone after the upload has happened.
ik heb hier nog te weinig inzicht in.
Niet aantrekkelijk vormgegeven (lettertype, kleurgebruik..)
Opzich gebruiksvriendelijk, maar de look en feel is niet heel bijzonder. 
ziet er gelikt uit, info direct beschikbaar en digitaal dus goed voor milieu
niet duidelijk genoeg
heel basic, weinig sexy daarentegen wel gebruiksvriendelijk.
Goed overzicht
Nederlandse vertaling graag erbij. En misschien wat fellere kleuren.
Overzichtelijk, sober maar doeltreffend
Het digitaliseren van zoekfuncties kan veel versnellen en vereenvoudigen 
Ziet e