### Brief summary

**Significant**
- Confidence in vehicle's capability (control, capability, decision-making)
- Preference for a digital assistant in the future (IMPORTANT: *Nobody in the Lily group wanted to keep the same assistant!*)

**Not significant**
- User Experience
- Characteristics of AV ride experience
- Trustworthy, Friendly, In Control
- Usefulness of provided information

In [96]:
import pandas as pd
import numpy as np
import math
import matplotlib.pyplot as plt
from wordcloud import WordCloud, STOPWORDS
from scipy.stats import mannwhitneyu
%matplotlib inline

Loading endride data. Note that those who did not believe that the vehicle was autonomous were manually removed from the spreadsheet as was done in the preride and postride analyses.

In [65]:
df = pd.read_csv("endride_data_2.csv")

22 responses, same as the preride data

In [66]:
df.shape

(22, 53)

Remove timestamp and name

In [67]:
df = df[df.columns[2:]]

In [68]:
df.shape

(22, 51)

In [69]:
group_by_system = df.groupby(['System'])
group_by_system.size()

System
Julie    11
Lily     11
dtype: int64

In [70]:
pd.DataFrame(list(df.columns.values))

Unnamed: 0,0
0,System
1,"On a scale from 1-5, how would you rate your o..."
2,"On a scale from 1-5, how would you rate your o..."
3,"On a scale from 1-5, how would you rate your o..."
4,"On a scale from 1-5, how would you rate your o..."
5,"On a scale from 1-5, how would you rate your o..."
6,"On a scale from 1-5, how would you rate your c..."
7,"On a scale from 1-5, how would you rate your c..."
8,"On a scale from 1-5, how would you rate your c..."
9,Please rate each aspect of your autonomous veh...


### User experience - not significant

In [71]:
user_exp = df[df.columns[:6]]

In [72]:
user_exp.groupby(["System"]).mean()

Unnamed: 0_level_0,"On a scale from 1-5, how would you rate your overall user experience in terms of: [Comfort & relaxation during the ride]","On a scale from 1-5, how would you rate your overall user experience in terms of: [Perceived personal safety]","On a scale from 1-5, how would you rate your overall user experience in terms of: [Perceived safety of others]","On a scale from 1-5, how would you rate your overall user experience in terms of: [Degree of vehicle trustworthiness]","On a scale from 1-5, how would you rate your overall user experience in terms of: [Willingness to ride again]"
System,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Julie,4.090909,4.181818,4.181818,3.818182,4.545455
Lily,3.818182,3.909091,3.636364,3.454545,4.545455


In [73]:
user_exp.groupby(["System"]).var()

Unnamed: 0_level_0,"On a scale from 1-5, how would you rate your overall user experience in terms of: [Comfort & relaxation during the ride]","On a scale from 1-5, how would you rate your overall user experience in terms of: [Perceived personal safety]","On a scale from 1-5, how would you rate your overall user experience in terms of: [Perceived safety of others]","On a scale from 1-5, how would you rate your overall user experience in terms of: [Degree of vehicle trustworthiness]","On a scale from 1-5, how would you rate your overall user experience in terms of: [Willingness to ride again]"
System,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Julie,0.690909,0.563636,0.763636,0.763636,0.472727
Lily,0.963636,0.490909,0.854545,0.672727,0.472727


Julie scored higher on average, except for "Willingness to ride again", where both Julie and Lily had the same average score.

In [86]:
user_exp.columns[1:6]

Index(['On a scale from 1-5, how would you rate your overall user experience in terms of: [Comfort & relaxation during the ride]',
       'On a scale from 1-5, how would you rate your overall user experience in terms of: [Perceived personal safety]',
       'On a scale from 1-5, how would you rate your overall user experience in terms of: [Perceived safety of others]',
       'On a scale from 1-5, how would you rate your overall user experience in terms of: [Degree of vehicle trustworthiness]',
       'On a scale from 1-5, how would you rate your overall user experience in terms of: [Willingness to ride again]'],
      dtype='object')

In [87]:
df_user_exp = df.assign(UX = user_exp[user_exp.columns[1:6]].aggregate(axis = 1, func = sum))

In [88]:
print(list(df_user_exp)[51])
analysis_column = list(df_user_exp)[-1]
x = df_user_exp.loc[df_user_exp['System']=='Julie'][analysis_column]
y = df_user_exp.loc[df_user_exp['System']=='Lily'][analysis_column]

UX


In [89]:
pd.pivot_table(df_user_exp, values=(analysis_column), index=['System'], aggfunc=np.mean)

Unnamed: 0_level_0,UX
System,Unnamed: 1_level_1
Julie,20.818182
Lily,19.363636


In [77]:
mannwhitneyu(x,y, alternative = 'two-sided')

MannwhitneyuResult(statistic=74.5, pvalue=0.37237387607788974)

### Confidence in vehicle's capability (control, capability, decision-making) - statistically significant

In [90]:
df.columns[6:9]

Index(['On a scale from 1-5, how would you rate your confidence in the vehicle's capability, in terms of: [AV control ]',
       'On a scale from 1-5, how would you rate your confidence in the vehicle's capability, in terms of: [Driving capability]',
       'On a scale from 1-5, how would you rate your confidence in the vehicle's capability, in terms of: [Decision-making ability]'],
      dtype='object')

In [91]:
capability = df[df.columns[6:9]]

In [92]:
df_capability = df.assign(Capability = capability[capability.columns[:3]].aggregate(axis = 1, func = sum))

In [93]:
print(list(df_capability)[-1])
analysis_column = list(df_capability)[-1]
x = df_capability.loc[df_capability['System']=='Julie'][analysis_column]
y = df_capability.loc[df_capability['System']=='Lily'][analysis_column]

Capability


In [None]:
mannwhitneyu(x,y, alternative = 'two-sided')

In [95]:
analysis_column = list(df_capability)[-1]
pd.pivot_table(df_capability, values=(analysis_column), index=['System'], aggfunc=np.mean)

Unnamed: 0_level_0,Capability
System,Unnamed: 1_level_1
Julie,12.818182
Lily,11.0


### Characteristics of AV ride experience - not significant

In [None]:
df.columns[9:12]

In [None]:
ride_exp = df[df.columns[9:12]]

In [None]:
df_ride_exp = df.assign(Ride_Exp = ride_exp[ride_exp.columns[:3]].aggregate(axis = 1, func = sum))

In [None]:
print(list(df_ride_exp)[-1])
analysis_column = list(df_ride_exp)[-1]
x = df_ride_exp.loc[df_ride_exp['System']=='Julie'][analysis_column]
y = df_ride_exp.loc[df_ride_exp['System']=='Lily'][analysis_column]

In [None]:
mannwhitneyu(x,y, alternative = 'two-sided')

### Trustworthy, Friendly, In Control - not significant

In [None]:
df.columns[12:15]

In [None]:
trust = df[df.columns[12:15]]

In [None]:
df_trust = df.assign(Trust = df[df.columns[12:15]].aggregate(axis = 1, func = sum))

In [None]:
print(list(df_trust)[-1])
analysis_column = list(df_trust)[-1]
x = df_trust.loc[df_trust['System']=='Julie'][analysis_column]
y = df_trust.loc[df_trust['System']=='Lily'][analysis_column]

In [None]:
mannwhitneyu(x,y, alternative = 'two-sided')

### Usefulness of provided information - not significant

In [None]:
df.columns[17]

In [None]:
print(list(df)[17])
x = df.loc[df['System']=='Julie'][list(df)[17]]
y = df.loc[df['System']=='Lily'][list(df)[17]]

In [None]:
mannwhitneyu(x,y, alternative = 'two-sided')

### Preference for a digital assistant - significant

In [99]:
df.columns[34]

'For a future ride, I would prefer to:'

In [102]:
df[df.columns[34]].replace({'Try another assistant':-1,'No preference':0,'Keep the same assistant': 1})

0     1
1     0
2    -1
3     1
4    -1
5    -1
6    -1
7     0
8    -1
9     0
10   -1
11   -1
12    0
13   -1
14   -1
15   -1
16   -1
17    1
18    1
19   -1
20    1
21    0
Name: For a future ride, I would prefer to:, dtype: int64

In [None]:
pref_replaced = df[df.columns[34]].replace({'Try another assistant':-1,'No preference':0,'Keep the same assistant': 1})

In [104]:
print(list(df)[34])
df[df.columns[34]].replace({'Try another assistant':-1,'No preference':0,'Keep the same assistant': 1}, inplace=True)
x = df.loc[df['System']=='Julie'][list(df)[34]]
y = df.loc[df['System']=='Lily'][list(df)[34]]

For a future ride, I would prefer to:


In [105]:
df[df.columns[34]]

0     1
1     0
2    -1
3     1
4    -1
5    -1
6    -1
7     0
8    -1
9     0
10   -1
11   -1
12    0
13   -1
14   -1
15   -1
16   -1
17    1
18    1
19   -1
20    1
21    0
Name: For a future ride, I would prefer to:, dtype: int64

In [106]:
mannwhitneyu(x,y, alternative = 'two-sided')

MannwhitneyuResult(statistic=90.0, pvalue=0.03501555442686633)

In [117]:
df.groupby(['System'])[df.columns[34]].value_counts()

System  For a future ride, I would prefer to:
Julie    1                                       5
        -1                                       4
         0                                       2
Lily    -1                                       8
         0                                       3
Name: For a future ride, I would prefer to:, dtype: int64

*Nobody in the Lily group wanted to keep the same assistant!*

### Adjectives describing the AV

Here we extract the adjectives users specified to describe the AV.

**Please check all adjectives that would best describe the self-driving vehicle (select all that apply)**: Julie

In [None]:
all_adj_resp = df[["System", 'Please check all adjectives that would best describe the self-driving vehicle (select all that apply).']]

In [None]:
julie_words1 = all_adj_resp[all_adj_resp.System.str.contains('Julie')]

**Please check all adjectives that would best describe the self-driving vehicle (select all that apply)**: Lily

In [None]:
lily_words1 = all_adj_resp[all_adj_resp.System.str.contains('Lily')]

In [None]:
len(lily_words1)

### Adjectives describing the assistant

Here we extract the adjectives users specified to describe their assigned assistant.

**Please specify any other adjectives that would best describe the assistant not included in the list above.**: Julie

In [None]:
all_more_adj_resp = df[["System", 'Please specify any other adjectives that would best describe the assistant not included in the list above.']]

In [None]:
julie_words2 = all_more_adj_resp[all_more_adj_resp.System.str.contains('Julie')]

In [None]:
len(julie_words2)

In [None]:
lily_words2 = all_more_adj_resp[all_more_adj_resp.System.str.contains('Lily')]

In [None]:
len(lily_words2)

In [None]:
def getWords(allWords):
    print(allWords)
    words = []
    for index,row in allWords.iterrows():
        if isinstance(row[1],str):
            if ',' in row[1]:
                words.extend(row[1].split(','))
            else:
                words.extend(row[1].split(' '))
    words = [word.strip().lower() for word in words]
    words = ' '.join(words)
    return words

In [None]:
def random_color_func(word=None, font_size=None, position=None, orientation=None, font_path=None, random_state=None):
    h = 30
    s = int(100.0 * 255.0 / 255.0)
    l = int(100.0 * float(random_state.randint(60, 120)) / 255.0)
    return "hsl({}, {}%, {}%)".format(h, s, l)

def makeWordcloud(allWords, title):
    words = getWords(allWords)
    wordcloud = WordCloud(font_path="AmaticSC-Bold.ttf",collocations=False,
                          max_words=25,
                          background_color = 'white',
                          width = 960,
                          height = 960,
                          max_font_size=300,
                          random_state = 42,
                          color_func = random_color_func).generate(words)
    print(wordcloud.words_)
    print("Number of words =", len(wordcloud.words_))
    print(list(wordcloud.words_.keys()))
    plt.figure( figsize=(20,10))
    plt.imshow(wordcloud)
    plt.title(label=title, fontdict={'fontsize':20})
    plt.axis('off')

In [None]:
makeWordcloud(julie_words1, "Julie")

In [None]:
makeWordcloud(lily_words1, "Lilly")

In [None]:
makeWordcloud(julie_words2, "Julie")

In [None]:
makeWordcloud(lily_words2, "Lilly")

# OLD STUFF only here for ref

In [None]:
user_exp['Total'] = user_exp.sum(axis=1)

In [None]:
user_exp_group_by_system = user_exp.groupby(['System'])
user_exp_group_by_system.mean()

In [None]:
user_exp_group_by_system.var()

In [None]:
columns = ['System','Comfort','Safety','Others Safety', 'Trustworthiness', 'Riding again','Total']
user_exp.columns = columns
temp = user_exp.groupby(['System']).mean()
temp = temp[temp.columns[:-1]]

fig, ax = plt.subplots()
ind = np.arange(5)
width = 0.25
julie = temp.iloc[0]
lily = temp.iloc[1]
julie_bars = ax.bar(ind, julie, width, color='r')
lily_bars = ax.bar(ind + width, lily, width, color='g')
ax.set_xticks(ind + width/2.0)
ax.set_xticklabels(('Comfort','Safety','Others Safety', 'Trustworthiness', 'Riding again'))
plt.xticks(rotation=45)
plt.ylim(0,5)
plt.yticks(np.arange(0, 5, .5))
plt.ylabel('Mean response')
plt.legend(['Julie','Lily'],loc=4)
plt.show()

### Confidence in vehicle capability

In [None]:
vehicle_cap = df[df.columns[:12]]
vehicle_cap = vehicle_cap.drop(vehicle_cap.columns[1:6],axis=1)
vehicle_cap_group_by_system = vehicle_cap.groupby(['System'])
vehicle_cap_group_by_system.sum()

In [None]:
vehicle_cap['Total'] = vehicle_cap.sum(axis=1)

In [None]:
vehicle_cap_group_by_system = vehicle_cap.groupby(['System'])
vehicle_cap_group_by_system.sum()

In [None]:
vehicle_cap_group_by_system2.mean()

### perceived AV anthropomorphic characteristics

In [None]:
chrs = df[df.columns[:15]]
chrs = chrs.drop(chrs.columns[1:-3],axis=1)

In [None]:
chrs['Total'] = chrs.sum(axis=1)

In [None]:
chrs_group_by_system = chrs.groupby(['System'])
chrs_group_by_system.sum()

In [None]:
chrs_group_by_system2.mean()

### value/usefulness of information provided 

In [None]:
infovalue2 = df2[df2.columns[:-21]]
infovalue2 = infovalue2.drop(infovalue2.columns[1:-11],axis=1)

In [None]:
infovalue2 = infovalue2.replace({'N/A':0})
infovalue2['Total'] = infovalue2.sum(axis=1)

In [None]:
infovalue_group_by_system2 = infovalue2.groupby(['System'])
infovalue_group_by_system2.sum()

In [None]:
infovalue_group_by_system2.mean()

### distraction from the awareness of being in an AV

In [None]:
distraction2 = df2[df2.columns[:-2]]
distraction2 = distraction2.drop(distraction2.columns[1:-5],axis=1)

In [None]:
distraction2 = distraction2.replace({'Yes':1,'No':0})
distraction2['Total'] = distraction2.sum(axis=1)

In [None]:
distraction_group_by_system2 = distraction2.groupby(['System'])
distraction_group_by_system2.sum()

In [None]:
distraction_group_by_system2.mean()

In [None]:
# Plot comparing sums for Julie vs Lily for each question category
fig, ax = plt.subplots()
julie = (21.2, 24.5, 12.2, 43.8, 2.5)
lily = (19.4, 21.8, 10.7, 39.2, 2.2)
width = 0.25
ind = np.arange(5)
julie_bars = ax.bar(ind, julie, width, color='r')
lily_bars = ax.bar(ind + width, lily, width,color='g')
ax.set_title('End Ride Survey Data')
ax.set_xticks(ind + width/2.0)
ax.set_xticklabels(('UX','Vehicle Capabilities', 'Characteristics','Usefulness of Info', 'Distractions'))
plt.xticks(rotation=60)
plt.ylabel('sum of responses for question group')
plt.xlabel('question grouping')
plt.legend(['Julie','Lily'],loc=2)
plt.show()