# Pre-Ride Analysis

### -------------------------------------

# Findings Summary
There doesn't seem to be a statistically significant difference between the values asked in the pre-ride survey. 

Both a combined score indicating the users' viewpoint of ride-sharing apps like Lyft and Uber and an analysis on the positive or negative adjectives used to describe a self-driving vehicle came back insignificant. This shows that there was at least not a significant bias between the user groups for the two digital assistants. 

### -------------------------------------

## EDA

In [7]:
import numpy as np
import pandas as pd

In [18]:
data = pd.read_csv('preridecsv.csv')
data.shape

(27, 40)

In [19]:
data.head()

Unnamed: 0,Timestamp,System,Use?,"Do you get motion sickness when looking at visual media (e.g., emails, videos, etc.) in cars/vehicles?","If you need to go somewhere by car, which would you prefer?","Do you use a GPS/geography-based app? If so, which do you use?",Please choose the response that best represents your view on taking a Lyft or Uber. [I am usually comfortable being driven.],Please choose the response that best represents your view on taking a Lyft or Uber. [I usually feel safe being driven.],Please choose the response that best represents your view on taking a Lyft or Uber. [I usually trust the driver.],Please choose the response that best represents your view on taking a Lyft or Uber. [I usually trust other drivers on the road.],...,Please choose the response that best represents how you feel as a passenger in a self-driving vehicle. (The following words are in alphabetical order.) [Mindless],Please choose the response that best represents how you feel as a passenger in a self-driving vehicle. (The following words are in alphabetical order.) [Nervous],Please choose the response that best represents how you feel as a passenger in a self-driving vehicle. (The following words are in alphabetical order.) [Not in control],Please choose the response that best represents how you feel as a passenger in a self-driving vehicle. (The following words are in alphabetical order.) [Relaxed],Please choose the response that best represents how you feel as a passenger in a self-driving vehicle. (The following words are in alphabetical order.) [Uncertain],Name,Age,Gender,Occupation,Highest level of education you have completed or currently pursuing
0,7/17/18,Julie,YES,No,Drive myself,Google Maps,Agree,Agree,Agree,Neutral,...,Disagree,Disagree,Neutral,Agree,Disagree,babak mortezai,54,Male,manager,Bachelor's degree
1,7/19/18,Lily,Maybe; Semi-autonomous,Sometimes,Drive myself,Google Maps,Agree,Agree,Neutral,Agree,...,Disagree,Disagree,Strongly disagree,Strongly agree,Disagree,Christian Angerer,29,Male,PhD-Student,Master's degree
2,7/23/18,Julie,"Maybe--Wondered if she was being ""punked""; Sem...",Sometimes,Drive myself,Google Maps,Strongly agree,Agree,Agree,Neutral,...,Strongly disagree,Agree,,Neutral,,Kristin Muench,28,Female,Graduate Student,Doctoral degree
3,7/27/18,Lily,YES,No,Ask friend or family member to drive,Google Maps,Strongly agree,Agree,Agree,Disagree,...,Disagree,Agree,Strongly agree,Neutral,Neutral,daniel landes,54,Male,writer,Bachelor's degree
4,7/30/18,Julie,NO; knew this was not an AV and filled out the...,Yes,Drive myself,Google Maps,Neutral,Neutral,Neutral,Neutral,...,Disagree,Disagree,Strongly disagree,Neutral,Disagree,Aaron Kau,40,Male,IT,Master's degree


There are some NAs in columns 11, 13, 25-34

### Remove those that were aware of the point of the study

In [20]:
data = data.loc[data["Use?"].str.startswith("Y")]
data.shape

(22, 40)

In [21]:
data["System"].value_counts()

Lily     11
Julie    11
Name: System, dtype: int64

In [37]:
df = data
df.head()

Unnamed: 0,Timestamp,System,Use?,"Do you get motion sickness when looking at visual media (e.g., emails, videos, etc.) in cars/vehicles?","If you need to go somewhere by car, which would you prefer?","Do you use a GPS/geography-based app? If so, which do you use?",Please choose the response that best represents your view on taking a Lyft or Uber. [I am usually comfortable being driven.],Please choose the response that best represents your view on taking a Lyft or Uber. [I usually feel safe being driven.],Please choose the response that best represents your view on taking a Lyft or Uber. [I usually trust the driver.],Please choose the response that best represents your view on taking a Lyft or Uber. [I usually trust other drivers on the road.],...,Please choose the response that best represents how you feel as a passenger in a self-driving vehicle. (The following words are in alphabetical order.) [Mindless],Please choose the response that best represents how you feel as a passenger in a self-driving vehicle. (The following words are in alphabetical order.) [Nervous],Please choose the response that best represents how you feel as a passenger in a self-driving vehicle. (The following words are in alphabetical order.) [Not in control],Please choose the response that best represents how you feel as a passenger in a self-driving vehicle. (The following words are in alphabetical order.) [Relaxed],Please choose the response that best represents how you feel as a passenger in a self-driving vehicle. (The following words are in alphabetical order.) [Uncertain],Name,Age,Gender,Occupation,Highest level of education you have completed or currently pursuing
0,7/17/18,Julie,YES,No,Drive myself,Google Maps,Agree,Agree,Agree,Neutral,...,Disagree,Disagree,Neutral,Agree,Disagree,babak mortezai,54,Male,manager,Bachelor's degree
3,7/27/18,Lily,YES,No,Ask friend or family member to drive,Google Maps,Strongly agree,Agree,Agree,Disagree,...,Disagree,Agree,Strongly agree,Neutral,Neutral,daniel landes,54,Male,writer,Bachelor's degree
5,7/31/18,Julie,YES,No,Drive myself,Google Maps,Agree,Agree,Agree,Disagree,...,Agree,Disagree,Agree,Agree,Disagree,Kevin Goncalves,26,Male,Graduate Student,Doctoral degree
6,8/3/18,Julie,YES,Sometimes,Drive myself,Apple Maps,Strongly agree,Agree,Agree,Agree,...,Agree,Neutral,Agree,Neutral,Agree,Cortney Miller,29,Female,PhD student,Doctoral degree
7,8/3/18,Lily,YES,No,Drive myself,Google Maps,Agree,Agree,Agree,Neutral,...,Disagree,Agree,Neutral,Disagree,Agree,Michelle Blum Atkinson,33,Female,Software,Bachelor's degree


## Summary Tables

In [38]:
data.groupby("System")["Do you get motion sickness when looking at visual media (e.g., emails, videos, etc.) in cars/vehicles?"].value_counts(sort = False)

System  Do you get motion sickness when looking at visual media (e.g., emails, videos, etc.) in cars/vehicles?
Julie   No                                                                                                        7
        Sometimes                                                                                                 3
        rarely                                                                                                    1
Lily    No                                                                                                        4
        Sometimes                                                                                                 6
        Very rarely                                                                                               1
Name: Do you get motion sickness when looking at visual media (e.g., emails, videos, etc.) in cars/vehicles?, dtype: int64

#### Ride Sharing App Viewpoints

In [34]:
lyft_uber_views = df[df.columns[:10]]
lyft_uber_views = lyft_uber_views.drop(lyft_uber_views.columns[0], axis = 1)
lyft_uber_views = lyft_uber_views.drop(lyft_uber_views.columns[1:5],axis = 1)
lyft_uber_views = lyft_uber_views.replace({'Strongly disagree':-2,'Disagree':1,'Neutral':0,'Agree':1,'Strongly agree':2})
#lyft_uber_views
lyft_uber_views_group_by_system = lyft_uber_views.groupby("System")
lyft_uber_views_group_by_system.mean()

Unnamed: 0_level_0,Please choose the response that best represents your view on taking a Lyft or Uber. [I am usually comfortable being driven.],Please choose the response that best represents your view on taking a Lyft or Uber. [I usually feel safe being driven.],Please choose the response that best represents your view on taking a Lyft or Uber. [I usually trust the driver.],Please choose the response that best represents your view on taking a Lyft or Uber. [I usually trust other drivers on the road.]
System,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Julie,1.363636,1.181818,1.181818,0.636364
Lily,1.181818,1.090909,0.818182,0.636364


In [35]:
lyft_uber_views_group_by_system.min()

Unnamed: 0_level_0,Please choose the response that best represents your view on taking a Lyft or Uber. [I am usually comfortable being driven.],Please choose the response that best represents your view on taking a Lyft or Uber. [I usually feel safe being driven.],Please choose the response that best represents your view on taking a Lyft or Uber. [I usually trust the driver.],Please choose the response that best represents your view on taking a Lyft or Uber. [I usually trust other drivers on the road.]
System,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Julie,1,0,0,0
Lily,0,0,0,0


In [36]:
lyft_uber_views_group_by_system.max()

Unnamed: 0_level_0,Please choose the response that best represents your view on taking a Lyft or Uber. [I am usually comfortable being driven.],Please choose the response that best represents your view on taking a Lyft or Uber. [I usually feel safe being driven.],Please choose the response that best represents your view on taking a Lyft or Uber. [I usually trust the driver.],Please choose the response that best represents your view on taking a Lyft or Uber. [I usually trust other drivers on the road.]
System,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Julie,2,2,2,1
Lily,2,2,2,1


## A combined Lyft/Uber Trust score

In [65]:
df[df.columns[5:9]]

Unnamed: 0,Please choose the response that best represents your view on taking a Lyft or Uber. [I am usually comfortable being driven.],Please choose the response that best represents your view on taking a Lyft or Uber. [I usually feel safe being driven.],Please choose the response that best represents your view on taking a Lyft or Uber. [I usually trust the driver.],Please choose the response that best represents your view on taking a Lyft or Uber. [I usually trust other drivers on the road.]
0,Agree,Agree,Agree,Neutral
1,Agree,Agree,Neutral,Agree
2,Strongly agree,Agree,Agree,Neutral
3,Strongly agree,Agree,Agree,Disagree
5,Agree,Agree,Agree,Disagree
6,Strongly agree,Agree,Agree,Agree
7,Agree,Agree,Agree,Neutral
9,Agree,Neutral,Agree,Agree
10,Agree,Agree,Agree,Agree
11,Agree,Agree,Neutral,Neutral


In [39]:
lyft_uber_views[lyft_uber_views.columns[1:5]].aggregate(axis = 1, func = sum)
df = df.assign(LU_Trust = lyft_uber_views[lyft_uber_views.columns[1:5]].aggregate(axis = 1, func = sum))

In [40]:
df.head()

Unnamed: 0,Timestamp,System,Use?,"Do you get motion sickness when looking at visual media (e.g., emails, videos, etc.) in cars/vehicles?","If you need to go somewhere by car, which would you prefer?","Do you use a GPS/geography-based app? If so, which do you use?",Please choose the response that best represents your view on taking a Lyft or Uber. [I am usually comfortable being driven.],Please choose the response that best represents your view on taking a Lyft or Uber. [I usually feel safe being driven.],Please choose the response that best represents your view on taking a Lyft or Uber. [I usually trust the driver.],Please choose the response that best represents your view on taking a Lyft or Uber. [I usually trust other drivers on the road.],...,Please choose the response that best represents how you feel as a passenger in a self-driving vehicle. (The following words are in alphabetical order.) [Nervous],Please choose the response that best represents how you feel as a passenger in a self-driving vehicle. (The following words are in alphabetical order.) [Not in control],Please choose the response that best represents how you feel as a passenger in a self-driving vehicle. (The following words are in alphabetical order.) [Relaxed],Please choose the response that best represents how you feel as a passenger in a self-driving vehicle. (The following words are in alphabetical order.) [Uncertain],Name,Age,Gender,Occupation,Highest level of education you have completed or currently pursuing,LU_Trust
0,7/17/18,Julie,YES,No,Drive myself,Google Maps,Agree,Agree,Agree,Neutral,...,Disagree,Neutral,Agree,Disagree,babak mortezai,54,Male,manager,Bachelor's degree,3
3,7/27/18,Lily,YES,No,Ask friend or family member to drive,Google Maps,Strongly agree,Agree,Agree,Disagree,...,Agree,Strongly agree,Neutral,Neutral,daniel landes,54,Male,writer,Bachelor's degree,5
5,7/31/18,Julie,YES,No,Drive myself,Google Maps,Agree,Agree,Agree,Disagree,...,Disagree,Agree,Agree,Disagree,Kevin Goncalves,26,Male,Graduate Student,Doctoral degree,4
6,8/3/18,Julie,YES,Sometimes,Drive myself,Apple Maps,Strongly agree,Agree,Agree,Agree,...,Neutral,Agree,Neutral,Agree,Cortney Miller,29,Female,PhD student,Doctoral degree,5
7,8/3/18,Lily,YES,No,Drive myself,Google Maps,Agree,Agree,Agree,Neutral,...,Agree,Neutral,Disagree,Agree,Michelle Blum Atkinson,33,Female,Software,Bachelor's degree,3


## Text Analysis from Kali

In [41]:
posArr = []
negArr = []
for word in open('poswords.txt').readlines()[35:]:
    word = word.rstrip()
    posArr.append(word)
for word in open('negwords.txt').readlines()[35:]:
    word = word.rstrip()
    negArr.append(word)
print ("num of positive words %d" % len(posArr))
print ("num of negative words %d" % len(negArr))

num of positive words 2006
num of negative words 4783


In [159]:
## sentiment analysis stuff
temp = df[df.columns[:-6]]
temp = temp.drop(temp.columns[0], axis = 1)
subj_words = temp.drop(temp.columns[1:-21],axis=1)
adjs = subj_words[subj_words.columns[:2]]
# remove NaN
#print(adjs)
#subj_words.columns

In [160]:
subj_words.head()

Unnamed: 0,System,Provide 5 words (adjectives) to characterize/describe a self-driving vehicle.,Please choose the response that best represents how you perceive the self-driving vehicle. (The following words are in alphabetical order.) [Capable],Please choose the response that best represents how you perceive the self-driving vehicle. (The following words are in alphabetical order.) [Emotional],Please choose the response that best represents how you perceive the self-driving vehicle. (The following words are in alphabetical order.) [Energetic],Please choose the response that best represents how you perceive the self-driving vehicle. (The following words are in alphabetical order.) [Indifferent],Please choose the response that best represents how you perceive the self-driving vehicle. (The following words are in alphabetical order.) [Methodical],Please choose the response that best represents how you perceive the self-driving vehicle. (The following words are in alphabetical order.) [Predictable],Please choose the response that best represents how you perceive the self-driving vehicle. (The following words are in alphabetical order.) [Reliable],Please choose the response that best represents how you perceive the self-driving vehicle. (The following words are in alphabetical order.) [Safe],...,Please choose the response that best represents how you feel as a passenger in a self-driving vehicle. (The following words are in alphabetical order.) [Bored],Please choose the response that best represents how you feel as a passenger in a self-driving vehicle. (The following words are in alphabetical order.) [Calm],Please choose the response that best represents how you feel as a passenger in a self-driving vehicle. (The following words are in alphabetical order.) [Defensive],Please choose the response that best represents how you feel as a passenger in a self-driving vehicle. (The following words are in alphabetical order.) [Excited],Please choose the response that best represents how you feel as a passenger in a self-driving vehicle. (The following words are in alphabetical order.) [Lethargic],Please choose the response that best represents how you feel as a passenger in a self-driving vehicle. (The following words are in alphabetical order.) [Mindless],Please choose the response that best represents how you feel as a passenger in a self-driving vehicle. (The following words are in alphabetical order.) [Nervous],Please choose the response that best represents how you feel as a passenger in a self-driving vehicle. (The following words are in alphabetical order.) [Not in control],Please choose the response that best represents how you feel as a passenger in a self-driving vehicle. (The following words are in alphabetical order.) [Relaxed],Please choose the response that best represents how you feel as a passenger in a self-driving vehicle. (The following words are in alphabetical order.) [Uncertain]
0,Julie,fun smart safe good enjoy,Agree,Disagree,Agree,Disagree,Strongly agree,Agree,Agree,Agree,...,Disagree,Agree,Disagree,Strongly agree,Disagree,Disagree,Disagree,Neutral,Agree,Disagree
3,Lily,"effortless, pleasant, robotic, time-saving, zen",Neutral,Strongly disagree,Strongly disagree,Strongly agree,Strongly agree,Agree,Neutral,Neutral,...,Disagree,Neutral,Neutral,Agree,Strongly disagree,Disagree,Agree,Strongly agree,Neutral,Neutral
5,Julie,0,Agree,Disagree,Disagree,Agree,Agree,Agree,Agree,Agree,...,Agree,Agree,Disagree,Disagree,Agree,Agree,Disagree,Agree,Agree,Disagree
6,Julie,"smooth, safe, intuitive, automated, easy",Agree,Disagree,Neutral,Agree,Neutral,Agree,Agree,Agree,...,Neutral,Neutral,Neutral,Neutral,Neutral,Agree,Neutral,Agree,Neutral,Agree
7,Lily,"safe, freedom, easy, self-sufficient, slow",Agree,Neutral,Neutral,Neutral,Agree,Agree,Agree,Agree,...,Disagree,Neutral,Agree,Agree,Disagree,Disagree,Agree,Neutral,Disagree,Agree


In [161]:
adjs.head()

Unnamed: 0,System,Provide 5 words (adjectives) to characterize/describe a self-driving vehicle.
0,Julie,fun smart safe good enjoy
3,Lily,"effortless, pleasant, robotic, time-saving, zen"
5,Julie,0
6,Julie,"smooth, safe, intuitive, automated, easy"
7,Lily,"safe, freedom, easy, self-sufficient, slow"


In [162]:
adjs2 = adjs.dropna()
adjs2 = adjs2[adjs2.iloc[:,1]!="0"]
print(len(adjs)-len(adjs2))

6


In [163]:
adjs2

Unnamed: 0,System,Provide 5 words (adjectives) to characterize/describe a self-driving vehicle.
0,Julie,fun smart safe good enjoy
3,Lily,"effortless, pleasant, robotic, time-saving, zen"
6,Julie,"smooth, safe, intuitive, automated, easy"
7,Lily,"safe, freedom, easy, self-sufficient, slow"
10,Lily,"advanced, futuristic, expensive, efficient, in..."
11,Julie,"convenient, smart, effective, time saving, new"
12,Lily,exciting hitech cool fun unknown
13,Julie,"interesting, high-techy, scary, fascinating, s..."
15,Julie,"Trusting, high-tech, safer, blind-spot, futuri..."
16,Lily,simple beta future software assist


In [165]:
import re
#adjs2 = adjs.dropna()
for index, row in adjs2.iterrows():
    if ',' in row[1]:
        row[1] = re.split(', ',row[1])
    else:
        row[1] = re.split(' ', row[1])
adjs2

Unnamed: 0,System,Provide 5 words (adjectives) to characterize/describe a self-driving vehicle.
0,Julie,"[fun, smart, safe, good, enjoy]"
3,Lily,"[effortless, pleasant, robotic, time-saving, zen]"
6,Julie,"[smooth, safe, intuitive, automated, easy]"
7,Lily,"[safe, freedom, easy, self-sufficient, slow ]"
10,Lily,"[advanced, futuristic, expensive, efficient, i..."
11,Julie,"[convenient, smart, effective, time saving, new]"
12,Lily,"[exciting, hitech, cool, fun, unknown]"
13,Julie,"[interesting, high-techy, scary, fascinating, ..."
15,Julie,"[Trusting, high-tech, safer, blind-spot, futur..."
16,Lily,"[simple, beta, future, software, assist]"


In [166]:
adjs_group_by_system = adjs2.groupby(['System'])
adjs_group_by_system.count()

Unnamed: 0_level_0,Provide 5 words (adjectives) to characterize/describe a self-driving vehicle.
System,Unnamed: 1_level_1
Julie,8
Lily,8


In [167]:
adjs2Temp = adjs2
for index, row in adjs2Temp.iterrows():
    posNum = 0
    negNum = 0
    for word in row[1]:
        word = word.lower()
        if word in posArr:
            posNum += 1
        if word in negArr:
            negNum -= 1
    total = posNum + negNum
    print(posNum, negNum)
    adjs2.loc[index,'Total'] = total

5 0
2 0
4 0
4 0
4 -1
3 0
3 -1
3 -1
2 0
0 0
3 -1
2 0
3 0
5 0
5 0
2 0


In [168]:
adjs2
# Total: 0 <= x <= 5

Unnamed: 0,System,Provide 5 words (adjectives) to characterize/describe a self-driving vehicle.,Total
0,Julie,"[fun, smart, safe, good, enjoy]",5.0
3,Lily,"[effortless, pleasant, robotic, time-saving, zen]",2.0
6,Julie,"[smooth, safe, intuitive, automated, easy]",4.0
7,Lily,"[safe, freedom, easy, self-sufficient, slow ]",4.0
10,Lily,"[advanced, futuristic, expensive, efficient, i...",3.0
11,Julie,"[convenient, smart, effective, time saving, new]",3.0
12,Lily,"[exciting, hitech, cool, fun, unknown]",2.0
13,Julie,"[interesting, high-techy, scary, fascinating, ...",2.0
15,Julie,"[Trusting, high-tech, safer, blind-spot, futur...",2.0
16,Lily,"[simple, beta, future, software, assist]",0.0


# Mann-Whitney U test

We'll use the Mann-Whitney U test to determine if the difference between responses between Julie and Lily users is statistically significant. 

In [169]:
import scipy
from scipy import stats

### Trust in Ridesharing Apps like Uber and Lyft

In [170]:
print(list(df)[40])
x = df.loc[df['System']=='Julie'][list(df)[40]]
y = df.loc[df['System']=='Lily'][list(df)[40]]
scipy.stats.mannwhitneyu(x,y, alternative = 'two-sided')

LU_Trust


MannwhitneyuResult(statistic=74.5, pvalue=0.368094907772863)

There doesn't seem to be a significant difference between the riders' trust in riding Lyft or Uber. A good thing. 

### Positive/Negative Word Choice Score

In [171]:
# on pos/negative words
print("Count of positive vs negative words")
print( "Average for Julie: ", x.mean() )
print( "Average for Lily:  ", y.mean() )
x = adjs2.loc[adjs2['System']=="Julie"][adjs2.columns[2]]
y = adjs2.loc[adjs2['System']=="Lily"][adjs2.columns[2]]
scipy.stats.mannwhitneyu(x, y, alternative = 'two-sided')

Count of positive vs negative words
Average for Julie:  4.363636363636363
Average for Lily:   3.727272727272727


MannwhitneyuResult(statistic=36.5, pvalue=0.6586874174078845)

Though there is a difference between the count of positive vs negative words for Julie and Lily users, the difference is not statistically significant by the Mann-Whitney U test. 