Notes:<Br>
Year: 2912<Br>
We've received a transmission from four lightyears away<Br>
The Spaceship Titanic was an interstellar passenger liner launched a month ago. With almost 13,000 passengers on board, the vessel set out on its maiden voyage transporting emigrants from our solar system to three newly habitable exoplanets orbiting nearby stars.
you are challenged to predict which passengers were transported by the anomaly using records recovered from the spaceship’s damaged computer system.<Br><Br>

Evaluation Metric:<Br>
Submissions are evaluated based on their classification accuracy, the percentage of predicted labels that are correct.<Br><Br>

Submission Format:<Br>
The submission format for the competition is a csv file with the following format:<Br><Br>

PassengerId,Transported<Br>
0013_01,False<Br>
0018_01,False<Br>
0019_01,False<Br>
0021_01,False<Br>
etc.

In [343]:
# Personal records for about two-thirds (~8700) of the passengers, to be used as training data.
# PassengerId - A unique Id for each passenger. Each Id takes the form gggg_pp where gggg indicates a group the passenger is travelling with and pp is their number within the group. People in a group are often family members, but not always.
# HomePlanet - The planet the passenger departed from, typically their planet of permanent residence.
# CryoSleep - Indicates whether the passenger elected to be put into suspended animation for the duration of the voyage. Passengers in cryosleep are confined to their cabins.
# Cabin - The cabin number where the passenger is staying. Takes the form deck/num/side, where side can be either P for Port or S for Starboard.
# Destination - The planet the passenger will be debarking to.
# Age - The age of the passenger.
# VIP - Whether the passenger has paid for special VIP service during the voyage.
# RoomService, FoodCourt, ShoppingMall, Spa, VRDeck - Amount the passenger has billed at each of the Spaceship Titanic's many luxury amenities.
# Name - The first and last names of the passenger.
# Transported - Whether the passenger was transported to another dimension. This is the target, the column you are trying to predict.

In [344]:
import pandas as pd

pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

In [345]:
train = pd.read_csv(r'../data/train.csv')
train.head()

Unnamed: 0,PassengerId,HomePlanet,CryoSleep,Cabin,Destination,Age,VIP,RoomService,FoodCourt,ShoppingMall,Spa,VRDeck,Name,Transported
0,0001_01,Europa,False,B/0/P,TRAPPIST-1e,39.0,False,0.0,0.0,0.0,0.0,0.0,Maham Ofracculy,False
1,0002_01,Earth,False,F/0/S,TRAPPIST-1e,24.0,False,109.0,9.0,25.0,549.0,44.0,Juanna Vines,True
2,0003_01,Europa,False,A/0/S,TRAPPIST-1e,58.0,True,43.0,3576.0,0.0,6715.0,49.0,Altark Susent,False
3,0003_02,Europa,False,A/0/S,TRAPPIST-1e,33.0,False,0.0,1283.0,371.0,3329.0,193.0,Solam Susent,False
4,0004_01,Earth,False,F/1/S,TRAPPIST-1e,16.0,False,303.0,70.0,151.0,565.0,2.0,Willy Santantines,True


## Feature Engineering

##### Checking all the groups are from same planet, have same destination, are/are not VIP

In [346]:
# Passenger Group:
# Check if all the groups are from same planet, have same destination, are/are not VIP
train['PassengerGroup'] = train['PassengerId'].str[:4]
x = train[['PassengerGroup','HomePlanet']].groupby('PassengerGroup').nunique().reset_index()
x.sort_values(['HomePlanet'],ascending=False)
# x.groupby('HomePlanet').nunique()
x['HomePlanet'].value_counts(normalize=True)*100
# So presumaably, everyone from one group is usually from the same HomePlanet. For NaNs, we can impute the HomePlanet values from other passengers from same group
del(x)

In [347]:
x = train[['PassengerGroup','Destination']].groupby('PassengerGroup').nunique().reset_index()
x.sort_values(['Destination'],ascending=False)
# x.groupby('Destination').nunique()
x['Destination'].value_counts(normalize=True)*100
del(x)

In [348]:
x = train[['PassengerGroup','VIP']].groupby('PassengerGroup').nunique().reset_index()
x.sort_values(['VIP'],ascending=False)
# x.groupby('VIP').nunique()
x['VIP'].value_counts(normalize=True)*100
del(x)

# So everyone from one group is usually from the same HomePlanet, but may have a different destination & different VIP status

In [349]:
# Divide Cabin into 3 variables
train['Cabin1'] = train['Cabin'].str.split('/',expand=True)[0]
train['Cabin2'] = train['Cabin'].str.split('/',expand=True)[1]
train['Cabin3'] = train['Cabin'].str.split('/',expand=True)[2]

In [350]:
# Total Amount Spent:
train['TotalSpent'] = train['RoomService'] + train['FoodCourt'] + train['ShoppingMall'] + train['Spa'] + train['VRDeck']

In [351]:
# Group Size:
x = train[['PassengerId','PassengerGroup']].groupby('PassengerGroup').nunique().reset_index()
x.columns = ['PassengerGroup','GroupSize']
train = pd.merge(left=train, right=x, on='PassengerGroup',how='left')
del(x)

In [352]:
# Family Size:
train['Last Name'] = train['Name'].str.split(' ',expand=True)[1]
x = train[['PassengerId','Last Name']].groupby('Last Name').nunique().reset_index()
x.columns = ['Last Name','FamilySize']
train = pd.merge(left=train, right=x, on='Last Name',how='left')
del(x)

## EDA & Missing Value Treatment:

In [353]:
train.columns

Index(['PassengerId', 'HomePlanet', 'CryoSleep', 'Cabin', 'Destination', 'Age',
       'VIP', 'RoomService', 'FoodCourt', 'ShoppingMall', 'Spa', 'VRDeck',
       'Name', 'Transported', 'PassengerGroup', 'Cabin1', 'Cabin2', 'Cabin3',
       'TotalSpent', 'GroupSize', 'Last Name', 'FamilySize'],
      dtype='object')

In [354]:
train.head(10)

Unnamed: 0,PassengerId,HomePlanet,CryoSleep,Cabin,Destination,Age,VIP,RoomService,FoodCourt,ShoppingMall,Spa,VRDeck,Name,Transported,PassengerGroup,Cabin1,Cabin2,Cabin3,TotalSpent,GroupSize,Last Name,FamilySize
0,0001_01,Europa,False,B/0/P,TRAPPIST-1e,39.0,False,0.0,0.0,0.0,0.0,0.0,Maham Ofracculy,False,1,B,0,P,0.0,1,Ofracculy,1.0
1,0002_01,Earth,False,F/0/S,TRAPPIST-1e,24.0,False,109.0,9.0,25.0,549.0,44.0,Juanna Vines,True,2,F,0,S,736.0,1,Vines,4.0
2,0003_01,Europa,False,A/0/S,TRAPPIST-1e,58.0,True,43.0,3576.0,0.0,6715.0,49.0,Altark Susent,False,3,A,0,S,10383.0,2,Susent,6.0
3,0003_02,Europa,False,A/0/S,TRAPPIST-1e,33.0,False,0.0,1283.0,371.0,3329.0,193.0,Solam Susent,False,3,A,0,S,5176.0,2,Susent,6.0
4,0004_01,Earth,False,F/1/S,TRAPPIST-1e,16.0,False,303.0,70.0,151.0,565.0,2.0,Willy Santantines,True,4,F,1,S,1091.0,1,Santantines,6.0
5,0005_01,Earth,False,F/0/P,PSO J318.5-22,44.0,False,0.0,483.0,0.0,291.0,0.0,Sandie Hinetthews,True,5,F,0,P,774.0,1,Hinetthews,7.0
6,0006_01,Earth,False,F/2/S,TRAPPIST-1e,26.0,False,42.0,1539.0,3.0,0.0,0.0,Billex Jacostaffey,True,6,F,2,S,1584.0,2,Jacostaffey,7.0
7,0006_02,Earth,True,G/0/S,TRAPPIST-1e,28.0,False,0.0,0.0,0.0,0.0,,Candra Jacostaffey,True,6,G,0,S,,2,Jacostaffey,7.0
8,0007_01,Earth,False,F/3/S,TRAPPIST-1e,35.0,False,0.0,785.0,17.0,216.0,0.0,Andona Beston,True,7,F,3,S,1018.0,1,Beston,5.0
9,0008_01,Europa,True,B/1/P,55 Cancri e,14.0,False,0.0,0.0,0.0,0.0,0.0,Erraiam Flatic,True,8,B,1,P,0.0,3,Flatic,3.0


##### Destination:

In [355]:
# Check if passengers from a group travel to the same destination
x = train[['PassengerGroup','Destination']].groupby('PassengerGroup').nunique().reset_index()
# Need to keep only records with Detination for this
x.sort_values(['Destination'],ascending=False)
print(pd.merge(left=x.groupby('Destination').nunique(),right=x['Destination'].value_counts(normalize=True)*100,left_index=True,right_index=True))
del(x)
# Looks like in most cases (~87%) the entire group travels to the same destination, while 1.7% of the records have null. 
# We can impute the destinations using this knowledge

             PassengerGroup  proportion
Destination                            
0                       103    1.656748
1                      5397   86.810359
2                       668   10.744732
3                        49    0.788161


In [356]:
# Get list of passenger groups for passengerids with no Destination:
train.loc[train.Destination.isnull(),'PassengerGroup'].unique()
# Checking if any of these have a Destination in the dataset:
x = train.loc[(~train.Destination.isnull()) & (train.PassengerGroup.isin(train.loc[train.Destination.isnull(),'PassengerGroup'].unique())),['PassengerGroup','Destination']]
# Some of these PassengerGroups may have multiple destination. Let us check if any such case exists. If they do, we will take the one occuring the most frequent & random in case of a tie
x = pd.DataFrame(x.groupby(['PassengerGroup','Destination'],as_index=False).size())
x.sort_values(['PassengerGroup','size'],ascending=False)
x = x.groupby('PassengerGroup').first().reset_index()[['PassengerGroup','Destination']]
x.columns = ['PassengerGroup','Destination2']
train = pd.merge(left=train, right=x, on='PassengerGroup',how='left')
train['Destination'] = train['Destination'].fillna(train['Destination2'])
# train.drop(['Destination2'], axis=1, inplace=True)
# train.head()

In [357]:
train[train['Destination'].isnull()]

Unnamed: 0,PassengerId,HomePlanet,CryoSleep,Cabin,Destination,Age,VIP,RoomService,FoodCourt,ShoppingMall,Spa,VRDeck,Name,Transported,PassengerGroup,Cabin1,Cabin2,Cabin3,TotalSpent,GroupSize,Last Name,FamilySize,Destination2
139,0152_01,Earth,False,F/32/P,,41.0,False,0.0,0.0,0.0,0.0,607.0,Andan Estron,False,152,F,32.0,P,607.0,1,Estron,5.0,
347,0382_01,,False,G/64/P,,23.0,False,348.0,0.0,0.0,4.0,368.0,Blanie Floydendley,False,382,G,64.0,P,720.0,1,Floydendley,5.0,
430,0462_01,Earth,True,G/67/S,,50.0,False,0.0,0.0,0.0,0.0,0.0,Ronia Sosanturney,False,462,G,67.0,S,0.0,1,Sosanturney,3.0,
547,0576_01,Earth,False,F/107/S,,21.0,False,0.0,,625.0,110.0,0.0,Melice Herry,False,576,F,107.0,S,,1,Herry,2.0,
620,0645_01,Earth,False,G/98/P,,20.0,False,1724.0,0.0,0.0,1.0,0.0,Troyra Grahangory,False,645,G,98.0,P,1725.0,1,Grahangory,5.0,
719,0761_01,Europa,False,C/26/P,,33.0,False,0.0,3879.0,0.0,48.0,67.0,Izarki Fliblerolt,True,761,C,26.0,P,3994.0,1,Fliblerolt,3.0,
742,0779_01,Earth,False,F/162/P,,21.0,False,562.0,2.0,0.0,0.0,11.0,Holey Rodger,False,779,F,162.0,P,575.0,1,Rodger,7.0,
877,0939_01,Earth,True,G/135/P,,15.0,False,0.0,0.0,0.0,0.0,0.0,Eulah Peter,True,939,G,135.0,P,0.0,1,Peter,6.0,
906,0979_01,Europa,False,B/40/S,,44.0,False,0.0,0.0,0.0,0.0,0.0,Gimph Fushausive,True,979,B,40.0,S,0.0,1,Fushausive,4.0,
937,1000_01,Mars,False,D/39/P,,18.0,False,885.0,0.0,32.0,0.0,0.0,Alus Harte,False,1000,D,39.0,P,917.0,1,Harte,2.0,


##### HomePLanet

In [358]:
len(train[train['HomePlanet'].isnull()])

201

In [359]:
x = train[['PassengerGroup','HomePlanet']].groupby('PassengerGroup').nunique().reset_index()
x.sort_values(['HomePlanet'],ascending=False,inplace=True)
print(pd.merge(left=x.groupby('HomePlanet').nunique(),right=x['HomePlanet'].value_counts(normalize=True)*100,left_index=True,right_index=True))
del(x)

            PassengerGroup  proportion
HomePlanet                            
0                      110    1.769342
1                     6107   98.230658


In [360]:
# Get list of passenger groups for passengerids with no HomePlanet:
train.loc[train.HomePlanet.isnull(),'PassengerGroup'].unique()
# Checking if any of these have a HomePlanet in the dataset:
x = train.loc[(~train.HomePlanet.isnull()) & (train.PassengerGroup.isin(train.loc[train.HomePlanet.isnull(),'PassengerGroup'].unique())),['PassengerGroup','HomePlanet']]
x.columns = ['PassengerGroup','HomePlanet2']
x.drop_duplicates(inplace=True)
train = pd.merge(left=train, right=x, on='PassengerGroup',how='left')
train['HomePlanet'] = train['HomePlanet'].fillna(train['HomePlanet2'])
train.drop(['HomePlanet2'], axis=1, inplace=True)
train.head()

Unnamed: 0,PassengerId,HomePlanet,CryoSleep,Cabin,Destination,Age,VIP,RoomService,FoodCourt,ShoppingMall,Spa,VRDeck,Name,Transported,PassengerGroup,Cabin1,Cabin2,Cabin3,TotalSpent,GroupSize,Last Name,FamilySize,Destination2
0,0001_01,Europa,False,B/0/P,TRAPPIST-1e,39.0,False,0.0,0.0,0.0,0.0,0.0,Maham Ofracculy,False,1,B,0,P,0.0,1,Ofracculy,1.0,
1,0002_01,Earth,False,F/0/S,TRAPPIST-1e,24.0,False,109.0,9.0,25.0,549.0,44.0,Juanna Vines,True,2,F,0,S,736.0,1,Vines,4.0,
2,0003_01,Europa,False,A/0/S,TRAPPIST-1e,58.0,True,43.0,3576.0,0.0,6715.0,49.0,Altark Susent,False,3,A,0,S,10383.0,2,Susent,6.0,
3,0003_02,Europa,False,A/0/S,TRAPPIST-1e,33.0,False,0.0,1283.0,371.0,3329.0,193.0,Solam Susent,False,3,A,0,S,5176.0,2,Susent,6.0,
4,0004_01,Earth,False,F/1/S,TRAPPIST-1e,16.0,False,303.0,70.0,151.0,565.0,2.0,Willy Santantines,True,4,F,1,S,1091.0,1,Santantines,6.0,


In [361]:
len(train[train['HomePlanet'].isnull()])

111

##### Cryosleep:

In [362]:
train[train['CryoSleep'].isnull()]

Unnamed: 0,PassengerId,HomePlanet,CryoSleep,Cabin,Destination,Age,VIP,RoomService,FoodCourt,ShoppingMall,Spa,VRDeck,Name,Transported,PassengerGroup,Cabin1,Cabin2,Cabin3,TotalSpent,GroupSize,Last Name,FamilySize,Destination2
92,0099_02,Earth,,G/12/P,TRAPPIST-1e,2.0,False,0.0,0.0,0.0,0.0,0.0,Thewis Connelson,True,99,G,12.0,P,0.0,2,Connelson,4.0,
98,0105_01,Earth,,F/21/P,TRAPPIST-1e,27.0,False,0.0,0.0,570.0,2.0,131.0,Carry Cleachrand,False,105,F,21.0,P,703.0,1,Cleachrand,7.0,
104,0110_02,Europa,,B/5/P,TRAPPIST-1e,40.0,False,0.0,331.0,0.0,0.0,1687.0,Aldeba Bootious,False,110,B,5.0,P,2018.0,4,Bootious,2.0,
111,0115_01,Mars,,F/24/P,TRAPPIST-1e,26.0,False,0.0,0.0,0.0,0.0,,Rohs Pead,True,115,F,24.0,P,,1,Pead,5.0,
152,0173_01,Earth,,E/11/S,TRAPPIST-1e,58.0,False,0.0,985.0,0.0,5.0,0.0,Hilip Grifford,True,173,E,11.0,S,990.0,1,Grifford,4.0,
175,0198_01,Earth,,G/30/P,PSO J318.5-22,52.0,False,0.0,0.0,0.0,0.0,0.0,Jeroy Cookson,True,198,G,30.0,P,0.0,1,Cookson,7.0,
224,0241_01,Europa,,E/11/P,55 Cancri e,33.0,False,0.0,1249.0,0.0,4812.0,1116.0,Alas Dischod,False,241,E,11.0,P,7177.0,1,Dischod,4.0,
266,0290_03,Europa,,B/7/S,TRAPPIST-1e,43.0,False,0.0,0.0,0.0,0.0,0.0,Dhenar Excialing,True,290,B,7.0,S,0.0,4,Excialing,7.0,
314,0348_02,Mars,,,TRAPPIST-1e,36.0,False,520.0,0.0,1865.0,0.0,0.0,Weet Mane,True,348,,,,2385.0,2,Mane,4.0,
392,0433_01,Europa,,B/20/P,55 Cancri e,27.0,False,0.0,0.0,0.0,0.0,0.0,Hekark Mormonized,True,433,B,20.0,P,0.0,2,Mormonized,5.0,


In [363]:
# Checking relation between Cryosleep & Transported
print(pd.concat([train.groupby(['Cabin1','CryoSleep'])[['Transported']].value_counts(dropna=False),train.groupby(['Cabin1','CryoSleep'])[['Transported']].value_counts(normalize=True, dropna=False)],axis=1))

                              count  proportion
Cabin1 CryoSleep Transported                   
A      False     False          123    0.675824
                 True            59    0.324176
       True      True            64    0.941176
                 False            4    0.058824
B      False     False          197    0.577713
                 True           144    0.422287
       True      True           416    0.992840
                 False            3    0.007160
C      False     False          231    0.537209
                 True           199    0.462791
       True      True           292    0.993197
                 False            2    0.006803
D      False     False          263    0.722527
                 True           101    0.277473
       True      True           103    0.990385
                 False            1    0.009615
E      False     False          491    0.713663
                 True           197    0.286337
       True      True           109    0

While everyone in CryoSleep is more likely to be transported, this also seems to depend on Cabin1 variable where being in Cabin1 G & E will make it less likely ffor you to be transported

In [364]:
# I am inclined to say False if it is NA but lets check what % of their group & family is in the cryosleep:
print("For passengerid 2822_02:")
print(train.loc[train.PassengerGroup=='2822',['PassengerId','CryoSleep']].groupby('CryoSleep', dropna=False).nunique())
print(train.loc[train['Last Name']=='Harverez',['PassengerId','CryoSleep']].groupby('CryoSleep', dropna=False).nunique())

print("For passengerid 5090_01:")
print(train.loc[train.PassengerGroup=='5090',['PassengerId','CryoSleep']].groupby('CryoSleep', dropna=False).nunique())

print("For passengerid 6405_02:")
print(train.loc[train.PassengerGroup=='6405',['PassengerId','CryoSleep']].groupby('CryoSleep', dropna=False).nunique())
print(train.loc[train['Last Name']=='Toddleton',['PassengerId','CryoSleep']].groupby('CryoSleep', dropna=False).nunique())

print("For passengerid 7584_01:")
print(train.loc[train.PassengerGroup=='7584',['PassengerId','CryoSleep']].groupby('CryoSleep', dropna=False).nunique())
print(train.loc[train['Last Name']=='Swingse',['PassengerId','CryoSleep']].groupby('CryoSleep', dropna=False).nunique())

print("Overall:")
print(train[['PassengerId','CryoSleep']].groupby('CryoSleep', dropna=False).nunique())

For passengerid 2822_02:
           PassengerId
CryoSleep             
True                 4
NaN                  1
           PassengerId
CryoSleep             
False                2
True                 4
NaN                  1
For passengerid 5090_01:
           PassengerId
CryoSleep             
NaN                  1
False                3
True                 2
For passengerid 6405_02:
           PassengerId
CryoSleep             
False                2
NaN                  1
True                 1
           PassengerId
CryoSleep             
NaN                  2
False                5
True                 1
For passengerid 7584_01:
           PassengerId
CryoSleep             
NaN                  1
False                1
True                 1
           PassengerId
CryoSleep             
True                 4
NaN                  1
False                1
Overall:
           PassengerId
CryoSleep             
False             5439
True              3037
NaN              

In [365]:
# There doesn't seem to be any pattern among the group

In [366]:
# Looking for any specific patterns for people in Cryosleep vs not in Cryosleep:
# First checking if it has something to do with age. Maybe very young or very old people are more likely to do this?
train.groupby('CryoSleep',dropna=False).agg({'Age':['mean','median'],'TotalSpent':['mean','median'],'GroupSize':['mean','median'],'FamilySize':['mean','median']})

Unnamed: 0_level_0,Age,Age,TotalSpent,TotalSpent,GroupSize,GroupSize,FamilySize,FamilySize
Unnamed: 0_level_1,mean,median,mean,median,mean,median,mean,median
CryoSleep,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2
False,29.651319,27.0,2304.194614,1048.5,1.933444,1.0,5.44549,5.0
True,27.405415,26.0,0.0,0.0,2.208429,2.0,5.389244,5.0
,27.921296,25.0,1359.901554,716.0,2.175115,2.0,5.541063,5.0


In [367]:
# So people in Crysleep have no spends. This can be used in missing value treatments. Checking how many people with 0 spends are in cryosleep:
train[train['TotalSpent']==0][['CryoSleep']].value_counts(normalize=True)

CryoSleep
True         0.851266
False        0.148734
Name: proportion, dtype: float64

In [368]:
len(train[train['VIP']==True]), len(train), len(train[train.VIP.isnull()])

(199, 8693, 203)

In [369]:
# Check how many in Cryosleep are VIP
# Also check if VIP have lower spends
print(pd.concat([train.groupby(['VIP','CryoSleep'])[['Transported']].value_counts(dropna=False),train.groupby(['VIP','CryoSleep'])[['Transported']].value_counts(normalize=True, dropna=False)],axis=1))

                             count  proportion
VIP   CryoSleep Transported                   
False False     False         3455    0.671787
                True          1688    0.328213
      True      True          2406    0.818089
                False          535    0.181911
True  False     False          121    0.691429
                True            54    0.308571
      True      True            21    1.000000


In [370]:
train.head()

Unnamed: 0,PassengerId,HomePlanet,CryoSleep,Cabin,Destination,Age,VIP,RoomService,FoodCourt,ShoppingMall,Spa,VRDeck,Name,Transported,PassengerGroup,Cabin1,Cabin2,Cabin3,TotalSpent,GroupSize,Last Name,FamilySize,Destination2
0,0001_01,Europa,False,B/0/P,TRAPPIST-1e,39.0,False,0.0,0.0,0.0,0.0,0.0,Maham Ofracculy,False,1,B,0,P,0.0,1,Ofracculy,1.0,
1,0002_01,Earth,False,F/0/S,TRAPPIST-1e,24.0,False,109.0,9.0,25.0,549.0,44.0,Juanna Vines,True,2,F,0,S,736.0,1,Vines,4.0,
2,0003_01,Europa,False,A/0/S,TRAPPIST-1e,58.0,True,43.0,3576.0,0.0,6715.0,49.0,Altark Susent,False,3,A,0,S,10383.0,2,Susent,6.0,
3,0003_02,Europa,False,A/0/S,TRAPPIST-1e,33.0,False,0.0,1283.0,371.0,3329.0,193.0,Solam Susent,False,3,A,0,S,5176.0,2,Susent,6.0,
4,0004_01,Earth,False,F/1/S,TRAPPIST-1e,16.0,False,303.0,70.0,151.0,565.0,2.0,Willy Santantines,True,4,F,1,S,1091.0,1,Santantines,6.0,
