# Challenge - Spaceship Titanic

### Objetivo:
#### Prever quais passageiros serão transportados para uma dimensão alternativa.

---

## Descrição dos dados:

| Variavel   |                   Definição                |
|------------|--------------------------------------------|
|PassengerId | Um ID exclisivo de cada passageiro         |
|HomePlanet  | O planeta de onde o passageiro partiu, normalmente o planeta de residência permanente.|
|CryoSleep   | Indica se o passageiro optou por ser colocado em animação suspensa durante a viagem.    |
|Cabin       | O número da cabine onde o passageiro está hospedado. Assume a forma deck/num/side, onde o lado pode ser P para Bombordo ou S para Estibordo.|
|Destination | O planeta para onde o passageiro desembarcará. |
|Age         | A idade do passageiro. |
|VIP         | Se o passageiro pagou por serviço VIP especial durante a viagem. |
|RoomService, FoodCourt, ShoppingMall, Spa, VRDeck | Valor que o passageiro faturou em cada uma das muitas comodidades de luxo da Nave Espacial Titanic. |
|Name | O nome e sobrenome do passageiro. |
|Transported | Se o passageiro foi transportado para outra dimensão. Este é o alvo, a coluna que você está tentando prever.|

---

#### Analise dos dados:

In [163]:
import numpy as numpy
import pandas as pd
import matplotlib.pyplot as pyplot
import seaborn as sns

train = pd.read_csv("./CSV's_Titanic_Space/train.csv", index_col=0)
test  = pd.read_csv("./CSV's_Titanic_Space/test.csv", index_col=0)

train.head()

Unnamed: 0_level_0,HomePlanet,CryoSleep,Cabin,Destination,Age,VIP,RoomService,FoodCourt,ShoppingMall,Spa,VRDeck,Name,Transported
PassengerId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
0001_01,Europa,False,B/0/P,TRAPPIST-1e,39.0,False,0.0,0.0,0.0,0.0,0.0,Maham Ofracculy,False
0002_01,Earth,False,F/0/S,TRAPPIST-1e,24.0,False,109.0,9.0,25.0,549.0,44.0,Juanna Vines,True
0003_01,Europa,False,A/0/S,TRAPPIST-1e,58.0,True,43.0,3576.0,0.0,6715.0,49.0,Altark Susent,False
0003_02,Europa,False,A/0/S,TRAPPIST-1e,33.0,False,0.0,1283.0,371.0,3329.0,193.0,Solam Susent,False
0004_01,Earth,False,F/1/S,TRAPPIST-1e,16.0,False,303.0,70.0,151.0,565.0,2.0,Willy Santantines,True


### Verificando os dados de cada coluna

In [164]:
train['HomePlanet'].unique()

array(['Europa', 'Earth', 'Mars', nan], dtype=object)

In [165]:
train['Destination'].unique()

array(['TRAPPIST-1e', 'PSO J318.5-22', '55 Cancri e', nan], dtype=object)

In [166]:
train['CryoSleep'].unique()

array([False, True, nan], dtype=object)

In [167]:
train['CryoSleep'].isnull().sum()

217

In [168]:
train['VIP'].unique()

array([False, True, nan], dtype=object)

In [169]:
train.isnull().sum()

HomePlanet      201
CryoSleep       217
Cabin           199
Destination     182
Age             179
VIP             203
RoomService     181
FoodCourt       183
ShoppingMall    208
Spa             183
VRDeck          188
Name            200
Transported       0
dtype: int64

In [170]:
test.isnull().sum()

HomePlanet       87
CryoSleep        93
Cabin           100
Destination      92
Age              91
VIP              93
RoomService      82
FoodCourt       106
ShoppingMall     98
Spa             101
VRDeck           80
Name             94
dtype: int64

## Conversão

In [171]:
# Conversão do HomePlanet
train['HomePlanet'] = train['HomePlanet'].map({'Europa': 0, 'Earth': 1, 'Mars': 2})
test['HomePlanet'] = test['HomePlanet'].map({'Europa': 0, 'Earth': 1, 'Mars': 2})
# Conversão do Destination
train['Destination'] = train['Destination'].map({'TRAPPIST-1e': 0, 'PSO J318.5-22': 1, '55 Cancri e': 2})
test['Destination'] = test['Destination'].map({'TRAPPIST-1e': 0, 'PSO J318.5-22': 1, '55 Cancri e': 2})
# Convertendo RoomService, FoodCourt, ShoppingMall, Spa, VRDeck para só um valor
train['spending'] = train['RoomService'] + train['FoodCourt'] + train['ShoppingMall'] + train['Spa'] + train['VRDeck']
test['spending'] = test['RoomService'] + test['FoodCourt'] + test['ShoppingMall'] + test['Spa'] + test['VRDeck']

train.head()

Unnamed: 0_level_0,HomePlanet,CryoSleep,Cabin,Destination,Age,VIP,RoomService,FoodCourt,ShoppingMall,Spa,VRDeck,Name,Transported,spending
PassengerId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
0001_01,0.0,False,B/0/P,0.0,39.0,False,0.0,0.0,0.0,0.0,0.0,Maham Ofracculy,False,0.0
0002_01,1.0,False,F/0/S,0.0,24.0,False,109.0,9.0,25.0,549.0,44.0,Juanna Vines,True,736.0
0003_01,0.0,False,A/0/S,0.0,58.0,True,43.0,3576.0,0.0,6715.0,49.0,Altark Susent,False,10383.0
0003_02,0.0,False,A/0/S,0.0,33.0,False,0.0,1283.0,371.0,3329.0,193.0,Solam Susent,False,5176.0
0004_01,1.0,False,F/1/S,0.0,16.0,False,303.0,70.0,151.0,565.0,2.0,Willy Santantines,True,1091.0


## Limpeza do Data Frame 

In [172]:
train = train.drop(['RoomService', 'FoodCourt', 'ShoppingMall', 'Spa', 'VRDeck', 'Name'], axis=1)

train.head()

Unnamed: 0_level_0,HomePlanet,CryoSleep,Cabin,Destination,Age,VIP,Transported,spending
PassengerId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
0001_01,0.0,False,B/0/P,0.0,39.0,False,False,0.0
0002_01,1.0,False,F/0/S,0.0,24.0,False,True,736.0
0003_01,0.0,False,A/0/S,0.0,58.0,True,False,10383.0
0003_02,0.0,False,A/0/S,0.0,33.0,False,False,5176.0
0004_01,1.0,False,F/1/S,0.0,16.0,False,True,1091.0
