# Kaggle Submission
## Project: Titanic Survival Exploration

In 1912, the ship RMS Titanic struck an iceberg on its maiden voyage and sank, resulting in the deaths of most of its passengers and crew. In this introductory project, we will explore a subset of the RMS Titanic passenger manifest to determine which features best predict whether someone survived or did not survive. To complete this project, you will need to implement several conditional predictions and answer the questions below. Your project submission will be evaluated based on the completion of the code and your responses to the questions.
> **Tip:** Quoted sections like this will provide helpful instructions on how to navigate and use an iPython notebook. 

In [1]:
# Import libraries necessary for this project
import numpy as np
import pandas as pd
from IPython.display import display # Allows the use of display() for DataFrames

# Import supplementary visualizations code visuals.py
import visuals as vs

# Pretty display for notebooks
%matplotlib inline

# Load the dataset
in_file = 'kaggle_test.csv'
full_data = pd.read_csv(in_file)

# Print the first few entries of the RMS Titanic data
display(full_data.head())

Unnamed: 0,PassengerId,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,892,3,"Kelly, Mr. James",male,34.5,0,0,330911,7.8292,,Q
1,893,3,"Wilkes, Mrs. James (Ellen Needs)",female,47.0,1,0,363272,7.0,,S
2,894,2,"Myles, Mr. Thomas Francis",male,62.0,0,0,240276,9.6875,,Q
3,895,3,"Wirz, Mr. Albert",male,27.0,0,0,315154,8.6625,,S
4,896,3,"Hirvonen, Mrs. Alexander (Helga E Lindqvist)",female,22.0,1,1,3101298,12.2875,,S


From a sample of the RMS Titanic data, we can see the various features present for each passenger on the ship:
- **Pclass**: Socio-economic class (1 = Upper class; 2 = Middle class; 3 = Lower class)
- **Name**: Name of passenger
- **Sex**: Sex of the passenger
- **Age**: Age of the passenger (Some entries contain `NaN`)
- **SibSp**: Number of siblings and spouses of the passenger aboard
- **Parch**: Number of parents and children of the passenger aboard
- **Ticket**: Ticket number of the passenger
- **Fare**: Fare paid by the passenger
- **Cabin** Cabin number of the passenger (Some entries contain `NaN`)
- **Embarked**: Port of embarkation of the passenger (C = Cherbourg; Q = Queenstown; S = Southampton)

Since we're interested in the outcome of survival for each passenger or crew member, we can remove the **Survived** feature from this dataset and store it as its own separate variable `outcomes`. We will use these outcomes as our prediction targets.  
Run the code cell below to remove **Survived** as a feature of the dataset and store it in `outcomes`.

In [4]:
def predictions(data):
    """ Model with multiple features. Makes a prediction with an accuracy of at least 80%. """
    
    predictions = []
    for _, passenger in data.iterrows():
        
        if passenger['Sex'] == 'female':
            if passenger['SibSp'] > 3:
                predictions.append(0)
            elif passenger['Age'] > 20 and passenger['Age'] < 30:
                if passenger['Parch'] < 4:
                    predictions.append(1)
                else:
                    predictions.append(0)
            else:
                predictions.append(1)
        else:
            if passenger['Age'] < 10:
                if passenger['SibSp'] < 3:
                    predictions.append(1)
                else:
                    predictions.append(0)
            elif passenger['Age'] > 20 and passenger['Age'] < 30:
                if passenger['Pclass'] < 2 and passenger['Parch'] < 1:
                    predictions.append(1)
                elif passenger['Pclass'] < 2 and passenger['SibSp'] > 0:
                    predictions.append(1)
                else:
                    predictions.append(0)
            elif passenger['Age'] > 30 and passenger['Age'] < 40:
                if passenger['Pclass'] < 2:
                    predictions.append(1)
                else:
                    predictions.append(0)
            elif passenger['Age'] > 40 and passenger['Age'] < 50:
                if passenger['Pclass'] < 2 and passenger["SibSp"] == 1:
                    predictions.append(1)
                else:
                    predictions.append(0)
            else:
                predictions.append(0)
    
    # Return our predictions
    return pd.Series(predictions)

# Make the predictions
result = predictions(full_data)

In [18]:
full_data['Survived'] = result

In [19]:
writable = full_data[['PassengerId', 'Survived']].copy()

In [20]:
writable.to_csv('kaggle_submission.csv', sep=',', index=False, encoding='utf-8')