## 🕶 Awesome Visualization with Spaceship Titanic Dataset📊

##### Credit : https://www.kaggle.com/subinium/awesome-visualization-with-titanic-dataset 

![](https://storage.googleapis.com/kaggle-competitions/kaggle/34377/logos/header.png?t=2022-02-11-21-53-06)
![](https://storage.googleapis.com/kaggle-media/competitions/Spaceship%20Titanic/joel-filipe-QwoNAhbmLLo-unsplash.jpg)

I am one of the Kagglers like @subinium who **love** visualization.

**Have fun and if you liked it, please upvote!**

---

### Table of Contents

- **Timeline visualization** : Matplotlib Techniques
- **Ridgeplot** : Effective Multi Distribution
- **Barplot** : How to Customize Bar?
- **Stripplot** : Effective distribution plot
- **Heatmap** : How t Custom Heatmap?
- **Dimension Reduction + Scatter**

In [None]:
!pip install seaborn==0.11.0

In [None]:
import matplotlib as mpl
import matplotlib.pyplot as plt
import matplotlib.gridspec as grid_spec
from matplotlib.ticker import FuncFormatter

import seaborn as sns

import numpy as np
import pandas as pd

In [None]:
from cycler import cycler

mpl.rcParams['figure.dpi'] = 120
mpl.rcParams['axes.spines.top'] = False
mpl.rcParams['axes.spines.right'] = False
# mpl.rcParams['font.family'] = 'serif'

raw_light_palette = [
    (0, 122, 255), # Blue
    (255, 149, 0), # Orange
    (52, 199, 89), # Green
    (255, 59, 48), # Red
    (175, 82, 222),# Purple
    (255, 45, 85), # Pink
    (88, 86, 214), # Indigo
    (90, 200, 250),# Teal
    (255, 204, 0)  # Yellow
]

light_palette = np.array(raw_light_palette)/255


mpl.rcParams['axes.prop_cycle'] = cycler('color',light_palette)

transported_palette = ['#dddddd', mpl.colors.to_hex(light_palette[2])]
planet_palette = [light_palette[0], light_palette[3]]

In [None]:
#data = pd.read_csv('/kaggle/input/titanic/train.csv')
data = pd.read_csv('../input/spaceship-titanic/train.csv')
data.shape
test_data = pd.read_csv('../input/spaceship-titanic/test.csv')

## Awesome Timeline Visualization

- A **Timeline** is a graphical way of displaying a list of events in chronological order. 
- line + scatter + **stem plot**

---

### Simple Explanation

- First of all, the ingredients are:
    - a line
    - 3 data points
        - 2 point set of date
            - black one, white one
        - 1 point set of time
    - vertical lines which matching with time (stem plot)

- The downside is that the process of adjusting text position is heuristic.

In [None]:
sns.palplot(['#fafafa', '#4a4a4a', '#e3120b'])

## Awesome Distribution Visualization (Ridgeplot)
 
 - A **Ridgeline** plot (sometimes called Joyplot) shows the distribution of a numeric value for several groups.
- **library** : matplotlib, seaborn
- **colortheme** from movie "Snowpiercer" (because of *class*)

---

### Simple Explanation

- **Step1** : Use Gridspec
- **Step2** : Create density plot using seaborn's kdeplot
    - `bw` : bandwidth
    - `edgecolor` : to separate each density plot
    - `alpha` : remove transparency 
    - `cut` : clear meaning
- **Step3** : Remove ticks & labels except last xticks
- **Step4** : Remove Spine
- **Step5** : Make plots closer & Make the background transparent
- **Step6** : add subtext (figure title, axes title)

In [None]:
sns.palplot(["#022133", "#5c693b", "#51371c", "lightgray"])

In [None]:
data.head()

In [None]:
data.columns

In [None]:
data.fillna(0)

In [None]:
fig = plt.figure(figsize=(12, 8))
gs = fig.add_gridspec(3,1)
gs.update(hspace= -0.55)

axes = list()
#colors = ["#022133", "#5c693b", "#51371c"]

colors = light_palette[0:]

for idx, cls, c in zip(range(3), sorted(data['FoodCourt'].unique()), colors):
    axes.append(fig.add_subplot(gs[idx, 0]))
    
    # you can also draw density plot with matplotlib + scipy.
    sns.kdeplot(x='Age', data=data[data['FoodCourt']==cls], 
                fill=True, ax=axes[idx], cut=0, bw_method=0.25, 
                lw=1.4, edgecolor='lightgray',color=c, alpha=1) 
    
    axes[idx].set_ylim(0, 0.04)
    axes[idx].set_xlim(0, 85)
    
    axes[idx].set_yticks([])
    if idx != 2 : axes[idx].set_xticks([])
    axes[idx].set_ylabel('')
    axes[idx].set_xlabel('')
    
    spines = ["top","right","left","bottom"]
    for s in spines:
        axes[idx].spines[s].set_visible(False)
        
    axes[idx].patch.set_alpha(0)
    axes[idx].text(-0.2,0,f'RoomService {cls}',fontweight="light", fontfamily='serif', fontsize=11,ha="right")

fig.text(0.13,0.81,"Age distribution by RoomService in Spaceship Titanic", fontweight="bold", fontfamily='serif', fontsize=16)
plt.show()    

In [None]:
fig = plt.figure(figsize=(12, 8))
gs = fig.add_gridspec(3,1)
gs.update(hspace= -0.55)

axes = list()
#colors = ["#022133", "#5c693b", "#51371c"]

colors = light_palette[-6:]
unique_vals = data['RoomService'].nunique()
print(unique_vals)

for idx, cls, c in zip(range(3), sorted(data['RoomService'].unique()), colors):
    axes.append(fig.add_subplot(gs[idx, 0]))
    
    # you can also draw density plot with matplotlib + scipy.
    sns.kdeplot(x='Age', data=data[data['RoomService']==cls], 
                fill=True, ax=axes[idx], cut=0, bw_method=0.25, 
                lw=1.4, edgecolor='lightgray',color=c, alpha=1) 
    
    axes[idx].set_ylim(0, 0.04)
    axes[idx].set_xlim(0, 85)
    
    axes[idx].set_yticks([])
    if idx != 2 : axes[idx].set_xticks([])
    axes[idx].set_ylabel('')
    axes[idx].set_xlabel('')
    
    spines = ["top","right","left","bottom"]
    for s in spines:
        axes[idx].spines[s].set_visible(False)
        
    axes[idx].patch.set_alpha(0)
    axes[idx].text(-0.2,0,f'RoomService {cls}',fontweight="light", fontfamily='serif', fontsize=11,ha="right")

fig.text(0.13,0.81,"Age distribution by RoomService in Spaceship Titanic", fontweight="bold", fontfamily='serif', fontsize=16)
plt.show()    

In [None]:
fig = plt.figure(figsize=(12, 8))
gs = fig.add_gridspec(3,1)
gs.update(hspace= -0.55)

axes = list()
#colors = ["#022133", "#5c693b", "#51371c"]

colors = light_palette[-3:]

for idx, cls, c in zip(range(3), sorted(data['VRDeck'].unique()), colors):
    axes.append(fig.add_subplot(gs[idx, 0]))
    
    # you can also draw density plot with matplotlib + scipy.
    sns.kdeplot(x='Age', data=data[data['VRDeck']==cls], 
                fill=True, ax=axes[idx], cut=0, bw_method=0.25, 
                lw=1.4, edgecolor='lightgray',color=c, alpha=1) 
    
    axes[idx].set_ylim(0, 0.04)
    axes[idx].set_xlim(0, 85)
    
    axes[idx].set_yticks([])
    if idx != 2 : axes[idx].set_xticks([])
    axes[idx].set_ylabel('')
    axes[idx].set_xlabel('')
    
    spines = ["top","right","left","bottom"]
    for s in spines:
        axes[idx].spines[s].set_visible(False)
        
    axes[idx].patch.set_alpha(0)
    axes[idx].text(-0.2,0,f'VRDeck {cls}',fontweight="light", fontfamily='serif', fontsize=11,ha="right")

fig.text(0.13,0.81,"Age distribution by VRDeck in Titanic", fontweight="bold", fontfamily='serif', fontsize=16)
plt.show()    

Changed the color of the graph above to give it a **glacier** texture whether it is a survivor or not.

In [None]:
sns.color_palette(sns.color_palette("PuBu", 2))

In [None]:
fig = plt.figure(figsize=(12, 8))
gs = fig.add_gridspec(3,1)
gs.update(hspace= -0.55)

axes = list()
colors = ["#022133", "#5c693b", "#51371c"]

for idx, cls, c in zip(range(3), sorted(data['VRDeck'].unique()), colors):
    axes.append(fig.add_subplot(gs[idx, 0]))
    
    # you can also draw density plot with matplotlib + scipy.
    sns.kdeplot(x='Age', data=data[data['VRDeck']==cls], 
                fill=True, ax=axes[idx], cut=0, bw_method=0.25, 
                lw=1.4, edgecolor='lightgray', hue='Transported', 
                multiple="stack", palette='PuBu', alpha=0.7
               ) 
    
    axes[idx].set_ylim(0, 0.04)
    axes[idx].set_xlim(0, 85)
    
    axes[idx].set_yticks([])
    if idx != 2 : axes[idx].set_xticks([])
    axes[idx].set_ylabel('')
    axes[idx].set_xlabel('')
    
    spines = ["top","right","left","bottom"]
    for s in spines:
        axes[idx].spines[s].set_visible(False)
        
    axes[idx].patch.set_alpha(0)
    axes[idx].text(-0.2,0,f'VRDeck {cls}',fontweight="light", fontfamily='serif', fontsize=11,ha="right")
    if idx != 1 : axes[idx].get_legend().remove()
        
fig.text(0.13,0.81,"Age distribution by VRDeck in Titanic", fontweight="bold", fontfamily='serif', fontsize=16)

plt.show()    

## Awesome Barplot Visualization

- **library** : matplotlib, seaborn
- **colortheme** from [The Economist Colors](https://pattern-library.economist.com/color.html)

---

### Simple Explanation

1. Grid
2. Color difference in the bar you want to emphasize
3. Average line and text annotations for it
4. Minimize the y-axis information and add it as an annotation to each bar

In [None]:
sns.palplot(['#d4dddd', '#244747', '#efe8d1', '#4a4a4a'])

In [None]:
def age_band(num):
    for i in range(1, 100):
        if num < 10*i :  return f'{(i-1) * 10} ~ {i*10}'

data['age_band'] = data['Age'].apply(age_band)
titanic_age = data[['age_band', 'Transported']].groupby('age_band')['Transported'].value_counts().sort_index().unstack().fillna(0)
titanic_age['Transported rate'] = titanic_age[1] / (titanic_age[0] + titanic_age[1]) * 100

fig, ax = plt.subplots(1, 1, figsize=(10, 7))

color_map = ['#d4dddd' for _ in range(9)]
color_map[0] = color_map[8] = '#244747' # color highlight

ax.bar(titanic_age['Transported rate'].index, titanic_age['Transported rate'], 
       color=color_map, width=0.55, 
       edgecolor='black', 
       linewidth=0.7)



for s in ["top","right","left"]:
    ax.spines[s].set_visible(False)


# Annotation Part
for i in titanic_age['Transported rate'].index:
    ax.annotate(f"{titanic_age['Transported rate'][i]:.02f}%", 
                   xy=(i, titanic_age['Transported rate'][i] + 2.3),
                   va = 'center', ha='center',fontweight='light', 
                   color='#4a4a4a')


# mean line + annotation
mean = data['Transported'].mean() *100
ax.axhline(mean ,color='black', linewidth=0.4, linestyle='dashdot')
ax.annotate(f"mean : {mean :.4}%", 
            xy=('70 ~ 80', mean + 4),
            va = 'center', ha='center',
            color='#4a4a4a',
            bbox=dict(boxstyle='round', pad=0.4, facecolor='#efe8d1', linewidth=0))
    

# Title & Subtitle    
fig.text(0.06, 1, 'Age Band & Transported Rate', fontsize=15, fontweight='bold', fontfamily='serif')
fig.text(0.06, 0.96, 'It can be seen that the Transported rate of young children and the elderly is high.', fontsize=12, fontweight='light', fontfamily='serif')

grid_y_ticks = np.arange(0, 101, 20)
ax.set_yticks(grid_y_ticks)
ax.grid(axis='y', linestyle='-', alpha=0.4)

plt.tight_layout()
plt.show()

## Awesome Bar+Scatter Plot (Stripplot)

- A **strip plot** is a scatter plot where one of the variables is categorical. 
- **color theme** : Pantone 1805, Pantone 540

---

### Simple Explanation

- Calculate mean first
- Generate Uniform Distribution
    - Survived = 1, Survived = 0 
- Make a difference using color or luminance or transparency.
- (tips) Add legend
- (tips) Add explanation under the title.

In [None]:
survival_rate = data.groupby(['HomePlanet']).mean()[['Transported']]
earth_rate = survival_rate.loc['Earth']
europa_rate = survival_rate.loc['Europa']
mars_rate = survival_rate.loc['Mars']
display(survival_rate)

In [None]:
earth_pos = np.random.uniform(0, earth_rate, len(data[(data['HomePlanet']=='Earth') & (data['Transported']==True)]))
earth_neg = np.random.uniform(earth_rate, 1, len(data[(data['HomePlanet']=='Earth') & (data['Transported']==False)]))
europa_pos = np.random.uniform(0, europa_rate, len(data[(data['HomePlanet']=='Europa') & (data['Transported']==True)]))
europa_neg = np.random.uniform(europa_rate, 1, len(data[(data['HomePlanet']=='Europa') & (data['Transported']==False)]))
mars_pos = np.random.uniform(0, mars_rate, len(data[(data['HomePlanet']=='Mars') & (data['Transported']==True)]))
mars_neg = np.random.uniform(mars_rate, 1, len(data[(data['HomePlanet']=='Mars') & (data['Transported']==False)]))


In [None]:
sns.palplot(['#004c70', '#990000', 'lightgray'])

In [None]:
fig, ax = plt.subplots(1, 1, figsize=(14, 8))

np.random.seed(42)

# Earth Stripplot
ax.scatter(np.random.uniform(-0.3, 0.3, len(earth_pos)), earth_pos, color='#004c70', 
           edgecolor='lightgray', label='Earth(Transported=1)')
ax.scatter(np.random.uniform(-0.3, 0.3, len(earth_neg)), earth_neg, color='#004c70', 
           edgecolor='lightgray', alpha=0.2, label='Earth(Transported=0)')

# Europa Stripplot
ax.scatter(0.75 + np.random.uniform(-0.3, 0.3, len(europa_pos)), europa_pos, color='#990000', 
           edgecolor='lightgray', label='Europa(Transported=1)')
ax.scatter(0.75 + np.random.uniform(-0.3, 0.3, len(europa_neg)), europa_neg, color='#990000', 
           edgecolor='lightgray', alpha=0.2, label='Europa(Transported=0)')

# Europa Stripplot
ax.scatter(1.5 + np.random.uniform(-0.3, 0.3, len(mars_pos)), mars_pos, color='#ff4c70', 
           edgecolor='lightgray', label='Mars(Transported=1)')
ax.scatter(1.5 + np.random.uniform(-0.3, 0.3, len(mars_neg)), mars_neg, color='#ff4c70', 
           edgecolor='lightgray', alpha=0.2, label='Mars(Transported=0)')

# Set Figure & Axes
ax.set_xlim(-0.5, 2.0)
ax.set_ylim(-0.03, 1.1)

# Ticks
ax.set_xticks([0, 1])
ax.set_xticklabels(['Earth', 'Europa', 'Mars'], fontweight='bold', fontfamily='serif', fontsize=13)
ax.set_yticks([], minor=False)
ax.set_ylabel('')

# Spines
for s in ["top","right","left", 'bottom']:
    ax.spines[s].set_visible(False)


# Title & Explanation
fig.text(0.1, 1, 'Distribution of Transport by Planet', fontweight='bold', fontfamily='serif', fontsize=15)    
fig.text(0.1, 0.96, 'As is known, the transport rate for Europa is high followed by Mars and then Earth', fontweight='light', fontfamily='serif', fontsize=12)    

ax.legend( edgecolor='None')
plt.tight_layout()
plt.show()

The survival probability visualization for each pclass using the above method is as follows.

In [None]:

# You can also make this meta data using for-loop
survival_rate = data.groupby(['HomePlanet']).mean()[['Transported']]
print(survival_rate.head())
p1_rate = survival_rate.iloc[0]
p2_rate = survival_rate.iloc[1]
p3_rate = survival_rate.iloc[2]

print(p1_rate,p2_rate,p3_rate)
p1_pos = np.random.uniform(0, p1_rate, len(data[(data['HomePlanet']=='Earth') & (data['Transported']==1)]))
p1_neg = np.random.uniform(p1_rate, 1, len(data[(data['HomePlanet']=='Earth') & (data['Transported']==0)]))
p2_pos = np.random.uniform(0, p2_rate, len(data[(data['HomePlanet']=='Europa') & (data['Transported']==1)]))
p2_neg = np.random.uniform(p2_rate, 1, len(data[(data['HomePlanet']=='Europa') & (data['Transported']==0)]))
p3_pos = np.random.uniform(0, p3_rate, len(data[(data['HomePlanet']=='Mars') & (data['Transported']==1)]))
p3_neg = np.random.uniform(p3_rate, 1, len(data[(data['HomePlanet']=='Mars') & (data['Transported']==0)]))




fig, ax = plt.subplots(1, 1, figsize=(12, 7))

np.random.seed(42)

ax.scatter(np.random.uniform(-0.3, 0.3, len(p1_pos)), p1_pos, color='#022133', 
           edgecolor='lightgray', label='Earth(Transported=1)')
ax.scatter(np.random.uniform(-0.3, 0.3, len(p1_neg)), p1_neg, color='#022133', 
           edgecolor='lightgray', alpha=0.2, label='Earth(Transported=0)')
ax.scatter(np.random.uniform(1-0.3, 1+0.3, len(p2_pos)), p2_pos, color='#5c693b', 
           edgecolor='lightgray', label='Europa(Transported=1)')
ax.scatter(np.random.uniform(1-0.3, 1+0.3, len(p2_neg)), p2_neg, color='#5c693b', 
           edgecolor='lightgray', alpha=0.2, label='Europa(Transported=0)')
ax.scatter(np.random.uniform(2-0.3, 2+0.3, len(p3_pos)), p3_pos, color='#51371c', 
           edgecolor='lightgray', label='Mars(Transported=1)')
ax.scatter(np.random.uniform(2-0.3, 2+0.3, len(p3_neg)), p3_neg, color='#51371c', 
           edgecolor='lightgray', alpha=0.2, label='PMars(Transported=0)')



# # Set Figure & Axes
ax.set_xlim(-0.5, 4.0)
ax.set_ylim(-0.03, 1.1)

# # Ticks
ax.set_xticks([0, 1, 2])
ax.set_xticklabels(['Earth', 'Europa', 'Mars'], fontweight='bold', fontfamily='serif', fontsize=13)
ax.set_yticks([], minor=False)
ax.set_ylabel('')

# Spines
for s in ["top","right","left", 'bottom']:
    ax.spines[s].set_visible(False)


# Title & Explanation
fig.text(0.06, 0.95, 'Distribution of Transportation by Planet', fontweight='bold', fontfamily='serif', fontsize=15, ha='left')    


ax.legend(loc=(0.67, 0.5), edgecolor='None')
plt.tight_layout()
plt.show()

## Awesome Heatmap

- A **heat map (or heatmap)** is a data visualization technique that shows magnitude of a phenomenon as color in two dimensions.

---

### Simple Explanation

- (tip) `mask` (remove symetric)
- (tip) `square` (to make x-y scale same)
- (tip) `colormap (diverging colormap)
- (tip) text as watermark


In [None]:
print(data.tail())
print(data.columns)

In [None]:
vips = data.groupby(['VIP']).mean()[['Transported']]
print(vips.head())
dest = data.groupby(['Destination']).mean()[['Transported']]
print(dest.head())

In [None]:
data['VIP'] = data['VIP'].map({'Ordinary':0, 'VIP':1})
data['Destination'] = data['Destination'].fillna('Trappist')
data['Destination'] = data['Destination'].map({'55 Cancri e':'Cancri', 'PSO J318.5-22':'PSO', 'TRAPPIST-1e':'TRAPPIST'})
#data['Family'] = data['SibSp'] + data['Parch']
data = data[[col for col in data.columns if col !='Transported']+ ['Transported']]  
corr = data.corr()
corr

In [None]:
import random
def rand_col():
    hexadecimal = ["#"+''.join([random.choice('0123456789ABCDEF') for i in range(6)])]
    #print("A Random color is :",hexadecimal)
    return hexadecimal

In [None]:
def plotkde(axs,row,colno, data, test_data,col):
    #axrow[0].plot(x, color='red')
    #axrow[1].plot(y, color='green')
    sns.kdeplot(data=data, x=col,  ax=axs[row], color=rand_col(), shade=True,  bw_adjust=2)
    sns.kdeplot(data=test_data, ax=axs[row], x=col, color=rand_col(), shade=True, bw_adjust=2)
    
    
def plotbar(axs,row,colno, train_data, test_data,col):  
    data = train_data[col].value_counts()
    data2 = test_data[col].value_counts()
    x = np.arange(len(data))
    ax.bar(x, height=data, zorder=3, color=rand_col(), width=0.7, alpha=0.8)
    #ax.bar(x, height=data2,  zorder=3, color=rand_col(), width=0.3, alpha=0.5)
    ax.set_ylabel(" ")
    ax.set_xticks(x)
    ax.set_xticklabels(labels = data.index.tolist())
    #ax.yaxis.set_major_locator(mtick.MultipleLocator(1000))
   # sns.countplot(x=col, data=train_data, kind='count', order=sorted(train_data[col].value_counts()), ax=axs)
   # sns.countplot(x=col, data=test_data, kind='count', order=sorted(test_data[col].value_counts()), ax=axs)
        

In [None]:
numcols = ('Age','RoomService','FoodCourt','ShoppingMall','Spa','VRDeck')
nonumcols = ('HomePlanet','Destination','CryoSleep','VIP','Deck','Side')

nrows = 6
ncols = 1
fig, axes = plt.subplots(nrows, ncols, figsize=(10, 30))

plt.subplots_adjust(hspace=0.5)
fig.suptitle("KDEPlot for numeric values", fontsize=18, y=0.95)
i = 0
j = 0
for x in range(nrows):
#    print(row)
    plotkde(axes,i,j, data, test_data, numcols[i])
    i = i + 1
plt.show()





In [None]:
test_data.describe().T.style.bar(subset=['mean'], color='#205ff2')\
                            .background_gradient(subset=['std'], cmap='Reds')\
                            .background_gradient(subset=['50%'], cmap='coolwarm')

In [None]:
data.describe().T.style.bar(subset=['mean'], color='#205ff2')\
                            .background_gradient(subset=['std'], cmap='Reds')\
                            .background_gradient(subset=['50%'], cmap='coolwarm')

In [None]:
def diff_color(x):
    color = 'red' if x<0 else ('green' if x > 0 else 'black')
    return f'color: {color}'


'''
plt.subplots_adjust(hspace=0.5)
fig.suptitle("Count Plot for non numeric values", fontsize=18, y=0.95)
i = 0
j = 0
for x in range(nrows):
#    print(row)
    cols = nonumcols[i]
    plotbar(axes,i,j, data, test_data, nonumcols[i])
    i = i + 1
plt.show()
'''

In [None]:
sns.color_palette(sns.diverging_palette(230, 20))

In [None]:
#Thanks to https://www.kaggle.com/abhinavkum/futuristic-titanic-eda 
from termcolor import colored
import plotly.express as px
import plotly.graph_objects as go
df = pd.concat([data, test_data]).groupby(['HomePlanet', 'Destination']).size()
df = pd.concat([df, df.groupby(level=0).apply(lambda x: (x/x.sum()*100).round(1))], axis=1).reset_index()
df = df.rename(columns ={0:'count', 1:'percentage'} )


fig = px.sunburst(df, path=['HomePlanet', 'Destination'],
                  values='count', color='HomePlanet',color_discrete_sequence=['mediumblue', 'mediumorchid', 'mediumpurple'],
                  title='Spaceship "Titanic" final Destination', width=1000, height=500,
                  custom_data=['percentage']
                  
                 ) 
fig.update_traces(hovertemplate="<br>".join([
    "Count:%{value}",
]),
                  textinfo="label+percent parent"
)

fig.update_layout(hoverlabel=dict(
    bgcolor='aliceblue',
    font_size=16,
    font_color='burlywood',
    font_family="roboto"
),
                  paper_bgcolor='mediumturquoise',

                 
)

fig.show()

In [None]:
#Thanks to https://www.kaggle.com/abhinavkum/futuristic-titanic-eda 

df = pd.concat([data, test_data]).groupby(['HomePlanet', 'Transported']).size()
df = pd.concat([df, df.groupby(level=0).apply(lambda x: (x/x.sum()*100).round(1))], axis=1).reset_index()
df = df.rename(columns ={0:'count', 1:'percentage'} )


fig = px.sunburst(df, path=['HomePlanet', 'Transported'],
                  values='count', color='HomePlanet',color_discrete_sequence=['mediumblue', 'mediumorchid', 'mediumpurple'],
                  title='Spaceship "Titanic" final Transported', width=1000, height=500,
                  custom_data=['percentage']
                  
                 ) 
fig.update_traces(hovertemplate="<br>".join([
    "Count:%{value}",
]),
                  textinfo="label+percent parent"
)

fig.update_layout(hoverlabel=dict(
    bgcolor='aliceblue',
    font_size=16,
    font_color='burlywood',
    font_family="roboto"
),
                  paper_bgcolor='mediumturquoise',

                 
)

fig.show()

In [None]:
#Thanks to https://www.kaggle.com/abhinavkum/futuristic-titanic-eda 
df = pd.concat([data, test_data]).groupby(['Destination', 'Transported']).size()
df = pd.concat([df, df.groupby(level=0).apply(lambda x: (x/x.sum()*100).round(1))], axis=1).reset_index()
df = df.rename(columns ={0:'count', 1:'percentage'} )


fig = px.sunburst(df, path=['Destination', 'Transported'],
                  values='count', color='Destination',color_discrete_sequence=['mediumblue', 'mediumorchid', 'mediumpurple'],
                  title='Spaceship "Titanic" final Transported to Destination', width=1000, height=500,
                  custom_data=['percentage']
                  
                 ) 
fig.update_traces(hovertemplate="<br>".join([
    "Count:%{value}",
]),
                  textinfo="label+percent parent"
)

fig.update_layout(hoverlabel=dict(
    bgcolor='aliceblue',
    font_size=16,
    font_color='burlywood',
    font_family="roboto"
),
                  paper_bgcolor='mediumturquoise',

                 
)

fig.show()

In [None]:
fig, ax = plt.subplots(1, 1, figsize=(7, 7))

mask = np.zeros_like(corr, dtype=np.bool)
mask[np.triu_indices_from(mask)] = True

cmap = sns.diverging_palette(230, 20, as_cmap=True)

sns.heatmap(corr, 
            square=True, 
            mask=mask,
            linewidth=2.5, 
            vmax=0.4, vmin=-0.4, 
            cmap=cmap, 
            cbar=False, 
            ax=ax)

ax.set_yticklabels(ax.get_xticklabels(), fontfamily='serif', rotation = 0, fontsize=11)
ax.set_xticklabels(ax.get_xticklabels(), fontfamily='serif', rotation=90, fontsize=11)

ax.spines['top'].set_visible(True)

fig.text(0.97, 1, 'Correlation Heatmap Visualization', fontweight='bold', fontfamily='serif', fontsize=15, ha='right')    
fig.text(0.97, 0.92, 'Dataset : Spaceship Titanic\nAuthor : Kalilur Rahman', fontweight='light', fontfamily='serif', fontsize=12, ha='right')    

plt.tight_layout()
plt.show()

## Awesome Dimension Reduction + Scatter Plot

- UMAP(Dimension Reduction) + Scatterplot

### Simple Explanation

In the past, **dimensional reduction** has been widely used in multidimensional data analysis. 

However, it is seldom used to analyze it after clustering. 

This visualization draws separate graphs based on gender and Pclass based on the clustering results according to survival, so that the distribution of survivors according to the criteria can be viewed differently.

In [None]:
from umap import UMAP

# Dimension Reduction

transported = data['Transported'] 
data_sub = data.drop(['Transported', 'HomePlanet', 'Name','CryoSleep', 'Cabin', 'Destination', 'age_band'], axis=1).fillna(0)

umap = UMAP(random_state=0)
titanic_umap = umap.fit_transform(data_sub, transported)



In [None]:
fig, axes = plt.subplots(3, 3, figsize=(12, 12))

# Transported
axes[0][0].scatter(titanic_umap[data['Transported']==1][:,0], 
                   titanic_umap[data['Transported']==1][:,1],
                   c='#8abbd0', alpha=0.25, label='Transported:1')
axes[0][0].scatter(titanic_umap[data['Transported']==0][:,0], 
                   titanic_umap[data['Transported']==0][:,1], 
                   c='#4a4a4a', alpha=0.25, label='Transported:0')

# Home Planet
axes[1][0].scatter(titanic_umap[data['HomePlanet']=='Earth'][:,0], 
                   titanic_umap[data['HomePlanet']=='Earth'][:,1], 
                   c='#004c70', alpha=0.1, label='Earth')
axes[1][1].scatter(titanic_umap[data['HomePlanet']=='Europa'][:,0], 
                   titanic_umap[data['HomePlanet']=='Europa'][:,1], 
                   c='#990000', alpha=0.1, label='Europa')
axes[1][2].scatter(titanic_umap[data['HomePlanet']=='Mars'][:,0], 
                   titanic_umap[data['HomePlanet']=='Mars'][:,1], 
                   c='#990000', alpha=0.1, label='Mars')

# Destination
axes[2][0].scatter(titanic_umap[data['Destination']=='Cancri'][:,0], 
                   titanic_umap[data['Destination']=='Cancri'][:,1], 
                   c="#022133", alpha=0.1, label='Destination : Cancri')
axes[2][1].scatter(titanic_umap[data['Destination']=='PSO'][:,0], 
                   titanic_umap[data['Destination']=='PSO'][:,1], 
                   c='#5c693b', alpha=0.1, label='Destination : PSO')
axes[2][2].scatter(titanic_umap[data['Destination']=='TRAPPIST'][:,0], 
                   titanic_umap[data['Destination']=='TRAPPIST'][:,1], 
                   c='#51371c', alpha=0.1, label='Destination : TRAPPIST')

for i in range(3):
    for j in range(3):
        axes[i][j].set_xticks([])
        axes[i][j].set_yticks([])
        for s in ["top","right","left", 'bottom']:
            axes[i][j].spines[s].set_visible(False)
        if j <= i : axes[i][j].legend()

            

# Text Part
fig.text(0.97, 1, 'Explore Embedding Space', fontweight='bold', fontfamily='serif', fontsize=20, ha='right')   
fig.text(0.97, 0.975, 'Author : Kalilur Rahman', fontweight='light', fontfamily='serif', fontsize=12, ha='right')

fig.text(0.97, 0.94, '''
Analysis in Progress''', 
         fontweight='light', fontfamily='serif', fontsize=12, va='top', ha='right')   


plt.tight_layout()
plt.show()


### Bad Dimension Reduction Case Example

- using tsne

In [None]:
from sklearn.manifold import TSNE
tsne = TSNE(random_state=0)
titanic_tsne = tsne.fit_transform(data_sub, transported)

In [None]:
fig, axes = plt.subplots(3, 3, figsize=(12, 12))

# Transported
axes[0][0].scatter(titanic_tsne[data['Transported']==1][:,0], 
                   titanic_tsne[data['Transported']==1][:,1],
                   c='#8abbd0', alpha=0.25, label='Transported:1')
axes[0][0].scatter(titanic_tsne[data['Transported']==0][:,0], 
                   titanic_tsne[data['Transported']==0][:,1], 
                   c='#4a4a4a', alpha=0.25, label='Transported:0')

# Home Planet
axes[1][0].scatter(titanic_tsne[data['HomePlanet']=='Earth'][:,0], 
                   titanic_tsne[data['HomePlanet']=='Earth'][:,1], 
                   c='#004c70', alpha=0.1, label='Earth')
axes[1][1].scatter(titanic_tsne[data['HomePlanet']=='Europa'][:,0], 
                   titanic_tsne[data['HomePlanet']=='Europa'][:,1], 
                   c='#990000', alpha=0.1, label='Europa')
axes[1][2].scatter(titanic_tsne[data['HomePlanet']=='Mars'][:,0], 
                   titanic_tsne[data['HomePlanet']=='Mars'][:,1], 
                   c='#990000', alpha=0.1, label='Mars')

# Destination
axes[2][0].scatter(titanic_tsne[data['Destination']=='Cancri'][:,0], 
                   titanic_tsne[data['Destination']=='Cancri'][:,1], 
                   c="#022133", alpha=0.1, label='Destination : Cancri')
axes[2][1].scatter(titanic_tsne[data['Destination']=='PSO'][:,0], 
                   titanic_tsne[data['Destination']=='PSO'][:,1], 
                   c='#5c693b', alpha=0.1, label='Destination : PSO')
axes[2][2].scatter(titanic_tsne[data['Destination']=='TRAPPIST'][:,0], 
                   titanic_tsne[data['Destination']=='TRAPPIST'][:,1], 
                   c='#51371c', alpha=0.1, label='Destination : TRAPPIST')

for i in range(3):
    for j in range(3):
        axes[i][j].set_xticks([])
        axes[i][j].set_yticks([])
        for s in ["top","right","left", 'bottom']:
            axes[i][j].spines[s].set_visible(False)
        if j <= i : axes[i][j].legend()

            

# Text Part
fig.text(0.97, 1, 'Spaceship Titanic Explore Embedding Space', fontweight='bold', fontfamily='serif', fontsize=20, ha='right')   
fig.text(0.97, 0.975, 'Author : Kalilur Rahman', fontweight='light', fontfamily='serif', fontsize=12, ha='right')

fig.text(0.97, 0.94, '''
Analysis in Progress''', 
         fontweight='light', fontfamily='serif', fontsize=12, va='top', ha='right')   


plt.tight_layout()
plt.show()


In [None]:
data.head()

## Awesome Correlation Plots

In [None]:
#Thanks to Subinium - https://www.kaggle.com/subinium/tps-apr-highlighting-the-data
fig, axes = plt.subplots(1, 2, figsize=(10, 15))
for idx, feature in enumerate(['HomePlanet', 'Destination']):
    sns.heatmap(data.groupby(['age_band', feature])['Transported'].aggregate('mean').unstack()*100, ax=axes[idx],
                square=True, annot=True, fmt='.2f', center=mean, linewidth=2,
                cbar_kws={"orientation": "horizontal"}, cmap=sns.diverging_palette(240, 10, as_cmap=True)
               ) 

axes[0].set_title('Age Band & Homeplanet Survived Ratio', loc='left', fontweight='bold')    
axes[1].set_title('Age Band & Destination Survived Ratio', loc='left', fontweight='bold')    
plt.show()

In [None]:
fig, axes = plt.subplots(2, 1, figsize=(12 , 9), sharex=True)

for idx, feature in enumerate(['HomePlanet', 'Destination']):
    sns.heatmap(data.groupby([feature, 'age_band'])['Transported'].aggregate('mean').unstack()*100, ax=axes[idx],
                square=True, annot=True, fmt='.2f', center=mean, linewidth=2,
                cbar=False, cmap=sns.diverging_palette(240, 10, as_cmap=True)
               ) 

axes[0].set_title('HomePlanet & Age Band Transported  Ratio', loc='left', fontweight='bold')    
axes[1].set_title('Destination & Age Band Transported Ratio', loc='left', fontweight='bold')       
plt.show()

In [None]:
fig, ax = plt.subplots(1, 3, figsize=(17 , 5))

feature_lst = ['Age', 'RoomService', 'FoodCourt','ShoppingMall','Spa', 'VRDeck']

corr = data[feature_lst].corr()

mask = np.zeros_like(corr, dtype=np.bool)
mask[np.triu_indices_from(mask)] = True


for idx, method in enumerate(['pearson', 'kendall', 'spearman']):
    sns.heatmap(data[feature_lst].corr(method=method), ax=ax[idx],
            square=True, annot=True, fmt='.2f', center=0, linewidth=2,
            cbar=False, cmap=sns.diverging_palette(240, 10, as_cmap=True),
            mask=mask
           ) 
    ax[idx].set_title(f'{method.capitalize()} Correlation', loc='left', fontweight='bold')     

plt.show()

In [None]:
for cols in data.columns:
    print(cols + " ==> ")
    print(data[cols].unique())

## Awesome GridSpec plots

In [None]:
fig = plt.figure(figsize=(17, 20))
gs = fig.add_gridspec(5,3)
ax = fig.add_subplot(gs[0,:2])
sns.scatterplot(y='age_band', x='RoomService', hue='Transported', 
                data=data, ax=ax)
ax.set_title(f'Age & Room Service', loc='left', fontweight='bold')

ax = fig.add_subplot(gs[1,:2])
sns.scatterplot(y='age_band', x='FoodCourt', hue='Transported', 
                data=data, ax=ax)
ax.set_title(f'Age & Food Court', loc='left', fontweight='bold')
ax = fig.add_subplot(gs[2,:2])
sns.scatterplot(y='age_band', x='ShoppingMall', hue='Transported', 
                data=data, ax=ax)
ax.set_title(f'Age & Shopping Mall', loc='left', fontweight='bold')
ax = fig.add_subplot(gs[3,:2])
sns.scatterplot(y='age_band', x='Spa', hue='Transported', 
                data=data, ax=ax)
ax.set_title(f'Age & Spa', loc='left', fontweight='bold')

ax = fig.add_subplot(gs[4,:2])
sns.scatterplot(y='age_band', x='VRDeck', hue='Transported', 
                data=data, ax=ax)
ax.set_title(f'Age & VRDeck', loc='left', fontweight='bold')

    
plt.show()

In [None]:
nan_data = (data.isna().sum().sort_values(ascending=False) / len(data) * 100)[:6]
fig, ax = plt.subplots(1,1,figsize=(7, 5))

ax.bar(nan_data.index, 100, color='#dadada', width=0.6)

bar = ax.bar(nan_data.index,nan_data, color=rand_col(), width=0.6)
#ax.bar_label(bar, fmt='%.01f %%')
#ax.spines.left.set_visible(False)
ax.set_yticks([])
ax.set_title('Null Data Ratio', fontweight='bold')

plt.show()

In [None]:
def age_band(num):
    for i in range(1, 100):
        if num < 10*i :  return f'{(i-1) * 10} ~ {i*10}'

#data['Age band'] = data['Age'].apply(age_band)
data_age = data[['age_band', 'Transported']].groupby('age_band')['Transported'].value_counts().sort_index().unstack().fillna(0)
data_age['Transport rate'] = data_age[1] / (data_age[0] + data_age[1]) * 100
age_band = data['age_band'].value_counts().sort_index()

## Awesome Bar Charts with annotations

In [None]:
from mpl_toolkits.axes_grid1.axes_divider import make_axes_locatable

fig = plt.figure(figsize=(15, 10))
gs = fig.add_gridspec(3, 4)
ax = fig.add_subplot(gs[:-1,:])

color_map = ['#d4dddd' for _ in range(9)]
color_map[2] = light_palette[3]
color_map[8] = light_palette[2]


bars = ax.bar(data_age['Transport rate'].index, data_age['Transport rate'], 
       color=color_map, width=0.55, 
       edgecolor='black', 
       linewidth=0.7)

#ax.spines[["top","right","left"]].set_visible(False)
#ax.bar_label(bars, fmt='%.2f%%')


# mean line + annotation
mean = data['Transported'].mean() *100
ax.axhline(mean ,color='black', linewidth=0.4, linestyle='dashdot')
ax.annotate(f"mean : {mean :.4}%", 
            xy=('20 ~ 30', mean + 4),
            va = 'center', ha='center',
            color='#4a4a4a',
            bbox=dict(boxstyle='round', pad=0.4, facecolor='#efe8d1', linewidth=0))
    


ax.set_yticks(np.arange(0, 81, 20))
ax.grid(axis='y', linestyle='-', alpha=0.4)
ax.set_ylim(0, 85)


ax_bottom = fig.add_subplot(gs[-1,:])
bars = ax_bottom.bar(age_band.index, age_band, width=0.55, 
       edgecolor='black', 
       linewidth=0.7)

#ax_bottom.spines[["top","right","left"]].set_visible(False)
#ax_bottom.bar_label(bars, fmt='%d', label_type='center', color='white')
ax_bottom.grid(axis='y', linestyle='-', alpha=0.4)

# Title & Subtitle    
fig.text(0.1, 1, 'Age Band & Transport Rate', fontsize=15, fontweight='bold', fontfamily='serif', ha='left')
fig.text(0.1, 0.96, 'The transport rate of infants and toddlers is very high. It is below average for the biggest group 20-30 years', fontsize=12, fontweight='light', fontfamily='serif', ha='left')

plt.show()

## Awesome Pivot Table with Style and Color coding

In [None]:
cols_to_plot = ['RoomService','FoodCourt','ShoppingMall','Spa','VRDeck']



pd.pivot_table(data,values='RoomService',index=['HomePlanet'],columns=['Transported'],aggfunc=[np.mean, np.std]).style.bar(subset=['mean'], color='#205ff2').background_gradient(subset=['std'], cmap='Greens')


        

In [None]:

pd.pivot_table(data,values='RoomService',index=['Destination'],columns=['Transported'],aggfunc=[np.mean, np.std]).style.bar(subset=['mean'], color='#205ff2').background_gradient(subset=['std'], cmap='Greens')



In [None]:
pd.pivot_table(data,values='FoodCourt',index=['HomePlanet'],columns=['Transported'],aggfunc=[np.mean, np.std]).style.bar(subset=['mean'], color='#205ff2').background_gradient(subset=['std'], cmap='Greens')

In [None]:
pd.pivot_table(data,values='FoodCourt',index=['Destination'],columns=['Transported'],aggfunc=[np.mean, np.std]).style.bar(subset=['mean'], color='#205ff2').background_gradient(subset=['std'], cmap='Greens')

In [None]:

pd.pivot_table(data,values='ShoppingMall',index=['HomePlanet'],columns=['Transported'],aggfunc=[np.mean, np.std]).style.bar(subset=['mean'], color='#205ff2').background_gradient(subset=['std'], cmap='Greens')

In [None]:

pd.pivot_table(data,values='ShoppingMall',index=['Destination'],columns=['Transported'],aggfunc=[np.mean, np.std]).style.bar(subset=['mean'], color='#205ff2').background_gradient(subset=['std'], cmap='Greens')

In [None]:

pd.pivot_table(data,values='Spa',index=['HomePlanet'],columns=['Transported'],aggfunc=[np.mean, np.std]).style.bar(subset=['mean'], color='#205ff2').background_gradient(subset=['std'], cmap='Greens')

In [None]:

pd.pivot_table(data,values='Spa',index=['Destination'],columns=['Transported'],aggfunc=[np.mean, np.std]).style.bar(subset=['mean'], color='#205ff2').background_gradient(subset=['std'], cmap='Greens')

In [None]:

pd.pivot_table(data,values='VRDeck',index=['HomePlanet'],columns=['Transported'],aggfunc=[np.mean, np.std]).style.bar(subset=['mean'], color='#205ff2').background_gradient(subset=['std'], cmap='Greens')

In [None]:

pd.pivot_table(data,values='VRDeck',index=['Destination'],columns=['Transported'],aggfunc=[np.mean, np.std]).style.bar(subset=['mean'], color='#205ff2').background_gradient(subset=['std'], cmap='Greens')

In [None]:
try:
    import stylecloud
except:
    !pip install stylecloud
    import stylecloud
    
from IPython.display import Image

## Awesome SkyCloud word chart using Names

In [None]:
#Code by Kapa Kudaibergenov https://www.kaggle.com/kapakudaibergenov/stylecloud/notebook
data['Name'].fillna('NoName')

concat_features = ' '.join([i for i in data.Name.astype(str)])
print(concat_features[:1000])

In [None]:
stylecloud.gen_stylecloud(text=concat_features,
                          icon_name='fas fa-space-shuttle',
                          palette='colorbrewer.diverging.Spectral_11',
                          background_color='#87CEFA',
                          gradient='horizontal',
                          size=1024)


In [None]:
Image(filename="./stylecloud.png", width=1024, height=1024)

In [None]:
#sns.pairplot(data, hue='Destination')

## Awesome Seaborn JointPlots with Hues on HomePlanet and Destination

In [None]:
sns.jointplot(x="Spa",y="Transported",data=data,hue="Destination")
sns.jointplot(x="Spa",y="Transported",data=data,hue="HomePlanet")
sns.jointplot(x="ShoppingMall",y="Transported",data=data,hue="Destination")
sns.jointplot(x="ShoppingMall",y="Transported",data=data,hue="HomePlanet")
sns.jointplot(x="RoomService",y="Transported",data=data,hue="Destination")
sns.jointplot(x="RoomService",y="Transported",data=data,hue="HomePlanet")
sns.jointplot(x="FoodCourt",y="Transported",data=data,hue="Destination")
sns.jointplot(x="FoodCourt",y="Transported",data=data,hue="HomePlanet")
sns.jointplot(x="VRDeck",y="Transported",data=data,hue="Destination")
sns.jointplot(x="VRDeck",y="Transported",data=data,hue="HomePlanet")


## Awesome Seaborn JointPlots with Hues on Destination

In [None]:
sns.jointplot(x="Spa",y="Age",data=data,hue='Destination',height=8,s=30,alpha=0.7)
sns.jointplot(x="ShoppingMall",y="Age",data=data,hue='Destination',height=8,s=30,alpha=0.7)
sns.jointplot(x="RoomService",y="Age",data=data,hue='Destination',height=8,s=30,alpha=0.7)
sns.jointplot(x="FoodCourt",y="Age",data=data,hue='Destination',height=8,s=30,alpha=0.7)
sns.jointplot(x="VRDeck",y="Age",data=data,hue='Destination',height=8,s=30,alpha=0.7)

## Awesome Seaborn JointPlots with Hues on HomePlanet

In [None]:
sns.jointplot(x="Spa",y="Age",data=data,hue='HomePlanet',height=8,s=30,alpha=0.7)
sns.jointplot(x="ShoppingMall",y="Age",data=data,hue='HomePlanet',height=8,s=30,alpha=0.7)
sns.jointplot(x="RoomService",y="Age",data=data,hue='HomePlanet',height=8,s=30,alpha=0.7)
sns.jointplot(x="FoodCourt",y="Age",data=data,hue='HomePlanet',height=8,s=30,alpha=0.7)
sns.jointplot(x="VRDeck",y="Age",data=data,hue='HomePlanet',height=8,s=30,alpha=0.7)

## Awesome Box Plots

In [None]:
#sns.jointplot(x="Spa",y="Age",data=data,hue='HomePlanet',height=8,s=30,alpha=0.7)

sns.boxplot(y='HomePlanet',x='Age',data=data)

In [None]:
sns.boxplot(y='Destination',x='Age',data=data)

In [None]:
sns.boxplot(x='Transported',y='Age',data=data)

In [None]:
sns.boxplot(y="Spa",x="age_band",data=data,hue='HomePlanet')

In [None]:

sns.boxplot(y="ShoppingMall",x="age_band",data=data,hue='HomePlanet')

In [None]:

sns.boxplot(y="RoomService",x="age_band",data=data,hue='HomePlanet')

In [None]:

sns.boxplot(y="FoodCourt",x="age_band",data=data[(data['FoodCourt']<5000)],hue='HomePlanet')

In [None]:

sns.boxplot(y="VRDeck",x="age_band",data=data[(data['VRDeck']<5000)],hue='HomePlanet')

In [None]:
sns.boxplot(y="Spa",x="age_band",data=data,hue='Destination')

In [None]:

sns.boxplot(y="ShoppingMall",x="age_band",data=data,hue='Destination')

In [None]:

sns.boxplot(y="RoomService",x="age_band",data=data,hue='Destination')

In [None]:

sns.boxplot(y="FoodCourt",x="age_band",data=data,hue='Destination')

In [None]:

sns.boxplot(y="VRDeck",x="age_band",data=data,hue='Destination')

## Awesome Violin plots

In [None]:
sns.violinplot(x="Spa",y="age_band",data=data,hue='HomePlanet')


In [None]:
sns.violinplot(x="ShoppingMall",y="age_band",data=data,hue='HomePlanet')


In [None]:
sns.violinplot(x="RoomService",y="age_band",data=data,hue='HomePlanet')


In [None]:
sns.violinplot(x="FoodCourt",y="age_band",data=data,hue='HomePlanet')

In [None]:

sns.violinplot(x="VRDeck",y="age_band",data=data,hue='HomePlanet')

In [None]:
sns.violinplot(y="VRDeck",x="Transported",data=data[(data['VRDeck']<100)],hue='HomePlanet')

In [None]:
numcols = ('Age','RoomService','FoodCourt','ShoppingMall','Spa','VRDeck','HomePlanet')
nonumcols = ('HomePlanet','Destination','CryoSleep','VIP','Deck','Side')

# Need to replace with a simple function or a for loop
data_red = data[['Age','RoomService','FoodCourt','ShoppingMall','Spa','VRDeck','HomePlanet']]
sns.pairplot(data_red, hue='HomePlanet', diag_kind='hist')
data_red = data[['Age','RoomService','FoodCourt','ShoppingMall','Spa','VRDeck','Destination']]
sns.pairplot(data_red, hue='Destination')
data_red = data[['Age','RoomService','FoodCourt','ShoppingMall','Spa','VRDeck','CryoSleep']]
sns.pairplot(data_red, hue='CryoSleep')


## Work in Progress - More to come


### Please be sure to leave a reference to the sources

##### I referred to @subinium @abhinavkum and @mpwolke for this notebook. Thanks to them for some reusable and relevant code

