# Data set of "Titanic"

This code demonstrates how to load the Titanic datase

In [1]:
import numpy as np
import pandas as pd

train_data=pd.read_csv("train.csv")



Display first and last rows: The head(1) and tail(1) functions are used to view the first and last row of the dataset. Doing this to check if the dataset is loaded properly or not

In [2]:
train_data.head(1)  

Unnamed: 0,PassengerId,Survived,Pclass,Name,Gender,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S


In [3]:
train_data.tail(1)

Unnamed: 0,PassengerId,Survived,Pclass,Name,Gender,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
890,891,0,3,"Dooley, Mr. Patrick",male,32.0,0,0,370376,7.75,,Q


Filtering only numeric columns: The select_dtypes() function is used to filter out only the numeric columns from the dataset.

In [4]:
#chosing only numeric colmns
data=train_data.select_dtypes(include=[np.number])

Checking for missing values: To ensure data quality, the code checks for any missing values in the dataset 
using isnull().sum(), which counts the number of null values in each column.

In [5]:
data.isnull().sum()

PassengerId      0
Survived         0
Pclass           0
Age            177
SibSp            0
Parch            0
Fare             0
dtype: int64

Handling missing values: Missing values in the dataset can be filled using the fillna() function, 
which replaces null cells with a specified value (e.g., 0).

In [11]:
train_data = train_data.fillna(0)

# Seperating data into x and y

Preparing data for model input (features and labels): The Survived column is separated as the label (y_data), 
and the remaining numeric features become the input (x_data).

In [12]:
x_data=data.drop("Survived",axis=1).values
y_data=data["Survived"].values


# pytorch and tensor

Converting data to PyTorch tensors: Both x_data (features) and y_data (labels) are converted into PyTorch tensors, which can then be used as input to a neural network.

What is PyTorch?
PyTorch is a popular open-source machine learning library that provides tensor computation with strong GPU acceleration, making it especially useful for deep learning tasks. PyTorch is known for its dynamic computation graphs, which makes it easier to debug and build neural networks in a flexible way.

What are Tensors?
Tensors are multi-dimensional arrays that are a generalization of matrices. In deep learning, tensors are used to store data for computations, such as inputs to a neural network. They can have any number of dimensions, which makes them ideal for representing data like images, text, or other complex datasets.

Why Convert Data to Tensors?
Neural networks in PyTorch require data to be in the form of tensors because tensors allow for efficient computation on both CPUs and GPUs. Converting features and labels to tensors enables PyTorch to perform matrix operations, which are essential for the forward and backward passes in training a model. This step is crucial for optimizing the model using gradient-based methods like backpropagation.

In [13]:
import torch as t
x_tensor = t.tensor(x_data, dtype=t.float32)
y_tensor = t.tensor(y_data, dtype=t.float32)


Printing tensors: Finally, the code prints out the tensor values for x_tensor and y_tensor to verify the conversion.

In [14]:
print(x_tensor)

tensor([[  1.0000,   3.0000,  22.0000,   1.0000,   0.0000,   7.2500],
        [  2.0000,   1.0000,  38.0000,   1.0000,   0.0000,  71.2833],
        [  3.0000,   3.0000,  26.0000,   0.0000,   0.0000,   7.9250],
        ...,
        [889.0000,   3.0000,      nan,   1.0000,   2.0000,  23.4500],
        [890.0000,   1.0000,  26.0000,   0.0000,   0.0000,  30.0000],
        [891.0000,   3.0000,  32.0000,   0.0000,   0.0000,   7.7500]])


In [16]:
print(y_tensor)

tensor([0., 1., 1., 1., 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 1., 0., 1.,
        0., 1., 0., 1., 1., 1., 0., 1., 0., 0., 1., 0., 0., 1., 1., 0., 0., 0.,
        1., 0., 0., 1., 0., 0., 0., 1., 1., 0., 0., 1., 0., 0., 0., 0., 1., 1.,
        0., 1., 1., 0., 1., 0., 0., 1., 0., 0., 0., 1., 1., 0., 1., 0., 0., 0.,
        0., 0., 1., 0., 0., 0., 1., 1., 0., 1., 1., 0., 1., 1., 0., 0., 1., 0.,
        0., 0., 0., 0., 0., 0., 0., 1., 1., 0., 0., 0., 0., 0., 0., 0., 1., 1.,
        0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 1.,
        0., 1., 1., 0., 0., 0., 0., 1., 0., 0., 1., 0., 0., 0., 0., 1., 1., 0.,
        0., 0., 1., 0., 0., 0., 0., 1., 0., 0., 0., 0., 1., 0., 0., 0., 0., 1.,
        0., 0., 0., 1., 1., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 1., 1., 0., 1., 1., 0., 0., 1., 0., 1., 1., 1., 1., 0., 0.,
        1., 0., 0., 0., 0., 0., 1., 0., 0., 1., 1., 1., 0., 1., 0., 0., 0., 1.,
        1., 0., 1., 0., 1., 0., 0., 0., 