Feature engineering is the process of transforming data into meaningful features that can be used by a machine learning model:

1. Handling missing values by using the median or placeholder strings/numbers.

2. Encoding categorical values into numbers.

3. Extracting useful information.

4. Erasing unimportant columns



In [346]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.svm import SVC
from sklearn.svm import LinearSVC
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
from sklearn.model_selection import RandomizedSearchCV

In [347]:
test=pd.read_csv("test.csv")

In [348]:
train=pd.read_csv("train.csv")
train.head()

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S


In [349]:
train.shape

(891, 12)

In [350]:
train.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 891 entries, 0 to 890
Data columns (total 12 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   PassengerId  891 non-null    int64  
 1   Survived     891 non-null    int64  
 2   Pclass       891 non-null    int64  
 3   Name         891 non-null    object 
 4   Sex          891 non-null    object 
 5   Age          714 non-null    float64
 6   SibSp        891 non-null    int64  
 7   Parch        891 non-null    int64  
 8   Ticket       891 non-null    object 
 9   Fare         891 non-null    float64
 10  Cabin        204 non-null    object 
 11  Embarked     889 non-null    object 
dtypes: float64(2), int64(5), object(5)
memory usage: 83.7+ KB


In [351]:
train.isnull().sum()

PassengerId      0
Survived         0
Pclass           0
Name             0
Sex              0
Age            177
SibSp            0
Parch            0
Ticket           0
Fare             0
Cabin          687
Embarked         2
dtype: int64

In [352]:
test.head()

Unnamed: 0,PassengerId,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,892,3,"Kelly, Mr. James",male,34.5,0,0,330911,7.8292,,Q
1,893,3,"Wilkes, Mrs. James (Ellen Needs)",female,47.0,1,0,363272,7.0,,S
2,894,2,"Myles, Mr. Thomas Francis",male,62.0,0,0,240276,9.6875,,Q
3,895,3,"Wirz, Mr. Albert",male,27.0,0,0,315154,8.6625,,S
4,896,3,"Hirvonen, Mrs. Alexander (Helga E Lindqvist)",female,22.0,1,1,3101298,12.2875,,S


In [353]:
test.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 418 entries, 0 to 417
Data columns (total 11 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   PassengerId  418 non-null    int64  
 1   Pclass       418 non-null    int64  
 2   Name         418 non-null    object 
 3   Sex          418 non-null    object 
 4   Age          332 non-null    float64
 5   SibSp        418 non-null    int64  
 6   Parch        418 non-null    int64  
 7   Ticket       418 non-null    object 
 8   Fare         417 non-null    float64
 9   Cabin        91 non-null     object 
 10  Embarked     418 non-null    object 
dtypes: float64(2), int64(4), object(5)
memory usage: 36.0+ KB


In [354]:
test.isnull().sum()

PassengerId      0
Pclass           0
Name             0
Sex              0
Age             86
SibSp            0
Parch            0
Ticket           0
Fare             1
Cabin          327
Embarked         0
dtype: int64

We should clean our data

Data Preprocessing:

Handle missing values (e.g., fill age using median or predictive imputation)

Encode categorical features (e.g., one-hot or label encoding for Sex, Embarked,Name,Ticket,cabin)

Feature engineering (e.g., extract title from Name, title from ticket)

In [355]:
from sklearn.pipeline import Pipeline
from sklearn.impute import SimpleImputer#fill mised value with any number or median..
from sklearn.preprocessing import OneHotEncoder#convert string to binary vector
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import LabelEncoder

Label Encoding: Converts each unique category (string) into a unique integer, ex:

['red', 'green', 'blue'] → [2, 1, 0]

One-Hot Encoding: Converts each category into a binary vector (0s and 1s)

['red', 'green', 'blue'] →
red:   [1, 0, 0]
green: [0, 1, 0]
blue:  [0, 0, 1]


Before using OneHotEncoder or LabelEncoder, you must replace NaN (missing values) with something else — like the string "missing".

In [356]:
train['Cabin'] = train['Cabin'].fillna('Missing')
train['Cabin_letter'] = train['Cabin'].apply(lambda x: x[0])#to prevent high dimensionality so only apply label or onhot encoding for first letter
#For each value x in the Cabin column (which is now either a cabin string like 'C85' or 'Missing'), it takes the first character of that string
test['Cabin'] = test['Cabin'].fillna('Missing')
test['Cabin_letter'] = test['Cabin'].apply(lambda x: x[0])

Name,Sex and  Ticket are object columns, and raw text like that doesn’t help machine learning models directly. Here’s how to handle them using feature engineering

x: Braund, Mr. Owen Harris so x.split(',')[1].split('.')[0].strip() 

1. ['Braund','Mr.Owen Harris']

2. ['Mr','Owen Harris]

3. ['Mr']

4. Mr without spaces using strip

In [357]:
#Name we can extract title to prevent high dimensionality after on hot encoding
#sex apply onlabel encoding just 1 and 0 isntead of male and female
#ticket same thing
train['Title']=train['Name'].apply(lambda x:x.split(',')[1].split('.')[0].strip())
test['Title']=test['Name'].apply(lambda x:x.split(',')[1].split('.')[0].strip())


"AB./5 21171" → matches "AB./" and stops because next char is '5' (digit)

113803" → no match because it starts with digits

match.group(0) returns the exact matched string ex "AB./"

.replace('.', '').replace('/', '').strip()  returns AB without/ and . and spaces


In [358]:
import re#regukar expression finding or extracting from string
train['Ticket_prefix'] = train['Ticket'].apply(
    lambda x: re.match(r'([A-Za-z./]+)', x).group(0).replace('.', '').replace('/', '').strip() 
    if re.match(r'([A-Za-z./]+)', x) else 'missing'
)
test['Ticket_prefix'] = test['Ticket'].apply(
    lambda x: re.match(r'([A-Za-z./]+)', x).group(0).replace('.', '').replace('/', '').strip() 
    if re.match(r'([A-Za-z./]+)', x) else 'missing'
)


In [359]:
train.drop(columns=['Name','Cabin','PassengerId','Ticket'],inplace=True) 
test.drop(columns=['Name','Cabin','PassengerId','Ticket'],inplace=True) 

In [360]:
train['Sex'] = train['Sex'].map({'male': 1, 'female': 0})
test['Sex'] = test['Sex'].map({'male': 1, 'female': 0})


In [361]:
train.head()

Unnamed: 0,Survived,Pclass,Sex,Age,SibSp,Parch,Fare,Embarked,Cabin_letter,Title,Ticket_prefix
0,0,3,1,22.0,1,0,7.25,S,M,Mr,A
1,1,1,0,38.0,1,0,71.2833,C,C,Mrs,PC
2,1,3,0,26.0,0,0,7.925,S,M,Miss,STONO
3,1,1,0,35.0,1,0,53.1,S,C,Mrs,missing
4,0,3,1,35.0,0,0,8.05,S,M,Mr,missing


In [362]:
train.shape

(891, 11)

In [363]:
train.isnull().sum()

Survived           0
Pclass             0
Sex                0
Age              177
SibSp              0
Parch              0
Fare               0
Embarked           2
Cabin_letter       0
Title              0
Ticket_prefix      0
dtype: int64

In [364]:
test.head()

Unnamed: 0,Pclass,Sex,Age,SibSp,Parch,Fare,Embarked,Cabin_letter,Title,Ticket_prefix
0,3,1,34.5,0,0,7.8292,Q,M,Mr,missing
1,3,0,47.0,1,0,7.0,S,M,Mrs,missing
2,2,1,62.0,0,0,9.6875,Q,M,Mr,missing
3,3,1,27.0,0,0,8.6625,S,M,Mr,missing
4,3,0,22.0,1,1,12.2875,S,M,Mrs,missing


In [365]:
test.isnull().sum()

Pclass            0
Sex               0
Age              86
SibSp             0
Parch             0
Fare              1
Embarked          0
Cabin_letter      0
Title             0
Ticket_prefix     0
dtype: int64

In [366]:
test.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 418 entries, 0 to 417
Data columns (total 10 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   Pclass         418 non-null    int64  
 1   Sex            418 non-null    int64  
 2   Age            332 non-null    float64
 3   SibSp          418 non-null    int64  
 4   Parch          418 non-null    int64  
 5   Fare           417 non-null    float64
 6   Embarked       418 non-null    object 
 7   Cabin_letter   418 non-null    object 
 8   Title          418 non-null    object 
 9   Ticket_prefix  418 non-null    object 
dtypes: float64(2), int64(4), object(4)
memory usage: 32.8+ KB


In [367]:

age=['Age','Fare']
agetr=Pipeline(
    steps=[
       ("imputer",SimpleImputer(strategy='median')) 
    ]
)
# Embarked have Nans where cabin i solved it and fill nan before extracting first letter from it
embarked = ['Embarked']
embarktr = Pipeline([
    ('imputer', SimpleImputer(strategy='constant', fill_value='missing')),
    ('onhot', OneHotEncoder(handle_unknown='ignore', sparse_output=False))
])

# Title, Cabin_letter and Ticket_prefix  clean
title_ticket_cabin= ['Title', 'Ticket_prefix','Cabin_letter']
titletr = Pipeline([
    ('onhot', OneHotEncoder(handle_unknown='ignore', sparse_output=False))
])

#pipeline takes steps [] and strategy (fill null value with?)

ColumnTransformer is like fit_transform() for multiple columns

Training (fit_transform): You study a textbook, take notes, and summarize what you learned.

Testing (transform only): You use what you learned to solve problems—without opening the textbook again.

In [368]:

colm=ColumnTransformer(
    transformers=[
        ('age',agetr,age),
        ('embarked',embarktr,embarked),
        ('rest',titletr,title_ticket_cabin)
        
    ]
)

In [369]:
colm

Try all algorithms to find best one

In [370]:
models={
    'SVC':SVC(),
    'LR':LogisticRegression(),
    'RandomForestClassifier':RandomForestClassifier(),
    'LSVC':LinearSVC(),
    'kNN':KNeighborsClassifier()
}

In [371]:
ytrain=train['Survived']
ytrain

0      0
1      1
2      1
3      1
4      0
      ..
886    0
887    1
888    0
889    1
890    0
Name: Survived, Length: 891, dtype: int64

In [372]:
xtrain=train.drop('Survived',axis=1)
xtrain

Unnamed: 0,Pclass,Sex,Age,SibSp,Parch,Fare,Embarked,Cabin_letter,Title,Ticket_prefix
0,3,1,22.0,1,0,7.2500,S,M,Mr,A
1,1,0,38.0,1,0,71.2833,C,C,Mrs,PC
2,3,0,26.0,0,0,7.9250,S,M,Miss,STONO
3,1,0,35.0,1,0,53.1000,S,C,Mrs,missing
4,3,1,35.0,0,0,8.0500,S,M,Mr,missing
...,...,...,...,...,...,...,...,...,...,...
886,2,1,27.0,0,0,13.0000,S,M,Rev,missing
887,1,0,19.0,0,0,30.0000,S,B,Miss,missing
888,3,0,,1,2,23.4500,S,M,Miss,WC
889,1,1,26.0,0,0,30.0000,C,C,Mr,missing


In [373]:
cleandata=colm.fit_transform(xtrain)
cleandata

array([[22.    ,  7.25  ,  0.    , ...,  0.    ,  1.    ,  0.    ],
       [38.    , 71.2833,  1.    , ...,  0.    ,  0.    ,  0.    ],
       [26.    ,  7.925 ,  0.    , ...,  0.    ,  1.    ,  0.    ],
       ...,
       [28.    , 23.45  ,  0.    , ...,  0.    ,  1.    ,  0.    ],
       [26.    , 30.    ,  1.    , ...,  0.    ,  0.    ,  0.    ],
       [32.    ,  7.75  ,  0.    , ...,  0.    ,  1.    ,  0.    ]],
      shape=(891, 61))

22. is the Age

0., 1., 0. etc. are one-hot encoded values for Embarked, Cabin_letter, Title, Ticket_prefix, etc.

You started with fewer columns (like 11), and after OneHotEncoding, it became 60 columns due to many categories.

🔹 What Is .score()?
.score(X, y) in scikit-learn:

Trains the model (if not already trained).

Evaluates its performance (usually accuracy :percentage of correct predictions)

🔹 What Is Cross-Validation?
Cross-validation (especially k-fold cross-validation) is a more reliable performance evaluation method.

Example: 5-Fold Cross-Validation
The data is split into 5 equal parts.

The model is trained on 4 parts and tested on the 5th.

This is repeated 5 times, with a different fold as the test set each time.

Then the average accuracy is calculated.

Imagine you have 100 data points and want to check how good your model is.

Instead of just training your model on 80 data points and testing on 20 once (which might be lucky or unlucky),

Cross-validation does this:

Split the 100 data points into 5 groups of 20 (these are called “folds”).

Round 1:

Train the model on groups 1, 2, 3, and 4 (total 80 points).

Test the model on group 5 (20 points).

Save the accuracy.

Round 2:

Train on groups 1, 2, 3, and 5.

Test on group 4.

Save the accuracy.

Round 3:

Train on groups 1, 2, 4, and 5.

Test on group 3.

Save the accuracy.

Round 4:

Train on groups 1, 3, 4, and 5.

Test on group 2.

Save the accuracy.

Round 5:

Train on groups 2, 3, 4, and 5.

Test on group 1.

Save the accuracy.



In [374]:
np.random.seed(42)
results={}
for modelname,model in models.items():
    model.fit(cleandata,ytrain)
    results[modelname]=model.score(cleandata,ytrain)
results
#.score for xtest and ytest how well the model generalize to new unseen data
#.score for xtrain and ytrain (how well the model learned the data)



STOP: TOTAL NO. OF ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(


{'SVC': 0.6835016835016835,
 'LR': 0.8047138047138047,
 'RandomForestClassifier': 0.9876543209876543,
 'LSVC': 0.8114478114478114,
 'kNN': 0.8058361391694725}

.score() method for classification models returns the accuracy of the model’s predictions on the dataset cleandata compared to the true labels ytrain.

Uses the model to predict labels for cleandata

Compares predictions with true ytrain

Calculates the accuracy (percentage of correct predictions)

Looking at these training accuracies:

SVC: 63.2% — Underfitting, maybe needs tuning or kernel choice.

Logistic Regression: 80.8% — Decent baseline linear model.

Random Forest: 92.4% — Very high, might be overfitting.

LinearSVC: 81.5% — Similar to Logistic Regression, solid linear.

kNN: 82.6% — Good performance, but remember can be slow on big data.

Cross-validation (CV) is a statistical technique used to evaluate the performance of a machine learning

In [375]:
from sklearn.model_selection import cross_val_score
np.random.seed(42)
results={}
for modelname,model in models.items():
    scores = cross_val_score(model, cleandata, ytrain, cv=5)#5 folds
    results[modelname]=scores.mean()
results
#the model sees both the training features (X_train) and their corresponding labels (y_train) — so it can learn from them.

#Then the model is tested on the held-out fold — where it only sees the features (X_test) and predicts labels.

#These predictions are compared to the true labels (y_test) to calculate accuracy (or other metrics).






STOP: TOTAL NO. OF ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(
STOP: TOTAL NO. OF ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(
STOP: TOTAL NO. OF ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver opt

{'SVC': np.float64(0.6723808925993346),
 'LR': np.float64(0.7789090452576737),
 'RandomForestClassifier': np.float64(0.7934781244115248),
 'LSVC': np.float64(0.7856568953612454),
 'kNN': np.float64(0.6891783315548301)}

In [377]:
cleantest=colm.transform(test)

In [378]:
cleantest

array([[34.5   ,  7.8292,  0.    , ...,  0.    ,  1.    ,  0.    ],
       [47.    ,  7.    ,  0.    , ...,  0.    ,  1.    ,  0.    ],
       [62.    ,  9.6875,  0.    , ...,  0.    ,  1.    ,  0.    ],
       ...,
       [38.5   ,  7.25  ,  0.    , ...,  0.    ,  1.    ,  0.    ],
       [28.    ,  8.05  ,  0.    , ...,  0.    ,  1.    ,  0.    ],
       [28.    , 22.3583,  1.    , ...,  0.    ,  1.    ,  0.    ]],
      shape=(418, 61))

Since Random Forest Classifier has the highest average cross-validation accuracy (~79.8%), it suggests Random Forest Classifier is the best performing model 

In [388]:
np.random.seed(42)
RF=RandomForestClassifier()
RF.fit(cleandata,ytrain)

In [389]:
np.random.seed(42)
RF.score(cleandata,ytrain)

0.9876543209876543

In [390]:
np.random.seed(42)
ypred=RF.predict(cleantest)

In [391]:
ypred

array([0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 0, 1,
       1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1,
       1, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1,
       1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0,
       1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0,
       0, 1, 1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1,
       1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1,
       0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1,
       0, 0, 1, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0,
       0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0,
       1, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0,
       0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1,
       0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0,
       0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0,

We dont have y test to compare them with y pred so we can test metrices during training

In [392]:
from sklearn.metrics import confusion_matrix,f1_score,precision_score,recall_score,classification_report

In [393]:
ypredtr=RF.predict(cleandata)

In [None]:
c=confusion_matrix(ypredtr,ytrain)
c
#546 TN are actually negative(unsurvived) and detected as negative
#8 FP are actualyy negative(unsurvived) and detected as positive(survived)
#3 FN are actually positive and detected as negative 
# 334 TP are actually positive(survived) and detected as positive (survived)

array([[546,   8],
       [  3, 334]])

In [396]:
print(classification_report(ypredtr,ytrain))


              precision    recall  f1-score   support

           0       0.99      0.99      0.99       554
           1       0.98      0.99      0.98       337

    accuracy                           0.99       891
   macro avg       0.99      0.99      0.99       891
weighted avg       0.99      0.99      0.99       891



Precision:(tells how many of the predicted positives were actually positive)

1. Class 0 (Unsurvived): 0.99 → Of all passengers predicted as unsurvived, 99% were truly unsurvived.

2. Class 1 (survived): 0.98  → Of  all passengers predicted  as survived , 98% were truly  survived

Recall:(tells how many of the actual positives were correctly predicted)

1. Class 0 (Unsurvived): 0.99 → Of  all passengers that are actually unsurvived , 99%  were predicted as unsurvived

2. Class 1 (survived): 0.98  →  Of all passengers that are actually survived, 98% were predicted as survived

✅ .precision_score(y_test, y_pred).... only for positive class .....TP/TP+FP

✅ .recall_score(y_test, y_pred)...only for positive class (survived)....TP/TP+FN

✅classification report view for positive and negative class.

In [397]:
#Hyperparameer tunning tool used to find best settings or best parameters of ML model that should be set before training ti omprove performance and minimize error.
from sklearn.model_selection import GridSearchCV

In [398]:
param_grid = {
    "n_estimators": [100, 200, 300],           # Number of trees
    "max_depth": [None, 10, 20, 30],           # Maximum depth of a tree
    "min_samples_split": [2, 5, 10],           # Min samples to split a node
    "min_samples_leaf": [1, 2, 4],             # Min samples at a leaf node
    "bootstrap": [True, False]                 # Whether bootstrap samples are used
}

In [399]:
tunning=GridSearchCV(estimator=RandomForestClassifier(),param_grid=param_grid,cv=5)
tunning.fit(cleandata,ytrain)

In [400]:
best=tunning.best_params_

In [401]:
best

{'bootstrap': False,
 'max_depth': 30,
 'min_samples_leaf': 1,
 'min_samples_split': 10,
 'n_estimators': 100}

In [402]:
RF=RandomForestClassifier()
RF.set_params(**best)
RF.fit(cleandata,ytrain)

In [403]:
RF.score(cleandata,ytrain)

0.9337822671156004

In [404]:
score=cross_val_score(RF,cleandata,ytrain,cv=5)

In [406]:
score.mean()

np.float64(0.8080785889146946)

In [407]:
RF.predict(cleantest)

array([0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 1,
       1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1,
       1, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1,
       1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1,
       1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0,
       0, 1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1,
       1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1,
       0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1,
       0, 0, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1,
       0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0,
       1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0,
       1, 0, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1,
       0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0,
       0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0,