## Project Overview
A machine learning-based system for distinguishing between rocks and naval mines using sonar signal patterns. This project leverages **Supervised Machine Learning** algorithms to provide accurate predictions and help in underwater object detection, supporting naval operations and maritime safety.

Industry: Navy/Maritime

## IMPORT THE DEPENDENCIES

In [1]:
# import the required libraries
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

### DATA COLLECTION AND DATA PROCESSING

In [2]:
# Loading the dataset to a pandas dataframe
rock_mine_prediction = pd.read_csv('sonar data.csv', header= None)

In [3]:
# Check the first 5 rows of the dataset
rock_mine_prediction.head()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,51,52,53,54,55,56,57,58,59,60
0,0.02,0.0371,0.0428,0.0207,0.0954,0.0986,0.1539,0.1601,0.3109,0.2111,...,0.0027,0.0065,0.0159,0.0072,0.0167,0.018,0.0084,0.009,0.0032,R
1,0.0453,0.0523,0.0843,0.0689,0.1183,0.2583,0.2156,0.3481,0.3337,0.2872,...,0.0084,0.0089,0.0048,0.0094,0.0191,0.014,0.0049,0.0052,0.0044,R
2,0.0262,0.0582,0.1099,0.1083,0.0974,0.228,0.2431,0.3771,0.5598,0.6194,...,0.0232,0.0166,0.0095,0.018,0.0244,0.0316,0.0164,0.0095,0.0078,R
3,0.01,0.0171,0.0623,0.0205,0.0205,0.0368,0.1098,0.1276,0.0598,0.1264,...,0.0121,0.0036,0.015,0.0085,0.0073,0.005,0.0044,0.004,0.0117,R
4,0.0762,0.0666,0.0481,0.0394,0.059,0.0649,0.1209,0.2467,0.3564,0.4459,...,0.0031,0.0054,0.0105,0.011,0.0015,0.0072,0.0048,0.0107,0.0094,R


In [4]:
# Check the shape of the dataset
rock_mine_prediction.shape

(208, 61)

In [5]:
# Check the statistical measures of the dataset
rock_mine_prediction.describe()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,50,51,52,53,54,55,56,57,58,59
count,208.0,208.0,208.0,208.0,208.0,208.0,208.0,208.0,208.0,208.0,...,208.0,208.0,208.0,208.0,208.0,208.0,208.0,208.0,208.0,208.0
mean,0.029164,0.038437,0.043832,0.053892,0.075202,0.10457,0.121747,0.134799,0.178003,0.208259,...,0.016069,0.01342,0.010709,0.010941,0.00929,0.008222,0.00782,0.007949,0.007941,0.006507
std,0.022991,0.03296,0.038428,0.046528,0.055552,0.059105,0.061788,0.085152,0.118387,0.134416,...,0.012008,0.009634,0.00706,0.007301,0.007088,0.005736,0.005785,0.00647,0.006181,0.005031
min,0.0015,0.0006,0.0015,0.0058,0.0067,0.0102,0.0033,0.0055,0.0075,0.0113,...,0.0,0.0008,0.0005,0.001,0.0006,0.0004,0.0003,0.0003,0.0001,0.0006
25%,0.01335,0.01645,0.01895,0.024375,0.03805,0.067025,0.0809,0.080425,0.097025,0.111275,...,0.008425,0.007275,0.005075,0.005375,0.00415,0.0044,0.0037,0.0036,0.003675,0.0031
50%,0.0228,0.0308,0.0343,0.04405,0.0625,0.09215,0.10695,0.1121,0.15225,0.1824,...,0.0139,0.0114,0.00955,0.0093,0.0075,0.00685,0.00595,0.0058,0.0064,0.0053
75%,0.03555,0.04795,0.05795,0.0645,0.100275,0.134125,0.154,0.1696,0.233425,0.2687,...,0.020825,0.016725,0.0149,0.0145,0.0121,0.010575,0.010425,0.01035,0.010325,0.008525
max,0.1371,0.2339,0.3059,0.4264,0.401,0.3823,0.3729,0.459,0.6828,0.7106,...,0.1004,0.0709,0.039,0.0352,0.0447,0.0394,0.0355,0.044,0.0364,0.0439


In [6]:
# Check the value count of the label column
rock_mine_prediction[60].value_counts()

60
M    111
R     97
Name: count, dtype: int64

M ---> Mine

R ---> Rock

In [7]:
# Check the mean values of the label column
rock_mine_prediction.groupby(60).mean()

Unnamed: 0_level_0,0,1,2,3,4,5,6,7,8,9,...,50,51,52,53,54,55,56,57,58,59
60,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
M,0.034989,0.045544,0.05072,0.064768,0.086715,0.111864,0.128359,0.149832,0.213492,0.251022,...,0.019352,0.016014,0.011643,0.012185,0.009923,0.008914,0.007825,0.00906,0.008695,0.00693
R,0.022498,0.030303,0.035951,0.041447,0.062028,0.096224,0.11418,0.117596,0.137392,0.159325,...,0.012311,0.010453,0.00964,0.009518,0.008567,0.00743,0.007814,0.006677,0.007078,0.006024


### Seperate The Data (Features) and The Label (Target)

In [8]:
# Seperate the Features and the Label
x = rock_mine_prediction.drop(columns= 60, axis = 1)
y = rock_mine_prediction[60]

### Spliting The Data Into 'Training' And 'Test' Data

In [9]:
# create four variables and split the dataset into training and test data
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size= 0.1, stratify = y, random_state= 1)

In [10]:
# Check the shape of the splited datasets
print(x.shape, x_train.shape, x_test.shape )

(208, 60) (187, 60) (21, 60)


### Training the model ---> I am using the Logistic Regression model

In [12]:
# Put the Logistic Regression into a new variable
model = LogisticRegression()

In [13]:
# Fit the training data and training label into the Logistic Regression model
model.fit(x_train, y_train)

#### **OBSERVATION**: The training (x_train) and The training label (y_traing), has both been trained using the model

### MODEL EVALUATION USING ACCURACY SCORE
* Accuracy score is used to evaluate our model in other to find how well our model is perfoming and how many good predictions it is making.

In [14]:
# Check the accuracy score of thr training data
x_train_prediction = model.predict(x_train)
training_data_accuracy = accuracy_score(x_train_prediction, y_train)

In [15]:
print('Accuracy Score on Training Data : ', training_data_accuracy)

Accuracy Score on Training Data :  0.8342245989304813


#### The Accuracy Score for the Training data is about 83%.

#### This is a very good score.  

In [16]:
# Check the accuracy score of thr testing data
x_test_prediction = model.predict(x_test)
test_data_accuracy = accuracy_score(x_test_prediction, y_test)

In [17]:
print('Accuracy Score on Test Data : ', test_data_accuracy)

Accuracy Score on Test Data :  0.7619047619047619


#### The Accuracy Score for the Test data is about 76%.

#### This is a very good score

### I Built a Predictive System that can predict whether the object is either a 'Rock' or a 'Mine'

In [18]:
input_data = (0.0253,0.0808,0.0507,0.0244,0.1724,0.3823,0.3729,0.3583,0.3429,0.2197,0.2653,0.3223,0.5582,0.6916,0.7943,0.7152,0.3512,0.2008,0.2676,0.4299,0.5280,0.3489,0.1430,0.5453,0.6338,0.7712,0.6838,0.8015,0.8073,0.8310,0.7792,0.5049,0.1413,0.2767,0.5084,0.4787,0.1356,0.2299,0.2789,0.3833,0.2933,0.1155,0.1705,0.1294,0.0909,0.0800,0.0567,0.0198,0.0114,0.0151,0.0085,0.0178,0.0073,0.0079,0.0038,0.0116,0.0033,0.0039,0.0081,0.0053)

# Changing the input data to a numpy array
input_data_as_numpy_array = np.asarray(input_data)

# Reshape the numpy array as we are predicting for one instance
input_data_reshaped = input_data_as_numpy_array.reshape(1, -1)

prediction = model.predict(input_data_reshaped)
print(prediction)

if (prediction[0]=='R'):
    print('The Object Is A Rock')
else:
    print('The Object Is A Mine')
    

['R']
The Object Is A Rock
