<a href="https://colab.research.google.com/github/jalalrahmanov/Projects/blob/master/SONAR_Rock_vs_Mine_Prediction_Project.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Welcome SONAR Rock vs Mine Prediction Project!
### About project:

Sonar (sound navigation and ranging) is a technique based on the principle of reflection of ultrasonic sound waves. These waves propagate through water and reflect on hitting the ocean bed or any object obstructing its path.

Sonar has been widely used in submarine navigation, communication with or detection of objects on or under the water surface (like other vessels), hazard identification, etc.

There are two types of sonar technology used — passive (listening to the sound emitted by vessels in the ocean) and active (emitting pulses and listening for their echoes).

It is important to note that research shows the use of active sonar can cause mass strandings of marine animals.

For more detailed information, you can go to my reference for this project from medium:
https://medium.com/ai-techsystems/sonar-data-mines-vs-rocks-on-cainvas-c0a08dde895b

In [72]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [73]:
  import pandas as pd   #for reading csv files (others may be)
  import numpy as np   #for doing some necessary operations
  from sklearn.model_selection import train_test_split        #for divide data to test and train part
  from sklearn.linear_model import LogisticRegression      #for creating an object from this class. It will be called as model
  from sklearn.metrics import accuracy_score      #for evalueating model


  # /content/drive/MyDrive/Kaggle/SONAR Rock_vs_Mine_Prediction_Dataset.csv 

In [74]:
sonar_data = pd.read_csv('/content/drive/MyDrive/Kaggle/SONAR Rock_vs_Mine_Prediction_Dataset.csv', header=None)   #there is no header, for that reason
sonar_data                                                                                                         #give header None

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,51,52,53,54,55,56,57,58,59,60
0,0.0200,0.0371,0.0428,0.0207,0.0954,0.0986,0.1539,0.1601,0.3109,0.2111,...,0.0027,0.0065,0.0159,0.0072,0.0167,0.0180,0.0084,0.0090,0.0032,R
1,0.0453,0.0523,0.0843,0.0689,0.1183,0.2583,0.2156,0.3481,0.3337,0.2872,...,0.0084,0.0089,0.0048,0.0094,0.0191,0.0140,0.0049,0.0052,0.0044,R
2,0.0262,0.0582,0.1099,0.1083,0.0974,0.2280,0.2431,0.3771,0.5598,0.6194,...,0.0232,0.0166,0.0095,0.0180,0.0244,0.0316,0.0164,0.0095,0.0078,R
3,0.0100,0.0171,0.0623,0.0205,0.0205,0.0368,0.1098,0.1276,0.0598,0.1264,...,0.0121,0.0036,0.0150,0.0085,0.0073,0.0050,0.0044,0.0040,0.0117,R
4,0.0762,0.0666,0.0481,0.0394,0.0590,0.0649,0.1209,0.2467,0.3564,0.4459,...,0.0031,0.0054,0.0105,0.0110,0.0015,0.0072,0.0048,0.0107,0.0094,R
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
203,0.0187,0.0346,0.0168,0.0177,0.0393,0.1630,0.2028,0.1694,0.2328,0.2684,...,0.0116,0.0098,0.0199,0.0033,0.0101,0.0065,0.0115,0.0193,0.0157,M
204,0.0323,0.0101,0.0298,0.0564,0.0760,0.0958,0.0990,0.1018,0.1030,0.2154,...,0.0061,0.0093,0.0135,0.0063,0.0063,0.0034,0.0032,0.0062,0.0067,M
205,0.0522,0.0437,0.0180,0.0292,0.0351,0.1171,0.1257,0.1178,0.1258,0.2529,...,0.0160,0.0029,0.0051,0.0062,0.0089,0.0140,0.0138,0.0077,0.0031,M
206,0.0303,0.0353,0.0490,0.0608,0.0167,0.1354,0.1465,0.1123,0.1945,0.2354,...,0.0086,0.0046,0.0126,0.0036,0.0035,0.0034,0.0079,0.0036,0.0048,M


## Data Understanding

In [75]:
sonar_data.head() 

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,51,52,53,54,55,56,57,58,59,60
0,0.02,0.0371,0.0428,0.0207,0.0954,0.0986,0.1539,0.1601,0.3109,0.2111,...,0.0027,0.0065,0.0159,0.0072,0.0167,0.018,0.0084,0.009,0.0032,R
1,0.0453,0.0523,0.0843,0.0689,0.1183,0.2583,0.2156,0.3481,0.3337,0.2872,...,0.0084,0.0089,0.0048,0.0094,0.0191,0.014,0.0049,0.0052,0.0044,R
2,0.0262,0.0582,0.1099,0.1083,0.0974,0.228,0.2431,0.3771,0.5598,0.6194,...,0.0232,0.0166,0.0095,0.018,0.0244,0.0316,0.0164,0.0095,0.0078,R
3,0.01,0.0171,0.0623,0.0205,0.0205,0.0368,0.1098,0.1276,0.0598,0.1264,...,0.0121,0.0036,0.015,0.0085,0.0073,0.005,0.0044,0.004,0.0117,R
4,0.0762,0.0666,0.0481,0.0394,0.059,0.0649,0.1209,0.2467,0.3564,0.4459,...,0.0031,0.0054,0.0105,0.011,0.0015,0.0072,0.0048,0.0107,0.0094,R


In [76]:
sonar_data.shape    #for general understanding, 60 columns for feature and 1 target

(208, 61)

In [77]:
sonar_data.describe().T             #not really necessary. Shows us many statistical values about our dataset

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
0,208.0,0.029164,0.022991,0.0015,0.01335,0.0228,0.03555,0.1371
1,208.0,0.038437,0.03296,0.0006,0.01645,0.0308,0.04795,0.2339
2,208.0,0.043832,0.038428,0.0015,0.01895,0.0343,0.05795,0.3059
3,208.0,0.053892,0.046528,0.0058,0.024375,0.04405,0.0645,0.4264
4,208.0,0.075202,0.055552,0.0067,0.03805,0.0625,0.100275,0.401
5,208.0,0.10457,0.059105,0.0102,0.067025,0.09215,0.134125,0.3823
6,208.0,0.121747,0.061788,0.0033,0.0809,0.10695,0.154,0.3729
7,208.0,0.134799,0.085152,0.0055,0.080425,0.1121,0.1696,0.459
8,208.0,0.178003,0.118387,0.0075,0.097025,0.15225,0.233425,0.6828
9,208.0,0.208259,0.134416,0.0113,0.111275,0.1824,0.2687,0.7106


In [79]:
sonar_data[60].value_counts()          #look at target column for check inmalance in dataset. No imbalance, everything is okay

M    111
R     97
Name: 60, dtype: int64

In [80]:
sonar_data.groupby(60).mean()         #no need but shows mean values by target. means are different that's why there is some regularity

Unnamed: 0_level_0,0,1,2,3,4,5,6,7,8,9,...,50,51,52,53,54,55,56,57,58,59
60,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
M,0.034989,0.045544,0.05072,0.064768,0.086715,0.111864,0.128359,0.149832,0.213492,0.251022,...,0.019352,0.016014,0.011643,0.012185,0.009923,0.008914,0.007825,0.00906,0.008695,0.00693
R,0.022498,0.030303,0.035951,0.041447,0.062028,0.096224,0.11418,0.117596,0.137392,0.159325,...,0.012311,0.010453,0.00964,0.009518,0.008567,0.00743,0.007814,0.006677,0.007078,0.006024


In [81]:
X = sonar_data.drop(columns = [60], axis = 1).values
y = sonar_data[60].values

In [82]:
X

array([[0.02  , 0.0371, 0.0428, ..., 0.0084, 0.009 , 0.0032],
       [0.0453, 0.0523, 0.0843, ..., 0.0049, 0.0052, 0.0044],
       [0.0262, 0.0582, 0.1099, ..., 0.0164, 0.0095, 0.0078],
       ...,
       [0.0522, 0.0437, 0.018 , ..., 0.0138, 0.0077, 0.0031],
       [0.0303, 0.0353, 0.049 , ..., 0.0079, 0.0036, 0.0048],
       [0.026 , 0.0363, 0.0136, ..., 0.0036, 0.0061, 0.0115]])

In [83]:
y

array(['R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R',
       'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R',
       'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R',
       'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R',
       'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R',
       'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R',
       'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R',
       'R', 'R', 'R', 'R', 'R', 'R', 'M', 'M', 'M', 'M', 'M', 'M', 'M',
       'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M',
       'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M',
       'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M',
       'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M',
       'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M',
       'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M

# Train and test splitting

In [84]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1, stratify=y, random_state=1)

In [85]:
print(X_train)
print(y_train)

[[0.0414 0.0436 0.0447 ... 0.0077 0.0246 0.0198]
 [0.0123 0.0022 0.0196 ... 0.0058 0.0047 0.0071]
 [0.0152 0.0102 0.0113 ... 0.0011 0.0034 0.0033]
 ...
 [0.0117 0.0069 0.0279 ... 0.0062 0.0026 0.0052]
 [0.115  0.1163 0.0866 ... 0.0141 0.0068 0.0086]
 [0.0187 0.0346 0.0168 ... 0.0115 0.0193 0.0157]]
['M' 'R' 'R' 'M' 'R' 'R' 'R' 'M' 'R' 'R' 'M' 'M' 'R' 'R' 'M' 'R' 'R' 'M'
 'R' 'R' 'M' 'R' 'R' 'R' 'M' 'M' 'R' 'R' 'M' 'M' 'M' 'M' 'R' 'R' 'M' 'M'
 'R' 'M' 'M' 'M' 'R' 'M' 'M' 'R' 'M' 'M' 'M' 'M' 'R' 'R' 'M' 'R' 'M' 'R'
 'M' 'M' 'R' 'M' 'R' 'M' 'R' 'R' 'M' 'R' 'M' 'M' 'R' 'R' 'M' 'M' 'M' 'M'
 'M' 'M' 'M' 'R' 'R' 'R' 'R' 'M' 'R' 'R' 'R' 'M' 'R' 'M' 'M' 'M' 'M' 'M'
 'M' 'M' 'M' 'M' 'M' 'R' 'R' 'M' 'M' 'M' 'R' 'R' 'R' 'M' 'M' 'R' 'R' 'R'
 'M' 'M' 'M' 'M' 'R' 'M' 'M' 'M' 'R' 'M' 'R' 'R' 'M' 'M' 'R' 'R' 'R' 'M'
 'R' 'M' 'R' 'M' 'M' 'M' 'R' 'R' 'M' 'M' 'R' 'R' 'R' 'M' 'R' 'M' 'R' 'M'
 'R' 'M' 'R' 'R' 'M' 'M' 'M' 'R' 'R' 'R' 'M' 'R' 'M' 'M' 'R' 'M' 'R' 'R'
 'M' 'R' 'R' 'M' 'R' 'M' 'R' 'M' 'R' 'M' 'R

In [86]:
print(X.shape, X_train.shape, X_test.shape)
print(X_train.shape[0] + X_test.shape[0])         #shows sum of row size is equal to original dataset's row size

(208, 60) (187, 60) (21, 60)
208


# Model Training - LogisticRegression - because of binary classification

In [87]:
model = LogisticRegression()               #create model object

In [88]:
model.fit(X_train, y_train)                #training of model with train data. Fitted model.

LogisticRegression()

## Predicting and Model evalueation
Accuracy on train data

In [89]:
X_train_prediction = model.predict(X_train)
X_train_prediction                                 #our predictions are ready!

array(['M', 'R', 'M', 'M', 'R', 'M', 'R', 'R', 'R', 'R', 'M', 'M', 'R',
       'R', 'M', 'R', 'M', 'M', 'R', 'R', 'M', 'R', 'R', 'R', 'M', 'M',
       'R', 'R', 'R', 'M', 'M', 'R', 'M', 'R', 'M', 'R', 'R', 'M', 'M',
       'M', 'R', 'M', 'M', 'R', 'M', 'M', 'M', 'M', 'M', 'R', 'M', 'M',
       'M', 'R', 'M', 'M', 'R', 'M', 'R', 'M', 'R', 'R', 'M', 'R', 'M',
       'M', 'R', 'M', 'R', 'R', 'R', 'M', 'M', 'M', 'M', 'R', 'R', 'R',
       'R', 'M', 'M', 'R', 'R', 'M', 'R', 'M', 'M', 'M', 'M', 'M', 'M',
       'M', 'M', 'M', 'M', 'R', 'R', 'R', 'M', 'M', 'R', 'M', 'R', 'M',
       'M', 'R', 'R', 'R', 'M', 'M', 'R', 'M', 'R', 'R', 'M', 'M', 'R',
       'M', 'R', 'M', 'M', 'M', 'R', 'R', 'R', 'M', 'R', 'R', 'R', 'M',
       'M', 'M', 'M', 'M', 'M', 'M', 'R', 'R', 'M', 'M', 'M', 'M', 'R',
       'M', 'R', 'M', 'R', 'R', 'M', 'M', 'M', 'M', 'M', 'R', 'M', 'R',
       'R', 'M', 'R', 'M', 'M', 'R', 'M', 'R', 'R', 'M', 'R', 'M', 'R',
       'M', 'R', 'R', 'R', 'M', 'R', 'R', 'R', 'M', 'M', 'R', 'M

In [90]:
training_data_accuracy = accuracy_score(X_train_prediction, y_train)
training_data_accuracy                    #our model predicts target with 83% probability (based on training data)

0.8342245989304813

Accuracy on test data

In [91]:
X_test_prediction = model.predict(X_test)

In [92]:
test_data_accuracy = accuracy_score(X_test_prediction, y_test)
test_data_accuracy                             #our model say target with 76% accuracy

0.7619047619047619

# Making a predicting system

In [93]:
#take a row which you want from dataset. Check it manually

string = '0.0286	0.0453	0.0277	0.0174	0.0384	0.0990	0.1201	0.1833	0.2105	0.3039	0.2988	0.4250	0.6343	0.8198	1.0000	0.9988	0.9508	0.9025	0.7234	0.5122	0.2074	0.3985	0.5890	0.2872	0.2043	0.5782	0.5389	0.3750	0.3411	0.5067	0.5580	0.4778	0.3299	0.2198	0.1407	0.2856	0.3807	0.4158	0.4054	0.3296	0.2707	0.2650	0.0723	0.1238	0.1192	0.1089	0.0623	0.0494	0.0264	0.0081	0.0104	0.0045	0.0014	0.0038	0.0013	0.0089	0.0057	0.0027	0.0051	0.0062'

number_list = [float(item) for item in string.split('	')]
number_list

[0.0286,
 0.0453,
 0.0277,
 0.0174,
 0.0384,
 0.099,
 0.1201,
 0.1833,
 0.2105,
 0.3039,
 0.2988,
 0.425,
 0.6343,
 0.8198,
 1.0,
 0.9988,
 0.9508,
 0.9025,
 0.7234,
 0.5122,
 0.2074,
 0.3985,
 0.589,
 0.2872,
 0.2043,
 0.5782,
 0.5389,
 0.375,
 0.3411,
 0.5067,
 0.558,
 0.4778,
 0.3299,
 0.2198,
 0.1407,
 0.2856,
 0.3807,
 0.4158,
 0.4054,
 0.3296,
 0.2707,
 0.265,
 0.0723,
 0.1238,
 0.1192,
 0.1089,
 0.0623,
 0.0494,
 0.0264,
 0.0081,
 0.0104,
 0.0045,
 0.0014,
 0.0038,
 0.0013,
 0.0089,
 0.0057,
 0.0027,
 0.0051,
 0.0062]

In [94]:
input_data = (number_list)

#changing the input_data to a numpy array for fast work
input_data_as_numpy_array = np.asarray(input_data)

#we have to reshape it for giving to our model. For model this is mandatory
input_data_reshaped = input_data_as_numpy_array.reshape(1, -1)

#lets make a prediction for our manual selective data
prediction = model.predict(input_data_reshaped)
print('Prediction result is', prediction[0])                         #predicted R --> Great Job! Prediction is True!

if (prediction[0] == 'R'):
  print('The object is a Rock')
else:
  print('The object is a Mine')

Prediction result is R
The object is a Rock


## Thanks!