#Overview

The NASA data set comprises different size NACA 0012 airfoils at various wind tunnel speeds and angles of attack. The span of the airfoil and the observer position were the same in all of the experiments.

We need to predict the scaled sound pressure level in decibels.

---


**This problem has the following inputs:**

1. Frequency, in Hertzs.
2. Angle of attack, in degrees.
3. Chord length, in meters.
4. Free-stream velocity, in meters per second.
5. Suction side displacement thickness, in meters.

#Importing Libraries

In [0]:
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_error
import pickle
from sklearn.model_selection import train_test_split

#Reading Dataframe

In [0]:
df = pd.read_csv('NASA_airfoil_self_noise.csv')

In [14]:
df.head(5)

Unnamed: 0,Frequency,AngleAttack,ChordLength,FreeStreamVelocity,SuctionSide,Sound
0,800,0.0,0.3048,71.3,0.002663,126.201
1,1000,0.0,0.3048,71.3,0.002663,125.201
2,1250,0.0,0.3048,71.3,0.002663,125.951
3,1600,0.0,0.3048,71.3,0.002663,127.591
4,2000,0.0,0.3048,71.3,0.002663,127.461


#Basic Data Infromation

In [15]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1503 entries, 0 to 1502
Data columns (total 6 columns):
 #   Column              Non-Null Count  Dtype  
---  ------              --------------  -----  
 0   Frequency           1503 non-null   int64  
 1   AngleAttack         1503 non-null   float64
 2   ChordLength         1503 non-null   float64
 3   FreeStreamVelocity  1503 non-null   float64
 4   SuctionSide         1503 non-null   float64
 5   Sound               1503 non-null   float64
dtypes: float64(5), int64(1)
memory usage: 70.6 KB


In [17]:
df.describe().T

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
Frequency,1503.0,2886.380572,3152.573137,200.0,800.0,1600.0,4000.0,20000.0
AngleAttack,1503.0,6.782302,5.918128,0.0,2.0,5.4,9.9,22.2
ChordLength,1503.0,0.136548,0.093541,0.0254,0.0508,0.1016,0.2286,0.3048
FreeStreamVelocity,1503.0,50.860745,15.572784,31.7,39.6,39.6,71.3,71.3
SuctionSide,1503.0,0.01114,0.01315,0.000401,0.002535,0.004957,0.015576,0.058411
Sound,1503.0,124.835943,6.898657,103.38,120.191,125.721,129.9955,140.987


In [18]:
df.corr()

Unnamed: 0,Frequency,AngleAttack,ChordLength,FreeStreamVelocity,SuctionSide,Sound
Frequency,1.0,-0.272765,-0.003661,0.133664,-0.230107,-0.390711
AngleAttack,-0.272765,1.0,-0.504868,0.05876,0.753394,-0.156108
ChordLength,-0.003661,-0.504868,1.0,0.003787,-0.220842,-0.236162
FreeStreamVelocity,0.133664,0.05876,0.003787,1.0,-0.003974,0.125103
SuctionSide,-0.230107,0.753394,-0.220842,-0.003974,1.0,-0.31267
Sound,-0.390711,-0.156108,-0.236162,0.125103,-0.31267,1.0


In [20]:
df.isnull().sum()

Frequency             0
AngleAttack           0
ChordLength           0
FreeStreamVelocity    0
SuctionSide           0
Sound                 0
dtype: int64

#Train & Test Split

In [0]:
X = df.drop('Sound', axis=1)
y = df['Sound']

In [0]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=40)

In [26]:
print("Size of train data : ", X_train.shape)
print("Size of test data : ", X_test.shape)

Size of train data :  (1127, 5)
Size of test data :  (376, 5)


#Model Training

In [0]:
#Fitting X_train and y_train in Linear Regression Model
lr = LinearRegression().fit(X_train, y_train)

In [0]:
#Dumping the Model using Pickle for later process like making API 
pickle.dump(lr, open('airfoil_model.pkl', 'wb'))

#Model Prediciton & Evalutaion

In [0]:
predictions = lr.predict(X_test)

In [30]:
mae = mean_absolute_error(predictions, y_test)

print("Mean Absolute Error :", round(mae, 2))

Mean Absolute Error : 3.95
