### False Alarm Prediction Model for leakage detection of Hydrogen Sulphide Gas in Chemical Industry

Flask is a web framework.Flask provides us with tools, libraries and technologies that allows us to build a web application.
Importing Flask module form flask library,Importing jsonify which converts the data into json format.Importing request module which handles HTTP requests.


In [1]:
from flask import Flask,jsonify,request

Object of Flask Module is created for our WSGI application 'app'. 

In [2]:
app = Flask(__name__)

1. Importing necessary ML/DS libraries.

2. Numpy library is imported for array operations of the data with alias 'np'.

3. Pandas library is imported for structuring data into tables and dataframes for operations with alias 'pd'.

4. Importing Gaussian Naive Bayes module from SK learn library which is responsible for developing Machine Learning.

5. Importing joblib (serialization library to save the state of the model) to save the model in a pickle file which can be later loaded and used multiple times.

In [3]:
import numpy as np
import pandas as pd
from sklearn.naive_bayes import GaussianNB
import joblib

The route() function of the Flask class is a decorator, which tells the application which URL should call the associated function. ‘/train’ URL is bound with train function. Hence, when the home page of web server is opened in browser, the output of this function will be rendered.
Creating a function named 'train' to train the model.
1. Reading the data from excel file and storing in a variable named df_train.
2. Splitting the data in X and y variables which corresponds to the independent columns and dependent column respectively.
3. Creating an object of the Classifier Model(here: Gaussian NB).
4. Fitting the X and y columns in the model with fit function of the classifier.
5. Saving the state of the model in a pickle file with file name 'filename.pkl'.
6. Displaying a model trained message.

In [9]:
# @app.route('/train')
def train():
    df_train = pd.read_excel('False Alarm Cases.xlsx')
    df_train = df_train.iloc[:, 1:8]
    X = df_train.iloc[:,0:6]
    y = df_train['Spuriosity Index(0/1)']
    
    classifier = GaussianNB()
    classifier.fit(X, y)
    joblib.dump(classifier, 'filename.pkl')
    return 'Model has been Trained'

In [11]:
train()

'Model has been Trained'

‘/train’ URL is bound with test function with a type of request explicitly selected as 'POST'.
'test' function is created. 
1. Trained model is loaded and called in variable named 'clf'.
2. Test data in the form of json format is passed via get_json function.
3. All independent columns are independently assigned to a variable and is combined in a list to form an array of test data.
4. The array of test data is reshaped and is formed into a dataframe.
5. Predict function of the classifier is applied on the test dataframe.
6. The condition based on the output of the classifier is displayed.

In [5]:
@app.route('/test', methods=['POST'])
def test():
    clf = joblib.load('filename.pkl')
    
    request_data = request.get_json()
    
    a = request_data['Ambient Temperature']
    b = request_data['Calibration']
    c = request_data['Unwanted substance deposition']
    d = request_data['Humidity']
    e = request_data['H2S Content']
    f = request_data['detected by']
    l = [a,b,c,d,e,f]
    narr = np.array(l)
    narr = narr.reshape(1,6)
    df_test = pd.DataFrame(narr, columns = ['Ambient Temperature', 'Calibration', 'Unwanted substance deposition','Humidity', 'H2S Content', 'detected by'])

    ypred = clf.predict(df_test)
    
    if ypred ==1:
        result = 'Danger'

    else:
        result='No Danger'

    return jsonify({'Recommendation':result})

app.run(port=5000)

 * Serving Flask app '__main__'
 * Debug mode: off


 * Running on http://127.0.0.1:5000
Press CTRL+C to quit
127.0.0.1 - - [16/Nov/2023 19:46:54] "GET / HTTP/1.1" 404 -
127.0.0.1 - - [16/Nov/2023 19:47:02] "GET /test HTTP/1.1" 405 -
[2023-11-16 19:47:09,059] ERROR in app: Exception on /train [GET]
Traceback (most recent call last):
  File "c:\Users\jacob\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\compat\_optional.py", line 132, in import_optional_dependency
    module = importlib.import_module(name)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Users\jacob\AppData\Local\Programs\Python\Python311\Lib\importlib\__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen importlib._bootstrap>", line 1206, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1178, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1142, in _find_and_load_unlocked
ModuleNotFoundError: No

: 

In [23]:
xls = pd.ExcelFile('False Alarm Cases.xlsx')
df_test = pd.read_excel(xls, "History_Data")
df_test

Unnamed: 0,Case No.,Ambient Temperature( deg C),Calibration(days),Unwanted substance deposition(0/1),Humidity(%),H2S Content(ppm),detected by(% of sensors),Spuriosity Index(0/1),Unnamed: 8,Unnamed: 9,Unnamed: 10
0,Case # 1,-2,226,1,96,9,21,1,,,
1,Case # 2,4,134,1,83,4,77,0,,,
2,Case # 3,7,163,0,69,2,81,0,,,
3,Case # 4,5,162,0,80,6,69,0,,,
4,Case # 5,2,192,1,87,3,67,0,,,
...,...,...,...,...,...,...,...,...,...,...,...
1887,Case # 1992,6,195,1,72,5,79,0,,,
1888,Case # 1993,8,134,1,94,9,22,1,,,
1889,Case # 1994,1,32,0,95,4,100,0,,,
1890,Case # 1995,6,31,0,93,6,39,1,,,


In [56]:
df = pd.read_excel(xls)
fac = df.drop('Case No.', axis =True)
dum_Default = pd.get_dummies(fac, drop_first=True)
X = dum_Default.iloc[:,0:6]

y = dum_Default['Spuriosity Index(0/1)']
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size = 0.3, random_state=42, shuffle=True)
clf = GaussianNB()
clf.fit(x_train, y_train)
y_pred = clf.predict(x_test)
print(confusion_matrix(y_test, y_pred))
print(classification_report(y_test, y_pred))
print(accuracy_score(y_test, y_pred))

[[475   0]
 [  0  93]]
              precision    recall  f1-score   support

           0       1.00      1.00      1.00       475
           1       1.00      1.00      1.00        93

    accuracy                           1.00       568
   macro avg       1.00      1.00      1.00       568
weighted avg       1.00      1.00      1.00       568

1.0


In [58]:
xls_deploy = pd.ExcelFile('../False_Alarm_Case.xlsx')
fac = df.drop('Case No.', axis =True)
dum_Default = pd.get_dummies(fac, drop_first=True)
x_deploy = dum_Default.iloc[:,0:6]

y_deploy = dum_Default['Spuriosity Index(0/1)']
y_pred_deploy = clf.predict(x_deploy)
print(confusion_matrix(y_deploy, y_pred_deploy))
print(classification_report(y_deploy, y_pred_deploy))
print(accuracy_score(y_deploy, y_pred_deploy))

[[1562    1]
 [   0  329]]
              precision    recall  f1-score   support

           0       1.00      1.00      1.00      1563
           1       1.00      1.00      1.00       329

    accuracy                           1.00      1892
   macro avg       1.00      1.00      1.00      1892
weighted avg       1.00      1.00      1.00      1892

0.9994714587737844


In [55]:
import joblib
import sys
from sklearn.model_selection import train_test_split 
xls = pd.ExcelFile('../False_Alarm_Cases.xlsx')
sys.modules['sklearn.externals.joblib'] = joblib

# clf = joblib.load('filename.pkl')

df = pd.read_excel(xls)
df = df.iloc[:,1:8]
X = df.iloc[:,0:6]

y = df['Spuriosity Index(0/1)']
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size = 0.3, random_state=42, shuffle=True)
clf = GaussianNB()
clf.fit(x_train, y_train)
y_pred = clf.predict(x_test)


from sklearn.metrics import confusion_matrix, classification_report
from sklearn.metrics import accuracy_score
print(confusion_matrix(y_test, y_pred))
print(classification_report(y_test, y_pred))
print(accuracy_score(y_test, y_pred))

[[475   0]
 [  0  93]]
              precision    recall  f1-score   support

           0       1.00      1.00      1.00       475
           1       1.00      1.00      1.00        93

    accuracy                           1.00       568
   macro avg       1.00      1.00      1.00       568
weighted avg       1.00      1.00      1.00       568

1.0


In [45]:
df.describe()

Unnamed: 0,Ambient Temperature( deg C),Calibration(days),Unwanted substance deposition(0/1),Humidity(%),H2S Content(ppm),detected by(% of sensors),Spuriosity Index(0/1)
count,1892.0,1892.0,1892.0,1892.0,1892.0,1892.0,1892.0
mean,3.449789,131.633192,0.48203,82.513214,5.532241,71.610465,0.17389
std,3.323731,67.741005,0.499809,7.6599,2.271502,21.203802,0.379115
min,-2.0,10.0,0.0,69.0,2.0,20.0,0.0
25%,1.0,75.0,0.0,76.0,4.0,63.0,0.0
50%,3.0,133.0,0.0,82.0,6.0,76.0,0.0
75%,6.0,188.0,1.0,89.0,8.0,88.0,0.0
max,9.0,250.0,1.0,96.0,9.0,100.0,1.0


In [41]:
for i, j in enumerate(zip(list(y_test), list(y_pred))):
    print(i, j)

0 (0, 0)
1 (1, 1)
2 (0, 0)
3 (0, 0)
4 (0, 0)
5 (0, 0)
6 (0, 0)
7 (1, 1)
8 (0, 0)
9 (0, 0)
10 (0, 0)
11 (0, 0)
12 (0, 0)
13 (0, 0)
14 (0, 0)
15 (0, 0)
16 (0, 0)
17 (0, 0)
18 (0, 0)
19 (0, 0)
20 (0, 0)
21 (1, 1)
22 (0, 0)
23 (0, 0)
24 (0, 0)
25 (0, 0)
26 (1, 1)
27 (0, 0)
28 (0, 0)
29 (0, 0)
30 (1, 1)
31 (1, 1)
32 (0, 0)
33 (0, 0)
34 (0, 0)
35 (0, 0)
36 (0, 0)
37 (0, 0)
38 (0, 0)
39 (0, 0)
40 (0, 0)
41 (1, 1)
42 (0, 0)
43 (0, 0)
44 (0, 0)
45 (0, 0)
46 (0, 0)
47 (0, 0)
48 (0, 0)
49 (0, 0)
50 (0, 0)
51 (0, 0)
52 (0, 0)
53 (0, 0)
54 (0, 0)
55 (1, 1)
56 (0, 0)
57 (0, 0)
58 (0, 0)
59 (0, 0)
60 (0, 0)
61 (0, 0)
62 (1, 1)
63 (0, 0)
64 (0, 0)
65 (0, 0)
66 (0, 0)
67 (0, 0)
68 (0, 0)
69 (0, 0)
70 (0, 0)
71 (0, 0)
72 (0, 0)
73 (0, 0)
74 (0, 0)
75 (0, 0)
76 (0, 0)
77 (1, 1)
78 (0, 0)
79 (0, 0)
80 (0, 0)
81 (0, 0)
82 (0, 0)
83 (0, 0)
84 (1, 1)
85 (0, 0)
86 (0, 0)
87 (0, 0)
88 (0, 0)
89 (0, 0)
90 (1, 1)
91 (1, 1)
92 (0, 0)
93 (0, 0)
94 (0, 0)
95 (0, 0)
96 (0, 0)
97 (0, 0)
98 (1, 1)
99 (0, 0)
100 (0, 0)