### Task:

**dataset: https://archive.ics.uci.edu/ml/datasets/AI4I+2020+Predictive+Maintenance+Dataset**

1. Load the data: "ai4i2020.csv"
2. Profiling of the data.
3. Write your analysis of the data.
4. If there are NAN vals, Imputation to be done.
5. If the data ain't normal, then handle it.
6. Check multicollinearity.
7. Build the model and save it. (Regress against Air tempearature)
8. Compute Model accuracy.
9. 10 test cases and create a final report.

In [1]:
import pandas as pd
import numpy as np
from pandas_profiling import ProfileReport
import os

In [2]:
pwd

'D:\\iNeuron\\12. Machine Learning'

In [3]:
os.chdir("files used or created in the process")

In [4]:
os.listdir()

['advertising.csv',
 'Advertizing_report.html',
 'ai4i2020.csv',
 'ai4i2020.log',
 'myfirstmodel.sav',
 'predictive_maintenance.sav',
 'predictive_maintenance_report.html',
 'standardized_ai4i.csv']

In [5]:
### 1. Loading the data:

df = pd.read_csv("ai4i2020.csv")
df.head()

Unnamed: 0,UDI,Product ID,Type,Air temperature [K],Process temperature [K],Rotational speed [rpm],Torque [Nm],Tool wear [min],Machine failure,TWF,HDF,PWF,OSF,RNF
0,1,M14860,M,298.1,308.6,1551,42.8,0,0,0,0,0,0,0
1,2,L47181,L,298.2,308.7,1408,46.3,3,0,0,0,0,0,0
2,3,L47182,L,298.1,308.5,1498,49.4,5,0,0,0,0,0,0
3,4,L47183,L,298.2,308.6,1433,39.5,7,0,0,0,0,0,0
4,5,L47184,L,298.2,308.7,1408,40.0,9,0,0,0,0,0,0


In [6]:
try:
    df.rename(columns = {'UDI': 'UID'}, inplace=True)
    df.set_index('UID', inplace=True)

except Excetion as e:
    print(e)

In [7]:
df.head()

Unnamed: 0_level_0,Product ID,Type,Air temperature [K],Process temperature [K],Rotational speed [rpm],Torque [Nm],Tool wear [min],Machine failure,TWF,HDF,PWF,OSF,RNF
UID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
1,M14860,M,298.1,308.6,1551,42.8,0,0,0,0,0,0,0
2,L47181,L,298.2,308.7,1408,46.3,3,0,0,0,0,0,0
3,L47182,L,298.1,308.5,1498,49.4,5,0,0,0,0,0,0
4,L47183,L,298.2,308.6,1433,39.5,7,0,0,0,0,0,0
5,L47184,L,298.2,308.7,1408,40.0,9,0,0,0,0,0,0


In [8]:
### 2. Profiling the data:

pf_report = ProfileReport(df)

In [9]:
## daving the report genrated locally:

pf_report.to_file("predictive_maintenance_report.html")

Summarize dataset:   0%|          | 0/5 [00:00<?, ?it/s]

Generate report structure:   0%|          | 0/1 [00:00<?, ?it/s]

Render HTML:   0%|          | 0/1 [00:00<?, ?it/s]

Export report to file:   0%|          | 0/1 [00:00<?, ?it/s]

In [9]:
pf_report.to_widgets()

Summarize dataset:   0%|          | 0/5 [00:00<?, ?it/s]

Generate report structure:   0%|          | 0/1 [00:00<?, ?it/s]

Render widgets:   0%|          | 0/1 [00:00<?, ?it/s]

VBox(children=(Tab(children=(Tab(children=(GridBox(children=(VBox(children=(GridspecLayout(children=(HTML(valu…

In [11]:
### 3. My Analysis:

1. - We have 14 distinct variables in this dataset i.e. 14 feature columns and any one of them can be our label column as per the desire of our client.
   - There's not a single record containing NaN vals.
   - Out of our 14 feature variables 8 are of categorical datatype and 6 are of numerical.
   - Decent dataset, as we're having 10,000 records.
   - Each record is an unique one since there are no duplicate rows.
<br><br>
2. - Air Temperature doesn't follow perfect normal distribution.
   - Process Temperature doesn't follow normal distribution.
   - The distribution of Rotational speed is somewhat right skewed i.e. most of its vals falls right to the peak of the graph.
   - The Torque feature follows Normal distribution.
   - Tool Wear doesn't follow normal distribution.
<br><br>
3. Correlations: (i.e. if one entity increases, the other one increases or decreases accordingly.)<br>
    - Air temp. and Process temp. are positively correlated, it seems.
    - Change in Air temp. doesn't have any effect on rotational speed and Torque whatsoever.
    - Process temp. is not at all correlated with rotational speed and torque.
    - Rotational speed is negatively correlated with Torque and vice-versa.
    - Machine failure is positively correlated w HDF, PWF and OSF.
    - HDF (Heat Dissipation Failure) is positively correlated with Machine Failure.
    - PWF (Power failure) is positively correlated with Machine failure.
    - OSF (Overstrain failure) is positively correlated with Machine failure.
    <br>**Note: As mentioned in the dataset description that if at least one of the above(HDF, PWF and OSF) failure modes is true, the process fails and the 'machine failure' label is set to 1. 
       It is thus, therefore not transparent to the machine learning method, which of the failure modes has caused the process to fail.**

In [12]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 10000 entries, 1 to 10000
Data columns (total 13 columns):
 #   Column                   Non-Null Count  Dtype  
---  ------                   --------------  -----  
 0   Product ID               10000 non-null  object 
 1   Type                     10000 non-null  object 
 2   Air temperature [K]      10000 non-null  float64
 3   Process temperature [K]  10000 non-null  float64
 4   Rotational speed [rpm]   10000 non-null  int64  
 5   Torque [Nm]              10000 non-null  float64
 6   Tool wear [min]          10000 non-null  int64  
 7   Machine failure          10000 non-null  int64  
 8   TWF                      10000 non-null  int64  
 9   HDF                      10000 non-null  int64  
 10  PWF                      10000 non-null  int64  
 11  OSF                      10000 non-null  int64  
 12  RNF                      10000 non-null  int64  
dtypes: float64(3), int64(8), object(2)
memory usage: 1.1+ MB


In [13]:
### 4. If there are NAN vals, Imputation to be done:

In statistics, imputation is the process of replacing missing data with substituted values. When substituting for a data point, it is known as "unit imputation"; when substituting for a component of a data point, it is known as "item imputation".
<br>**As there isn't even a single NaN val in the whole dataset, Imputation needn't be done.**

In [14]:
### 5. If the data ain't normal, then handle it.

From our fore analysis of the data, the distribution of our columns clearily indicates that the following columns doesn't follow Normal distribution:

- **Air temperature**
- **Process temperature**
- **Rotational speed** 
- **Tool wear**

In [11]:
df.columns

Index(['Product ID', 'Type', 'Air temperature [K]', 'Process temperature [K]',
       'Rotational speed [rpm]', 'Torque [Nm]', 'Tool wear [min]',
       'Machine failure', 'TWF', 'HDF', 'PWF', 'OSF', 'RNF'],
      dtype='object')

In [15]:
from sklearn.preprocessing import StandardScaler

In [16]:
def standardize(dataset):
    """
    A function to standardize a dataframe passed as an argument.
    """
    try:
        scalar = StandardScaler()
        scalar.fit(dataset)
        new = scalar.transform(dataset) # our tranformed dataset
        new = pd.DataFrame(new) # transformed dataset converted to Series datatype
        
        return new
    
    except Exception as e:
        print(e)

In [18]:
df_copy[['Torque [Nm]']] = standardize(df_copy[['Torque [Nm]']])

In [33]:
df_copy.tail()

Unnamed: 0_level_0,Product ID,Type,Air temperature [K],Process temperature [K],Rotational speed [rpm],Torque [Nm],Tool wear [min],Machine failure,TWF,HDF,PWF,OSF,RNF
UID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
9996,M24855,M,298.8,308.4,1604,-0.821283,14,0,0,0,0,0,0
9997,H39410,H,298.9,308.4,1632,-0.660777,17,0,0,0,0,0,0
9998,M24857,M,299.0,308.6,1645,0.854005,22,0,0,0,0,0,0
9999,H39412,H,299.0,308.7,1408,0.021376,25,0,0,0,0,0,0
10000,M24859,M,299.0,308.7,1500,,30,0,0,0,0,0,0


In [17]:
# Creating a deep copy of our original dataset so that transformation changes won't be reflected in the original dataset.

df_copy = df.copy(deep = True)
df_copy.head()

Unnamed: 0_level_0,Product ID,Type,Air temperature [K],Process temperature [K],Rotational speed [rpm],Torque [Nm],Tool wear [min],Machine failure,TWF,HDF,PWF,OSF,RNF
UID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
1,M14860,M,298.1,308.6,1551,42.8,0,0,0,0,0,0,0
2,L47181,L,298.2,308.7,1408,46.3,3,0,0,0,0,0,0
3,L47182,L,298.1,308.5,1498,49.4,5,0,0,0,0,0,0
4,L47183,L,298.2,308.6,1433,39.5,7,0,0,0,0,0,0
5,L47184,L,298.2,308.7,1408,40.0,9,0,0,0,0,0,0


In [19]:
df_copy

Unnamed: 0_level_0,Product ID,Type,Air temperature [K],Process temperature [K],Rotational speed [rpm],Torque [Nm],Tool wear [min],Machine failure,TWF,HDF,PWF,OSF,RNF
UID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
1,M14860,M,298.1,308.6,1551,0.633308,0,0,0,0,0,0,0
2,L47181,L,298.2,308.7,1408,0.944290,3,0,0,0,0,0,0
3,L47182,L,298.1,308.5,1498,-0.048845,5,0,0,0,0,0,0
4,L47183,L,298.2,308.6,1433,0.001313,7,0,0,0,0,0,0
5,L47184,L,298.2,308.7,1408,0.191915,9,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...
9996,M24855,M,298.8,308.4,1604,-0.821283,14,0,0,0,0,0,0
9997,H39410,H,298.9,308.4,1632,-0.660777,17,0,0,0,0,0,0
9998,M24857,M,299.0,308.6,1645,0.854005,22,0,0,0,0,0,0
9999,H39412,H,299.0,308.7,1408,0.021376,25,0,0,0,0,0,0


In [22]:
df[['Torque [Nm]']] = standardize(df[['Torque [Nm]']])

In [23]:
df.tail()

Unnamed: 0_level_0,Product ID,Type,Air temperature [K],Process temperature [K],Rotational speed [rpm],Torque [Nm],Tool wear [min],Machine failure,TWF,HDF,PWF,OSF,RNF
UID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
9996,M24855,M,298.8,308.4,1604,29.5,14,0,0,0,0,0,0
9997,H39410,H,298.9,308.4,1632,31.8,17,0,0,0,0,0,0
9998,M24857,M,299.0,308.6,1645,33.4,22,0,0,0,0,0,0
9999,H39412,H,299.0,308.7,1408,48.5,25,0,0,0,0,0,0
10000,M24859,M,299.0,308.7,1500,40.2,30,0,0,0,0,0,0


In [29]:
print("Std: ", df[['Torque [Nm]']].std())
print("Mean: ", df[['Torque [Nm]']].mean())

# mean: 39.98691 (from pandas profiling)
# calculated manually: z score of last: 0.0021442017

Std:  Torque [Nm]    9.968934
dtype: float64
Mean:  Torque [Nm]    39.98691
dtype: float64


In [22]:
df_copy

Unnamed: 0_level_0,Product ID,Type,Air temperature [K],Process temperature [K],Rotational speed [rpm],Torque [Nm],Tool wear [min],Machine failure,TWF,HDF,PWF,OSF,RNF
UID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
1,M14860,M,298.1,308.6,1551,0.633308,0,0,0,0,0,0,0
2,L47181,L,298.2,308.7,1408,0.944290,3,0,0,0,0,0,0
3,L47182,L,298.1,308.5,1498,-0.048845,5,0,0,0,0,0,0
4,L47183,L,298.2,308.6,1433,0.001313,7,0,0,0,0,0,0
5,L47184,L,298.2,308.7,1408,0.191915,9,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...
9996,M24855,M,298.8,308.4,1604,-0.821283,14,0,0,0,0,0,0
9997,H39410,H,298.9,308.4,1632,-0.660777,17,0,0,0,0,0,0
9998,M24857,M,299.0,308.6,1645,0.854005,22,0,0,0,0,0,0
9999,H39412,H,299.0,308.7,1408,0.021376,25,0,0,0,0,0,0


In [30]:
pr1 = ProfileReport(df_copy)
pr1.to_widgets()

Summarize dataset:   0%|          | 0/5 [00:00<?, ?it/s]

Generate report structure:   0%|          | 0/1 [00:00<?, ?it/s]

Render widgets:   0%|          | 0/1 [00:00<?, ?it/s]

VBox(children=(Tab(children=(Tab(children=(GridBox(children=(VBox(children=(GridspecLayout(children=(HTML(valu…

In [None]:
### Actually checking if the Torque is normal or not:



In [32]:
df_copy.to_csv("standardized_ai4i.csv")

In [15]:
### 6. Check multicollinearity

**Feature columns:** Process temperature, Rotational speed, Torque, Tool wear, Machine failure, TWF, PWF, OSF, RNF.

**Label Column:** Air tempearature

### Multicollinearity is a term which refers to the correlation between the feature columns apart from the one that is treated to be as label column.

**So from our analysis of the profiling report, we can conclude that, there is multicollineariy between the following feature columns:**

- Rotational speed and Torque **(negative correlation).**
- Machine faiure and TWF **(positive correlation)**.
- Machine faiure and HDF **(positive correlation)**.
- Machine faiure and PWF **(positive correlation)**.
- Machine faiure and OSF **(positive correlation)**.
- Machine faiure and RNF **(positive correlation)**.

### 7. Build the model and save it. (Regress against the Air temperature)

Since we have to regress against the Air temperature (our response column or label column) against not one but various explanatory variables, we have to use the **Multiple Linear Regression** (that is just an extension of linear regression to several explanatory variables).

In [16]:
df.columns

Index(['Product ID', 'Type', 'Air temperature [K]', 'Process temperature [K]',
       'Rotational speed [rpm]', 'Torque [Nm]', 'Tool wear [min]',
       'Machine failure', 'TWF', 'HDF', 'PWF', 'OSF', 'RNF'],
      dtype='object')

**Note:** From our analysis, we do know that 'Machine failure' is highly postively correlated with 'TWF', 'HDF', 'PWF', 'OSF', 'RNF'.<br>
And it's even emphasized in the dataset description itself if one of the failures from TWF, HDF, PWF, OSF and RNF is true Machine failure is set to 1.<br>
<br>
Thus I'll put a condition in the predict method of the class LModel that if atleast one of the afore-mentioned failures is true, then set Machine failure to 1 i.e. the class LModel's **predict() method** won't be accepting Machine failure feature values.

In [17]:
from sklearn.linear_model import LinearRegression
import pickle
import logging as lg

In [18]:
class LModel:
    """
    This is a class that is specific to build a regression model to regress against the label column 'Air Temperature [K]'
    """
    
    def __init__(self):
        self.lModel = None
    
    def _features(self):
        """
        A protected method specific to pick the feature columns from the datset.
        """
        try:
            x = df[['Process temperature [K]', 'Rotational speed [rpm]', 'Torque [Nm]', 'Tool wear [min]',
                    'Machine failure', 'TWF', 'HDF', 'PWF', 'OSF', 'RNF']]
            return x
        
        except Exception as e:
            lg.error(e)
            
    def _label(self):
        """
        A protected method specific to select the label column that is to be regressed against.
        """
        try:
            y = df[['Air temperature [K]']]
            return y
        
        except Exception as e:
            lg.error(e)
            
    def build(self):
        """
        A method specific to build the desired model.
        """
        try:
            self.lModel = LinearRegression()
            lg.info("readying the model..")
            modelBuilt = self.lModel.fit(self._features(), self._label())
            lg.info("Model executed succesfully!")

            return modelBuilt

        except Exception as e:
            lg.error(e)
            
    def predict_(self, process_t, rot_speed, torque, tool_wear, twf, hdf, pwf, osf, rnf):
        """
        A method specific to yield prediction results.
        
        Note: If even one of the twf, hdf, pwf, osf and rnf failures is true, then Machine failure will be set to 1.
              For further clarification, refer to the dataset desrciption.
        """
        try:
            machine_failure = 0
            failures = [twf, hdf, pwf, osf, rnf]
            for i in failures:
                if i == 1: # i.e if any of these failures become true
                    machine_failure = 1 # set the machine failure = true
                    break
                    
#             print("Our Model predicts the value of Air temperature to be: (in Kelvins)")
            return float(self.build().predict([[process_t, rot_speed, torque, tool_wear, machine_failure, twf, hdf,
                                                pwf, osf, rnf]]))
        
        except Exception as e:
            lg.error(e)
            
    def accuracy(self):
        """
        This method calculates the accuracy of the built model.
        """
        try:
            accuracy_ = self.lModel.score(self._features(), self._label())* 100
            
            lg.info(f"The model appears to be {accuracy_}% accurate.")
            return f"The model appears to be {round(accuracy_, 3)} % accurate."
            
        except Exception as e:
            lg.error(e)
        
    def save(self):
        """
        The method to save the model locally.
        """
        try:
            pickle.dump(self.lModel, open("predictive_maintenance.sav", 'wb'))
            print("The model is saved sucessfully!")
            lg.info("The model is saved sucessfully!")
            
        except Exception as e:
            lg.error(e)

In [19]:
class Log:
    def __init__(self):
        try:
            self.logFile="ai4i2020.log"
            
            # removing the log file if already exists so as not to congest it.
            if os.path.exists(self.logFile):
                os.remove(self.logFile)
            lg.basicConfig(filename=self.logFile, level=lg.INFO, format="%(asctime)s %(levelname)s %(message)s")
            
            # Adding the StreamHandler to record logs in the console.
            self.console_log = lg.StreamHandler()
            
            # setting level to the console log.
            self.console_log.setLevel(lg.INFO) 
            
            # defining format for the console log.
            self.format = lg.Formatter("%(levelname)s %(asctime)s %(message)s")
            self.console_log.setFormatter(self.format) 
            
            # adding handler to the console log.
            lg.getLogger('').addHandler(self.console_log) 
        
        except Exception as e:
            lg.info(e)
            
        else:
            lg.info("Log Class successfully executed!")

In [20]:
Log()
linear_mod = LModel()

INFO 2022-08-21 14:27:00,228 Log Class successfully executed!


In [21]:
linear_mod.build()

INFO 2022-08-21 14:27:00,242 readying the model..
INFO 2022-08-21 14:27:00,275 Model executed succesfully!


LinearRegression()

In [22]:
### Saving the model.

linear_mod.save()

INFO 2022-08-21 14:27:00,293 The model is saved sucessfully!


The model is saved sucessfully!


In [23]:
### 8. Compute the Model Accuracy 

In [24]:
linear_mod.accuracy()

INFO 2022-08-21 14:27:00,346 The model appears to be 77.5666013666106% accurate.


'The model appears to be 77.567 % accurate.'

### 9. 10 test cases and create a final report.

In [25]:
# dataframe report for 10 test cases:

repo = df.iloc[:,3:][0:0].reset_index(drop=True)
repo

Unnamed: 0,Process temperature [K],Rotational speed [rpm],Torque [Nm],Tool wear [min],Machine failure,TWF,HDF,PWF,OSF,RNF


Here I'll generate those 10 test cases in an automated way by using **np.random.randint(df[column].min(), df[column].max()+1)**.<br><br>
In layman terms, I'm generating random datapoints of a specific column by picking the random values (using np.random.randint()) from between it's minimum and maximum values.

In [26]:
def generateAndMap(column, n=10):
    """
    This function generates random datapoints of a specific column of df by picking the random values 
    (using np.random.randint()) from between it's minimum and maximum values.
    And further map those datapoints to the same column of test cases report "repo".
    
    By default 10 number of datapoints will be generated.
    """
    l = []
    for i in range(n):
        x = np.random.randint(df[column].min(), df[column].max()+1)
        l.append(x)
    
    repo[column] = pd.Series(l)

In [27]:
df.columns

Index(['Product ID', 'Type', 'Air temperature [K]', 'Process temperature [K]',
       'Rotational speed [rpm]', 'Torque [Nm]', 'Tool wear [min]',
       'Machine failure', 'TWF', 'HDF', 'PWF', 'OSF', 'RNF'],
      dtype='object')

In [28]:
for i in df.columns[3:]: # picking only the desired feature columns from df
    generateAndMap(i)

In [29]:
repo # (not yet containing the predicted values)

Unnamed: 0,Process temperature [K],Rotational speed [rpm],Torque [Nm],Tool wear [min],Machine failure,TWF,HDF,PWF,OSF,RNF
0,308,1285,22,140,0,1,0,0,1,0
1,313,2284,47,93,1,1,1,1,0,1
2,308,1544,21,221,1,1,1,0,1,1
3,307,1540,72,154,0,0,1,1,0,1
4,308,2064,44,8,1,1,1,0,1,0
5,310,2348,67,129,1,0,1,0,0,0
6,312,1882,73,211,0,1,0,0,1,0
7,307,1851,24,217,1,1,1,0,1,1
8,306,1424,31,135,0,0,1,1,0,0
9,307,2631,72,74,1,1,0,1,0,0


In [30]:
# let's now add the column conataining predicted values of the Air temperatures.

In [31]:
predicted_vals = []
for i in range(10):
    predicted_vals.append( linear_mod.predict_(repo['Process temperature [K]'].iloc[i], 
                                        repo['Rotational speed [rpm]'].iloc[i],
                                        repo['Torque [Nm]'].iloc[i],
                                        repo['Tool wear [min]'].iloc[i],
                                        repo['TWF'].iloc[i], repo['HDF'].iloc[i],
                                        repo['PWF'].iloc[i], repo['OSF'].iloc[i],
                                        repo['RNF'].iloc[i]) )

INFO 2022-08-21 14:27:00,515 readying the model..
INFO 2022-08-21 14:27:00,527 Model executed succesfully!
INFO 2022-08-21 14:27:00,539 readying the model..
INFO 2022-08-21 14:27:00,548 Model executed succesfully!
INFO 2022-08-21 14:27:00,550 readying the model..
INFO 2022-08-21 14:27:00,561 Model executed succesfully!
INFO 2022-08-21 14:27:00,563 readying the model..
INFO 2022-08-21 14:27:00,572 Model executed succesfully!
INFO 2022-08-21 14:27:00,575 readying the model..
INFO 2022-08-21 14:27:00,584 Model executed succesfully!
INFO 2022-08-21 14:27:00,586 readying the model..
INFO 2022-08-21 14:27:00,597 Model executed succesfully!
INFO 2022-08-21 14:27:00,598 readying the model..
INFO 2022-08-21 14:27:00,608 Model executed succesfully!
INFO 2022-08-21 14:27:00,610 readying the model..
INFO 2022-08-21 14:27:00,619 Model executed succesfully!
INFO 2022-08-21 14:27:00,621 readying the model..
INFO 2022-08-21 14:27:00,629 Model executed succesfully!
INFO 2022-08-21 14:27:00,631 readying

In [32]:
repo["predicted Air temperature [K]"] = pd.Series(predicted_vals)

In [33]:
repo # final report (10 test cases)

Unnamed: 0,Process temperature [K],Rotational speed [rpm],Torque [Nm],Tool wear [min],Machine failure,TWF,HDF,PWF,OSF,RNF,predicted Air temperature [K]
0,308,1285,22,140,0,1,0,0,1,0,297.652502
1,313,2284,47,93,1,1,1,1,0,1,305.618569
2,308,1544,21,221,1,1,1,0,1,1,299.430384
3,307,1540,72,154,0,0,1,1,0,1,298.24455
4,308,2064,44,8,1,1,1,0,1,0,299.574693
5,310,2348,67,129,1,0,1,0,0,0,301.831362
6,312,1882,73,211,0,1,0,0,1,0,302.480792
7,307,1851,24,217,1,1,1,0,1,1,298.315487
8,306,1424,31,135,0,0,1,1,0,0,297.093364
9,307,2631,72,74,1,1,0,1,0,0,296.925058
