## Question 1:

In a video game, a meteor will fall toward the main character's home planet.<br>
Given the meteor's trajectory as a string in the form y = mx + b and the character's position as a tuple of (x, y),<br>
return True if the meteor will hit the character and False if it will not.<br>

 - **Examples**<br>
   will_hit("y = 2x - 5", (0, 0)) ➞ False<br>
   will_hit("y = -4x + 6", (1, 2)) ➞ True<br>
   will_hit("y = 2x + 6", (3, 2)) ➞ False<br>

- **Notes**<br>
  The b value will never be zero or blank.<br>
  The m value will always be an integer.<br>
  If the m value is 1, the "1" will be shown.<br>
  For example, "y = x + 5" will be shown as "y = 1x + 5".<br>
  If the m value is -1, the "-1" will be shown.<br>
  For example, "y = -x + 2" will be shown as "y = -1x + 2".<br>

## Solution:

In [1]:
def sign(string: "String"):
    
    # Case m or b have negative value
    if(string[0] == '-' and len(string) == 2):
        return -1 * int(string[1])

    # Case b has positive value
    if(string[0] == '+' and len(string) == 2):
        return int(string[1])
    
    # Case there is no sign attached
    return int(string[0])


def will_hit(trajectory: "equation"="y = 2x - 5", position: tuple=(0, 0)):
    
    # Splitting equation into 2 parts the get right handside then splitting again to separate m and b
    RHS = trajectory.split(' = ')[1]
    RHS = RHS.split(' ', 1)
    
    # Assigning m with its side of the string, removing x char from string, checking sign then converting to int
    m = RHS[0][:-1]
    m = sign(m)
    
    # Assigning b to its side of the string, removing white space, checking sign then converting to int
    b = RHS[1].replace(" ", "")
    b = sign(b)
    
    # splitting tuple to separate variable for easir access
    x, y = position[0], position[1]
    
    # Calculating whether the poistion lies on the fiven trajectory equation then returning boolean 
    hit = (m*x+b == y) 
    
    return hit

In [2]:
will_hit('y = -1x + 2', (3, 2))

False

## Question 2

Based on object oriented programming concepts create an object model of a chicken (bird).

## Solution:

In [3]:
class Bird():
    def __init__(self, feather_colour, beak_colour):
        self.feather_colour = feather_colour
        self.beak_colour = beak_colour
        # Encapsulation
        self._sing = "KAaaaa"
        
    def name(self):
        print("Bird")
        
    @property
    def sing(self):
        return self._sing
    
    # Abstraction
    def fly(self):
        ''' data '''
        
class Chicken(Bird):
    
    # Inheritance
    def __init__(self, feather_colour, beak_colour, flying):
        super().__init__(feather_colour, beak_colour)
        self.flying = flying

    @property
    def sing(self):
        return super().sing
    
    # Polymorphism
    def name(self):
        print("Chicken")
    
    def fly(self):
        return self.flying

In [4]:
chicken = Chicken("White", "Yellow", False)
chicken.name()
print(chicken.fly())
print(chicken.sing)

Chicken
False
KAaaaa


## Question 3

**About the Data :**
The data in `Data.csv` consists of normalized process data from a steel mill.<br>
Each row in the dataset is a coil processed in the steel mill.<br>


- **File Data.csv :**
    - Meta information columns :
        - "Coil_id"  - ID of the coil.
        - "Steelgrade" - Type of steelgrade.
        - "Material spec" - Type of sub material for the steelgrade.
        - "cgl_production_start" - coil production time.
- **Target column :**
    - "mechanical_target_1" - The measured mechanical property measured at the end of the whole process.
- **Other columns are measured process parameters (we normalized the data to make it anonymous).**
- **Goal :**
    - Develop a machine learning model that predicts the  mechanical property("mechanical_target_1") using the measured process parameters.
    - Explain the black box model.

### (You will be evaluated based on the approach you take to solve the problem and programming skills)

## Solution:

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score
from sklearn.metrics import mean_squared_error

In [2]:
# Reading Data from csv file
df = pd.read_csv('Data.csv', index_col='coil_id')
df.drop(df.columns[0], axis=1, inplace=True)
df.head(10)

Unnamed: 0_level_0,steelgrade,material_spec,cgl_production_start,hsm_temp_strip_coiling_meas_tail,hsm_temp_strip_exit_fm_meas_tail,ccm_casting_speed_tail,hsm_thickness_reduction_ratio_f1,hsm_thickness_reduction_ratio_f5,hsm_cooling_strategy,tcm_deformation_total,...,cgl_elong_spm_tail,cgl_elong_tl_tail,cgl_thick_exit_meas_tail,cgl_wr_force_spm_head,cgl_wr_force_spm_body,cgl_wr_force_spm_tail,cgl_wr_force_spm_spec_head,cgl_wr_force_spm_spec_body,cgl_wr_force_spm_spec_tail,mechanical_target_1
coil_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1,Grade_0,Material_0,2/9/18 0:59,,,,,,,0.903685,...,0.303654,0.0,0.396751,0.382642,0.407547,0.379913,0.374225,0.378177,0.337639,49.448726
2,Grade_1,Material_1,3/19/18 11:54,0.828864,0.944483,0.798438,0.84058,0.838791,0.0625,0.766175,...,0.300012,0.0,0.400495,0.209215,0.199375,0.203155,0.260019,0.235105,0.229439,36.161856
3,Grade_0,Material_0,3/14/18 18:49,0.828903,0.929989,0.798438,0.818035,0.884131,0.0625,0.945814,...,0.306589,0.0,0.268265,0.393983,0.337955,0.319281,0.484284,0.394149,0.356636,45.067861
4,Grade_0,Material_0,2/2/18 4:25,0.821872,0.928737,0.85107,0.595813,0.921914,0.0625,,...,0.424863,0.234736,0.576916,0.262058,0.292323,0.301225,0.348425,0.368769,0.363942,45.806417
5,Grade_0,Material_0,2/2/18 4:41,0.829637,0.92801,0.861352,0.600644,0.871537,0.0625,0.790953,...,0.413622,0.263496,0.570831,0.281178,0.304132,0.292848,0.374319,0.38415,0.354267,45.603351
6,Grade_0,Material_0,1/19/18 18:59,0.828349,0.928464,0.8625,0.697262,0.845088,0.0625,0.923319,...,0.427701,0.0,0.316288,0.378552,0.351742,0.348914,0.498364,0.439363,0.417415,48.216325
7,Grade_0,Material_2,2/2/18 11:44,0.825572,0.931917,0.690127,0.695652,0.867758,0.0625,0.826691,...,0.427402,0.271849,0.540012,0.478706,0.45411,0.448843,0.468176,0.421385,0.398899,51.696909
8,Grade_0,Material_2,2/2/18 11:59,0.819762,0.932128,0.744348,0.702093,0.882872,0.0625,0.82644,...,0.422716,0.233502,0.54396,0.475579,0.460355,0.445401,0.465117,0.42718,0.39584,48.522078
9,Grade_0,Material_2,2/2/18 12:15,0.814527,0.927129,0.788522,0.697262,0.879093,0.0625,0.826267,...,0.420789,0.224165,0.545763,0.476122,0.458174,0.441442,0.465648,0.425157,0.392322,48.234379
10,Grade_0,Material_2,2/2/18 12:32,0.820807,0.925844,0.798438,0.68277,0.88665,0.0625,0.826188,...,0.421605,0.106452,0.543703,0.464229,0.457221,0.439262,0.454017,0.424272,0.390384,48.051661


In [3]:
# Raw data
df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 31914 entries, 1 to 31914
Data columns (total 45 columns):
 #   Column                            Non-Null Count  Dtype  
---  ------                            --------------  -----  
 0   steelgrade                        31914 non-null  object 
 1   material_spec                     31914 non-null  object 
 2   cgl_production_start              31914 non-null  object 
 3   hsm_temp_strip_coiling_meas_tail  31750 non-null  float64
 4   hsm_temp_strip_exit_fm_meas_tail  31750 non-null  float64
 5   ccm_casting_speed_tail            31749 non-null  float64
 6   hsm_thickness_reduction_ratio_f1  31750 non-null  float64
 7   hsm_thickness_reduction_ratio_f5  31750 non-null  float64
 8   hsm_cooling_strategy              31750 non-null  float64
 9   tcm_deformation_total             29887 non-null  float64
 10  tund_chem_carbon                  31647 non-null  float64
 11  tund_chem_aluminium               31647 non-null  float64
 12  tund

In [4]:
# Cleaning data, removing nan value, removing negative values, rearranging date column
df.dropna(inplace=True)
df = df[(df['cgl_thick_exit_meas_tail']>=0) & (df['cgl_wr_force_spm_head']>=0) & (df['cgl_wr_force_spm_body']>=0) & (df['cgl_wr_force_spm_spec_head']>=0)  & (df['cgl_wr_force_spm_spec_body']>=0)]
df.insert(0, 'cgl_production_start', df.pop('cgl_production_start'))

In [5]:
# Changing column data types
df['cgl_production_start'] = pd.to_datetime(df['cgl_production_start'], format='%m/%d/%y %H:%M')
df['steelgrade'] = df['steelgrade'].apply(lambda x: float(x[-1]))
df['material_spec'] = df['material_spec'].apply(lambda x: float(x[-1]))

In [6]:
# Cleaned data
df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 29625 entries, 2 to 31914
Data columns (total 45 columns):
 #   Column                            Non-Null Count  Dtype         
---  ------                            --------------  -----         
 0   cgl_production_start              29625 non-null  datetime64[ns]
 1   steelgrade                        29625 non-null  float64       
 2   material_spec                     29625 non-null  float64       
 3   hsm_temp_strip_coiling_meas_tail  29625 non-null  float64       
 4   hsm_temp_strip_exit_fm_meas_tail  29625 non-null  float64       
 5   ccm_casting_speed_tail            29625 non-null  float64       
 6   hsm_thickness_reduction_ratio_f1  29625 non-null  float64       
 7   hsm_thickness_reduction_ratio_f5  29625 non-null  float64       
 8   hsm_cooling_strategy              29625 non-null  float64       
 9   tcm_deformation_total             29625 non-null  float64       
 10  tund_chem_carbon                  29625 non-nu

In [7]:
df.describe()

Unnamed: 0,steelgrade,material_spec,hsm_temp_strip_coiling_meas_tail,hsm_temp_strip_exit_fm_meas_tail,ccm_casting_speed_tail,hsm_thickness_reduction_ratio_f1,hsm_thickness_reduction_ratio_f5,hsm_cooling_strategy,tcm_deformation_total,tund_chem_carbon,...,cgl_elong_spm_tail,cgl_elong_tl_tail,cgl_thick_exit_meas_tail,cgl_wr_force_spm_head,cgl_wr_force_spm_body,cgl_wr_force_spm_tail,cgl_wr_force_spm_spec_head,cgl_wr_force_spm_spec_body,cgl_wr_force_spm_spec_tail,mechanical_target_1
count,29625.0,29625.0,29625.0,29625.0,29625.0,29625.0,29625.0,29625.0,29625.0,29625.0,...,29625.0,29625.0,29625.0,29625.0,29625.0,29625.0,29625.0,29625.0,29625.0,29625.0
mean,0.986937,3.791257,0.839418,0.930502,0.777481,0.712474,0.869229,0.291808,0.874055,0.352984,...,0.208775,0.033964,0.361446,0.25515,0.260227,0.253411,0.231112,0.224171,0.208352,60.414337
std,1.529615,3.001199,0.032651,0.008247,0.070997,0.07777,0.043288,0.402518,0.085588,0.32322,...,0.156707,0.06977,0.157185,0.177456,0.164561,0.166219,0.159865,0.140667,0.135729,20.935711
min,0.0,0.0,0.667799,0.872324,0.0,0.563607,0.715365,0.0,0.498998,0.006935,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,16.024396
25%,0.0,0.0,0.820396,0.927909,0.735937,0.658615,0.84131,0.0625,0.792659,0.142874,...,0.0,0.0,0.2189,0.101341,0.15177,0.124826,0.096037,0.138505,0.106149,48.357751
50%,0.0,4.0,0.828395,0.931279,0.7746,0.711755,0.869018,0.0625,0.898596,0.168743,...,0.227609,0.0,0.330932,0.292483,0.288976,0.277967,0.261767,0.251506,0.228914,51.580115
75%,2.0,7.0,0.857046,0.934382,0.819103,0.768116,0.897985,0.0625,0.92412,0.783801,...,0.308379,0.006853,0.503622,0.384531,0.368211,0.362454,0.344807,0.313573,0.29329,63.812811
max,9.0,9.0,1.0,1.0,1.0,0.988728,1.0,1.0,1.0,1.0,...,1.0,1.0,1.0,1.0,0.972326,0.986631,1.0,0.893787,0.843042,123.178979


In [8]:
# Features
X = df.iloc[:,1:44]

# Target
Y = df[['mechanical_target_1']]

# Splitting data into training and testing portions
X_train,X_test,y_train,y_test=train_test_split(X,Y,test_size=0.4,random_state=100)

In [9]:
# Multiple Regression
LR = LinearRegression()

# Non Negative Least Squares
LR.fit(X_train,y_train)

# Predict using the linear model
y_pred =  LR.predict(X_test)
print (X_test) #test dataset
print (y_pred) #predicted values

         steelgrade  material_spec  hsm_temp_strip_coiling_meas_tail  \
coil_id                                                                
26797           0.0            7.0                          0.813022   
1623            0.0            0.0                          0.829654   
5346            0.0            2.0                          0.816173   
7757            0.0            4.0                          0.826565   
22817           0.0            7.0                          0.812231   
...             ...            ...                               ...   
9531            2.0            3.0                          0.878603   
7771            0.0            4.0                          0.830162   
9412            2.0            5.0                          0.859817   
21771           0.0            4.0                          0.818263   
8630            0.0            0.0                          0.816049   

         hsm_temp_strip_exit_fm_meas_tail  ccm_casting_speed_ta

In [10]:
# coefficient of determination
score = r2_score(y_test,y_pred)

print('R2 Score = ',score)
print('Mean Squared Error = ',mean_squared_error(y_test,y_pred))
print('Root Mean Squared Error = ',np.sqrt(mean_squared_error(y_test,y_pred)))

R2 Score =  0.9360828634526998
Mean Squared Error =  28.249060215231566
Root Mean Squared Error =  5.31498449811771
