### Linear Regression

##### Step1: Load Datasets

In [50]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

In [51]:
data1 = pd.read_csv("Real estate.csv")

In [52]:
data1.head()

Unnamed: 0,No,X1 transaction date,X2 house age,X3 distance to the nearest MRT station,X4 number of convenience stores,X5 latitude,X6 longitude,Y house price of unit area
0,1,2012.917,32.0,84.87882,10,24.98298,121.54024,37.9
1,2,2012.917,19.5,306.5947,9,24.98034,121.53951,42.2
2,3,2013.583,13.3,561.9845,5,24.98746,121.54391,47.3
3,4,2013.5,13.3,561.9845,5,24.98746,121.54391,54.8
4,5,2012.833,5.0,390.5684,5,24.97937,121.54245,43.1


##### Step2: Separate input and output data

In [53]:
x = data1.loc[:,'X2 house age':'X6 longitude']
y = data1['Y house price of unit area']

##### Step3: Split datasets into Testing and Training set

In [54]:
from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test = train_test_split(x,y,test_size=.2,random_state=0)

##### Scalling Of Data

In [62]:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
x_train = scaler.fit_transform(x_train)
x_test = scaler.fit_transform(x_test)

##### Step4: Model Selection

In [56]:
from sklearn.linear_model import LinearRegression
lr = LinearRegression()

##### Step5: Model Training

In [57]:
lr.fit(x_train,y_train)

##### Step6: Prediction On Training Data

In [58]:
y_pred_train = lr.predict(x_train)

##### Step7: Prediction On Testing Data

In [59]:
y_pred_test = lr.predict(x_test)

##### Step8: Model Evolution

In [61]:
from sklearn.metrics import mean_absolute_error,r2_score
print(f"The MAE of Train : {mean_absolute_error(y_train,y_pred_train)}")
print(f"The MAE of Test : {mean_absolute_error(y_test,y_pred_test)}")
print(f"The r2 Score of Train : {r2_score(y_train,y_pred_train)}")
print(f"The r2 Score of Test : {r2_score(y_test,y_pred_test)}")

The MAE of Train : 6.301172552656106
The MAE of Test : 5.701176584068512
The r2 Score of Train : 0.5544852640330532
The r2 Score of Test : 0.6421477735370715


................................................................................................................................................................................................................................................................................................................................

### Logistic Regression

##### Step1: Load Datasets

In [63]:
data2 = pd.read_csv("heart.csv")
data2.head()

Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,target
0,52,1,0,125,212,0,1,168,0,1.0,2,2,3,0
1,53,1,0,140,203,1,0,155,1,3.1,0,0,3,0
2,70,1,0,145,174,0,1,125,1,2.6,0,0,3,0
3,61,1,0,148,203,0,1,161,0,0.0,2,1,3,0
4,62,0,0,138,294,1,1,106,0,1.9,1,3,2,0


- age – Age of the patient (in years)
- sex – Gender (1 = male, 0 = female)
- cp – Chest pain type (0–3, where 0 = typical angina, 1 = atypical angina, 2 = non-anginal pain, 3 = asymptomatic)
- trestbps – Resting blood pressure (in mm Hg)
- chol – Serum cholesterol (in mg/dl)
- fbs – Fasting blood sugar > 120 mg/dl (1 = true, 0 = false)
- restecg – Resting electrocardiographic results (0 = normal, 1 = ST-T wave abnormality, 2 = left ventricular hypertrophy)
- thalach – Maximum heart rate achieved
- exang – Exercise-induced angina (1 = yes, 0 = no)
- oldpeak – ST depression induced by exercise relative to rest
- slope – Slope of the peak exercise ST segment (0 = upsloping, 1 = flat, 2 = downsloping)
- ca – Number of major vessels (0–3) colored by fluoroscopy
- thal – Thalassemia (1 = normal, 2 = fixed defect, 3 = reversible defect)
- target – Presence of heart disease (1 = yes, 0 = no)

##### 🔹 Commonly important features:
- Feature	Reason
  - cp (chest pain)	Directly related to heart disease symptoms
  - thalach (max heart rate)	Indicates heart performance under stress
  - exang (exercise angina)	Whether patient experiences chest pain during exercise
  - oldpeak	Measures depression in ECG - strong indicator
  - slope	Related to ECG curve - shows abnormality
  - ca (number of major vessels)	Direct medical relevance
  - thal	Related to heart scan results

##### Step2: Separate input and output data

In [64]:
x = data2[['cp','thalach','exang','oldpeak','slope','ca','thal']]
y = data2['target']

##### Step3: Split datasets into Testing and Training set

In [65]:
from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test = train_test_split(x,y,test_size=.2,random_state=0)

##### Data Scalling

In [66]:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
x_train = scaler.fit_transform(x_train)
x_test = scaler.fit_transform(x_test)

##### Step4: Model Selection

In [67]:
from sklearn.linear_model import LogisticRegression
lgr = LogisticRegression()

##### Step5: Model Training

In [68]:
lgr.fit(x_train,y_train)

##### Step6: Prediction On Training Data

In [73]:
y_pred_train = lgr.predict(x_train)

##### Step7: Prediction On Testing Data

In [74]:
y_pred_test = lgr.predict(x_test)

##### Step8: Model Evolution

##### Accuracy Score

In [87]:
from sklearn.metrics import accuracy_score
print(f"Accuracy Score Of Train :{accuracy_score(y_train,y_pred_train)}")
print(f"Accuracy Score Of Test :{accuracy_score(y_test,y_pred_test)}")

Accuracy Score Of Train :0.8426829268292683
Accuracy Score Of Test :0.8585365853658536


##### Confusion Matrix

In [91]:
from sklearn.metrics import confusion_matrix
print(f"Confusion Metrics Of Train \n {confusion_matrix(y_train,y_pred_train)}")
print(f"Confusion Metrics Of Train \n {confusion_matrix(y_test,y_pred_test)}")

Confusion Metrics Of Train 
 [[318  83]
 [ 46 373]]
Confusion Metrics Of Train 
 [[79 19]
 [10 97]]


................................................................................................................................................................................................................................................................................................................................