# **Linear Regression with Scikit Learn - Machine Learning with Python**




## Problem Statement


> **QUESTION**: ACME Insurance Inc. offers affordable health insurance to thousands of customer all over the United States. As the lead data scientist at ACME, **you're tasked with creating an automated system to estimate the annual medical expenditure for new customers**, using information such as their age, sex, BMI, children, smoking habits and region of residence.
>
> Estimates from your system will be used to determine the annual insurance premium (amount paid every month) offered to the customer. Due to regulatory requirements, you must be able to explain why your system outputs a certain prediction.
>
> You're given a [CSV file](https://raw.githubusercontent.com/JovianML/opendatasets/master/data/medical-charges.csv) containing verified historical data, consisting of the aforementioned information and the actual medical charges incurred by over 1300 customers.

> <img src="https://i.imgur.com/87Uw0aG.png" width="480">





# **Step 1** ---- **Reading the Data**

In [None]:
# !pip install pandas
import pandas as pd

medical_df = pd.read_csv('medical.csv')  # Dataframe
medical_df

Unnamed: 0,age,sex,bmi,children,smoker,region,charges
0,19,female,27.900,0,yes,southwest,16884.92400
1,18,male,33.770,1,no,southeast,1725.55230
2,28,male,33.000,3,no,southeast,4449.46200
3,33,male,22.705,0,no,northwest,21984.47061
4,32,male,28.880,0,no,northwest,3866.85520
...,...,...,...,...,...,...,...
1333,50,male,30.970,3,no,northwest,10600.54830
1334,18,female,31.920,0,no,northeast,2205.98080
1335,18,female,36.850,0,no,southeast,1629.83350
1336,21,female,25.800,0,no,southwest,2007.94500


# **Step 2** ---- **Making a basic model and improving manually**

In [None]:
# estimated_charges

In [None]:
# y  =
#z= m * x + c

In [None]:
# [Define Target and Input]
Target = medical_df.charges  # ---->  y

input = medical_df.age   # ---->  x


#----------------------------------
# [Defining  the model]

# [Getting estimated results from our model]


#z                =   x   *    m +     c
estimated_charges = input * 300 + 100

# #----------------------------------
# # [Computing loss using RMSE]
# !pip install numpy --quiet
print(estimated_charges)

import numpy as np

# rmse == inorder to reduce the loss


def rmse(targets, predictions):
  return np.sqrt(np.mean(np.square(targets - predictions)))

rmse(Target, estimated_charges)


0        5800
1        5500
2        8500
3       10000
4        9700
        ...  
1333    15100
1334     5500
1335     5500
1336     6400
1337    18400
Name: age, Length: 1338, dtype: int64


11652.334347112395

# **Step 3** ---- **Using Linear regression from Sklearn**

In [None]:
medical_df[['age']]

Unnamed: 0,age
0,19
1,18
2,28
3,33
4,32
...,...
1333,50
1334,18
1335,18
1336,21


In [None]:
# from sklearn.linear_model import LinearRegression

# Installation
# !pip install scikit-learn --quiet


from sklearn.linear_model import LinearRegression


# Create inputs and targets
inputs, targets = medical_df[['age']], medical_df.charges

# Create and train the model


model = LinearRegression().fit(inputs, targets)


# Generate predictions
predictions = model.predict(inputs)
# print(predictions)

# Compute loss to evalute the model
loss = rmse(targets, predictions)

print('Loss:', loss)

# The coefficients
# m
print(model.coef_)  #m
# # c
# print(model.intercept_) #c

Loss: 11551.66562075632
[257.72261867]


In [None]:
x=model.predict([[89]])
print("charges :   ",x)

charges :    [26103.19806742]




In [None]:
# if you want user interface flask, html, css, python

In [None]:
# Create inputs and targets
inputs, targets = medical_df[['age', 'bmi']], medical_df['charges']

# Create and train the model
model = LinearRegression().fit(inputs, targets)

# Generate predictions
predictions = model.predict(inputs)

# Compute loss to evalute the model
loss = rmse(targets, predictions)
print('Loss:', loss)

print(model.coef_)   #m1 m2
print(model.intercept_) # c

Loss: 11374.110466839007
[241.9307779  332.96509081]
-6424.804612240769


In [None]:
x=model.predict([[10,25.6]])
print("charges ",x)

charges  [4518.40949162]


