<p style="text-align:center">
    <a href="https://skills.network/?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMDB0321ENSkillsNetwork26764238-2022-01-01" target="_blank">
    <img src="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/assets/logos/SN_web_lightmode.png" width="200" alt="Skills Network Logo"  />
    </a>
</p>


### Hands On Lab - Saving and loading a SparkML model


#### Objectives:

In this lab you will

*   Create a simple Linear Regression Model
*   Save the SparkML model
*   Load the SparkML model
*   Make predictions using the loaded SparkML model


#### Install pyspark


In [None]:
!pip install pyspark
!pip install findspark

#### Import libraries


In [None]:
import findspark
findspark.init()

In [None]:
from pyspark import SparkContext, SparkConf
from pyspark.sql import SparkSession

#### Creating the spark session and context


In [None]:
# Creating a spark context class
sc = SparkContext()

# Creating a spark session
spark = SparkSession \
    .builder \
    .appName("Saving and Loading a SparkML Model").getOrCreate()

#### Importing Spark ML libraries


In [None]:
from pyspark.ml.feature import VectorAssembler
from pyspark.ml.regression import LinearRegression

#### Create a DataFrame with sample data


In [None]:
# Create a simple data set of infant height(cms) weight(kgs) chart.

mydata = [[46,2.5],[51,3.4],[54,4.4],[57,5.1],[60,5.6],[61,6.1],[63,6.4]]
  
# Mention column names of dataframe
columns = ["height", "weight"]
  
# creating a dataframe
mydf = spark.createDataFrame(mydata, columns)
  
# show data frame
mydf.show()

#### Converting data frame columns into feature vectors

In this task we use the `VectorAssembler()` function to convert the dataframe columns into feature vectors.
For our example, we use the horsepower ("hp) and weight of the car as input features and the miles-per-gallon ("mpg") as target labels.


In [None]:
assembler = VectorAssembler(
    inputCols=["height"],
    outputCol="features")

data = assembler.transform(mydf).select('features','weight')

In [None]:
data.show()

#### Create and Train model

We can create the model using the `LinearRegression()` class and train using the `fit()` function.


In [None]:
# Create a LR model
lr = LinearRegression(featuresCol='features', labelCol='weight', maxIter=100)
lr.setRegParam(0.1)
# Fit the model
lrModel = lr.fit(data)

#### Save the model


In [None]:
lrModel.save('infantheight2.model')

#### Load the model


In [None]:
# You need LinearRegressionModel to load the model
from pyspark.ml.regression import LinearRegressionModel

In [None]:
model = LinearRegressionModel.load('infantheight2.model')

#### Make Prediction


#### Predict the weight of an infant whose height is 70 CMs.


In [None]:
# This function converts a scalar number into a dataframe that can be used by the model to predict.
def predict(weight):
    assembler = VectorAssembler(inputCols=["weight"],outputCol="features")
    data = [[weight,0]]
    columns = ["weight", "height"]
    _ = spark.createDataFrame(data, columns)
    __ = assembler.transform(_).select('features','height')
    predictions = model.transform(__)
    predictions.select('prediction').show()


In [None]:
predict(70)

### Practice exercises


#### Save the model as `babyweightprediction.model`


Double-click **here** for the solution.

<!-- Hint:

lrModel.save('babyweightprediction.model')
-->


#### Load the model `babyweightprediction.model`


Double-click **here** for the solution.

<!-- Hint:

model = LinearRegressionModel.load('babyweightprediction.model')
-->


#### Predict the weight of an infant whose height is 50 CMs.


Double-click **here** for the solution.

<!-- Hint:

predict(50)
-->
