<a href="https://colab.research.google.com/github/mukkatharun/XGBoostChildWeightPrediction/blob/main/XGBoostChildWeightPrediction.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
import pandas as pd
import xgboost as xgb
import numpy as np

from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.utils import shuffle
from google.cloud import bigquery

In [None]:
query="""
SELECT
  weight_pounds,
  is_male,
  mother_age,
  plurality,
  gestation_weeks
FROM
  publicdata.samples.natality
WHERE year > 2007
LIMIT 1000
"""
df = bigquery.Client().query(query).to_dataframe()
df.head()

Unnamed: 0,weight_pounds,is_male,mother_age,plurality,gestation_weeks
0,8.161513,True,29,1,39
1,7.828615,True,33,1,41
2,7.18707,True,30,1,39
3,8.631098,False,33,1,40
4,6.74835,True,33,2,38


In [None]:
df.describe()

Unnamed: 0,weight_pounds,mother_age,plurality,gestation_weeks
count,1000.0,1000.0,1000.0,1000.0
mean,7.25831,27.457,1.023,38.705
std,1.257613,6.166537,0.15651,2.39911
min,1.102311,14.0,1.0,22.0
25%,6.624891,22.0,1.0,38.0
50%,7.36344,27.0,1.0,39.0
75%,7.998922,32.0,1.0,40.0
max,10.875403,46.0,3.0,47.0


In [None]:
df['is_male'].value_counts()

False    501
True     499
Name: is_male, dtype: int64

Extract the label column

In [None]:
df = df.dropna()
df = shuffle(df, random_state=2)

In [None]:
labels = df['weight_pounds']
data = df.drop(columns=['weight_pounds'])

In [None]:
data['is_male'] = data['is_male'].astype(int)

Split data into train and test sets

In [None]:
x,y = data,labels
x_train,x_test,y_train,y_test = train_test_split(x,y)

Model Build, Train and Evaluate

In [None]:
model = xgb.XGBRegressor(
    objective='reg:linear'
)

In [None]:
model.fit(x_train, y_train)


XGBRegressor()

Evaluate your model on test data

In [None]:
y_pred = model.predict(x_test)

In [None]:
for i in range(5):
    print('Predicted weight: ', y_pred[i])
    print('Actual weight: ', y_test.iloc[i])
    print()

Predicted weight:  7.9718227
Actual weight:  7.31273323054

Predicted weight:  7.2785263
Actual weight:  6.35372239084

Predicted weight:  7.4629817
Actual weight:  6.9225150268

Predicted weight:  8.111164
Actual weight:  7.12313568522

Predicted weight:  7.6964064
Actual weight:  10.24929056038



In [None]:
model.save_model('model.bst')

Configuration for the bucket and creation

In [4]:
GCP_PROJECT = 'Data Marvels- Child weight prediction'
MODEL_BUCKET = 'gs://childweightprediction'
VERSION_NAME = 'v1'
MODEL_NAME = 'baby_weight'

create bucket

In [None]:
!gsutil mb $MODEL_BUCKET

Creating gs://childweightprediction/...


Copy the model to the created bucket

In [None]:
!gsutil cp ./model.bst $MODEL_BUCKET

Copying file://./model.bst [Content-Type=application/octet-stream]...
/ [1 files][ 65.0 KiB/ 65.0 KiB]                                                
Operation completed over 1 objects/65.0 KiB.                                     


Deploying the model

In [None]:
!gcloud ai-platform models create $MODEL_NAME --region=us-west1 --project=end2endcodelabsxgboost


Using endpoint [https://us-west1-ml.googleapis.com/]
[1;31mERROR:[0m (gcloud.ai-platform.models.create) Resource in projects [end2endcodelabsxgboost] is the subject of a conflict: Field: model.name Error: A model with the same name already exists.
- '@type': type.googleapis.com/google.rpc.BadRequest
  fieldViolations:
  - description: A model with the same name already exists.
    field: model.name


In [None]:
!gcloud ai-platform versions create $VERSION_NAME \
--model=$MODEL_NAME \
--framework='XGBOOST' \
--runtime-version=1.15 \
--origin=$MODEL_BUCKET \
--python-version=3.7 \
--project=end2endcodelabsxgboost \
--region=us-west1

Using endpoint [https://us-west1-ml.googleapis.com/]
Creating version (this might take a few minutes)......done.                    


In [None]:
!gcloud projects list

API [cloudresourcemanager.googleapis.com] not enabled on project [651437649487].
 Would you like to enable and retry (this will take a few minutes)? (y/N)?  ^C


Command killed by keyboard interrupt



In [None]:
!gcloud config set project 651437649487

Updated property [core/project].


In [2]:
%%writefile predictions.json
[0.0, 33.0, 1.0, 27.0]
[1.0, 26.0, 1.0, 40.0]

Writing predictions.json


In [3]:
prediction = !gcloud ai-platform predict --project=end2endcodelabsxgboost --project=end2endcodelabsxgboost --model=$MODEL_NAME --region=us-west1 --json-instances=predictions.json --version=$VERSION_NAME
print(prediction.s)

Using endpoint [https://us-west1-ml.googleapis.com/] [2.1842293739318848, 7.8623223304748535]


In [None]:
from google.cloud import aiplatform

endpoint = aiplatform.Endpoint(
    endpoint_name="projects/651437649487/locations/us-west1/endpoints/6789374994573930909"
)

In [6]:
print("The weight of the batest = [0.0, 33.0, 1.0, 27.0]
response=endpoint.predict([test])

print("The weight of the baby is : ", response.predictions[0][0])by is : ", 2.1842293739318848)

The weight of the baby is :  2.1842293739318848
