# Project 1.) First Cloud Function

### Description : Post a cloud function that takes in a string of numbers and returns a json file that contains the the sum of all of the single digit numbers

#### Example : input ="12345"
#### output = 1+2+3+4+5 = 15
#### returns({"answer":15})

In [7]:
import requests
import json

url = "https://us-central1-cloud-computing-functions.cloudfunctions.net/sum_single_digits"

# Example input
input_data = {"input": "12345"}

response = requests.post(url, json=input_data)

# Print the response
print(response.json())

{'answer': 15}


## 1.b.) Query your cloud function using requests for example input "012937", "2" and "9999999999999"

In [10]:
# Function to query the cloud function
def query_cloud_function(input_string):
    input_data = {"input": input_string}
    response = requests.post(url, json=input_data)
    return response.json()

# Example inputs
inputs = ["012937", "2", "9999999999999"]

# Query the cloud function with each input and print the result
for input_string in inputs:
    result = query_cloud_function(input_string)
    print(f"Input: {input_string} => Output: {result}")

Input: 012937 => Output: {'answer': 22}
Input: 2 => Output: {'answer': 2}
Input: 9999999999999 => Output: {'answer': 117}


# Project 2.) Automated Webscraping

### Description : Find a website that is scrapable with Beautiful soup that updates with some frequency. Build a cloud function to programatically scrape the useful content

In [17]:
import requests

# The URL of the deployed cloud function
url = "https://us-central1-cloud-computing-functions.cloudfunctions.net/scrape_news"  # Use your actual function URL

def get_headlines():
    response = requests.get(url)
    if response.status_code == 200:
        return response.json().get("headlines", [])
    else:
        return f"Error: {response.status_code}"

headlines = get_headlines()
if isinstance(headlines, list):
    for idx, headline in enumerate(headlines, 1):
        print(f"{idx}. {headline}")
else:
    print(headlines)


1. Included in your subscription
2. UK election 2024 | Rishi Sunak’s election surprise
3. Le Pen’s hard right looks set to crush Macron’s centrists
4. Can anyone save the world’s most important diamond company?
5. A president’s death gives Iran’s regime a choice
6. Criminal gangs are showing their muscle as Mexico’s elections loom
7. China’s youth are rebelling against long hours
8. What does it mean to recognise Palestinian statehood?
9. Spices have their own riveting, piquant history
10. The ICC’s threat to arrest Binyamin Netanyahu has shocked Israel
11. War and climate change are overwhelming Somalia
12. China’s version of levelling up is not going well
13. Time is running out to fix America’s student-aid mess
14. Iran’s new leaders stand at a nuclear precipice
15. The revolt against Binyamin Netanyahu
16. The Israeli army is caught in a doom loop in Gaza
17. Israel has seen arms embargoes before
18. Can the rich world escape its baby crisis?
19. At long last, Europe’s economy is s

# Project 2.) 

### Description : Build some machine learning model using scikit learn and make it queriable using cloud functions

## a.) Think of a company that would use the ML app you just built. What employees could use this app what would they use it for? Write a short paragraph.

The Real Estate Advisory Firm could use this ML application to quickly estimate property values for clients. Real estate agents can input property features like size, number of bedrooms, and location to get an instant estimate of the property's market value. This helps them provide accurate pricing information during client consultations, improving service quality and decision-making. Financial analysts can use the model to analyze market trends and property valuations, aiding in investment decisions and market assessments. The tool streamlines property valuation, enhances productivity, and supports data-driven strategies within the firm.

In [21]:
import pandas as pd

# Load the dataset
df = pd.read_csv('housing.csv')

# Display the first few rows
df.head()


Unnamed: 0,longitude,latitude,housing_median_age,total_rooms,total_bedrooms,population,households,median_income,median_house_value,ocean_proximity
0,-122.23,37.88,41.0,880.0,129.0,322.0,126.0,8.3252,452600.0,NEAR BAY
1,-122.22,37.86,21.0,7099.0,1106.0,2401.0,1138.0,8.3014,358500.0,NEAR BAY
2,-122.24,37.85,52.0,1467.0,190.0,496.0,177.0,7.2574,352100.0,NEAR BAY
3,-122.25,37.85,52.0,1274.0,235.0,558.0,219.0,5.6431,341300.0,NEAR BAY
4,-122.25,37.85,52.0,1627.0,280.0,565.0,259.0,3.8462,342200.0,NEAR BAY


In [22]:
# Drop all rows with any NaN values
df.dropna(inplace=True)


In [23]:
df

Unnamed: 0,longitude,latitude,housing_median_age,total_rooms,total_bedrooms,population,households,median_income,median_house_value,ocean_proximity
0,-122.23,37.88,41.0,880.0,129.0,322.0,126.0,8.3252,452600.0,NEAR BAY
1,-122.22,37.86,21.0,7099.0,1106.0,2401.0,1138.0,8.3014,358500.0,NEAR BAY
2,-122.24,37.85,52.0,1467.0,190.0,496.0,177.0,7.2574,352100.0,NEAR BAY
3,-122.25,37.85,52.0,1274.0,235.0,558.0,219.0,5.6431,341300.0,NEAR BAY
4,-122.25,37.85,52.0,1627.0,280.0,565.0,259.0,3.8462,342200.0,NEAR BAY
...,...,...,...,...,...,...,...,...,...,...
20635,-121.09,39.48,25.0,1665.0,374.0,845.0,330.0,1.5603,78100.0,INLAND
20636,-121.21,39.49,18.0,697.0,150.0,356.0,114.0,2.5568,77100.0,INLAND
20637,-121.22,39.43,17.0,2254.0,485.0,1007.0,433.0,1.7000,92300.0,INLAND
20638,-121.32,39.43,18.0,1860.0,409.0,741.0,349.0,1.8672,84700.0,INLAND


In [24]:
# One-hot encoding for 'ocean_proximity'
dummies = pd.get_dummies(df['ocean_proximity'], prefix='ocean_proximity', drop_first=True)

# Combine the dummies with the existing dataset
data_with_dummies = pd.concat([df, dummies], axis=1)

# Drop the original 'ocean_proximity' column
data_with_dummies.drop('ocean_proximity', axis=1, inplace=True)
data_with_dummies = data_with_dummies.astype(int)

In [25]:
data_with_dummies

Unnamed: 0,longitude,latitude,housing_median_age,total_rooms,total_bedrooms,population,households,median_income,median_house_value,ocean_proximity_INLAND,ocean_proximity_ISLAND,ocean_proximity_NEAR BAY,ocean_proximity_NEAR OCEAN
0,-122,37,41,880,129,322,126,8,452600,0,0,1,0
1,-122,37,21,7099,1106,2401,1138,8,358500,0,0,1,0
2,-122,37,52,1467,190,496,177,7,352100,0,0,1,0
3,-122,37,52,1274,235,558,219,5,341300,0,0,1,0
4,-122,37,52,1627,280,565,259,3,342200,0,0,1,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...
20635,-121,39,25,1665,374,845,330,1,78100,1,0,0,0
20636,-121,39,18,697,150,356,114,2,77100,1,0,0,0
20637,-121,39,17,2254,485,1007,433,1,92300,1,0,0,0
20638,-121,39,18,1860,409,741,349,1,84700,1,0,0,0


In [26]:
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Features and target
X = data_with_dummies.drop('median_house_value', axis=1)
y = data_with_dummies['median_house_value']

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Standardize the data
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

In [27]:
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

# Train the model
model = LinearRegression()
model.fit(X_train_scaled, y_train)

# Evaluate the model
y_pred = model.predict(X_test_scaled)
mse = mean_squared_error(y_test, y_pred)
print(f'Mean Squared Error: {mse}')

Mean Squared Error: 5100440496.948736


In [28]:
import joblib

# Save the model and scaler
joblib.dump(model, 'california_housing_model.sav')
joblib.dump(scaler, 'scaler.sav')

print("Model and scaler saved successfully!")

Model and scaler saved successfully!


In [29]:
from io import BytesIO 

In [30]:
import os
from google.cloud import storage
from io import StringIO
import pandas as pd

In [31]:
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "Credentials.json"

In [32]:
client = storage.Client()

In [33]:
bucket_name = "california_housing_model"
bucket = client.get_bucket(bucket_name)
blob = bucket.blob("model/scaler.sav")
blob.upload_from_filename("scaler.sav")

In [34]:
bucket_name = "california_housing_model"
bucket = client.get_bucket(bucket_name)
blob = bucket.blob("model/california_housing_model.sav")
blob.upload_from_filename("california_housing_model.sav")

In [35]:
def load_scikit_model(file_name):
    bucket_name = "california_housing_model"
    source_blob = "model/" + file_name
    
    os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "Credentials.json"
    client = storage.Client()
    
    bucket = client.get_bucket(bucket_name)
    blob = bucket.blob(source_blob)
    
    model_data = blob.download_as_bytes()
    
    model = joblib.load(BytesIO(model_data))
    return(model)

In [36]:
model = load_scikit_model("california_housing_model.sav")
preproc = load_scikit_model("scaler.sav")

In [37]:
import joblib
from google.cloud import storage
from io import BytesIO
import numpy as np
import warnings

import session_info

In [72]:
session_info.show()

In [38]:
def california_housing(request_dictionary):
    try:
        with warnings.catch_warnings():
            warnings.simplefilter("ignore", UserWarning)
            model = load_scikit_model("california_housing_model.sav")
            preproc = load_scikit_model("scaler.sav")

            ### CONVERT request to request_dictionary ###
            dictionary = request_dictionary
            for key, value in dictionary.items():
                globals()[key] = value

            # Ensure all required keys are present
            required_keys = [
                "longitude", "latitude", "housing_median_age", "total_rooms",
                "total_bedrooms", "population", "households", "median_income",
                "ocean_proximity_INLAND", "ocean_proximity_ISLAND",
                "ocean_proximity_NEAR_BAY", "ocean_proximity_NEAR_OCEAN"
            ]
            for key in required_keys:
                if key not in dictionary:
                    globals()[key] = 0

            # Prepare the input data
            X = preproc.transform([[
                longitude, latitude, housing_median_age, total_rooms,
                total_bedrooms, population, households, median_income,
                ocean_proximity_INLAND, ocean_proximity_ISLAND, 
                ocean_proximity_NEAR_BAY, ocean_proximity_NEAR_OCEAN
            ]])

            # Make predictions
            prediction = model.predict(X)[0]

            return {"predicted_value": prediction}
    except Exception as e:
        return {"status": "error", "message": str(e)}

***Testing the function using the ML Model without referencing to any cloud function. This is run using only the function on our notebook***

In [39]:
# Test dictionary with sample values
test_request_dictionary = {
    "longitude": -122,
    "latitude": 37,
    "housing_median_age": 41,
    "total_rooms": 880,
    "total_bedrooms": 129,
    "population": 322,
    "households": 126,
    "median_income": 8,
    "ocean_proximity_INLAND": 0,
    "ocean_proximity_ISLAND": 0,
    "ocean_proximity_NEAR BAY": 1,
    "ocean_proximity_NEAR OCEAN": 0
}

# Call the function with the test dictionary
result = california_housing(test_request_dictionary)
print(result)

{'predicted_value': 421191.49634022615}


In [53]:
import ipywidgets as widgets
from IPython.display import display

# Define input widgets
longitude = widgets.FloatSlider(description='Longitude:', min=-180, max=180, step=0.01, value=-122.23)
latitude = widgets.FloatSlider(description='Latitude:', min=-90, max=90, step=0.01, value=37.88)
housing_median_age = widgets.IntSlider(description='Housing Median Age:', min=0, max=100, value=41)
total_rooms = widgets.IntSlider(description='Total Rooms:', min=0, max=10000, value=880)
total_bedrooms = widgets.IntSlider(description='Total Bedrooms:', min=0, max=5000, value=129)
population = widgets.IntSlider(description='Population:', min=0, max=10000, value=322)
households = widgets.IntSlider(description='Households:', min=0, max=5000, value=126)
median_income = widgets.FloatSlider(description='Median Income:', min=0, max=15, step=0.1, value=8)
ocean_proximity_INLAND = widgets.Checkbox(description='Inland', value=False)
ocean_proximity_ISLAND = widgets.Checkbox(description='Island', value=False)
ocean_proximity_NEAR_BAY = widgets.Checkbox(description='Near Bay', value=False)
ocean_proximity_NEAR_OCEAN = widgets.Checkbox(description='Near Ocean', value=False)

# Define output widget
output = widgets.Output()

# Define function to handle button click event
def on_predict_button_clicked(b):
    with output:
        output.clear_output()
        # Prepare input dictionary
        input_data = {
            "longitude": longitude.value,
            "latitude": latitude.value,
            "housing_median_age": housing_median_age.value,
            "total_rooms": total_rooms.value,
            "total_bedrooms": total_bedrooms.value,
            "population": population.value,
            "households": households.value,
            "median_income": median_income.value,
            "ocean_proximity_INLAND": ocean_proximity_INLAND.value,
            "ocean_proximity_ISLAND": ocean_proximity_ISLAND.value,
            "ocean_proximity_NEAR_BAY": ocean_proximity_NEAR_BAY.value,
            "ocean_proximity_NEAR_OCEAN": ocean_proximity_NEAR_OCEAN.value,
        }
        # Call the ML model function with input data
        prediction = california_housing(input_data)
        print("Predicted Median House Value:", prediction['predicted_value'])

# Define predict button
predict_button = widgets.Button(description='Predict')
predict_button.on_click(on_predict_button_clicked)

# Display input widgets and predict button
widgets.VBox([
    longitude, latitude, housing_median_age, total_rooms, total_bedrooms, population,
    households, median_income, ocean_proximity_INLAND, ocean_proximity_ISLAND,
    ocean_proximity_NEAR_BAY, ocean_proximity_NEAR_OCEAN, predict_button, output
])


VBox(children=(FloatSlider(value=-122.23, description='Longitude:', max=180.0, min=-180.0, step=0.01), FloatSl…

***Testing the function using the cloud function url. I think the question is asking us to make a separate notebook for this part so you can just copy the below code and paste it to a new notebook. It will work since it is linked to a cloud and doesnt need to have any inbuilt function defined in the notebook itself***

In [54]:
import flask
import functions_framework
import requests
import json

In [52]:
import requests
from urllib.parse import urlparse

# Define the URL of your cloud function
url = 'https://us-central1-cloud-computing-functions.cloudfunctions.net/ml_test'

# Define a test request dictionary with sample data
test_request_dictionary = {
    "longitude": -122,
    "latitude": 37,
    "housing_median_age": 41,
    "total_rooms": 880,
    "total_bedrooms": 129,
    "population": 322,
    "households": 126,
    "median_income": 8,
    "ocean_proximity_INLAND": 0,
    "ocean_proximity_ISLAND": 0,
    "ocean_proximity_NEAR BAY": 1,
    "ocean_proximity_NEAR OCEAN": 0
}

# Make a POST request to the cloud function
response = requests.post(url, json=test_request_dictionary)

# Print the raw response text
print("Response text:", response.text)

Response text: {"predicted_value": 421191.49634022615}


In [55]:
import ipywidgets as widgets
from IPython.display import display
import requests

# Define input widgets for each feature
longitude = widgets.FloatSlider(description='Longitude:', min=-180, max=180, step=0.01, value=-122.23)
latitude = widgets.FloatSlider(description='Latitude:', min=-90, max=90, step=0.01, value=37.88)
housing_median_age = widgets.IntSlider(description='Housing Median Age:', min=0, max=100, value=41)
total_rooms = widgets.IntSlider(description='Total Rooms:', min=0, max=10000, value=1000)
total_bedrooms = widgets.IntSlider(description='Total Bedrooms:', min=0, max=5000, value=300)
population = widgets.IntSlider(description='Population:', min=0, max=10000, value=1500)
households = widgets.IntSlider(description='Households:', min=0, max=5000, value=800)
median_income = widgets.FloatSlider(description='Median Income:', min=0, max=15, step=0.1, value=3.0)
ocean_proximity_INLAND = widgets.Checkbox(description='Inland', value=False)
ocean_proximity_ISLAND = widgets.Checkbox(description='Island', value=False)
ocean_proximity_NEAR_BAY = widgets.Checkbox(description='Near Bay', value=False)
ocean_proximity_NEAR_OCEAN = widgets.Checkbox(description='Near Ocean', value=False)

# Define output widget
output = widgets.Output()

# Define function to handle button click event
def on_predict_button_clicked(b):
    with output:
        output.clear_output()
        # Prepare input dictionary
        input_data = {
            "longitude": longitude.value,
            "latitude": latitude.value,
            "housing_median_age": housing_median_age.value,
            "total_rooms": total_rooms.value,
            "total_bedrooms": total_bedrooms.value,
            "population": population.value,
            "households": households.value,
            "median_income": median_income.value,
            "ocean_proximity_INLAND": ocean_proximity_INLAND.value,
            "ocean_proximity_ISLAND": ocean_proximity_ISLAND.value,
            "ocean_proximity_NEAR_BAY": ocean_proximity_NEAR_BAY.value,
            "ocean_proximity_NEAR_OCEAN": ocean_proximity_NEAR_OCEAN.value
        }
        # Call the cloud function with the input data
        try:
            response = requests.post(url, json=input_data)
            response_data = response.json()
            if "predicted_value" in response_data:
                predicted_value = response_data["predicted_value"]
                print("Predicted Median House Value:", predicted_value)
            elif "status" in response_data and response_data["status"] == "error":
                print("Error:", response_data["message"])
            else:
                print("Unexpected response from server.")
        except Exception as e:
            print("Error:", str(e))

# Define predict button
predict_button = widgets.Button(description='Predict')
predict_button.on_click(on_predict_button_clicked)

# Display input widgets, predict button, and output widget
widgets.VBox([
    longitude, latitude, housing_median_age, total_rooms, total_bedrooms,
    population, households, median_income, ocean_proximity_INLAND,
    ocean_proximity_ISLAND, ocean_proximity_NEAR_BAY, ocean_proximity_NEAR_OCEAN,
    predict_button, output
])


VBox(children=(FloatSlider(value=-122.23, description='Longitude:', max=180.0, min=-180.0, step=0.01), FloatSl…