# Pipeline - School Roads - Model Predict

The purpose of this notebook is to use a pre-trained image recognition model for use in detecting school warning signage on roads near schools.

This notebook is part of a larger series of other notebooks which are orchestrated using the "pipeline_schoolroads.ipynb" notebook found under <this repo>/task7-feature-extraction-using-aerial-level-data/code/pipeline directory.

## Imported Data

A trained model which is ready for predictive functions with street-level images.

## Exported Data

Data indicating whether school signage was found at image locations.

In [1]:
# Parameters cell used to indicate parameters which will be used at runtime.
# Note: the below is a default parameter value which is overridden when the
# notebook is executed as part of a pipeline via Prefect + Papermill

name = "usa"
link = "https://nces.ed.gov/programs/edge/data/EDGE_GEOCODE_PUBLICSCH_1819.zip"
unzip = True
target = "EDGE_GEOCODE_PUBLICSCH_1819.xlsx"
lat_colname = "LAT"
lon_colname = "LON"

In [2]:
import glob
import os

import pandas as pd
from imageai.Classification.Custom import CustomImageClassification

In [3]:
df = pd.read_parquet(
    "{}/data/{}_school_road_points_streetview.parquet".format(os.getcwd(), name)
)
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 100 entries, 0 to 99
Data columns (total 6 columns):
 #   Column                   Non-Null Count  Dtype  
---  ------                   --------------  -----  
 0   road_name                100 non-null    object 
 1   lon                      100 non-null    float64
 2   lat                      100 non-null    float64
 3   road_length              100 non-null    float64
 4   bearing                  100 non-null    float64
 5   google_streetview_image  100 non-null    object 
dtypes: float64(4), object(2)
memory usage: 4.8+ KB


In [4]:
df[df["google_streetview_image"].isna()]

Unnamed: 0,road_name,lon,lat,road_length,bearing,google_streetview_image


In [5]:
# get latest file from models created
list_of_files = glob.glob("{}/data/schools/resnet/models/*".format(os.getcwd()))
latest_file = max(list_of_files, key=os.path.getmtime)
os.path.basename(latest_file)

'model_ex-008_acc-1.000000.h5'

In [6]:
# custom model load
# https://imageai.readthedocs.io/en/latest/custom/index.html

prediction_model = CustomImageClassification()

prediction_model.setModelTypeAsResNet50()

prediction_model.setModelPath(latest_file)

prediction_model.setJsonPath(
    "{}/data/schools/resnet/json/model_class.json".format(os.getcwd())
)

prediction_model.loadModel(num_objects=2)

In [7]:
# define function for apply to predict on dataframe data
def gather_predictions(row, prediction_model):
    predict, probability = prediction_model.classifyImage(
        row["google_streetview_image"], result_count=1
    )
    row["schoolsign_prediction"] = predict[0]
    row["schoolsign_prediction_prob"] = probability[0]
    return row

In [8]:
# use model to predict on list of images via apply
df = df.apply(
    lambda x: gather_predictions(x, prediction_model)
    if x["google_streetview_image"] != None
    else x,
    axis=1,
)
df[["schoolsign_prediction", "schoolsign_prediction_prob"]].head()

Unnamed: 0,schoolsign_prediction,schoolsign_prediction_prob
0,schoolsign,100.0
1,schoolsign,100.0
2,schoolsign,100.0
3,schoolsign,100.0
4,schoolsign,100.0


In [9]:
# export results for merge
df.dropna().to_parquet(
    "{}/data/{}_school_road_points_streetview_predictions.parquet".format(
        os.getcwd(), name
    )
)