**Note:** To be able to execute this code, remember to introduce your keys for the Google Maps API and the Restb.ai API in the config.py file. Keys that could be found in this project repository are no longer available.

# Vision-based classification of outdoor environments

The idea of this project is to use the Computer Vision models provided by Restb.ai to make vision-based classifications and predictions about natural surroundings and, mainly, urban areas. The problem it tries to solve is the lack of information on the surrounding areas found on the majority of real estate listings. That is, many details are given about the property itself and its features, but there is often a lack of information regarding the features and quality of the surroundings of the property. 

Also, the model described here has many different applications outside the real estate field. For example, it could help identify possible improvements of urban areas. Another possible application is the optimization of transport routes (as an example, customers would be much happier to pass through a more natural or "green" sorrounding, than one lacking significant natural light).

In [1]:
import pandas as pd

from config import *

from google_maps_client import GoogleMapsClient
from restbai_client import RestbAiClient

pd.set_option('display.max_colwidth', None)

In [2]:
# Client classes that manage the connections to APIs
maps_client = GoogleMapsClient()
restbai_client = RestbAiClient()

In [3]:
def analyze_address(address):
    """Given an address, makes vision-based predictions and classifications of the type of zone 
    that surrounds the address, the styles of the exterior of buildings, a score of the condition 
    between 1.0 and 6.0, a bool that represents the significant presence of natural light. It also
    includes an image caption.

    :param address: The address from which to analyze the image
    :type address: str
    :return: Predictions and classifications
    :rtype: dict
    """
    data_dict = {}

    # Get the url for the static Street View image
    url = maps_client.get_static_image_url(location=address)
    data_dict['street_view_image_url'] = url
    
    # Get a list of predicted types of zones, that builds up to more than 80% of confidence
    type_of_zone_predictions = restbai_client.type_of_zone(url)
    type_of_zones_80pct_confidence = []

    total_confidence = 0.0
    for zone, confidence in type_of_zone_predictions:
        if total_confidence >= 0.8:
            break
        type_of_zones_80pct_confidence.append(zone)
        total_confidence += confidence
    
    # Format the list as a pretty string, to add it to the resulting dataframe
    zones_pretty_string = ', '.join([zone.replace('_', ' ') for zone in type_of_zones_80pct_confidence])
    data_dict['types_of_zones_80pct_confidence'] = zones_pretty_string

    # Get a list of predicted exterior styles of the buildings, that builds up to more than 80% of confidence
    building_style_predictions = restbai_client.building_exterior_style(url)
    building_styles_80pct_confidence = []

    total_confidence = 0.0
    for style, confidence in building_style_predictions:
        if total_confidence >= 0.8:
            break
        building_styles_80pct_confidence.append(style)
        total_confidence += confidence
    
    # Format the list as a pretty string, to add it to the resulting dataframe
    styles_pretty_string = ', '.join([zone.replace('_', ' ') for zone in building_styles_80pct_confidence])
    data_dict['building_styles_80pct_confidence'] = styles_pretty_string

    # Get the condition score
    condition_score = restbai_client.condition(url)
    data_dict['condition_score'] = condition_score

    # Get the bool that represents the presence of natural light
    # This seems to not be well optimized for exteriors, as the result is False too often
    natural_light = restbai_client.natural_light(url)
    data_dict['natural_light'] = natural_light

    # Get the auto-generated image caption (kind of more experimental data, for now)
    image_caption = restbai_client.image_caption(url)
    data_dict['image_caption'] = image_caption

    return data_dict

As an example of this prototype process, let's make vision-based classifications and predictions of features like the type of zone, the styles of buildings, the condition and the significant presence of natural light of the Tibidabo Avenue and the Aribau Street in Barcelona. To experimentate further, the generation of an image caption has also been included. This avenue and this street have some notably different characteristics from the other one, and we'll be able to see that by comparing the results for both streets.

In [4]:
# It is possible to automate the extraction of lists of addresses using APIs, or engineer the data ourselves 
# by reverse geocoding. For this prototype and testing, this automation is still not included.
addresses_av_tibidabo = [str(i)+' Av. del Tibidabo, Barcelona, Catalunya' for i in range(1,38)]

In [5]:
# List of dictionaries which will become a dataframe
data = []

for address in addresses_av_tibidabo:
    # Dictionary which will become a row in the dataframe
    data_dict = analyze_address(address)
    data.append(data_dict)
    
av_tibidabo_df = pd.DataFrame(data)
av_tibidabo_df.to_csv('data/Tibidabo_Avenue_Barcelona.csv', sep=';')
av_tibidabo_df.head()

Unnamed: 0,street_view_image_url,types_of_zones_80pct_confidence,building_styles_80pct_confidence,condition_score,natural_light,image_caption
0,https://maps.googleapis.com/maps/api/streetview?location=1+Av.+del+Tibidabo%2C+Barcelona%2C+Catalunya&size=1000x800&key=AIzaSyCASkpJn-jhL07GOLo_SGsuuj6eh5tOP2M,"community, playground, front house, property exterior, yard","manufactured mobile, unfinished, contemporary, italianate, raised ranch",,False,view of surrounding community
1,https://maps.googleapis.com/maps/api/streetview?location=2+Av.+del+Tibidabo%2C+Barcelona%2C+Catalunya&size=1000x800&key=AIzaSyCASkpJn-jhL07GOLo_SGsuuj6eh5tOP2M,outdoor building,not single family,,False,view of property
2,https://maps.googleapis.com/maps/api/streetview?location=3+Av.+del+Tibidabo%2C+Barcelona%2C+Catalunya&size=1000x800&key=AIzaSyCASkpJn-jhL07GOLo_SGsuuj6eh5tOP2M,"community, gate","contemporary, manufactured mobile, not single family, italianate, no distinct style",,False,view of surrounding community
3,https://maps.googleapis.com/maps/api/streetview?location=4+Av.+del+Tibidabo%2C+Barcelona%2C+Catalunya&size=1000x800&key=AIzaSyCASkpJn-jhL07GOLo_SGsuuj6eh5tOP2M,outdoor building,not single family,,False,view of building exterior
4,https://maps.googleapis.com/maps/api/streetview?location=5+Av.+del+Tibidabo%2C+Barcelona%2C+Catalunya&size=1000x800&key=AIzaSyCASkpJn-jhL07GOLo_SGsuuj6eh5tOP2M,"community, gate","contemporary, manufactured mobile, not single family, italianate, no distinct style",,False,view of home's community


In [6]:
# The predictions and classifications are based on images found on the Street View of Google Maps

#av_tibidabo_df = pd.read_csv('data/Tibidabo_Avenue_Barcelona.csv',sep=';')
av_tibidabo_df.loc[1,'street_view_image_url']

'https://maps.googleapis.com/maps/api/streetview?location=2+Av.+del+Tibidabo%2C+Barcelona%2C+Catalunya&size=1000x800&key=AIzaSyCASkpJn-jhL07GOLo_SGsuuj6eh5tOP2M'

In [7]:
addresses_st_aribau = [str(i)+" Carrer d'Aribau, Barcelona, Catalunya" for i in range(160,191)]

In [8]:
data = []

for address in addresses_st_aribau:
    # Dictionary which will become a row in the dataframe
    data_dict = analyze_address(address)
    data.append(data_dict)

st_aribau_df = pd.DataFrame(data)
st_aribau_df.to_csv('data/Aribau_Street_Barcelona.csv', sep=';')
st_aribau_df.head()

Unnamed: 0,street_view_image_url,types_of_zones_80pct_confidence,building_styles_80pct_confidence,condition_score,natural_light,image_caption
0,https://maps.googleapis.com/maps/api/streetview?location=160+Carrer+d%27Aribau%2C+Barcelona%2C+Catalunya&size=1000x800&key=AIzaSyCASkpJn-jhL07GOLo_SGsuuj6eh5tOP2M,outdoor building,"italianate, not single family",,False,view of property
1,https://maps.googleapis.com/maps/api/streetview?location=161+Carrer+d%27Aribau%2C+Barcelona%2C+Catalunya&size=1000x800&key=AIzaSyCASkpJn-jhL07GOLo_SGsuuj6eh5tOP2M,"outdoor building, front house","not single family, french country",,False,view of building exterior
2,https://maps.googleapis.com/maps/api/streetview?location=162+Carrer+d%27Aribau%2C+Barcelona%2C+Catalunya&size=1000x800&key=AIzaSyCASkpJn-jhL07GOLo_SGsuuj6eh5tOP2M,outdoor building,not single family,,False,view of property
3,https://maps.googleapis.com/maps/api/streetview?location=163+Carrer+d%27Aribau%2C+Barcelona%2C+Catalunya&size=1000x800&key=AIzaSyCASkpJn-jhL07GOLo_SGsuuj6eh5tOP2M,outdoor building,not single family,,False,view of building exterior
4,https://maps.googleapis.com/maps/api/streetview?location=164+Carrer+d%27Aribau%2C+Barcelona%2C+Catalunya&size=1000x800&key=AIzaSyCASkpJn-jhL07GOLo_SGsuuj6eh5tOP2M,outdoor building,"not single family, unfinished, oriental, raised beach house",,False,view of property


## Some comments and future steps
We can note that some of the features that the model predicts (mainly the significant presence of natural light and the conditions) aren't well-suited enough for exterior surroundings. In both datasets, many of the values of the `condition_score` colum are null (though not all of them, as can be seen in the next cells), and all the rows have `False` as the value that describes the significant presence of natural light, despite being outdoor environments. That gives the opportunity of further improvement of the computer vision model. 

Also, this model would be greatly expanded by implementing an automatic way of getting street addresses.

As for the usage and applications of this model, we believe the next step of the project is to create a visualization for the information, for it to be displayed more intuitively and be used more easily to solve the different problems that inspired its creation.

In [9]:
av_tibidabo_df['condition_score'].value_counts()

4.7    1
4.1    1
4.5    1
3.8    1
3.1    1
5.1    1
4.2    1
Name: condition_score, dtype: int64

In [10]:
st_aribau_df['condition_score'].value_counts()

5.5    1
Name: condition_score, dtype: int64

In [11]:
av_tibidabo_df['natural_light'].value_counts()

False    37
Name: natural_light, dtype: int64

In [12]:
st_aribau_df['natural_light'].value_counts()

False    31
Name: natural_light, dtype: int64