<a href="https://colab.research.google.com/github/analyticsariel/projects/blob/master/How_to_Get_Puerto_Rico_Real_Estate_Data_%7C_Python_Tutorial.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# How to Get Puerto Rico Real Estate Data | Python Tutorial
<i>Python tutorial to get Puerto Rico for-sale and for-rent (short-term and long-term) listings</i>
<br>
<br>
![link text](https://drive.google.com/uc?id=10oWFppsYIkOynHBEy0nXuxS5NuVDKO2u)

## Overview
| Detail Tag            | Information                                                                                        |
|-----------------------|----------------------------------------------------------------------------------------------------|
| Originally Created By | Ariel Herrera arielherrera@analyticsariel.com |
| External References   | Rapid API |
| Input Datasets        | [Puerto Rico Real Estate API](https://bit.ly/3uLpbvt) |
| Output Datasets       | CSV file, Map HTML |
| Input Data Source     | JSON file |
| Output Data Source    | Pandas DataFrame |

## History
| Date         | Developed By  | Reason                                                |
|--------------|---------------|-------------------------------------------------------|
| 12th July 2022 | Ariel Herrera | Create notebook. |

## Background 🤓
- **Overview**: This notebook retrieves real estate property data using the [Puerto Rico Real Estate API](https://bit.ly/3uLpbvt). This includes properties for-sale and properties for rent (short-term and long-term).
- **Purpose**: This notebook is a Python tutorial to query the [Puerto Rico Real Estate API](https://bit.ly/3uLpbvt). Below are examples on how to extract listings data for towns in Puerto Rico.
- **Background**: Property data for Puerto Rico is listed on the MLS (Multiple Listing Services) and on third party sites. Properties for-sale and for-rent are listed by agents or by property owners on websites like [Clasificados Online](https://www.clasificadosonline.com/). The data is not easily consumable. This makes it difficult to *analyze the performance of real estate markets* across the country of Puerto Rico. Therefore, the API was created to web scrape the data and make it programmatically available to analyze.
- **Example Use Case(s)**:
  - **Investors** - Identify median household prices and median rent prices within a town. Determine cash flow.
  - **Agents** - Provide trend analysis statistics such as days on market (DOM), inventory supply, and median home price by property type across towns.
  - **Owner-occupied buyers** - View all properties for-sale in a single spreadsheet.
  - **Expats** - View all properites for-rent in a single spreadsheet. Perform clustering to group properties by similar features.
- **Dataset**: Listings from [Clasificados Online](https://www.clasificadosonline.com/)

## Getting Started ✅
1. Copy this notebook -> File -> Save a Copy in Drive
2. Directions
  - Create an account with [RapidAPI.com](rapidapi.com)
  - Copy your [RapidAPI key](https://rapidapi.com/blog/api-glossary/api-key/)
  - Store your RapidAPI key into an `api_key.csv` file OR replace the `rapid_api_key` variable <string> with your own RapidAPI key
  - Run notebook -> Runtime -> Run all

## Useful Resources 📑
- [Google Colab Cheat Sheet](https://towardsdatascience.com/cheat-sheet-for-google-colab-63853778c093)
- [Clasificados Online](https://www.clasificadosonline.com/)
- [Puerto Rico Real Estate API](https://bit.ly/3uLpbvt)
- [Puerto Rico Towns](https://bit.ly/3RweCGr)

## Questions / Feedback❓
- [Contact AnalyticsAriel](https://www.analyticsariel.com/contact-me)

## <font color="blue">Install Packages</font>

In [24]:
!pip install keplergl -q # visualization

## <font color="blue">Imports</font>

In [25]:
from google.colab import drive, files, output # specific to Google Colab
import requests
import time

# data wrangling
import pandas as pd
import geopandas as gpd

# visualization
import plotly.express as px
from keplergl import KeplerGl

# settings
output.enable_custom_widget_manager() # enables to view charts in Google Colab

## <font color="blue">Functions</font>

### For Sale Functions

In [26]:
def get_for_sale_listings(rapid_api_key, 
                          town_id,
                          page=1, # default first page
                          property_type=None,
                          low_price_range=None,
                          high_price_range=None,
                          bedrooms=None,
                          repo=0, # default set flag off
                          opt=0, # default set flag off
                          all_pages=False): # get data for all pages
  """
  Search for property for-sale listings in a geographical area. 
  Returns the property features, image url, and coordinates.

  Parameters
  ----------
  @rapid_api_key [string]: Key to access data from Rapid API
  @town_id [string]: 
  @page [string]: Page number
  @property_type [string]: 
    apartamento|apartamento/walkup|casa|comercial|finca|multifamiliar|solar
  @low_price_range [string]: Minimum price
  @high_price_range [string]: Maximum price
  @bedrooms [string]: Exact number of bedrooms
  @repo [number]: Flag for reposessed properties
  @opt [number]: Flag for options filter

  Returns
  -------
  [json] API response

  """

  #####################
  #   REQUEST DATA    #
  #####################

  # parameters
  if bedrooms != None:
    bedrooms = str(int(bedrooms))
  if low_price_range != None:
    low_price_range = str(int(low_price_range))
  if high_price_range != None:
    high_price_range = str(int(high_price_range))

  # request
  url = "https://puerto-rico-real-estate.p.rapidapi.com/property/search"

  querystring = {
    "town_id":town_id, # required
    "page":str(page),
    "property_type": property_type,
    "low_price_range": str(int(low_price_range)),
    "high_price_range": str(int(high_price_range)),
    "bedrooms": bedrooms,
    "repo": int(repo),
    "opt": int(opt),
  }

  headers = {
    "X-RapidAPI-Key": rapid_api_key,
    "X-RapidAPI-Host": "puerto-rico-real-estate.p.rapidapi.com"
  }

  response = requests.request("GET", url, headers=headers, params=querystring)

  #########################
  #   RESPONSE CONTENT    #
  #########################
  
  # get contents of response
  if response.status_code == 200:
    print('API request successful!')
  else:
    print('WARNING: API request unsuccessful!')

  # get total number of pages
  max_page = response.json()['meta']['max_page']

  # get request for all pages
  if (all_pages == True) and (max_page > 1):
    # create list with first page results
    response_list = [response]

    start_time = time.time()
    # iterate through each page
    print('Retrieved data for page: 1')
    for i in range(2, max_page + 1):
      print('Requesting data for page:', str(i))

      # set new page
      querystring['page'] = i

      # get response
      response = requests.request("GET", url, headers=headers, params=querystring)

      # add response to list
      response_list.append(response)

    print("--- %s seconds ---" % (time.time() - start_time))

    return response_list

  return requests.request("GET", url, headers=headers, params=querystring)

### Rental Functions

In [27]:
def get_all_rental_towns(rapid_api_key):

  url = "https://puerto-rico-real-estate.p.rapidapi.com/long_term_rental/filters/towns"

  headers = {
    "X-RapidAPI-Key": rapid_api_key,
    "X-RapidAPI-Host": "puerto-rico-real-estate.p.rapidapi.com"
  }

  return requests.request("GET", url, headers=headers)

In [28]:
def get_long_term_rentals(rapid_api_key, 
                          town_id,
                          page=1, # default first page
                          property_type=None,
                          low_price_range=None,
                          high_price_range=None,
                          bedrooms=None,
                          all_pages=False): # get data for all pages
  """
  Search for property long-term rental listings in a geographical area. 
  Returns the property features, image url, and coordinates.

  Parameters
  ----------
  @rapid_api_key [string]: Key to access data from Rapid API
  @town_id [string]: 
  @page [string]: Page number
  @property_type [string]: 
    apartamento|apartamento/walkup|casa|comercial|finca|multifamiliar|solar
  @low_price_range [string]: Minimum price
  @high_price_range [string]: Maximum price
  @bedrooms [string]: Exact number of bedrooms

  Returns
  -------
  [json] API response

  """

  #####################
  #   REQUEST DATA    #
  #####################

  # parameters
  if bedrooms != None:
    bedrooms = str(int(bedrooms))
  if low_price_range != None:
    low_price_range = str(int(low_price_range))
  if high_price_range != None:
    high_price_range = str(int(high_price_range))

  # request
  url = "https://puerto-rico-real-estate.p.rapidapi.com/long_term_rental/search"

  querystring = {
    "town_id":town_id, # required
    "page":str(page),
    "property_type": property_type,
    "low_price_range": low_price_range,
    "high_price_range": high_price_range,
    "bedrooms": bedrooms
  }

  headers = {
    "X-RapidAPI-Key": rapid_api_key,
    "X-RapidAPI-Host": "puerto-rico-real-estate.p.rapidapi.com"
  }

  response = requests.request("GET", url, headers=headers, params=querystring)

  #########################
  #   RESPONSE CONTENT    #
  #########################
  
  # get contents of response
  if response.status_code == 200:
    print('API request successful!')
  else:
    print('WARNING: API request unsuccessful!')

  # get total number of pages
  max_page = response.json()['meta']['max_page']

  # get request for all pages
  if (all_pages == True) and (max_page > 1):
    # create list with first page results
    response_list = [response]

    start_time = time.time()
    # iterate through each page
    print('Retrieved data for page: 1')
    for i in range(2, max_page + 1):
      print('Requesting data for page:', str(i))

      # set new page
      querystring['page'] = i

      # get response
      response = requests.request("GET", url, headers=headers, params=querystring)

      # add response to list
      response_list.append(response)

    print("--- %s seconds ---" % (time.time() - start_time))

    return response_list

  return requests.request("GET", url, headers=headers, params=querystring)

In [29]:
def get_short_term_rentals(rapid_api_key, 
                          town_id,
                          page=1, # default first page
                          property_type=None,
                          low_price_range=None,
                          high_price_range=None,
                          bedrooms=None,
                          all_pages=False): # get data for all pages
  """
  Search for property short-term rental listings in a geographical area. 
  Returns the property features, image url, and coordinates.

  Parameters
  ----------
  @rapid_api_key [string]: Key to access data from Rapid API
  @town_id [string]: 
  @page [string]: Page number
  @property_type [string]: 
    apartamento|apartamento/walkup|casa|comercial|finca|multifamiliar|solar
  @low_price_range [string]: Minimum price
  @high_price_range [string]: Maximum price
  @bedrooms [string]: Exact number of bedrooms

  Returns
  -------
  [json] API response

  """

  #####################
  #   REQUEST DATA    #
  #####################

  # parameters
  if bedrooms != None:
    bedrooms = str(int(bedrooms))
  if low_price_range != None:
    low_price_range = str(int(low_price_range))
  if high_price_range != None:
    high_price_range = str(int(high_price_range))

  # request
  url = "https://puerto-rico-real-estate.p.rapidapi.com/short_term_rental/search"

  querystring = {
    "town_id":town_id, # required
    "page":str(page),
    "property_type": property_type,
    "low_price_range": low_price_range,
    "high_price_range": high_price_range,
    "bedrooms": bedrooms
  }

  headers = {
    "X-RapidAPI-Key": rapid_api_key,
    "X-RapidAPI-Host": "puerto-rico-real-estate.p.rapidapi.com"
  }

  response = requests.request("GET", url, headers=headers, params=querystring)

  #########################
  #   RESPONSE CONTENT    #
  #########################
  
  # get contents of response
  if response.status_code == 200:
    print('API request successful!')
  else:
    print('WARNING: API request unsuccessful!')

  # get total number of pages
  max_page = response.json()['meta']['max_page']

  # get request for all pages
  if (all_pages == True) and (max_page > 1):
    # create list with first page results
    response_list = [response]

    start_time = time.time()
    # iterate through each page
    print('Retrieved data for page: 1')
    for i in range(2, max_page + 1):
      print('Requesting data for page:', str(i))

      # set new page
      querystring['page'] = i

      # get response
      response = requests.request("GET", url, headers=headers, params=querystring)

      # add response to list
      response_list.append(response)

    print("--- %s seconds ---" % (time.time() - start_time))

    return response_list

  return requests.request("GET", url, headers=headers, params=querystring)

### General Functions

In [30]:
def transform_listings_to_df(response_list):
  return pd.concat([pd.DataFrame(r.json()['properties']) for r in response_list])

In [31]:
def get_num_bedrooms(x):
  try:
    if (x == None) or (x == ''):
      return 0
    else:
      return int(''.join(e for e in x if e.isalnum()).strip()[0])
  except:
    return 0
def get_num_bathrooms(x):
  try:
    if (x == None) or (x == ''):
      return 0
    else:
      return int(x[0])
  except:
    return 0

In [32]:
def get_kepler_map_config():
  return {'config': {'mapState': {'bearing': 0,
   'dragRotate': False,
   'isSplit': False,
   'latitude': 18.455909,
   'longitude': -66.06917085,
   'pitch': 0,
   'zoom': 13},
  'mapStyle': {'mapStyles': {},
   'styleType': 'dark',
   'threeDBuildingColor': [9.665468314072013,
    17.18305478057247,
    31.1442867897876],
   'topLayerGroups': {},
   'visibleLayerGroups': {'3d building': False,
    'border': False,
    'building': True,
    'label': True,
    'land': True,
    'road': True,
    'water': True}},
  'visState': {'animationConfig': {'currentTime': None, 'speed': 1},
   'filters': [],
   'interactionConfig': {'brush': {'enabled': False, 'size': 0.5},
    'coordinate': {'enabled': False},
    'geocoder': {'enabled': False},
    'tooltip': {'compareMode': False,
     'compareType': 'absolute',
     'enabled': True,
     'fieldsToShow': {'data_1': [{'format': None, 'name': 'street_address'},
       {'format': None, 'name': 'bedrooms'},
       {'format': None, 'name': 'bathrooms'},
       {'format': None, 'name': 'list_price'},
       {'format': None, 'name': 'property_type'}]}}},
   'layerBlending': 'normal',
   'layers': [{'config': {'color': [18, 147, 154],
      'columns': {'altitude': None, 'lat': 'lat', 'lng': 'lon'},
      'dataId': 'data_1',
      'hidden': False,
      'highlightColor': [252, 242, 26, 255],
      'isVisible': True,
      'label': 'Point',
      'textLabel': [{'alignment': 'center',
        'anchor': 'start',
        'color': [255, 255, 255],
        'field': None,
        'offset': [0, 0],
        'size': 18}],
      'visConfig': {'colorRange': {'category': 'Uber',
        'colors': ['#5A1846',
         '#900C3F',
         '#C70039',
         '#E3611C',
         '#F1920E',
         '#FFC300'],
        'name': 'Global Warming',
        'type': 'sequential'},
       'filled': True,
       'fixedRadius': False,
       'opacity': 0.8,
       'outline': False,
       'radius': 10,
       'radiusRange': [0, 50],
       'strokeColor': None,
       'strokeColorRange': {'category': 'Uber',
        'colors': ['#5A1846',
         '#900C3F',
         '#C70039',
         '#E3611C',
         '#F1920E',
         '#FFC300'],
        'name': 'Global Warming',
        'type': 'sequential'},
       'thickness': 2}},
     'id': '2nbr5t',
     'type': 'point',
     'visualChannels': {'colorField': {'name': 'num_bedrooms',
       'type': 'integer'},
      'colorScale': 'quantile',
      'sizeField': None,
      'sizeScale': 'linear',
      'strokeColorField': None,
      'strokeColorScale': 'quantile'}},
    {'config': {'color': [221, 178, 124],
      'columns': {'altitude': None, 'lat': 'latitude', 'lng': 'longitude'},
      'dataId': 'data_1',
      'hidden': False,
      'highlightColor': [252, 242, 26, 255],
      'isVisible': False,
      'label': 'Point',
      'textLabel': [{'alignment': 'center',
        'anchor': 'start',
        'color': [255, 255, 255],
        'field': None,
        'offset': [0, 0],
        'size': 18}],
      'visConfig': {'colorRange': {'category': 'Uber',
        'colors': ['#5A1846',
         '#900C3F',
         '#C70039',
         '#E3611C',
         '#F1920E',
         '#FFC300'],
        'name': 'Global Warming',
        'type': 'sequential'},
       'filled': True,
       'fixedRadius': False,
       'opacity': 0.8,
       'outline': False,
       'radius': 10,
       'radiusRange': [0, 50],
       'strokeColor': None,
       'strokeColorRange': {'category': 'Uber',
        'colors': ['#5A1846',
         '#900C3F',
         '#C70039',
         '#E3611C',
         '#F1920E',
         '#FFC300'],
        'name': 'Global Warming',
        'type': 'sequential'},
       'thickness': 2}},
     'id': 'cxqq17x',
     'type': 'point',
     'visualChannels': {'colorField': {'name': 'num_bedrooms',
       'type': 'integer'},
      'colorScale': 'quantile',
      'sizeField': None,
      'sizeScale': 'linear',
      'strokeColorField': None,
      'strokeColorScale': 'quantile'}},
    {'config': {'color': [146, 38, 198],
      'columns': {'lat0': 'lat',
       'lat1': 'latitude',
       'lng0': 'lon',
       'lng1': 'longitude'},
      'dataId': 'data_1',
      'hidden': False,
      'highlightColor': [252, 242, 26, 255],
      'isVisible': False,
      'label': ' ->  arc',
      'textLabel': [{'alignment': 'center',
        'anchor': 'start',
        'color': [255, 255, 255],
        'field': None,
        'offset': [0, 0],
        'size': 18}],
      'visConfig': {'colorRange': {'category': 'Uber',
        'colors': ['#5A1846',
         '#900C3F',
         '#C70039',
         '#E3611C',
         '#F1920E',
         '#FFC300'],
        'name': 'Global Warming',
        'type': 'sequential'},
       'opacity': 0.8,
       'sizeRange': [0, 10],
       'targetColor': None,
       'thickness': 2}},
     'id': '2d7ikce',
     'type': 'arc',
     'visualChannels': {'colorField': None,
      'colorScale': 'quantile',
      'sizeField': None,
      'sizeScale': 'linear'}},
    {'config': {'color': [136, 87, 44],
      'columns': {'alt0': None,
       'alt1': None,
       'lat0': 'lat',
       'lat1': 'latitude',
       'lng0': 'lon',
       'lng1': 'longitude'},
      'dataId': 'data_1',
      'hidden': False,
      'highlightColor': [252, 242, 26, 255],
      'isVisible': False,
      'label': ' ->  line',
      'textLabel': [{'alignment': 'center',
        'anchor': 'start',
        'color': [255, 255, 255],
        'field': None,
        'offset': [0, 0],
        'size': 18}],
      'visConfig': {'colorRange': {'category': 'Uber',
        'colors': ['#5A1846',
         '#900C3F',
         '#C70039',
         '#E3611C',
         '#F1920E',
         '#FFC300'],
        'name': 'Global Warming',
        'type': 'sequential'},
       'elevationScale': 1,
       'opacity': 0.8,
       'sizeRange': [0, 10],
       'targetColor': None,
       'thickness': 2}},
     'id': '7hkvr37',
     'type': 'line',
     'visualChannels': {'colorField': None,
      'colorScale': 'quantile',
      'sizeField': None,
      'sizeScale': 'linear'}},
    {'config': {'color': [255, 153, 31],
      'columns': {'geojson': 'geometry'},
      'dataId': 'data_1',
      'hidden': False,
      'highlightColor': [252, 242, 26, 255],
      'isVisible': True,
      'label': 'data_1',
      'textLabel': [{'alignment': 'center',
        'anchor': 'start',
        'color': [255, 255, 255],
        'field': None,
        'offset': [0, 0],
        'size': 18}],
      'visConfig': {'colorRange': {'category': 'ColorBrewer',
        'colors': ['#ffffb2',
         '#fed976',
         '#feb24c',
         '#fd8d3c',
         '#f03b20',
         '#bd0026'],
        'name': 'ColorBrewer YlOrRd-6',
        'type': 'sequential'},
       'elevationScale': 5,
       'enable3d': False,
       'enableElevationZoomFactor': True,
       'filled': True,
       'heightRange': [0, 500],
       'opacity': 0.8,
       'radius': 20,
       'radiusRange': [0, 50],
       'sizeRange': [0, 10],
       'strokeColor': None,
       'strokeColorRange': {'category': 'Uber',
        'colors': ['#5A1846',
         '#900C3F',
         '#C70039',
         '#E3611C',
         '#F1920E',
         '#FFC300'],
        'name': 'Global Warming',
        'type': 'sequential'},
       'strokeOpacity': 0.8,
       'stroked': False,
       'thickness': 0.5,
       'wireframe': False}},
     'id': 'xetjzy',
     'type': 'geojson',
     'visualChannels': {'colorField': {'name': 'list_price_int',
       'type': 'integer'},
      'colorScale': 'quantile',
      'heightField': None,
      'heightScale': 'linear',
      'radiusField': None,
      'radiusScale': 'linear',
      'sizeField': None,
      'sizeScale': 'linear',
      'strokeColorField': None,
      'strokeColorScale': 'quantile'}}],
   'splitMaps': []}},
 'version': 'v1'}

## <font color="blue">Locals & Constants</font>

In [34]:
########################################################################
# OPTIONAL                                                             #
# 1) Create a file called 'api_keys.csv' to store your Rapid API key   #
# 2) Read in the file                                                  #
# 3) Set the Rapid API key to the rapid_api_key variable               #
########################################################################

# mount drive
drive.mount('/content/drive', force_remount=False)

# data location
file_dir = '/content/drive/My Drive/Colab Data/input/' # optional

Mounted at /content/drive


In [35]:
# read in api key file
df_api_keys = pd.read_csv(file_dir + 'api_keys.csv')

# get keys
rapid_api_key = df_api_keys.loc[df_api_keys['API'] =='rapid']['KEY'].iloc[0] # replace this with your own key

## <font color="blue">Data</font>

### <font color="green">Section #1 - API Requests</font> 🇵🇷
This section will cover how to make API requests to the [Puerto Rico Real Estate API](https://bit.ly/3cexfP2). It demonstrates how to modify your search based on different parameters.

#### <font color="purple">1. Properties For-Sale</font> 🏘

Step #1 - <b>Select a location</b> to retrieve for sale listings

In [36]:
# select a town ID 
town_id = 'San Juan - Condado-Miramar'

# get data for town
for_sale_response = get_for_sale_listings(
    rapid_api_key=rapid_api_key, 
    town_id=town_id,
    low_price_range=5000, # minimum price of property
    high_price_range=500000 # maximum price of property
)

API request successful!


Step #2 - View response in text <string> format

In [37]:
# view raw text response before we transform into a table
for_sale_response.text

'{"status":"success","properties":[{"street_address":"Apartamento de  4 habitaciones en Mir...","bedrooms":"4 Cuartos","bathrooms":"2 Baños","list_price":"$320,000","property_type":"Apt/WalkUp","town":"Sector-Miramar , San Juan - Condado-Miramar","image_url":"https://imgcache.clasificadosonline.com//PP/FS/2022/7/21/07212022182420okwwejud.jpg","url":"https://www.clasificadosonline.com/UDRealEstateDetail.asp?ID=4685784","lat":"18.4540053","lon":"-66.0832357"},{"street_address":"OFICINA CONDADO PERFECTA MEDICOS ó AB...","bedrooms":"","bathrooms":"3 Baños","list_price":"$375,000","property_type":"Comercial","town":"Condominio-Adaligia , San Juan - Condado-Miramar","image_url":"https://imgcache.clasificadosonline.com//PP/FS/2022/7/28/07282022124117s4otwwkb.jpg","url":"https://www.clasificadosonline.com/UDRealEstateDetail.asp?ID=4686889","lat":"","lon":""},{"street_address":"Cond. Kingsville, SJ, Condado","bedrooms":"","bathrooms":"1 Baños","list_price":"$260,000","property_type":"Apartament

Step #3 - View attributes of the response including the `listing_url` and number of pages `max_pages`

In [38]:
print('Response keys:', list(for_sale_response.json().keys()))

# view attributes in 'meta' field
for key, value in for_sale_response.json()['meta'].items():
  print(key, ':', value)

Response keys: ['status', 'properties', 'meta']
listing_url : https://www.clasificadosonline.com/UDREListing.asp?RESPueblos=San+Juan+-+Condado-Miramar&Category=%25&LowPrice=5000&HighPrice=500000&Bedrooms=%25&Area=&BtnSearchListing=See+Listing&redirecturl=%2Fudrelistingmap.asp&IncPrecio=1&offset=0
current_page : 1
max_page : 2


Step #4 - Get data for ALL pages with listings

In [39]:
# get data for town
for_sale_response_list = get_for_sale_listings(
    rapid_api_key=rapid_api_key, 
    town_id=town_id,
    low_price_range=5000, # minimum price of property
    high_price_range=500000, # maximum price of property
    all_pages=True # *get data for ALL pages*
)

API request successful!
Retrieved data for page: 1
Requesting data for page: 2
--- 3.385178327560425 seconds ---


Step #5 - Transform text into a dataframe (table with rows and columns)

In [40]:
# transform list of responses into a single dataframe
df_for_sale = transform_listings_to_df(for_sale_response_list)
print('Num of rows:', len(df_for_sale))
print('Num of columns:', len(df_for_sale.columns))
df_for_sale.head() # preview first 5 rows

Num of rows: 59
Num of columns: 10


Unnamed: 0,street_address,bedrooms,bathrooms,list_price,property_type,town,image_url,url,lat,lon
0,Apartamento de 4 habitaciones en Mir...,4 Cuartos,2 Baños,"$320,000",Apt/WalkUp,"Sector-Miramar , San Juan - Condado-Miramar",https://imgcache.clasificadosonline.com//PP/FS...,https://www.clasificadosonline.com/UDRealEstat...,18.4540053,-66.0832357
1,OFICINA CONDADO PERFECTA MEDICOS ó AB...,,3 Baños,"$375,000",Comercial,"Condominio-Adaligia , San Juan - Condado-Miramar",https://imgcache.clasificadosonline.com//PP/FS...,https://www.clasificadosonline.com/UDRealEstat...,,
2,"Cond. Kingsville, SJ, Condado",,1 Baños,"$260,000",Apartamento,"Condominio-Kings Court , San Juan - Condado-Mi...",https://imgcache.clasificadosonline.com//PP/FS...,https://www.clasificadosonline.com/UDRealEstat...,18.452478,-66.061168
3,"HERMOSA VISTA --- Condo. Hilltop, Mir...",2 Cuartos,2 Baños,"$375,000",Apartamento,"Condominio-Hill Top , San Juan - Condado-Miramar",https://imgcache.clasificadosonline.com//PP/FS...,https://www.clasificadosonline.com/UDRealEstat...,,
4,"FULLY REMODELED & FULLY FURNISHED,GRE...",2 Cuartos,1 Baños,"$305,000",Apartamento,"Sector-Condado , San Juan - Condado-Miramar",https://imgcache.clasificadosonline.com//PP/FS...,https://www.clasificadosonline.com/UDRealEstat...,18.45189,-66.06698


Step #6 - View contents of dataframe

In [41]:
# view count of property types
df_grp1 = df_for_sale.groupby(['property_type'])['street_address'].count()\
  .reset_index()\
  .rename(columns={'street_address': 'count'})
df_grp1['prct'] = df_grp1['count'] / df_grp1['count'].sum()
print(df_grp1)

fig = px.pie(df_grp1, values='count', names='property_type', 
             title='Count of properties by type')
fig.show()

   property_type  count      prct
0    Apartamento     49  0.830508
1     Apt/WalkUp      3  0.050847
2           Casa      1  0.016949
3      Comercial      4  0.067797
4  MultiFamiliar      1  0.016949
5          Solar      1  0.016949


In [42]:
# add features
df_for_sale_feat = df_for_sale.copy()
df_for_sale_feat['num_bedrooms'] = df_for_sale_feat.apply(lambda x: 
  get_num_bedrooms(x['bedrooms']), axis=1)
df_for_sale_feat['num_bathrooms'] = df_for_sale_feat.apply(lambda x: 
  get_num_bathrooms(x['bathrooms']), axis=1)
df_for_sale_feat['list_price_int'] = df_for_sale_feat.apply(lambda x: 
  int(''.join(e for e in x['list_price'] if e.isalnum())), axis=1)
df_for_sale_feat['property_attributes'] = df_for_sale_feat.apply(lambda x: 
  'Bds: {0}, Bths: {1}'.format(str(x['num_bedrooms']), str(x['num_bathrooms'])), axis=1)
df_for_sale_feat.head(2)

Unnamed: 0,street_address,bedrooms,bathrooms,list_price,property_type,town,image_url,url,lat,lon,num_bedrooms,num_bathrooms,list_price_int,property_attributes
0,Apartamento de 4 habitaciones en Mir...,4 Cuartos,2 Baños,"$320,000",Apt/WalkUp,"Sector-Miramar , San Juan - Condado-Miramar",https://imgcache.clasificadosonline.com//PP/FS...,https://www.clasificadosonline.com/UDRealEstat...,18.4540053,-66.0832357,4,2,320000,"Bds: 4, Bths: 2"
1,OFICINA CONDADO PERFECTA MEDICOS ó AB...,,3 Baños,"$375,000",Comercial,"Condominio-Adaligia , San Juan - Condado-Miramar",https://imgcache.clasificadosonline.com//PP/FS...,https://www.clasificadosonline.com/UDRealEstat...,,,0,3,375000,"Bds: 0, Bths: 3"


In [43]:
# view count of property attributes (bed / bath)
df_grp2 = df_for_sale_feat.groupby(['property_attributes'])['street_address'].count()\
  .reset_index()\
  .rename(columns={'street_address': 'count'})\
  .sort_values(by=['property_attributes'])
df_grp2['prct'] = df_grp2['count'] / df_grp2['count'].sum()
print(df_grp2)

fig = px.pie(df_grp2, values='count', names='property_attributes', 
             title='Count of properties by type')
fig.show()

   property_attributes  count      prct
0      Bds: 0, Bths: 0      1  0.016949
1      Bds: 0, Bths: 1     13  0.220339
2      Bds: 0, Bths: 2      1  0.016949
3      Bds: 0, Bths: 3      1  0.016949
4      Bds: 1, Bths: 1     20  0.338983
5      Bds: 1, Bths: 2      2  0.033898
6      Bds: 2, Bths: 1      5  0.084746
7      Bds: 2, Bths: 2      5  0.084746
8      Bds: 3, Bths: 1      4  0.067797
9      Bds: 3, Bths: 2      2  0.033898
10     Bds: 3, Bths: 3      2  0.033898
11     Bds: 4, Bths: 2      1  0.016949
12     Bds: 6, Bths: 2      1  0.016949
13     Bds: 6, Bths: 5      1  0.016949


In [44]:
# view distribution of bedrooms / bathrooms / price
fig1 = px.histogram(df_for_sale_feat, x="num_bedrooms", title="Distribution of Bedrooms")
fig1.show()

fig2 = px.histogram(df_for_sale_feat, x="num_bathrooms", title="Distribution of Bathrooms")
fig2.show()

fig3 = px.box(df_for_sale_feat, y="list_price_int", title="Box Pot of List Price")
fig3.show()

Step #7 - Export file

In [21]:
# uncomment below to download file
# df_for_sale.to_csv('puerto_rico_listings_for_sale.csv', index=False)
# files.download('puerto_rico_listings_for_sale.csv')

#### <font color="purple">2. Properties For-Rent Long-Term</font> ✌

Step #1 - <b>View all towns</b> available to retrieve rental listings.

In [45]:
# get all rental towns
rental_town_response = get_all_rental_towns(rapid_api_key)

Step #2 - View response in a DataFrame

In [46]:
# transform list of responses into a single dataframe
df_rental_towns = pd.DataFrame(rental_town_response.json())
print('Num of rows:', len(df_rental_towns))
print('Num of columns:', len(df_rental_towns.columns))
df_rental_towns.head()

Num of rows: 92
Num of columns: 2


Unnamed: 0,id,name
0,%,All
1,Metro,Área Metro
2,Central,Área Central
3,Este,Área Este/ East
4,Norte,Área Norte/North


Step #3 - Get long-term rental data for multiple towns

In [47]:
# view list of towns
town_list = ['Carolina', 'Ponce']
df_rental_towns_fltr = df_rental_towns.loc[df_rental_towns['name'].isin(town_list)]
df_rental_towns_fltr

Unnamed: 0,id,name
22,Carolina,Carolina
66,Ponce,Ponce


In [48]:
long_term_rent_df_list = []

# set up a loop to get rental data for multiple towns
for town_id in df_rental_towns_fltr['id'].tolist():
  print('Getting data for {}'.format(town_id))

  # get data for town
  long_term_rent_response = get_long_term_rentals(
    rapid_api_key=rapid_api_key, 
    town_id=town_id,
    all_pages=True
  )

  # collect all responses
  _df = transform_listings_to_df(long_term_rent_response)
  long_term_rent_df_list.append(_df)

Getting data for Carolina
API request successful!
Retrieved data for page: 1
Requesting data for page: 2
Requesting data for page: 3
--- 3.8389475345611572 seconds ---
Getting data for Ponce
API request successful!
Retrieved data for page: 1
Requesting data for page: 2
Requesting data for page: 3
Requesting data for page: 4
--- 5.278387069702148 seconds ---


In [49]:
# long term rental
df_ltr = pd.concat(long_term_rent_df_list)
print('Num of rows:', len(df_ltr))
print('Num of columns:', len(df_ltr.columns))
df_ltr.head()

Num of rows: 175
Num of columns: 11


Unnamed: 0,street_address,bedrooms,bathrooms,list_price,property_type,town,sub_location,image_url,url,lat,lon
0,"Cond. Reef Tower, Carolina.",2 Cuartos,1 Baños,"$3,500",Apartamento,Condominio - Reef Tower,Carolina,https://imgcache.clasificadosonline.com//PP/FR...,https://www.clasificadosonline.com/UDRentalsDe...,18.447076,-66.03398
1,"COND TOMASVILLE PARK, APARTAMENTO EN CAROLINA",3 Cuartos,2 Baños,"$1,100",Apartamento/WalkUp,Condominio - Thomasville Park,Carolina,https://imgcache.clasificadosonline.com//PP/FR...,https://www.clasificadosonline.com/UDRentalsDe...,18.379945,-65.977515
2,CASA ABAJO DE 3 H Y 1 B. TIENE PATIO,3 Cuartos,1 Baños,$900,Casa,Urbanizacion - Villa Carolina,Carolina,https://imgcache.clasificadosonline.com//PP/FR...,https://www.clasificadosonline.com/UDRentalsDe...,18.4170787,-65.9600687
3,MARBELLA ISLA VERDE,1 Cuartos,1 Baños,"$2,350",Apartamento,Condominio - Marbella Del Caribe,Carolina,https://imgcache.clasificadosonline.com//PP/FR...,https://www.clasificadosonline.com/UDRentalsDe...,18.443203,-66.023723
4,Pequeño Estudio para 1 persona $500 Carolina,,1 Baños,$500,Apartamento,Urbanizacion - Villa Carolina,Carolina,https://imgcache.clasificadosonline.com//PP/FR...,https://www.clasificadosonline.com/UDRentalsDe...,18.4170787,-65.9600687


Step #5 - Add features

In [50]:
# add features
df_ltr_feat = df_ltr.copy()

# remove invalid / missing list prices
df_ltr_feat = df_ltr_feat.loc[~df_ltr_feat['list_price'].isnull()]
print('Removed {0}% of data'.format( 
    round( ((len(df_ltr) - len(df_ltr_feat)) / len(df_ltr)) * 100, 2)))

df_ltr_feat['num_bedrooms'] = df_ltr_feat.apply(lambda x: 
  get_num_bedrooms(x['bedrooms']), axis=1)
df_ltr_feat['num_bathrooms'] = df_ltr_feat.apply(lambda x: 
  get_num_bathrooms(x['bathrooms']), axis=1)
df_ltr_feat['list_price_int'] = df_ltr_feat.apply(lambda x: 
  int(''.join(e for e in x['list_price'] if e.isalnum())), axis=1)
df_ltr_feat['property_attributes'] = df_ltr_feat.apply(lambda x: 
  'Bds: {0}, Bths: {1}'.format(str(x['num_bedrooms']), str(x['num_bathrooms'])), axis=1)
df_ltr_feat.head(2)

Removed 4.57% of data


Unnamed: 0,street_address,bedrooms,bathrooms,list_price,property_type,town,sub_location,image_url,url,lat,lon,num_bedrooms,num_bathrooms,list_price_int,property_attributes
0,"Cond. Reef Tower, Carolina.",2 Cuartos,1 Baños,"$3,500",Apartamento,Condominio - Reef Tower,Carolina,https://imgcache.clasificadosonline.com//PP/FR...,https://www.clasificadosonline.com/UDRentalsDe...,18.447076,-66.03398,2,1,3500,"Bds: 2, Bths: 1"
1,"COND TOMASVILLE PARK, APARTAMENTO EN CAROLINA",3 Cuartos,2 Baños,"$1,100",Apartamento/WalkUp,Condominio - Thomasville Park,Carolina,https://imgcache.clasificadosonline.com//PP/FR...,https://www.clasificadosonline.com/UDRentalsDe...,18.379945,-65.977515,3,2,1100,"Bds: 3, Bths: 2"


Step #6 - View list price statistics

❓ Questions our Data Analysis Helps to Answer:
1. What the median rental price is per bedroom?
2. Which town is more affordable?

In [51]:
df_ltr_feat.groupby(['sub_location', 'num_bedrooms'])\
  .agg({'list_price_int': ['median', 'mean', 'min', 'max']}).round()

Unnamed: 0_level_0,Unnamed: 1_level_0,list_price_int,list_price_int,list_price_int,list_price_int
Unnamed: 0_level_1,Unnamed: 1_level_1,median,mean,min,max
sub_location,num_bedrooms,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2
Carolina,0,500.0,557.0,400,800
Carolina,1,500.0,587.0,350,2350
Carolina,2,690.0,897.0,500,3500
Carolina,3,1400.0,1462.0,700,3500
Carolina,4,2475.0,2475.0,950,4000
Ponce,0,940.0,6453.0,650,25000
Ponce,1,550.0,721.0,220,4000
Ponce,2,800.0,977.0,475,2100
Ponce,3,1500.0,1550.0,280,5000
Ponce,4,1700.0,1942.0,400,5000


#### <font color="purple">3. Properties For-Rent Short-Term</font> 🌴


Step #1 - Get short-term rental data for a single town

In [52]:
# get data for town
short_term_rent_response = get_short_term_rentals(
  rapid_api_key=rapid_api_key, 
  town_id='metro'
)

# transform list of responses into a single dataframe
df_str = pd.DataFrame(short_term_rent_response.json()['properties'])
print('Num of rows:', len(df_str))
print('Num of columns:', len(df_str.columns))
df_str.head()

API request successful!
Num of rows: 10
Num of columns: 7


Unnamed: 0,name,property_type,price,image_url,url,lat,lon
0,"Tropical Andalucía Torre I - 3 Cuartos, 1 Baño",Apartamento,Desde: $90,https://imgcache.clasificadosonline.com/PP/Vac...,https://www.clasificadosonline.com/UDVacationD...,18.3865305,-66.0329795
1,COND. ATLANTIC BEACH-ISLA VERDE,Apartamento,Desde: $125,https://imgcache.clasificadosonline.com/PP/Vac...,https://www.clasificadosonline.com/UDVacationD...,18.4424045793644,-66.0260742628296
2,Casa Condado Hotel,Hotel,,https://imgcache.clasificadosonline.com\media\...,https://www.clasificadosonline.com/UDVacationD...,18.4557548821414,-66.0715202188965
3,"Mare St Clair (ESJ), Isla Verde",Apartamento,Desde: $125,https://imgcache.clasificadosonline.com/PP/Vac...,https://www.clasificadosonline.com/UDVacationD...,18.4446825063385,-66.0170947299619
4,San Juan-Condado-Isla Verde,Apartamento,Desde: $130,https://imgcache.clasificadosonline.com/PP/Vac...,https://www.clasificadosonline.com/UDVacationD...,18.4595205716762,-66.0777915926514


In [53]:
df_rental_towns.head()

Unnamed: 0,id,name
0,%,All
1,Metro,Área Metro
2,Central,Área Central
3,Este,Área Este/ East
4,Norte,Área Norte/North


Step #2 - Get short-term rental data for <b>all available</b> rental listings.

In [54]:
listing_url_list = []
max_page_list = []
for index, row in df_rental_towns.iterrows():
  print('Getting data for id:', row['id'])
  short_term_rent_response = get_short_term_rentals(
    rapid_api_key=rapid_api_key, 
    town_id=row['id']
  )
  listing_url_list.append(short_term_rent_response.json()['meta']['listing_url'])
  max_page_list.append(short_term_rent_response.json()['meta']['max_page'])

Getting data for id: %
API request successful!
Getting data for id: Metro
API request successful!
Getting data for id: Central
API request successful!
Getting data for id: Este
API request successful!
Getting data for id: Norte
API request successful!
Getting data for id: Sur
API request successful!
Getting data for id: Oeste
API request successful!
Getting data for id: Adjuntas
API request successful!
Getting data for id: Aguada
API request successful!
Getting data for id: Aguadilla
API request successful!
Getting data for id: Aguas Buenas
API request successful!
Getting data for id: Aibonito
API request successful!
Getting data for id: Añasco
API request successful!
Getting data for id: Arecibo
API request successful!
Getting data for id: Arroyo
API request successful!
Getting data for id: Barceloneta
API request successful!
Getting data for id: Barranquitas
API request successful!
Getting data for id: Bayamón
API request successful!
Getting data for id: Cabo Rojo
API request success

In [55]:
data = {'id': df_rental_towns['id'].tolist(),
        'listing_url': listing_url_list,
        'max_page': max_page_list}
df_all_st_listings = pd.DataFrame(data)
df_all_st_listings

Unnamed: 0,id,listing_url,max_page
0,%,https://www.clasificadosonline.com/UDVacListin...,3
1,Metro,https://www.clasificadosonline.com/UDVacListin...,1
2,Central,https://www.clasificadosonline.com/UDVacListin...,1
3,Este,https://www.clasificadosonline.com/UDVacListin...,2
4,Norte,https://www.clasificadosonline.com/UDVacListin...,1
...,...,...,...
87,Vega Baja,https://www.clasificadosonline.com/UDVacListin...,1
88,Vieques,https://www.clasificadosonline.com/UDVacListin...,1
89,Villalba,https://www.clasificadosonline.com/UDVacListin...,0
90,Yabucoa,https://www.clasificadosonline.com/UDVacListin...,1


In [33]:
# # uncomment below to download file
# df_all_st_listings.to_csv('puerto_rico_short_term_rentals.csv', index=False)
# files.download('puerto_rico_short_term_rentals.csv')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

### <font color="green">Section #2 - Market Analysis</font> 📈
This section will cover how to derive market statistics to montior market performance.
<br>
*In Development - Coming Soon*

### <font color="green">Section #3 - Visualize Properties</font> 🗺
This section will cover how to visualize property listings on a map.

In [56]:
# visualize for-sale listings

# prepare dataframe
df_for_sale_map = df_for_sale_feat.copy()
df_for_sale_map['latitude'] = pd.to_numeric(df_for_sale_map.lat, errors='coerce')
df_for_sale_map['longitude'] = pd.to_numeric(df_for_sale_map.lon, errors='coerce')

# Create a geopandas dataframe from a regular dataframe
gdf = gpd.GeoDataFrame(df_for_sale_map, geometry=gpd.points_from_xy(
    df_for_sale_map.longitude, df_for_sale_map.latitude))

gdf.head(2)

Unnamed: 0,street_address,bedrooms,bathrooms,list_price,property_type,town,image_url,url,lat,lon,num_bedrooms,num_bathrooms,list_price_int,property_attributes,latitude,longitude,geometry
0,Apartamento de 4 habitaciones en Mir...,4 Cuartos,2 Baños,"$320,000",Apt/WalkUp,"Sector-Miramar , San Juan - Condado-Miramar",https://imgcache.clasificadosonline.com//PP/FS...,https://www.clasificadosonline.com/UDRealEstat...,18.4540053,-66.0832357,4,2,320000,"Bds: 4, Bths: 2",18.454005,-66.083236,POINT (-66.08324 18.45401)
1,OFICINA CONDADO PERFECTA MEDICOS ó AB...,,3 Baños,"$375,000",Comercial,"Condominio-Adaligia , San Juan - Condado-Miramar",https://imgcache.clasificadosonline.com//PP/FS...,https://www.clasificadosonline.com/UDRealEstat...,,,0,3,375000,"Bds: 0, Bths: 3",,,POINT EMPTY


In [57]:
# create Kepler map
map_config = get_kepler_map_config()
kepler_map = KeplerGl(height=400, data={'data_1': gdf}, config=map_config)
kepler_map

User Guide: https://docs.kepler.gl/docs/keplergl-jupyter


KeplerGl(config={'config': {'mapState': {'bearing': 0, 'dragRotate': False, 'isSplit': False, 'latitude': 18.4…

# End Notebook