# Chapter 10.1 - Putting it all together

Paul E. Anderson

## Ice Breaker

What is the first thing you are going to do once all your assignments are finished and exams are over?

**Learning objectives:** Understand how most of the tools discussed in this textbook work together.

In [1]:
%load_ext autoreload
%autoreload 2

from pathlib import Path
home = str(Path.home()) # all other paths are relative to this path. 
# This is not relevant to most people because I recommended you use my server, but
# change home to where you are storing everything. Again. Not recommended.

## Dataset Overview

AirBnB provides a lot of data about their listings, reviews, and neighborhoods in different formats. I have downloaded only a small subset.

In [2]:
!ls -l {home}/csc-369-student/data/airbnb/LA/02_November_2021

total 57100
-rw-r--r-- 1 jupyter-pander14 jupyter-pander14 35253051 Nov 27 16:30 calendar.csv.gz
-rw-r--r-- 1 jupyter-pander14 jupyter-pander14 22483928 Nov 27 16:30 listings.csv.gz
-rw-r--r-- 1 jupyter-pander14 jupyter-pander14     8392 Nov 27 16:30 neighbourhoods.csv
-rw-r--r-- 1 jupyter-pander14 jupyter-pander14   713261 Nov 27 16:30 neighbourhoods.geojson
-rw-r--r-- 1 jupyter-pander14 jupyter-pander14        0 Nov 27 16:30 output.geojson


## Exploring the files

In [27]:
import os

os.environ['PYSPARK_SUBMIT_ARGS'] = "--packages com.datastax.spark:spark-cassandra-connector_2.12:3.1.0,org.mongodb.spark:mongo-spark-connector_2.12:3.0.1"
os.environ['PYSPARK_SUBMIT_ARGS'] += " --conf spark.cassandra.connection.host=127.0.0.1"

os.environ['PYSPARK_SUBMIT_ARGS'] += ' --conf "spark.mongodb.input.uri=mongodb://127.0.0.1/csc-369.neighbourhoods_geo?readPreference=primaryPreferred"'
os.environ['PYSPARK_SUBMIT_ARGS'] += ' --conf "spark.mongodb.output.uri=mongodb://127.0.0.1/csc-369.neighbourhoods_geo"'
os.environ['PYSPARK_SUBMIT_ARGS'] += " pyspark-shell"


In [4]:
from pyspark import SparkConf
from pyspark.context import SparkContext

sc = SparkContext.getOrCreate(SparkConf().setMaster("local[*]"))

RDDs are great for taking a look at a file you are unfamiliar with:

### Calendar.csv.gz

In [5]:
rdd = sc.textFile(f"{home}/csc-369-student/data/airbnb/LA/02_November_2021/calendar.csv.gz")

print("\n".join(rdd.take(10)))

listing_id,date,available,price,adjusted_price,minimum_nights,maximum_nights
37171,2021-11-03,f,$100.00,$100.00,1,30
106061,2021-11-03,f,$83.00,$83.00,30,365
106061,2021-11-04,f,$83.00,$83.00,30,365
106061,2021-11-05,f,$83.00,$83.00,30,365
106061,2021-11-06,f,$83.00,$83.00,30,365
106061,2021-11-07,f,$83.00,$83.00,30,365
106061,2021-11-08,f,$83.00,$83.00,30,365
106061,2021-11-09,f,$83.00,$83.00,30,365
106061,2021-11-10,f,$83.00,$83.00,30,365


We can read that directly with Spark SQL

In [6]:
from pyspark.sql import SparkSession

spark = SparkSession \
    .builder \
    .appName("Chapter10") \
    .getOrCreate()

In [7]:
df = spark.read.csv(f"{home}/csc-369-student/data/airbnb/LA/02_November_2021/calendar.csv.gz",header=True) 
df

DataFrame[listing_id: string, date: string, available: string, price: string, adjusted_price: string, minimum_nights: string, maximum_nights: string]

It read the file, but it had trouble with the fields. We'll need to process them correctly.

In [8]:
from pyspark.sql.functions import col
from pyspark.sql.types import IntegerType
from pyspark.sql.functions import to_date
from pyspark.sql.functions import regexp_replace


df = df.withColumn("listing_id", df["listing_id"].cast("long"))
df = df.withColumn("date", to_date(df["date"], "yyyy-MM-dd"))
df = df.withColumn("date", to_date(df["date"], "yyyy-MM-dd"))
df = df.withColumn('price', regexp_replace('price', '\$', '')).withColumn('price',col('price').cast('double'))
df = df.withColumn('adjusted_price', regexp_replace('adjusted_price', '\$', '')).withColumn('adjusted_price',col('adjusted_price').cast('double'))
df = df.withColumn("minimum_nights", df["minimum_nights"].cast("int"))
df = df.withColumn("maximum_nights", df["maximum_nights"].cast("int"))

df

DataFrame[listing_id: bigint, date: date, available: string, price: double, adjusted_price: double, minimum_nights: int, maximum_nights: int]

In [9]:
df = df.na.drop(subset='price') # Remove any listings without price because we want to use it in a primary key later

In [10]:
df.take(5)

[Row(listing_id=37171, date=datetime.date(2021, 11, 3), available='f', price=100.0, adjusted_price=100.0, minimum_nights=1, maximum_nights=30),
 Row(listing_id=106061, date=datetime.date(2021, 11, 3), available='f', price=83.0, adjusted_price=83.0, minimum_nights=30, maximum_nights=365),
 Row(listing_id=106061, date=datetime.date(2021, 11, 4), available='f', price=83.0, adjusted_price=83.0, minimum_nights=30, maximum_nights=365),
 Row(listing_id=106061, date=datetime.date(2021, 11, 5), available='f', price=83.0, adjusted_price=83.0, minimum_nights=30, maximum_nights=365),
 Row(listing_id=106061, date=datetime.date(2021, 11, 6), available='f', price=83.0, adjusted_price=83.0, minimum_nights=30, maximum_nights=365)]

In [11]:
df.printSchema()

root
 |-- listing_id: long (nullable = true)
 |-- date: date (nullable = true)
 |-- available: string (nullable = true)
 |-- price: double (nullable = true)
 |-- adjusted_price: double (nullable = true)
 |-- minimum_nights: integer (nullable = true)
 |-- maximum_nights: integer (nullable = true)



Since we are going to have a lot of these files in theory, it is good to process them and then insert them into a Cassandra database.

In [12]:
from cassandra.cluster import Cluster
cluster = Cluster(['0.0.0.0'],port=9042)
session = cluster.connect()

In [13]:
for row in session.execute('select release_version from system.local;'):
    print(row)

Row(release_version='4.0.1')


In [14]:
session.execute("DROP KEYSPACE IF EXISTS airbnb")

<cassandra.cluster.ResultSet at 0x7fca225fcf98>

In [15]:
for row in session.execute("CREATE KEYSPACE airbnb WITH replication = {'class':'SimpleStrategy', 'replication_factor' : 1};"):
    print(row)

# We are using the first replica placement strategy, i.e.., Simple Strategy.

# And we are choosing the replication factor to 3 replica.

In [16]:
for row in session.execute('DROP TABLE IF EXISTS airbnb.calendar;'):
    print(row)
for row in session.execute("""
CREATE TABLE airbnb.calendar 
( listing_id int, 
  date timestamp, 
  available text, 
  price double,
  adjusted_price double,
  minimum_nights int,
  maximum_nights int,
  
  PRIMARY KEY (date, price, listing_id)
);
"""):
    print(row)

#Item one is the partition key
#Item two is the first clustering column. Added_date is a timestamp so the sort order is chronological, ascending.
#Item three is the second clustering column. Since videoid is a UUID, we are including it so simply show that it is a part of a unique record.

In [17]:
df.write.format("org.apache.spark.sql.cassandra").mode('append').options(table='calendar',keyspace='airbnb').save()

In [18]:
for row in session.execute("SELECT price FROM airbnb.calendar LIMIT 50;"):
    print(row)

Row(price=0.0)
Row(price=0.0)
Row(price=0.0)
Row(price=0.0)
Row(price=0.0)
Row(price=0.0)
Row(price=0.0)
Row(price=0.0)
Row(price=0.0)
Row(price=0.0)
Row(price=0.0)
Row(price=0.0)
Row(price=10.0)
Row(price=10.0)
Row(price=10.0)
Row(price=10.0)
Row(price=10.0)
Row(price=10.0)
Row(price=11.0)
Row(price=11.0)
Row(price=12.0)
Row(price=12.0)
Row(price=14.0)
Row(price=15.0)
Row(price=16.0)
Row(price=16.0)
Row(price=16.0)
Row(price=16.0)
Row(price=16.0)
Row(price=16.0)
Row(price=17.0)
Row(price=17.0)
Row(price=17.0)
Row(price=17.0)
Row(price=17.0)
Row(price=18.0)
Row(price=18.0)
Row(price=18.0)
Row(price=18.0)
Row(price=18.0)
Row(price=18.0)
Row(price=18.0)
Row(price=18.0)
Row(price=19.0)
Row(price=19.0)
Row(price=19.0)
Row(price=19.0)
Row(price=19.0)
Row(price=19.0)
Row(price=19.0)


### listings.csv.gz

In [19]:
rdd = sc.textFile(f"{home}/csc-369-student/data/airbnb/LA/02_November_2021/listings.csv.gz")

print("\n".join(rdd.take(10)))

id,listing_url,scrape_id,last_scraped,name,description,neighborhood_overview,picture_url,host_id,host_url,host_name,host_since,host_location,host_about,host_response_time,host_response_rate,host_acceptance_rate,host_is_superhost,host_thumbnail_url,host_picture_url,host_neighbourhood,host_listings_count,host_total_listings_count,host_verifications,host_has_profile_pic,host_identity_verified,neighbourhood,neighbourhood_cleansed,neighbourhood_group_cleansed,latitude,longitude,property_type,room_type,accommodates,bathrooms,bathrooms_text,bedrooms,beds,amenities,price,minimum_nights,maximum_nights,minimum_minimum_nights,maximum_minimum_nights,minimum_maximum_nights,maximum_maximum_nights,minimum_nights_avg_ntm,maximum_nights_avg_ntm,calendar_updated,has_availability,availability_30,availability_60,availability_90,availability_365,calendar_last_scraped,number_of_reviews,number_of_reviews_ltm,number_of_reviews_l30d,first_review,last_review,review_scores_rating,review_scores_accuracy,review_sc

In [20]:
listings_df = spark.read.csv(f"{home}/csc-369-student/data/airbnb/LA/02_November_2021/listings.csv.gz",header=True,quote="\"",escape="\"",multiLine=True)
listings_df

DataFrame[id: string, listing_url: string, scrape_id: string, last_scraped: string, name: string, description: string, neighborhood_overview: string, picture_url: string, host_id: string, host_url: string, host_name: string, host_since: string, host_location: string, host_about: string, host_response_time: string, host_response_rate: string, host_acceptance_rate: string, host_is_superhost: string, host_thumbnail_url: string, host_picture_url: string, host_neighbourhood: string, host_listings_count: string, host_total_listings_count: string, host_verifications: string, host_has_profile_pic: string, host_identity_verified: string, neighbourhood: string, neighbourhood_cleansed: string, neighbourhood_group_cleansed: string, latitude: string, longitude: string, property_type: string, room_type: string, accommodates: string, bathrooms: string, bathrooms_text: string, bedrooms: string, beds: string, amenities: string, price: string, minimum_nights: string, maximum_nights: string, minimum_mini

In [21]:
listings_df = listings_df.select(['id','host_id','name','description','bedrooms','host_neighbourhood','neighbourhood_group_cleansed'])
listings_df

DataFrame[id: string, host_id: string, name: string, description: string, bedrooms: string, host_neighbourhood: string, neighbourhood_group_cleansed: string]

In [22]:
listings_df.take(5)

[Row(id='109', host_id='521', name='Amazing bright elegant condo park front *UPGRADED*', description='*** Unit upgraded with new bamboo flooring, brand new Ultra HD 50" Sony TV, new paint, new lighting, new mattresses, ultra fast cable Internet connection, Apple TV, Google Chromecast. ***<br /><br />Gorgeous and Elegant Furnished Condo in front of Culver City Fox Hills Park. <br />Upper corner unit, total silence protected by trees.<br />Short walk to the new Westfield Mall.<br />Tennis courts, heated pool and jacuzzi hot tub.<br /><br /><b>The space</b><br />*** Unit upgraded with new bamboo flooring, brand new Ultra HD 50" Sony TV, new paint, new lighting, new mattresses, ultra fast cable Internet connection. ***<br /><br />Gorgeous and Elegant Furnished Apartment in front of Culver City Fox Hills Park. <br />Upper corner unit, total silence protected by trees.<br />Short walk to the new Westfield Mall.<br />Tennis courts, heated pool and jacuzzi hot tub.<br /><br />*** Upgraded with

In [23]:
listings_df = listings_df.withColumn("id", listings_df["id"].cast("long"))
listings_df = listings_df.withColumn("host_id", listings_df["host_id"].cast("long"))
listings_df = listings_df.withColumn("bedrooms", listings_df["bedrooms"].cast("int"))

listings_df

DataFrame[id: bigint, host_id: bigint, name: string, description: string, bedrooms: int, host_neighbourhood: string, neighbourhood_group_cleansed: string]

In [24]:
for row in session.execute('DROP TABLE IF EXISTS airbnb.listings;'):
    print(row)
for row in session.execute("""
CREATE TABLE airbnb.listings 
( id bigint, 
  host_id bigint,
  name text,
  description text,
  bedrooms int,
  host_neighbourhood text,
  neighbourhood_group_cleansed text,
  PRIMARY KEY (host_id, id)
);
"""):
    print(row)

In [28]:
listings_df.write.format("org.apache.spark.sql.cassandra").mode('append').options(table='listings',keyspace='airbnb').save()

In [29]:
for row in session.execute("SELECT host_neighbourhood FROM airbnb.listings LIMIT 50;"):
    print(row)

Row(host_neighbourhood='Manhattan Beach')
Row(host_neighbourhood='South LA')
Row(host_neighbourhood='Venice')
Row(host_neighbourhood='Pico')
Row(host_neighbourhood='Mid-Wilshire')
Row(host_neighbourhood='Mid-Wilshire')
Row(host_neighbourhood=None)
Row(host_neighbourhood='Mid-Wilshire')
Row(host_neighbourhood='Mid-Wilshire')
Row(host_neighbourhood='Mid-Wilshire')
Row(host_neighbourhood='Mid-Wilshire')
Row(host_neighbourhood='Mid-Wilshire')
Row(host_neighbourhood='South LA')
Row(host_neighbourhood='Signal Hill')
Row(host_neighbourhood='Signal Hill')
Row(host_neighbourhood='Silver Lake')
Row(host_neighbourhood='Westchester/Playa Del Rey')
Row(host_neighbourhood='Whittier')
Row(host_neighbourhood='Mid-Wilshire')
Row(host_neighbourhood='Mid-Wilshire')
Row(host_neighbourhood='Mid-Wilshire')
Row(host_neighbourhood='Mid-Wilshire')
Row(host_neighbourhood=None)
Row(host_neighbourhood='West Hollywood')
Row(host_neighbourhood='Topanga')
Row(host_neighbourhood='North Campus')
Row(host_neighbourhood

### Spark Pushdown Filters

Spark will work to push down filters to the database to only return the data we need.

In [30]:
df2 = spark.read.format("org.apache.spark.sql.cassandra").options(table='calendar',keyspace='airbnb').load()
df_with_pushdown = df2.filter(df2["price"] > 50)
df_with_pushdown.take(5)

[Row(date=datetime.datetime(2022, 7, 20, 0, 0), price=51.0, listing_id=2007610, adjusted_price=51.0, available='t', maximum_nights=1125, minimum_nights=5),
 Row(date=datetime.datetime(2022, 7, 20, 0, 0), price=51.0, listing_id=3782877, adjusted_price=51.0, available='t', maximum_nights=1125, minimum_nights=2),
 Row(date=datetime.datetime(2022, 7, 20, 0, 0), price=51.0, listing_id=4129717, adjusted_price=51.0, available='t', maximum_nights=1125, minimum_nights=1),
 Row(date=datetime.datetime(2022, 7, 20, 0, 0), price=51.0, listing_id=5070776, adjusted_price=51.0, available='t', maximum_nights=1125, minimum_nights=9),
 Row(date=datetime.datetime(2022, 7, 20, 0, 0), price=51.0, listing_id=7131026, adjusted_price=51.0, available='f', maximum_nights=62, minimum_nights=30)]

## neighbourhoods.geojson

In [31]:
!jq --compact-output ".features" {home}/csc-369-student/data/airbnb/LA/02_November_2021/neighbourhoods.geojson > {home}/output.geojson

In [34]:
!mongoimport --db "csc-369" -c neighbourhoods_geo --file {home}/output.geojson --jsonArray

2021-12-03T08:14:59.905-0800	connected to: localhost
2021-12-03T08:14:59.965-0800	imported 270 documents


In [35]:
from pymongo import MongoClient
client = MongoClient()

db = client["csc-369"]

col = db["neighbourhoods_geo"]

In [36]:
for record in col.find().limit(5):
    print(record)

{'_id': ObjectId('61a2be5424cd91bbb3894ff0'), 'type': 'Feature', 'geometry': {'type': 'MultiPolygon', 'coordinates': [[[[-118.207034, 34.539023], [-118.189414, 34.538559], [-118.189506, 34.534963], [-118.185162, 34.534773], [-118.185164, 34.531246], [-118.176015, 34.531354], [-118.176189, 34.523803], [-118.167025, 34.523512], [-118.16294, 34.523716], [-118.162988, 34.527586], [-118.154267, 34.527789], [-118.154027, 34.52732], [-118.153655, 34.527429], [-118.150635, 34.52459], [-118.150644, 34.524313], [-118.150334, 34.524307], [-118.148505, 34.522586], [-118.148506, 34.521995], [-118.147866, 34.52197], [-118.142992, 34.517388], [-118.142992, 34.516884], [-118.142442, 34.516859], [-118.13288, 34.507909], [-118.131474, 34.507935], [-118.123438, 34.500307], [-118.122797, 34.498797], [-118.122389, 34.498701], [-118.122692, 34.498368], [-118.122468, 34.497536], [-118.122742, 34.49722], [-118.122089, 34.496825], [-118.122098, 34.496531], [-118.120985, 34.49508], [-118.07974, 34.495369], [-11

In [37]:
df = spark.read.format("com.mongodb.spark.sql.DefaultSource").load()

In [38]:
df.take(5)

[Row(_id=Row(oid='61a2be5424cd91bbb3894ff0'), geometry=Row(type='MultiPolygon', coordinates=[[[[-118.207034, 34.539023], [-118.189414, 34.538559], [-118.189506, 34.534963], [-118.185162, 34.534773], [-118.185164, 34.531246], [-118.176015, 34.531354], [-118.176189, 34.523803], [-118.167025, 34.523512], [-118.16294, 34.523716], [-118.162988, 34.527586], [-118.154267, 34.527789], [-118.154027, 34.52732], [-118.153655, 34.527429], [-118.150635, 34.52459], [-118.150644, 34.524313], [-118.150334, 34.524307], [-118.148505, 34.522586], [-118.148506, 34.521995], [-118.147866, 34.52197], [-118.142992, 34.517388], [-118.142992, 34.516884], [-118.142442, 34.516859], [-118.13288, 34.507909], [-118.131474, 34.507935], [-118.123438, 34.500307], [-118.122797, 34.498797], [-118.122389, 34.498701], [-118.122692, 34.498368], [-118.122468, 34.497536], [-118.122742, 34.49722], [-118.122089, 34.496825], [-118.122098, 34.496531], [-118.120985, 34.49508], [-118.07974, 34.495369], [-118.07963, 34.473521], [-11

### Viewing prices over time for each neighborhood

In [39]:
df2 = spark.read.format("org.apache.spark.sql.cassandra").options(table='calendar',keyspace='airbnb').load()
df2_with_pushdown = df2.filter(df2["price"] > 50)
df2_with_pushdown.head()

Row(date=datetime.datetime(2022, 10, 11, 0, 0), price=51.0, listing_id=5728, adjusted_price=51.0, available='f', maximum_nights=1125, minimum_nights=30)

In [40]:
df3 = spark.read.format("org.apache.spark.sql.cassandra").options(table='listings',keyspace='airbnb').load()
df3_with_pushdown = df3.filter("host_neighbourhood is not null")
df3_with_pushdown.head()

Row(host_id=52276423, id=10179859, bedrooms=1, description='Beautiful and luxury high rise in the heart of West Los Angeles. Condo with partial ocean views, Olympic pool, tennis and gym. Only 10min away from Santa Monica beach and 5min from UCLA. Many shops and restaurants within walk-in distance.<br /><br /><b>Other things to note</b><br />Home theatre system with 65" Samsung curved Ultra High Definition smart TV. Netflix and premium channels.', host_neighbourhood='West Los Angeles', name='High Rise ocean view in West LA', neighbourhood_group_cleansed='City of Los Angeles')

In [41]:
df4 = df2_with_pushdown.join(df3_with_pushdown,df2_with_pushdown["listing_id"] ==  df3_with_pushdown["id"],"inner")
df4.take(5)

[Row(date=datetime.datetime(2022, 7, 20, 0, 0), price=66.0, listing_id=1375509, adjusted_price=66.0, available='f', maximum_nights=1125, minimum_nights=30, host_id=6602626, id=1375509, bedrooms=1, description="<b>The space</b><br />A private and cozy home. Queen size bed with own PRIVATE BATHROOM, Clean linens and towels as well as complimentary soap and shampoo. Free WiFi and street parking. You are welcome to leave your luggage in case you arrive early or leave late. <br /><br />From LAX, you can take the Flyaway to the Union Station for $9. From Union Station take the gold line train to Pico Aliso Station and the house is 3 minutes away from the station.  Less than a mile from the 5 freeway, 10 freeway, 60 freeway, 101 freeway, and the 710 freeway.<br /><br />A mile away to Starbucks, Yogurtland, Wurstkuche, Pie Hole, Zip Fusion Sushi, Nola's New Orleans, District Korean BBQ, Woori Korean Market, Urth Cafe, Daikokuya Ramen House, Sushi-Gen, Mikawaya Mochi Ice Cream Store, Phillipe's

In [42]:
df4.groupBy("host_neighbourhood").count().show()

+--------------------+------+
|  host_neighbourhood| count|
+--------------------+------+
|      West Vancouver|   365|
|           Mar Vista|104120|
|     Rose Park South|  2190|
|North Alamitos Beach|  7289|
|           HollyGlen|   730|
|    Westover Estates|   365|
|             Truckee|   730|
|       Magnolia Park|   730|
|          Sagepointe|   730|
| Calabasas Highlands|   365|
|          West Hills| 15069|
| West Loop/Greektown|   973|
|        West Gateway|  1095|
|             Ipanema|   365|
|           Hollywood|381820|
|       Glassell Park| 14235|
|         Malaga Cove|  2873|
|          Menlo Park|   365|
|    Pico - Robertson| 22266|
|              Harbor|  2190|
+--------------------+------+
only showing top 20 rows



In [43]:
df4.groupBy("host_neighbourhood").avg("adjusted_price").show()

+--------------------+-------------------+
|  host_neighbourhood|avg(adjusted_price)|
+--------------------+-------------------+
|      West Vancouver|  302.8493150684931|
|           Mar Vista| 201.91054552439493|
|     Rose Park South| 205.74474885844748|
|North Alamitos Beach|  175.0854712580601|
|           HollyGlen|              174.5|
|    Westover Estates|  120.6986301369863|
|             Truckee|              295.0|
|       Magnolia Park|  92.67260273972603|
|          Sagepointe|  200.7972602739726|
| Calabasas Highlands|              179.0|
|          West Hills| 182.89521534275664|
| West Loop/Greektown|   684.726618705036|
|        West Gateway| 156.12328767123287|
|             Ipanema|  221.4054794520548|
|           Hollywood| 150.64238384579122|
|       Glassell Park| 133.84931506849315|
|         Malaga Cove|  354.7977723633832|
|          Menlo Park|              395.0|
|    Pico - Robertson|  179.5206143896524|
|              Harbor|  154.6200913242009|
+----------

In [44]:
df5 = df4.groupBy("host_neighbourhood").count()
df6 = df4.groupBy("host_neighbourhood").avg("adjusted_price")
df5.join(df6,df5['host_neighbourhood'] == df6['host_neighbourhood']).show()

+--------------------+------+--------------------+-------------------+
|  host_neighbourhood| count|  host_neighbourhood|avg(adjusted_price)|
+--------------------+------+--------------------+-------------------+
|      West Vancouver|   365|      West Vancouver|  302.8493150684931|
|           HollyGlen|   730|           HollyGlen|              174.5|
|           Mar Vista|104120|           Mar Vista| 201.91054552439493|
|North Alamitos Beach|  7289|North Alamitos Beach|  175.0854712580601|
|     Rose Park South|  2190|     Rose Park South| 205.74474885844748|
|             Truckee|   730|             Truckee|              295.0|
|    Westover Estates|   365|    Westover Estates|  120.6986301369863|
| Calabasas Highlands|   365| Calabasas Highlands|              179.0|
|       Magnolia Park|   730|       Magnolia Park|  92.67260273972603|
|          Sagepointe|   730|          Sagepointe|  200.7972602739726|
|             Ipanema|   365|             Ipanema|  221.4054794520548|
|     