# EDA project for Stakeholder (Buyer)

### Dataset
**House Sales in King County, USA**

### About Dataset

This dataset contains house sale prices for King County, which includes Seattle. It includes homes sold between May 2014 and May 2015.

*source: https://www.kaggle.com/datasets/harlfoxem/housesalesprediction?datasetId=128&sortBy=commentCount&searchQuery=eda+map*

### Aim of this project
Help out buyer to find one house in the city center and one house at countryside

> **Requirement-1 (City center)**

House | Criteria
--------|-------
Location | centrally located
Rooms | 3-4
Bathrooms | min. 2
Renovation | not more than 1 year
Condition | good -> 4 [source: https://info.kingcounty.gov/assessor/esales/Glossary.aspx?type=r#b]<br> - No obvious maintenance required<br> - But everything is not new 
Grade | custom design and higher quality finish work (11 and above) [source: https://info.kingcounty.gov/assessor/esales/Glossary.aspx?type=r#b]
Build year | Build after year 2010
Availability | ASAP

In [1]:
# Import necessary libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px

import folium               # Visualization


## Getting to Know the Data
`IMPORTANT` First get an overview of the data before starting visualization.

1) Columns name description: https://github.com/neuefische/ds-eda-project-template/blob/main/column_names.md
2) Check dataframe head and tail by using `head()` and `tail()`. `sample()` can also be use to check random rows
3) Check no. of rows and columns by using `shape`
4) Check data type and missing values using `info()`. `dtypes` is another option
5) Numerical columns statistics overview by `describe()`. Pass include='all' in `describe(include='all')` for categorical values as well. `agg(['max', 'min', 'std'])` is also another option
6) `unique()` is use to list unique values. `nunique()` give the count of unique values

In [2]:
df = pd.read_csv('data/King_County_House_prices_dataset.csv')

In [3]:
# checking the first 5 rows

df.head()

Unnamed: 0,id,date,price,bedrooms,bathrooms,sqft_living,sqft_lot,floors,waterfront,view,...,grade,sqft_above,sqft_basement,yr_built,yr_renovated,zipcode,lat,long,sqft_living15,sqft_lot15
0,7129300520,10/13/2014,221900.0,3,1.0,1180,5650,1.0,,0.0,...,7,1180,0.0,1955,0.0,98178,47.5112,-122.257,1340,5650
1,6414100192,12/9/2014,538000.0,3,2.25,2570,7242,2.0,0.0,0.0,...,7,2170,400.0,1951,1991.0,98125,47.721,-122.319,1690,7639
2,5631500400,2/25/2015,180000.0,2,1.0,770,10000,1.0,0.0,0.0,...,6,770,0.0,1933,,98028,47.7379,-122.233,2720,8062
3,2487200875,12/9/2014,604000.0,4,3.0,1960,5000,1.0,0.0,0.0,...,7,1050,910.0,1965,0.0,98136,47.5208,-122.393,1360,5000
4,1954400510,2/18/2015,510000.0,3,2.0,1680,8080,1.0,0.0,0.0,...,8,1680,0.0,1987,0.0,98074,47.6168,-122.045,1800,7503


In [4]:
# random rows

df.sample(5)

Unnamed: 0,id,date,price,bedrooms,bathrooms,sqft_living,sqft_lot,floors,waterfront,view,...,grade,sqft_above,sqft_basement,yr_built,yr_renovated,zipcode,lat,long,sqft_living15,sqft_lot15
15664,3824100020,2/3/2015,335000.0,3,1.75,1510,9720,1.0,0.0,0.0,...,7,1510,0.0,1948,,98028,47.7728,-122.258,1520,10037
21313,263000009,1/29/2015,375000.0,3,2.5,1440,1102,3.0,0.0,0.0,...,8,1440,0.0,2009,,98103,47.6995,-122.346,1440,1434
8819,726059349,3/19/2015,460000.0,3,1.75,1970,9135,1.0,0.0,0.0,...,7,1370,600.0,1961,,98011,47.7603,-122.215,1880,9650
15489,2472920780,11/26/2014,395000.0,4,2.5,2250,6840,2.0,0.0,0.0,...,9,2250,0.0,1987,0.0,98058,47.4398,-122.151,2480,7386
19383,9541800065,6/9/2014,625000.0,3,1.75,2210,16200,1.0,0.0,0.0,...,8,1390,820.0,1958,0.0,98005,47.5924,-122.175,2050,16200


In [None]:
# checking for data type and missing values

df.info()

In [None]:
# statistical summary
# Method-1

df.describe().transpose()

# Method-2

#df.agg(['count', 'mean', 'median', 'std', 'max', 'min']).transpose()

In [7]:
df.columns

Index(['id', 'date', 'price', 'bedrooms', 'bathrooms', 'sqft_living',
       'sqft_lot', 'floors', 'waterfront', 'view', 'condition', 'grade',
       'sqft_above', 'sqft_basement', 'yr_built', 'yr_renovated', 'zipcode',
       'lat', 'long', 'sqft_living15', 'sqft_lot15'],
      dtype='object')

In [10]:
df_no_renovation = df[df['yr_renovated'].isnull()]
df_no_renovation.transpose()


Unnamed: 0,2,12,23,26,28,40,45,52,56,58,...,21551,21553,21556,21565,21575,21576,21577,21579,21581,21583
id,5631500400,114101516,8091400200,1794500383,5101402488,5547700270,8035350320,7518505990,9478500640,7922800400,...,9521100031,6021503705,6056111067,7853420110,4140940150,1931300412,8672200110,1972201967,191100405,7202300110
date,2/25/2015,5/28/2014,5/16/2014,6/26/2014,6/24/2014,7/15/2014,7/18/2014,12/31/2014,8/19/2014,8/27/2014,...,6/18/2014,10/15/2014,7/7/2014,5/4/2015,10/2/2014,4/16/2015,3/17/2015,10/31/2014,4/21/2015,9/15/2014
price,180000.0,310000.0,252700.0,937000.0,438000.0,625000.0,488000.0,600000.0,292500.0,951000.0,...,690000.0,329000.0,230000.0,625000.0,572000.0,475000.0,1090000.0,520000.0,1580000.0,810000.0
bedrooms,2,3,2,3,3,4,3,3,4,5,...,3,2,3,3,4,3,5,2,4,4
bathrooms,1.0,1.0,1.5,1.75,1.75,2.5,2.5,1.75,2.5,3.25,...,3.25,2.5,1.75,3.0,2.75,2.25,3.75,2.25,3.25,3.0
sqft_living,770,1430,1070,2450,1520,2570,3160,1410,2250,3250,...,1540,980,1140,2780,2770,1190,4170,1530,3410,3990
sqft_lot,10000,19901,9643,2691,6380,5520,13603,4080,4495,14342,...,1428,1020,1201,6000,3852,1200,8142,981,10125,7838
floors,1.0,1.5,1.0,2.0,1.0,2.0,2.0,1.0,2.0,2.0,...,3.0,3.0,2.0,2.0,2.0,3.0,2.0,3.0,2.0,2.0
waterfront,0.0,0.0,,0.0,0.0,,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
view,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,4.0,...,0.0,0.0,0.0,0.0,0.0,0.0,2.0,0.0,0.0,0.0


In [11]:
df_no_renovation.yr_built.unique()

array([1933, 1927, 1985, 1915, 1948, 2000, 2003, 1950, 2008, 1968, 2005,
       1956, 1941, 1979, 2006, 1978, 1984, 1900, 1987, 1926, 2014, 1923,
       1901, 1961, 1977, 1966, 1942, 1947, 1976, 1911, 1982, 1989, 1960,
       1997, 1969, 1980, 1928, 1916, 1922, 2004, 1954, 2010, 2002, 1943,
       1994, 1940, 2001, 2007, 1988, 1953, 1918, 1999, 1952, 1902, 1925,
       1990, 1910, 1924, 1967, 1921, 1965, 1934, 1974, 1963, 1983, 1955,
       1998, 2009, 1993, 1975, 1939, 1995, 1920, 1991, 1949, 1962, 1945,
       1905, 1959, 1906, 1951, 1957, 1970, 1996, 1981, 1986, 1992, 1944,
       1919, 1903, 1972, 1913, 1958, 2015, 1907, 1971, 1930, 1964, 1909,
       1946, 1973, 1917, 2011, 2013, 1938, 1937, 1914, 2012, 1935, 1912,
       1936, 1908, 1932, 1929, 1931, 1904])

In [None]:
df_no_renovation.info()

In [111]:
df.yr_built.unique()

array([1955, 1951, 1933, 1965, 1987, 2001, 1995, 1963, 1960, 2003, 1942,
       1927, 1977, 1900, 1979, 1994, 1916, 1921, 1969, 1947, 1968, 1985,
       1941, 1915, 1909, 1948, 2005, 1929, 1981, 1930, 1904, 1996, 2000,
       1984, 2014, 1922, 1959, 1966, 1953, 1950, 2008, 1991, 1954, 1973,
       1925, 1989, 1972, 1986, 1956, 2002, 1992, 1964, 1952, 1961, 2006,
       1988, 1962, 1939, 1946, 1967, 1975, 1980, 1910, 1983, 1978, 1905,
       1971, 2010, 1945, 1924, 1990, 1914, 1926, 2004, 1923, 2007, 1976,
       1949, 1999, 1901, 1993, 1920, 1997, 1943, 1957, 1940, 1918, 1928,
       1974, 1911, 1936, 1937, 1982, 1908, 1931, 1998, 1913, 2013, 1907,
       1958, 2012, 1912, 2011, 1917, 1932, 1944, 1902, 2009, 1903, 1970,
       2015, 1934, 1938, 1919, 1906, 1935])

In [14]:
df_no_renovation.id.value_counts()

7972000010    2
3262300940    2
1922059278    2
4222310010    2
1954420170    2
             ..
4289900005    1
1604590190    1
8647600020    1
2405500050    1
7202300110    1
Name: id, Length: 3837, dtype: int64

In [183]:
df.yr_built.value_counts().head(10)

2014    559
2006    453
2005    450
2004    433
2003    420
2007    417
1977    417
1978    387
1968    381
2008    367
Name: yr_built, dtype: int64

In [110]:
df.yr_renovated.value_counts()

0.0       17011
2014.0       73
2013.0       31
2003.0       31
2007.0       30
          ...  
1951.0        1
1953.0        1
1946.0        1
1976.0        1
1948.0        1
Name: yr_renovated, Length: 70, dtype: int64

In [112]:
df_built= df.query('yr_built ==2006 | yr_built ==2014')

In [113]:
df_built.condition.value_counts()

3    1008
4       4
Name: condition, dtype: int64

In [None]:
df_built.grade.value_counts()

In [115]:
df_grade= df_built.query('grade ==11 | grade ==12')

In [None]:
df_grade

In [166]:
df_bedrooms = df_grade.query('bedrooms ==3 | bedrooms ==4')

In [167]:
df_bedrooms

Unnamed: 0,id,date,price,bedrooms,bathrooms,sqft_living,sqft_lot,floors,waterfront,view,...,grade,sqft_above,sqft_basement,yr_built,yr_renovated,zipcode,lat,long,sqft_living15,sqft_lot15
153,7855801670,4/1/2015,2250000.0,4,3.25,5180,19850,2.0,0.0,3.0,...,12,3540,1640.0,2006,0.0,98006,47.562,-122.162,3160,9750
1122,7237501180,6/25/2014,1200000.0,4,1.75,3990,13470,2.0,0.0,0.0,...,11,3990,0.0,2006,0.0,98059,47.5305,-122.131,5790,13709
2405,3888100128,7/28/2014,968933.0,4,3.5,4120,7304,2.0,0.0,0.0,...,11,3070,1050.0,2006,0.0,98033,47.681,-122.167,2470,9600
3098,622069006,8/20/2014,1500000.0,4,5.5,6550,217374,1.0,0.0,0.0,...,11,5400,1150.0,2006,0.0,98058,47.4302,-122.095,4110,50378
6577,8562720420,4/30/2015,1350000.0,4,3.5,4740,8611,2.0,0.0,3.0,...,11,3640,1100.0,2006,,98027,47.5375,-122.07,4042,8321
9313,7768700300,12/5/2014,2580000.0,4,4.25,5540,15408,2.0,0.0,1.0,...,11,4280,1260.0,2006,0.0,98004,47.6071,-122.212,3570,14750
10527,1623069046,3/12/2015,1700000.0,4,3.5,4070,336283,2.0,0.0,0.0,...,11,4070,0.0,2006,0.0,98027,47.478,-122.038,3020,44613
10770,8562720390,8/25/2014,1050000.0,4,4.0,4320,8709,2.0,0.0,0.0,...,11,3190,1130.0,2006,0.0,98027,47.5369,-122.07,4010,8321
11247,3629921000,11/21/2014,950000.0,4,2.5,3700,7051,2.0,0.0,0.0,...,11,3700,0.0,2006,0.0,98029,47.5427,-121.995,3580,6175
11786,3629920990,6/23/2014,905000.0,4,3.25,3440,7661,2.0,0.0,0.0,...,11,3440,0.0,2006,0.0,98029,47.5429,-121.995,3580,6478


In [None]:
#df_bedrooms.sort_values(by='price', ascending=False)

In [329]:
# For houses close to city center

df_price = df_bedrooms.query('price == 810000 | price == 1180000 | price == 1310000 | price == 1870000 | price == 2230000 | price == 2250000 | price == 2350000 | price == 2580000')
df_price_1 = df_price.query('price != 2250000')
df_price_2 = df_price_1.query('price != 810000')
df_price_3 = df_price_2.query('price != 1310000')
df_price_4 = df_price_3.query('price != 1180000')
df_price_4

Unnamed: 0,id,date,price,bedrooms,bathrooms,sqft_living,sqft_lot,floors,waterfront,view,...,grade,sqft_above,sqft_basement,yr_built,yr_renovated,zipcode,lat,long,sqft_living15,sqft_lot15
9313,7768700300,12/5/2014,2580000.0,4,4.25,5540,15408,2.0,0.0,1.0,...,11,4280,1260.0,2006,0.0,98004,47.6071,-122.212,3570,14750
21185,518500460,10/8/2014,2230000.0,3,3.5,3760,5634,2.0,1.0,4.0,...,11,2830,930.0,2014,0.0,98056,47.5285,-122.205,3560,5762
21294,2154970020,7/3/2014,2350000.0,4,4.25,5010,19412,2.0,0.0,1.0,...,11,4000,1010.0,2014,0.0,98040,47.5455,-122.211,3820,17064
21498,3262300818,2/27/2015,1870000.0,4,3.75,3790,8797,2.0,0.0,0.0,...,11,3290,500.0,2006,,98039,47.6351,-122.236,2660,12150


### Map location

In [340]:
s = folium.Figure(width=800, height=500)

map1= folium.Map(location=[df_price_4.lat.mean(), df_price_4.long.mean()], zoom_start=11, control_scale=True).add_to(s)

for index, location_info in df_price_4.iterrows():
    folium.Marker([location_info["lat"], location_info["long"]], popup=location_info["price"]).add_to(map1)

In [341]:
map1

In [345]:
map1.save('city_map.html')

> **Requirement-2 (Countryside)**

House | Criteria
--------|-------
Location | out skirts
Rooms | 4
Bathrooms | min. 3
Renovation | no
Condition | average -> 3 [source: https://info.kingcounty.gov/assessor/esales/Glossary.aspx?type=r#b]<br> - Maintenance required<br> - Renovation required<br> 
Grade | Better architecture design -> 9 [source: https://info.kingcounty.gov/assessor/esales/Glossary.aspx?type=r#b]
Build year | Build in year 2010
Availability | can wait for good option

In [None]:
df.head()

In [197]:
df_grade1= df.query('grade ==9')
df_grade1

Unnamed: 0,id,date,price,bedrooms,bathrooms,sqft_living,sqft_lot,floors,waterfront,view,...,grade,sqft_above,sqft_basement,yr_built,yr_renovated,zipcode,lat,long,sqft_living15,sqft_lot15
15,9297300055,1/24/2015,650000.0,4,3.00,2950,5000,2.0,0.0,3.0,...,9,1980,970.0,1979,0.0,98126,47.5714,-122.375,2140,4000
21,2524049179,8/26/2014,2000000.0,3,2.75,3050,44867,1.0,0.0,4.0,...,9,2330,720.0,1968,0.0,98040,47.5316,-122.233,4110,20336
40,5547700270,7/15/2014,625000.0,4,2.50,2570,5520,2.0,,0.0,...,9,2570,0.0,2000,,98074,47.6145,-122.027,2470,5669
42,7203220400,7/7/2014,861990.0,5,2.75,3595,5639,2.0,0.0,0.0,...,9,3595,?,2014,0.0,98053,47.6848,-122.016,3625,5639
47,4178300310,7/16/2014,785000.0,4,2.50,2290,13416,2.0,0.0,0.0,...,9,2290,0.0,1981,0.0,98007,47.6194,-122.151,2680,13685
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
21580,7502800100,8/13/2014,679950.0,5,2.75,3600,9437,2.0,0.0,0.0,...,9,3600,0.0,2014,0.0,98059,47.4822,-122.131,3550,9421
21582,8956200760,10/13/2014,541800.0,4,2.50,3118,7866,2.0,,2.0,...,9,3118,0.0,2014,0.0,98001,47.2931,-122.264,2673,6500
21583,7202300110,9/15/2014,810000.0,4,3.00,3990,7838,2.0,0.0,0.0,...,9,3990,0.0,2003,,98053,47.6857,-122.046,3370,6814
21589,3448900210,10/14/2014,610685.0,4,2.50,2520,6023,2.0,0.0,,...,9,2520,0.0,2014,0.0,98056,47.5137,-122.167,2520,6023


In [201]:
df_condition= df_grade1.query('condition == 3')
df_condition

Unnamed: 0,id,date,price,bedrooms,bathrooms,sqft_living,sqft_lot,floors,waterfront,view,...,grade,sqft_above,sqft_basement,yr_built,yr_renovated,zipcode,lat,long,sqft_living15,sqft_lot15
15,9297300055,1/24/2015,650000.0,4,3.00,2950,5000,2.0,0.0,3.0,...,9,1980,970.0,1979,0.0,98126,47.5714,-122.375,2140,4000
21,2524049179,8/26/2014,2000000.0,3,2.75,3050,44867,1.0,0.0,4.0,...,9,2330,720.0,1968,0.0,98040,47.5316,-122.233,4110,20336
40,5547700270,7/15/2014,625000.0,4,2.50,2570,5520,2.0,,0.0,...,9,2570,0.0,2000,,98074,47.6145,-122.027,2470,5669
42,7203220400,7/7/2014,861990.0,5,2.75,3595,5639,2.0,0.0,0.0,...,9,3595,?,2014,0.0,98053,47.6848,-122.016,3625,5639
55,9822700295,5/12/2014,885000.0,4,2.50,2830,5000,2.0,,0.0,...,9,2830,0.0,1995,0.0,98105,47.6597,-122.290,1950,5000
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
21580,7502800100,8/13/2014,679950.0,5,2.75,3600,9437,2.0,0.0,0.0,...,9,3600,0.0,2014,0.0,98059,47.4822,-122.131,3550,9421
21582,8956200760,10/13/2014,541800.0,4,2.50,3118,7866,2.0,,2.0,...,9,3118,0.0,2014,0.0,98001,47.2931,-122.264,2673,6500
21583,7202300110,9/15/2014,810000.0,4,3.00,3990,7838,2.0,0.0,0.0,...,9,3990,0.0,2003,,98053,47.6857,-122.046,3370,6814
21589,3448900210,10/14/2014,610685.0,4,2.50,2520,6023,2.0,0.0,,...,9,2520,0.0,2014,0.0,98056,47.5137,-122.167,2520,6023


In [203]:
df_built1= df_condition.query('yr_built ==2010')
df_built1

Unnamed: 0,id,date,price,bedrooms,bathrooms,sqft_living,sqft_lot,floors,waterfront,view,...,grade,sqft_above,sqft_basement,yr_built,yr_renovated,zipcode,lat,long,sqft_living15,sqft_lot15
341,1115300070,11/6/2014,684000.0,4,3.5,3040,8414,2.0,0.0,0.0,...,9,2420,620.0,2010,,98059,47.5222,-122.157,3470,8066
1875,1853081000,7/17/2014,820000.0,5,2.75,2830,6137,2.0,0.0,0.0,...,9,2830,0.0,2010,0.0,98074,47.5932,-122.058,3170,6285
5670,7933250050,10/28/2014,1420000.0,5,3.25,4020,4500,2.0,0.0,0.0,...,9,3120,900.0,2010,0.0,98004,47.6349,-122.204,3550,5775
9419,1853081250,12/29/2014,800000.0,4,2.75,3120,5000,2.0,0.0,0.0,...,9,3120,0.0,2010,0.0,98074,47.594,-122.062,3200,5000
9489,7203100730,2/10/2015,875000.0,4,3.5,3790,6874,2.5,0.0,0.0,...,9,3790,0.0,2010,0.0,98053,47.6956,-122.022,3370,6535
10051,3303860160,2/24/2015,430000.0,3,2.5,2670,12806,2.0,0.0,0.0,...,9,2670,0.0,2010,,98038,47.3686,-122.059,3010,7231
12522,6003001999,2/9/2015,530000.0,2,1.75,1170,976,2.0,0.0,0.0,...,9,780,390.0,2010,0.0,98102,47.6192,-122.316,1280,1183
12597,269000970,4/2/2015,1300000.0,5,3.75,4450,7680,2.0,0.0,0.0,...,9,3460,990.0,2010,0.0,98199,47.6418,-122.392,2550,6400
12892,5100403882,4/27/2015,967000.0,4,2.5,3100,7250,2.0,0.0,0.0,...,9,3100,0.0,2010,0.0,98115,47.6961,-122.316,1240,6670
19476,7203100850,4/27/2015,840000.0,4,3.25,3500,5960,2.0,0.0,0.0,...,9,3500,0.0,2010,,98053,47.6944,-122.022,3390,6856


In [205]:
df_bedroom1 = df_built1.query('bedrooms ==4')
df_bedroom1.transpose()

Unnamed: 0,341,9419,9489,12892,19476,19841,20189,20264,20738,20775,21081,21178
id,1115300070,1853081250,7203100730,5100403882,7203100850,662440030,7203120050,7203120020,323079065,6791900260,7203100660,3304040020
date,11/6/2014,12/29/2014,2/10/2015,4/27/2015,4/27/2015,3/26/2015,10/8/2014,8/14/2014,6/24/2014,7/8/2014,11/17/2014,12/26/2014
price,684000.0,800000.0,875000.0,967000.0,840000.0,435000.0,789500.0,785000.0,790000.0,760005.0,780000.0,375500.0
bedrooms,4,4,4,4,4,4,4,4,4,4,4,4
bathrooms,3.5,2.75,3.5,2.5,3.25,2.5,3.25,3.5,3.5,2.75,2.75,2.5
sqft_living,3040,3120,3790,3100,3500,3100,3240,3310,3190,3090,3420,2301
sqft_lot,8414,5000,6874,7250,5960,4699,4852,4850,31450,5859,6787,6452
floors,2.0,2.0,2.5,2.0,2.0,2.0,2.0,2.0,2.0,2.0,2.0,2.0
waterfront,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
view,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


#### GEO MAP

In [344]:
df_price1 = df_bedroom1.query('price != 375500')
df_price2 = df_price1.query('price != 684000')
df_price3 = df_price2.query('price != 760005')
df_price4 = df_price3.query('price != 785000')
df_price5 = df_price4.query('price != 789500')
df_price6 = df_price5.query('price != 875000')
df_price7 = df_price6.query('price != 967000')
df_price8 = df_price7.query('price != 435000')
df_price9 = df_price8.query('price != 840000')
df_price9

Unnamed: 0,id,date,price,bedrooms,bathrooms,sqft_living,sqft_lot,floors,waterfront,view,...,grade,sqft_above,sqft_basement,yr_built,yr_renovated,zipcode,lat,long,sqft_living15,sqft_lot15
9419,1853081250,12/29/2014,800000.0,4,2.75,3120,5000,2.0,0.0,0.0,...,9,3120,0.0,2010,0.0,98074,47.594,-122.062,3200,5000
20738,323079065,6/24/2014,790000.0,4,3.5,3190,31450,2.0,0.0,0.0,...,9,3190,0.0,2010,0.0,98027,47.501,-121.902,3000,72745
21081,7203100660,11/17/2014,780000.0,4,2.75,3420,6787,2.0,0.0,0.0,...,9,3420,0.0,2010,0.0,98053,47.6962,-122.023,3450,6137


In [342]:
f = folium.Figure(width=800, height=500)

map3= folium.Map(location=[df_price9.lat.mean(), df_price9.long.mean()], zoom_start=11, control_scale=True).add_to(f)

for index, location_info in df_price9.iterrows():
    folium.Marker([location_info["lat"], location_info["long"]], popup=location_info["price"]).add_to(map3)

In [343]:
map3

In [297]:
map3.save('urban_map.html')