# Housing Rental Analysis for San Francisco

In this challenge, your job is to use your data visualisation skills, including aggregation, interactive visualisations, and geospatial analysis, to find properties in the San Francisco market that are viable investment opportunities.

## Instructions

Use the `san_francisco_housing.ipynb` notebook to visualise and analyse the real-estate data.

Note that this assignment requires you to create a visualisation by using hvPlot and GeoViews. Additionally, you need to read the `sfo_neighborhoods_census_data.csv` file from the `Resources` folder into the notebook and create the DataFrame that you’ll use in the analysis.

The main task in this Challenge is to visualise and analyse the real-estate data in your Jupyter notebook. Use the `san_francisco_housing.ipynb` notebook to complete the following tasks:

* Calculate and plot the housing units per year.

* Calculate and plot the average prices per square foot.

* Compare the average prices by neighborhood.

* Build an interactive neighborhood map.

* Compose your data story.

### Calculate and Plot the Housing Units per Year

For this part of the assignment, use numerical and visual aggregation to calculate the number of housing units per year, and then visualise the results as a bar chart. To do so, complete the following steps:

1. Use the `groupby` function to group the data by year. Aggregate the results by the `mean` of the groups.

2. Use the `hvplot` function to plot the `housing_units_by_year` DataFrame as a bar chart. Make the x-axis represent the `year` and the y-axis represent the `housing_units`.

3. Style and format the line plot to ensure a professionally styled visualisation.

4. Note that your resulting plot should appear similar to the following image:

![A screenshot depicts an example of the resulting bar chart.](Images/zoomed-housing-units-by-year.png)

5. Answer the following question:

    * What’s the overall trend in housing units over the period that you’re analysing?

### Calculate and Plot the Average Sale Prices per Square Foot

For this part of the assignment, use numerical and visual aggregation to calculate the average prices per square foot, and then visualise the results as a bar chart. To do so, complete the following steps:

1. Group the data by year, and then average the results. What’s the lowest gross rent that’s reported for the years that the DataFrame includes?

2. Create a new DataFrame named `prices_square_foot_by_year` by filtering out the “housing_units” column. The new DataFrame should include the averages per year for only the sale price per square foot and the gross rent.

3. Use hvPlot to plot the `prices_square_foot_by_year` DataFrame as a line plot.

    > **Hint** This single plot will include lines for both `sale_price_sqr_foot` and `gross_rent`.

4. Style and format the line plot to ensure a professionally styled visualisation.

5. Note that your resulting plot should appear similar to the following image:

![A screenshot depicts an example of the resulting plot.](Images/avg-sale-px-sq-foot-gross-rent.png)

6. Use both the `prices_square_foot_by_year` DataFrame and interactive plots to answer the following questions:

    * Did any year experience a drop in the average sale price per square foot compared to the previous year?

    * If so, did the gross rent increase or decrease during that year?

### Compare the Average Sale Prices by Neighborhood

For this part of the assignment, use interactive visualisations and widgets to explore the average sale price per square foot by neighborhood. To do so, complete the following steps:

1. Create a new DataFrame that groups the original DataFrame by year and neighborhood. Aggregate the results by the `mean` of the groups.

2. Filter out the “housing_units” column to create a DataFrame that includes only the `sale_price_sqr_foot` and `gross_rent` averages per year.

3. Create an interactive line plot with hvPlot that visualises both `sale_price_sqr_foot` and `gross_rent`. Set the x-axis parameter to the year (`x="year"`). Use the `groupby` parameter to create an interactive widget for `neighborhood`.

4. Style and format the line plot to ensure a professionally styled visualisation.

5. Note that your resulting plot should appear similar to the following image:

![A screenshot depicts an example of the resulting plot.](Images/pricing-info-by-neighborhood.png)

6. Use the interactive visualisation to answer the following question:

    * For the Anza Vista neighborhood, is the average sale price per square foot for 2016 more or less than the price that’s listed for 2012? 

### Build an Interactive Neighborhood Map

For this part of the assignment, explore the geospatial relationships in the data by using interactive visualisations with hvPlot and GeoViews. To build your map, use the `sfo_data_df` DataFrame (created during the initial import), which includes the neighborhood location data with the average prices. To do all this, complete the following steps:

1. Read the `neighborhood_coordinates.csv` file from the `Resources` folder into the notebook, and create a DataFrame named `neighborhood_locations_df`. Be sure to set the `index_col` of the DataFrame as “Neighborhood”.

2. Using the original `sfo_data_df` Dataframe, create a DataFrame named `all_neighborhood_info_df` that groups the data by neighborhood. Aggregate the results by the `mean` of the group.

3. Review the two code cells that concatenate the `neighborhood_locations_df` DataFrame with the `all_neighborhood_info_df` DataFrame. Note that the first cell uses the [Pandas concat function](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.concat.html) to create a DataFrame named `all_neighborhoods_df`. The second cell cleans the data and sets the “Neighborhood” column. Be sure to run these cells to create the `all_neighborhoods_df` DataFrame, which you’ll need to create the geospatial visualisation.

4. Using hvPlot with GeoViews enabled, create a `points` plot for the `all_neighborhoods_df` DataFrame. Be sure to do the following:

    * Set the `geo` parameter to True.
    * Set the `size` parameter to “sale_price_sqr_foot”.
    * Set the `color` parameter to “gross_rent”.
    * Set the `frame_width` parameter to 700.
    * Set the `frame_height` parameter to 500.
    * Include a descriptive title.

Note that your resulting plot should appear similar to the following image:

![A screenshot depicts an example of a scatter plot created with hvPlot and GeoViews.](Images/6-4-geoviews-plot.png)

5. Use the interactive map to answer the following question:

    * Which neighborhood has the highest gross rent, and which has the highest sale price per square foot?

### Compose Your Data Story

Based on the visualisations that you created, answer the following questions:

* How does the trend in rental income growth compare to the trend in sales prices? Does this same trend hold true for all the neighborhoods across San Francisco?

* What insights can you share with your company about the potential one-click, buy-and-rent strategy that they're pursuing? Do neighborhoods exist that you would suggest for investment, and why?

In [1]:
# Import the required libraries and dependencies
import pandas as pd
import hvplot.pandas
from pathlib import Path
import panel as pn
import geoviews as gv

# Enable GeoViews backend
gv.extension('bokeh', 'matplotlib')

  "class": algorithms.Blowfish,


## Import the data 

In [2]:
# Using the read_csv function and Path module, create a DataFrame 
# by importing the sfo_neighborhoods_census_data.csv file from the Resources folder
sfo_data_df = pd.read_csv(Path("./Resources/sfo_neighborhoods_census_data.csv"))

# Review the first and last five rows of the DataFrame
# YOUR CODE HERE
sfo_data_df.head()
# YOUR CODE HERE
sfo_data_df.tail()

Unnamed: 0,year,neighborhood,sale_price_sqr_foot,housing_units,gross_rent
392,2016,Telegraph Hill,903.049771,384242,4390
393,2016,Twin Peaks,970.08547,384242,4390
394,2016,Van Ness/ Civic Center,552.602567,384242,4390
395,2016,Visitacion Valley,328.319007,384242,4390
396,2016,Westwood Park,631.195426,384242,4390


---

## Calculate and Plot the Housing Units per Year

For this part of the assignment, use numerical and visual aggregation to calculate the number of housing units per year, and then visualise the results as a bar chart. To do so, complete the following steps:

1. Use the `groupby` function to group the data by year. Aggregate the results by the `mean` of the groups.

2. Use the `hvplot` function to plot the `housing_units_by_year` DataFrame as a bar chart. Make the x-axis represent the `year` and the y-axis represent the `housing_units`.

3. Style and format the line plot to ensure a professionally styled visualisation.

4. Note that your resulting plot should appear similar to the following image:

![A screenshot depicts an example of the resulting bar chart.](Images/zoomed-housing-units-by-year.png)

5. Answer the following question:

    * What’s the overall trend in housing units over the period that you’re analysing?



### Step 1: Use the `groupby` function to group the data by year. Aggregate the results by the `mean` of the groups.

In [3]:
# Create a numerical aggregation that groups the data by the year and then averages the results.
housing_units_by_year = sfo_data_df.groupby('year')['housing_units'].mean().reset_index()

# Review the DataFrame
# YOUR CODE HERE
housing_units_by_year

Unnamed: 0,year,housing_units
0,2010,372560.0
1,2011,374507.0
2,2012,376454.0
3,2013,378401.0
4,2014,380348.0
5,2015,382295.0
6,2016,384242.0


### Step 2: Use the `hvplot` function to plot the `housing_units_by_year` DataFrame as a bar chart. Make the x-axis represent the `year` and the y-axis represent the `housing_units`.

### Step 3: Style and format the line plot to ensure a professionally styled visualisation.

In [4]:
# Create a visual aggregation explore the housing units by year
# YOUR CODE HERE
housing_units_by_year.hvplot.bar(
    x='year',
    y='housing_units',
    title='Housing Units in San Francisco per Year',
    xlabel='Year',
    ylabel='Housing Units',
    width=800,
    height=400,
    ylim=(sfo_data_df['housing_units'].min() - 1000, sfo_data_df['housing_units'].max() + 1000),
    color='#ff7f0e',  
    hover=True,
    line_color='black',
    line_width=1.5
)

### Step 5: Answer the following question:

**Question:** What is the overall trend in housing_units over the period being analysed?

**Answer:** Percentage change from the housing units over the period of 2010 to 2016 is approximately 3.14 percent. 


Calculated by: 


housing_units_2010 = housing_units_by_year[housing_units_by_year['year'] == 2010]['housing_units'].values[0]


housing_units_2016 = housing_units_by_year[housing_units_by_year['year'] == 2016]['housing_units'].values[0]


percentage_change = ((housing_units_2016 - housing_units_2010) / housing_units_2010) * 100


## Calculate and Plot the Average Sale Prices per Square Foot

For this part of the assignment, use numerical and visual aggregation to calculate the average prices per square foot, and then visualise the results as a bar chart. To do so, complete the following steps:

1. Group the data by year, and then average the results. What’s the lowest gross rent that’s reported for the years that the DataFrame includes?

2. Create a new DataFrame named `prices_square_foot_by_year` by filtering out the “housing_units” column. The new DataFrame should include the averages per year for only the sale price per square foot and the gross rent.

3. Use hvPlot to plot the `prices_square_foot_by_year` DataFrame as a line plot.

    > **Hint** This single plot will include lines for both `sale_price_sqr_foot` and `gross_rent`.

4. Style and format the line plot to ensure a professionally styled visualisation.

5. Note that your resulting plot should appear similar to the following image:

![A screenshot depicts an example of the resulting plot.](Images/avg-sale-px-sq-foot-gross-rent.png)

6. Use both the `prices_square_foot_by_year` DataFrame and interactive plots to answer the following questions:

    * Did any year experience a drop in the average sale price per square foot compared to the previous year?

    * If so, did the gross rent increase or decrease during that year?



### Step 1: Group the data by year, and then average the results.

In [5]:
# Create a numerical aggregation by grouping the data by year and averaging the results
prices_square_foot_groupby_year = sfo_data_df.groupby('year').agg({
    'sale_price_sqr_foot': 'mean',
    'gross_rent': 'min'
}).reset_index()

# Review the resulting DataFrame
# YOUR CODE HERE
prices_square_foot_groupby_year

Unnamed: 0,year,sale_price_sqr_foot,gross_rent
0,2010,369.344353,1239
1,2011,341.903429,1530
2,2012,399.389968,2324
3,2013,483.600304,2971
4,2014,556.277273,3528
5,2015,632.540352,3739
6,2016,697.643709,4390


**Question:** What is the lowest gross rent reported for the years included in the DataFrame?

**Answer:** # YOUR ANSWER HERE

### Step 2: Create a new DataFrame named `prices_square_foot_by_year` by filtering out the “housing_units” column. The new DataFrame should include the averages per year for only the sale price per square foot and the gross rent.

In [6]:
# Filter out the housing_units column, creating a new DataFrame 
# Keep only sale_price_sqr_foot and gross_rent averages per year
prices_square_foot_by_year = prices_square_foot_groupby_year[['year', 'sale_price_sqr_foot', 'gross_rent']]

# Review the DataFrame
# YOUR CODE HERE
prices_square_foot_by_year

Unnamed: 0,year,sale_price_sqr_foot,gross_rent
0,2010,369.344353,1239
1,2011,341.903429,1530
2,2012,399.389968,2324
3,2013,483.600304,2971
4,2014,556.277273,3528
5,2015,632.540352,3739
6,2016,697.643709,4390


### Step 3: Use hvPlot to plot the `prices_square_foot_by_year` DataFrame as a line plot.

> **Hint** This single plot will include lines for both `sale_price_sqr_foot` and `gross_rent`

### Step 4: Style and format the line plot to ensure a professionally styled visualisation.


In [7]:
# Plot prices_square_foot_by_year. 
# Inclued labels for the x- and y-axes, and a title.
# YOUR CODE HERE
plot = prices_square_foot_by_year.hvplot.line(
    x='year',
    y=['sale_price_sqr_foot', 'gross_rent'],
    xlabel='Year',
    ylabel='Average Price per Square Foot / Lowest Gross Rent',
    title='Average Sale Prices per Square Foot and Lowest Gross Rent by Year',
    legend=True,
    line_color=['blue', 'red'],
    line_width=2
)

# Show the plot
plot.opts(width=800, height=400)

### Step 6: Use both the `prices_square_foot_by_year` DataFrame and interactive plots to answer the following questions:

**Question:** Did any year experience a drop in the average sale price per square foot compared to the previous year?

**Answer:** # 2011

**Question:** If so, did the gross rent increase or decrease during that year?

**Answer:** # Yes it did not

---

## Compare the Average Sale Prices by Neighborhood

For this part of the assignment, use interactive visualisations and widgets to explore the average sale price per square foot by neighborhood. To do so, complete the following steps:

1. Create a new DataFrame that groups the original DataFrame by year and neighborhood. Aggregate the results by the `mean` of the groups.

2. Filter out the “housing_units” column to create a DataFrame that includes only the `sale_price_sqr_foot` and `gross_rent` averages per year.

3. Create an interactive line plot with hvPlot that visualises both `sale_price_sqr_foot` and `gross_rent`. Set the x-axis parameter to the year (`x="year"`). Use the `groupby` parameter to create an interactive widget for `neighborhood`.

4. Style and format the line plot to ensure a professionally styled visualisation.

5. Note that your resulting plot should appear similar to the following image:

![A screenshot depicts an example of the resulting plot.](Images/pricing-info-by-neighborhood.png)

6. Use the interactive visualisation to answer the following question:

    * For the Anza Vista neighborhood, is the average sale price per square foot for 2016 more or less than the price that’s listed for 2012? 


### Step 1: Create a new DataFrame that groups the original DataFrame by year and neighborhood. Aggregate the results by the `mean` of the groups.

In [8]:
# Group by year and neighborhood and then create a new dataframe of the mean values
prices_by_year_by_neighborhood = sfo_data_df.groupby(['year', 'neighborhood']).agg({
    'sale_price_sqr_foot': 'mean',
    'gross_rent': 'mean'
}).reset_index()

# Review the DataFrame
# YOUR CODE HERE
prices_by_year_by_neighborhood

Unnamed: 0,year,neighborhood,sale_price_sqr_foot,gross_rent
0,2010,Alamo Square,291.182945,1239.0
1,2010,Anza Vista,267.932583,1239.0
2,2010,Bayview,170.098665,1239.0
3,2010,Buena Vista Park,347.394919,1239.0
4,2010,Central Richmond,319.027623,1239.0
...,...,...,...,...
392,2016,Telegraph Hill,903.049771,4390.0
393,2016,Twin Peaks,970.085470,4390.0
394,2016,Van Ness/ Civic Center,552.602567,4390.0
395,2016,Visitacion Valley,328.319007,4390.0


### Step 2: Filter out the “housing_units” column to create a DataFrame that includes only the `sale_price_sqr_foot` and `gross_rent` averages per year.

In [9]:
# Filter out the housing_units
prices_by_year_by_neighborhood = prices_by_year_by_neighborhood[['year', 'neighborhood', 'sale_price_sqr_foot', 'gross_rent']]

# Review the first and last five rows of the DataFrame
# YOUR CODE HERE
first_five_rows = prices_by_year_by_neighborhood.head()
# YOUR CODE HERE
last_five_rows = prices_by_year_by_neighborhood.tail()

display(first_five_rows)
display(last_five_rows)

Unnamed: 0,year,neighborhood,sale_price_sqr_foot,gross_rent
0,2010,Alamo Square,291.182945,1239.0
1,2010,Anza Vista,267.932583,1239.0
2,2010,Bayview,170.098665,1239.0
3,2010,Buena Vista Park,347.394919,1239.0
4,2010,Central Richmond,319.027623,1239.0


Unnamed: 0,year,neighborhood,sale_price_sqr_foot,gross_rent
392,2016,Telegraph Hill,903.049771,4390.0
393,2016,Twin Peaks,970.08547,4390.0
394,2016,Van Ness/ Civic Center,552.602567,4390.0
395,2016,Visitacion Valley,328.319007,4390.0
396,2016,Westwood Park,631.195426,4390.0


### Step 3: Create an interactive line plot with hvPlot that visualises both `sale_price_sqr_foot` and `gross_rent`. Set the x-axis parameter to the year (`x="year"`). Use the `groupby` parameter to create an interactive widget for `neighborhood`.

### Step 4: Style and format the line plot to ensure a professionally styled visualisation.

In [10]:
# Use hvplot to create an interactive line plot of the average price per square foot
# The plot should have a dropdown selector for the neighborhood
# YOUR CODE HERE

# Define a function to create an interactive line plot
def create_line_plot(neighborhood):
    filtered_data = prices_by_year_by_neighborhood[prices_by_year_by_neighborhood['neighborhood'] == neighborhood]
    plot = filtered_data.hvplot.line(
        x='year',
        y='sale_price_sqr_foot',
        xlabel='Year',
        ylabel='Average Price per Square Foot',
        title=f'Average Sale Prices per Square Foot in {neighborhood}',
        line_width=2
    )
    return plot

# Create a list of unique neighborhoods for the widget
neighborhoods = prices_by_year_by_neighborhood['neighborhood'].unique().tolist()

# Create a widget for selecting neighborhoods
neighborhood_selector = pn.widgets.Select(name='Select Neighborhood', options=neighborhoods)

# Define a function to update the plot based on the selected neighborhood
def update_plot(event):
    selected_neighborhood = neighborhood_selector.value
    plot = create_line_plot(selected_neighborhood)
    pn.panel(plot).servable()

# Bind the update_plot function to the neighborhood_selector widget
neighborhood_selector.param.watch(update_plot, 'value')

# Create an initial plot with the first neighborhood in the list
initial_neighborhood = neighborhoods[0]
initial_plot = create_line_plot(initial_neighborhood)

# Display the initial plot and the neighborhood selector
pn.Row(neighborhood_selector, pn.panel(initial_plot)).servable()


BokehModel(combine_events=True, render_bundle={'docs_json': {'ad6a34d5-6511-48a8-9aaf-8a10d28762cd': {'version…

### Step 6: Use the interactive visualisation to answer the following question:

**Question:** For the Anza Vista neighborhood, is the average sale price per square foot for 2016 more or less than the price that’s listed for 2012? 

**Answer:** # Less than 2012 

---

## Build an Interactive Neighborhood Map

For this part of the assignment, explore the geospatial relationships in the data by using interactive visualisations with hvPlot and GeoViews. To build your map, use the `sfo_data_df` DataFrame (created during the initial import), which includes the neighborhood location data with the average prices. To do all this, complete the following steps:

1. Read the `neighborhood_coordinates.csv` file from the `Resources` folder into the notebook, and create a DataFrame named `neighborhood_locations_df`. Be sure to set the `index_col` of the DataFrame as “Neighborhood”.

2. Using the original `sfo_data_df` Dataframe, create a DataFrame named `all_neighborhood_info_df` that groups the data by neighborhood. Aggregate the results by the `mean` of the group.

3. Review the two code cells that concatenate the `neighborhood_locations_df` DataFrame with the `all_neighborhood_info_df` DataFrame. Note that the first cell uses the [Pandas concat function](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.concat.html) to create a DataFrame named `all_neighborhoods_df`. The second cell cleans the data and sets the “Neighborhood” column. Be sure to run these cells to create the `all_neighborhoods_df` DataFrame, which you’ll need to create the geospatial visualisation.

4. Using hvPlot with GeoViews enabled, create a `points` plot for the `all_neighborhoods_df` DataFrame. Be sure to do the following:

    * Set the `size` parameter to “sale_price_sqr_foot”.

    * Set the `color` parameter to “gross_rent”.

    * Set the `size_max` parameter to “25”.

    * Set the `zoom` parameter to “11”.

Note that your resulting plot should appear similar to the following image:

![A screenshot depicts an example of a scatter plot created with hvPlot and GeoViews.](Images/6-4-geoviews-plot.png)

5. Use the interactive map to answer the following question:

    * Which neighborhood has the highest gross rent, and which has the highest sale price per square foot?


### Step 1: Read the `neighborhood_coordinates.csv` file from the `Resources` folder into the notebook, and create a DataFrame named `neighborhood_locations_df`. Be sure to set the `index_col` of the DataFrame as “Neighborhood”.

In [11]:
# Load neighborhoods coordinates data
neighborhood_locations_df = pd.read_csv(Path('./Resources/neighborhoods_coordinates.csv'), index_col='Neighborhood')

# Review the DataFrame
# YOUR CODE HERE
neighborhood_locations_df

Unnamed: 0_level_0,Lat,Lon
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1
Alamo Square,37.791012,-122.402100
Anza Vista,37.779598,-122.443451
Bayview,37.734670,-122.401060
Bayview Heights,37.728740,-122.410980
Bernal Heights,37.728630,-122.443050
...,...,...
West Portal,37.740260,-122.463880
Western Addition,37.792980,-122.435790
Westwood Highlands,37.734700,-122.456854
Westwood Park,37.734150,-122.457000


### Step 2: Using the original `sfo_data_df` Dataframe, create a DataFrame named `all_neighborhood_info_df` that groups the data by neighborhood. Aggregate the results by the `mean` of the group.

In [12]:
# Calculate the mean values for each neighborhood
all_neighborhood_info_df = sfo_data_df.groupby('neighborhood').mean()

# Review the resulting DataFrame
# YOUR CODE HERE
all_neighborhood_info_df

Unnamed: 0_level_0,year,sale_price_sqr_foot,housing_units,gross_rent
neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Alamo Square,2013.000000,366.020712,378401.00,2817.285714
Anza Vista,2013.333333,373.382198,379050.00,3031.833333
Bayview,2012.000000,204.588623,376454.00,2318.400000
Bayview Heights,2015.000000,590.792839,382295.00,3739.000000
Bernal Heights,2013.500000,576.746488,379374.50,3080.333333
...,...,...,...,...
West Portal,2012.250000,498.488485,376940.75,2515.500000
Western Addition,2012.500000,307.562201,377427.50,2555.166667
Westwood Highlands,2012.000000,533.703935,376454.00,2250.500000
Westwood Park,2015.000000,687.087575,382295.00,3959.000000


### Step 3: Review the two code cells that concatenate the `neighborhood_locations_df` DataFrame with the `all_neighborhood_info_df` DataFrame. 

Note that the first cell uses the [Pandas concat function](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.concat.html) to create a DataFrame named `all_neighborhoods_df`. 

The second cell cleans the data and sets the “Neighborhood” column. 

Be sure to run these cells to create the `all_neighborhoods_df` DataFrame, which you’ll need to create the geospatial visualisation.

In [13]:
# Using the Pandas `concat` function, join the 
# neighborhood_locations_df and the all_neighborhood_info_df DataFrame
# The axis of the concatenation is "columns".
# The concat function will automatially combine columns with
# identical information, while keeping the additional columns.
all_neighborhoods_df = pd.concat([neighborhood_locations_df, all_neighborhood_info_df], axis=1)
all_neighborhoods_df.reset_index(inplace=True)

# Review the resulting DataFrame
display(all_neighborhoods_df.head())
display(all_neighborhoods_df.tail())


Unnamed: 0,index,Lat,Lon,year,sale_price_sqr_foot,housing_units,gross_rent
0,Alamo Square,37.791012,-122.4021,2013.0,366.020712,378401.0,2817.285714
1,Anza Vista,37.779598,-122.443451,2013.333333,373.382198,379050.0,3031.833333
2,Bayview,37.73467,-122.40106,2012.0,204.588623,376454.0,2318.4
3,Bayview Heights,37.72874,-122.41098,2015.0,590.792839,382295.0,3739.0
4,Bernal Heights,37.72863,-122.44305,,,,


Unnamed: 0,index,Lat,Lon,year,sale_price_sqr_foot,housing_units,gross_rent
72,Yerba Buena,37.79298,-122.39636,2012.5,576.709848,377427.5,2555.166667
73,Bernal Heights,,,2013.5,576.746488,379374.5,3080.333333
74,Downtown,,,2013.0,391.434378,378401.0,2817.285714
75,Ingleside,,,2012.5,367.895144,377427.5,2509.0
76,Outer Richmond,,,2013.0,473.900773,378401.0,2817.285714


In [22]:
# Call the dropna function to remove any neighborhoods that do not have data
all_neighborhoods_df = all_neighborhoods_df.reset_index().dropna()

# Rename the "index" column as "Neighborhood" for use in the visualisation
all_neighborhoods_df = all_neighborhoods_df.rename(columns={"index": "Neighborhood"})

# Review the resulting DataFrame
display(all_neighborhoods_df.head())
display(all_neighborhoods_df.tail())

Unnamed: 0,Neighborhood,Neighborhood.1,Neighborhood.2,Neighborhood.3,Neighborhood.4,level_0,Neighborhood.5,Lat,Lon,year,sale_price_sqr_foot,housing_units,gross_rent
0,0,0,0,0,0,0,Alamo Square,37.791012,-122.4021,2013.0,366.020712,378401.0,2817.285714
1,1,1,1,1,1,1,Anza Vista,37.779598,-122.443451,2013.333333,373.382198,379050.0,3031.833333
2,2,2,2,2,2,2,Bayview,37.73467,-122.40106,2012.0,204.588623,376454.0,2318.4
3,3,3,3,3,3,3,Bayview Heights,37.72874,-122.41098,2015.0,590.792839,382295.0,3739.0
4,4,4,4,4,5,5,Buena Vista Park,37.76816,-122.43933,2012.833333,452.680591,378076.5,2698.833333


Unnamed: 0,Neighborhood,Neighborhood.1,Neighborhood.2,Neighborhood.3,Neighborhood.4,level_0,Neighborhood.5,Lat,Lon,year,sale_price_sqr_foot,housing_units,gross_rent
64,64,64,64,64,68,68,West Portal,37.74026,-122.46388,2012.25,498.488485,376940.75,2515.5
65,65,65,65,65,69,69,Western Addition,37.79298,-122.43579,2012.5,307.562201,377427.5,2555.166667
66,66,66,66,66,70,70,Westwood Highlands,37.7347,-122.456854,2012.0,533.703935,376454.0,2250.5
67,67,67,67,67,71,71,Westwood Park,37.73415,-122.457,2015.0,687.087575,382295.0,3959.0
68,68,68,68,68,72,72,Yerba Buena,37.79298,-122.39636,2012.5,576.709848,377427.5,2555.166667


### Step 4: Using hvPlot with GeoViews enabled, create a `points` plot for the `all_neighborhoods_df` DataFrame. Be sure to do the following:

* Set the `geo` parameter to True.
* Set the `size` parameter to “sale_price_sqr_foot”.
* Set the `color` parameter to “gross_rent”.
* Set the `frame_width` parameter to 700.
* Set the `frame_height` parameter to 500.
* Include a descriptive title.

In [17]:
# Create a plot to analyse neighborhood info
# YOUR CODE HERE
map_plot = all_neighborhoods_df.hvplot.points(
    x='Lon',
    y='Lat',
    geo=True,
    size='sale_price_sqr_foot',
    color='gross_rent',
    frame_width=800,
    frame_height=600,
    title='San Francisco Neighborhoods Map',
)

map_plot

### Step 5: Use the interactive map to answer the following question:

**Question:** Which neighborhood has the highest gross rent, and which has the highest sale price per square foot?

**Answer:** 

The neighborhood with the highest gross rent in 2015 is Presidio Heights with a gross rent of $3,739.


The neighborhood with the highest sale price per square foot in 2015 is Union Square District with a sale price per square foot of approximately $1,119.84.

## Compose Your Data Story

Based on the visualisations that you have created, compose a data story that synthesizes your analysis by answering the following questions:

**Question:**  How does the trend in rental income growth compare to the trend in sales prices? Does this same trend hold true for all the neighborhoods across San Francisco?

**Answer:** 

## Comparing Rental Income Growth and Sales Prices in San Francisco

Comparing the trends in rental income growth and sales prices in San Francisco can provide valuable insights into the real estate market. However, it's important to note that these trends can vary significantly by neighborhood. Here's a general overview of how the trends might compare and why they can differ across neighborhoods:

### Trend in Rental Income Growth vs. Sales Prices:

**Rental Income Growth:** Rental income growth tends to be influenced by factors such as demand for rental properties, job opportunities, and population growth. In cities like San Francisco with a strong job market, rental income can increase steadily over time, especially in neighborhoods close to employment hubs.

**Sales Prices:** Sales prices of properties, on the other hand, are influenced by various factors, including supply and demand dynamics, investor sentiment, interest rates, and economic conditions. Property prices can appreciate over the long term, but the rate of appreciation can fluctuate.

### Variations Across Neighborhoods:

The trends in rental income growth and sales prices can vary significantly across San Francisco neighborhoods due to the following factors:

- **Location:** Neighborhoods closer to the city center and major job centers tend to have higher demand for both rentals and sales, which can lead to more significant price and rental income growth. Areas farther from the city center may experience different trends.

- **Housing Type:** Different neighborhoods have varying housing types, such as single-family homes, apartments, and condos. The type of housing in a neighborhood can influence both rental income and sales prices.

- **Economic Factors:** Economic conditions, such as the presence of tech companies, can impact rental demand and sales prices. Tech hubs often drive higher demand for rentals and sales in nearby neighborhoods.

- **Development and Infrastructure:** Neighborhoods experiencing new developments, improved infrastructure, or revitalization efforts may see higher appreciation in sales prices. This can also impact rental income if it attracts more renters.

- **Market Cycles:** Real estate markets go through cycles of growth, stabilization, and correction. Different neighborhoods may be at different points in these cycles, affecting trends in rental income and sales prices.

- **Regulations:** Local regulations, zoning laws, and rent control policies can have a significant impact on rental income growth and property values. Regulations may vary by neighborhood.

In summary, while San Francisco as a whole may have experienced strong rental income growth and sales price appreciation in recent years, these trends can vary widely across neighborhoods due to location, housing types, economic factors, development, and market conditions. It's essential for investors to conduct neighborhood-specific research and consult with local real estate experts to make informed decisions based on their investment goals and risk tolerance.


**Question:** What insights can you share with your company about the potential one-click, buy-and-rent strategy that they're pursuing? Do neighborhoods exist that you would suggest for investment, and why?

**Answer:** 

## One-Click Buy-and-Rent Strategy Insights

**Advantages of the One-Click, Buy-and-Rent Strategy:**

1. **Convenience:** One-click purchasing can streamline the acquisition process, saving time and effort for investors.

2. **Diversification:** By offering properties in different neighborhoods, investors can diversify their real estate portfolio and reduce risk.

3. **Rental Demand:** San Francisco historically has strong rental demand due to its tech industry presence, so rental properties can generate consistent income.

4. **Potential for Appreciation:** Some neighborhoods in San Francisco have a history of property value appreciation, which can provide long-term capital gains.

**Considerations and Risks:**

1. **Market Volatility:** The real estate market can be volatile, and property values can fluctuate. It's crucial to conduct thorough market research and due diligence before making investments.

2. **Maintenance and Management:** Owning rental properties involves responsibilities such as maintenance, tenant management, and property upkeep, which can be time-consuming and require additional resources.

3. **Regulations:** Be aware of local rental regulations and tenant laws in San Francisco, as they can impact your ability to rent out properties and your obligations as a landlord.

4. **Property Selection:** Not all neighborhoods in San Francisco offer the same investment potential. Carefully select neighborhoods based on factors like rental yield, historical appreciation, and future growth prospects.

**Neighborhoods for Investment:**

While I can't provide real-time data, here are some neighborhoods in San Francisco that have historically been considered attractive for real estate investment:

1. **SoMa (South of Market):** Known for its proximity to tech companies, SoMa has seen strong demand for rental properties. However, property prices can be high.

2. **Mission District:** This neighborhood offers a mix of housing types and attracts a diverse demographic. It has seen appreciation in property values in the past.

3. **Inner Sunset:** Located close to Golden Gate Park and the University of California, San Francisco (UCSF), it has been popular among students and healthcare professionals, potentially providing a stable tenant base.

4. **Outer Richmond:** This neighborhood has a suburban feel and can offer more affordable investment options compared to some other parts of the city.

5. **Bayview-Hunters Point:** Historically an area with lower property prices, it has shown potential for future growth due to redevelopment projects and its proximity to the waterfront.

However, the attractiveness of neighborhoods can change over time, so it's essential to consult with local real estate experts, conduct thorough market research, and consider your investment goals and risk tolerance before deciding on specific neighborhoods for investment.

Additionally, consider working with a real estate agent or property management company with local expertise to help you navigate the San Francisco real estate market effectively. They can provide valuable insights and assistance in managing your rental properties.
