# Housing Rental Analysis for San Francisco

In this challenge, your job is to use your data visualization skills, including aggregation, interactive visualizations, and geospatial analysis, to find properties in the San Francisco market that are viable investment opportunities.

## Instructions

Use the `san_francisco_housing.ipynb` notebook to visualize and analyze the real-estate data.

Note that this assignment requires you to create a visualization by using hvPlot and GeoViews. Additionally, you need to read the `sfo_neighborhoods_census_data.csv` file from the `Resources` folder into the notebook and create the DataFrame that you’ll use in the analysis.

The main task in this Challenge is to visualize and analyze the real-estate data in your Jupyter notebook. Use the `san_francisco_housing.ipynb` notebook to complete the following tasks:

* Calculate and plot the housing units per year.

* Calculate and plot the average prices per square foot.

* Compare the average prices by neighborhood.

* Build an interactive neighborhood map.

* Compose your data story.

### Calculate and Plot the Housing Units per Year

For this part of the assignment, use numerical and visual aggregation to calculate the number of housing units per year, and then visualize the results as a bar chart. To do so, complete the following steps:

1. Use the `groupby` function to group the data by year. Aggregate the results by the `mean` of the groups.

2. Use the `hvplot` function to plot the `housing_units_by_year` DataFrame as a bar chart. Make the x-axis represent the `year` and the y-axis represent the `housing_units`.

3. Style and format the line plot to ensure a professionally styled visualization.

4. Note that your resulting plot should appear similar to the following image:

![A screenshot depicts an example of the resulting bar chart.](Images/zoomed-housing-units-by-year.png)

5. Answer the following question:

    * What’s the overall trend in housing units over the period that you’re analyzing?

### Calculate and Plot the Average Sale Prices per Square Foot

For this part of the assignment, use numerical and visual aggregation to calculate the average prices per square foot, and then visualize the results as a bar chart. To do so, complete the following steps:

1. Group the data by year, and then average the results. What’s the lowest gross rent that’s reported for the years that the DataFrame includes?

2. Create a new DataFrame named `prices_square_foot_by_year` by filtering out the “housing_units” column. The new DataFrame should include the averages per year for only the sale price per square foot and the gross rent.

3. Use hvPlot to plot the `prices_square_foot_by_year` DataFrame as a line plot.

    > **Hint** This single plot will include lines for both `sale_price_sqr_foot` and `gross_rent`.

4. Style and format the line plot to ensure a professionally styled visualization.

5. Note that your resulting plot should appear similar to the following image:

![A screenshot depicts an example of the resulting plot.](Images/avg-sale-px-sq-foot-gross-rent.png)

6. Use both the `prices_square_foot_by_year` DataFrame and interactive plots to answer the following questions:

    * Did any year experience a drop in the average sale price per square foot compared to the previous year?

    * If so, did the gross rent increase or decrease during that year?

### Compare the Average Sale Prices by Neighborhood

For this part of the assignment, use interactive visualizations and widgets to explore the average sale price per square foot by neighborhood. To do so, complete the following steps:

1. Create a new DataFrame that groups the original DataFrame by year and neighborhood. Aggregate the results by the `mean` of the groups.

2. Filter out the “housing_units” column to create a DataFrame that includes only the `sale_price_sqr_foot` and `gross_rent` averages per year.

3. Create an interactive line plot with hvPlot that visualizes both `sale_price_sqr_foot` and `gross_rent`. Set the x-axis parameter to the year (`x="year"`). Use the `groupby` parameter to create an interactive widget for `neighborhood`.

4. Style and format the line plot to ensure a professionally styled visualization.

5. Note that your resulting plot should appear similar to the following image:

![A screenshot depicts an example of the resulting plot.](Images/pricing-info-by-neighborhood.png)

6. Use the interactive visualization to answer the following question:

    * For the Anza Vista neighborhood, is the average sale price per square foot for 2016 more or less than the price that’s listed for 2012? 

### Build an Interactive Neighborhood Map

For this part of the assignment, explore the geospatial relationships in the data by using interactive visualizations with hvPlot and GeoViews. To build your map, use the `sfo_data_df` DataFrame (created during the initial import), which includes the neighborhood location data with the average prices. To do all this, complete the following steps:

1. Read the `neighborhood_coordinates.csv` file from the `Resources` folder into the notebook, and create a DataFrame named `neighborhood_locations_df`. Be sure to set the `index_col` of the DataFrame as “Neighborhood”.

2. Using the original `sfo_data_df` Dataframe, create a DataFrame named `all_neighborhood_info_df` that groups the data by neighborhood. Aggregate the results by the `mean` of the group.

3. Review the two code cells that concatenate the `neighborhood_locations_df` DataFrame with the `all_neighborhood_info_df` DataFrame. Note that the first cell uses the [Pandas concat function](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.concat.html) to create a DataFrame named `all_neighborhoods_df`. The second cell cleans the data and sets the “Neighborhood” column. Be sure to run these cells to create the `all_neighborhoods_df` DataFrame, which you’ll need to create the geospatial visualization.

4. Using hvPlot with GeoViews enabled, create a `points` plot for the `all_neighborhoods_df` DataFrame. Be sure to do the following:

    * Set the `geo` parameter to True.
    * Set the `size` parameter to “sale_price_sqr_foot”.
    * Set the `color` parameter to “gross_rent”.
    * Set the `frame_width` parameter to 700.
    * Set the `frame_height` parameter to 500.
    * Include a descriptive title.

Note that your resulting plot should appear similar to the following image:

![A screenshot depicts an example of a scatter plot created with hvPlot and GeoViews.](Images/6-4-geoviews-plot.png)

5. Use the interactive map to answer the following question:

    * Which neighborhood has the highest gross rent, and which has the highest sale price per square foot?

### Compose Your Data Story

Based on the visualizations that you created, answer the following questions:

* How does the trend in rental income growth compare to the trend in sales prices? Does this same trend hold true for all the neighborhoods across San Francisco?

* What insights can you share with your company about the potential one-click, buy-and-rent strategy that they're pursuing? Do neighborhoods exist that you would suggest for investment, and why?

In [1]:
# Import the required libraries and dependencies
import pandas as pd
import hvplot.pandas
from pathlib import Path

## Import the data 

In [2]:
# Using the read_csv function and Path module, create a DataFrame 
# by importing the sfo_neighborhoods_census_data.csv file from the Resources folder

# Define the file path to the CSV file
csv_file_path = Path("sfo_neighborhoods_census_data.csv")

# Check if the file exists
if csv_file_path.exists():
    # Use the read_csv function to create a DataFrame
    sfo_data_df = pd.read_csv(csv_file_path)
else:
    print("File does not exist: ", csv_file_path)

# Review the first and last five rows of the DataFrame
# Display the first five rows of the DataFrame
print(sfo_data_df.head())

# Display the last five rows of the DataFrame
print(sfo_data_df.tail())


   year      neighborhood  sale_price_sqr_foot  housing_units  gross_rent
0  2010      Alamo Square           291.182945         372560        1239
1  2010        Anza Vista           267.932583         372560        1239
2  2010           Bayview           170.098665         372560        1239
3  2010  Buena Vista Park           347.394919         372560        1239
4  2010  Central Richmond           319.027623         372560        1239
     year            neighborhood  sale_price_sqr_foot  housing_units  \
392  2016          Telegraph Hill           903.049771         384242   
393  2016              Twin Peaks           970.085470         384242   
394  2016  Van Ness/ Civic Center           552.602567         384242   
395  2016       Visitacion Valley           328.319007         384242   
396  2016           Westwood Park           631.195426         384242   

     gross_rent  
392        4390  
393        4390  
394        4390  
395        4390  
396        4390  


---

## Calculate and Plot the Housing Units per Year

For this part of the assignment, use numerical and visual aggregation to calculate the number of housing units per year, and then visualize the results as a bar chart. To do so, complete the following steps:

1. Use the `groupby` function to group the data by year. Aggregate the results by the `mean` of the groups.

2. Use the `hvplot` function to plot the `housing_units_by_year` DataFrame as a bar chart. Make the x-axis represent the `year` and the y-axis represent the `housing_units`.

3. Style and format the line plot to ensure a professionally styled visualization.

4. Note that your resulting plot should appear similar to the following image:

![A screenshot depicts an example of the resulting bar chart.](Images/zoomed-housing-units-by-year.png)

5. Answer the following question:

    * What’s the overall trend in housing units over the period that you’re analyzing?

     

### Step 1: Use the `groupby` function to group the data by year. Aggregate the results by the `mean` of the groups.

In [3]:
import pandas as pd
import hvplot.pandas

# Use the groupby function to group the data by year and calculate the mean
housing_units_by_year = sfo_data_df.groupby('year')['housing_units'].mean()

### Step 2: Use the `hvplot` function to plot the `housing_units_by_year` DataFrame as a bar chart. Make the x-axis represent the `year` and the y-axis represent the `housing_units`.

### Step 3: Style and format the line plot to ensure a professionally styled visualization.

In [4]:
# Create a visual aggregation explore the housing units by year
import pandas as pd
import hvplot.pandas

# Step 1: Use the groupby function to group the data by year and calculate the mean
housing_units_by_year = sfo_data_df.groupby('year')['housing_units'].mean()

# Create the bar chart using hvplot
bar_chart = housing_units_by_year.hvplot.bar(
    x='year',
    y='housing_units',
    xlabel='Year',
    ylabel='Housing Units',
    title='Housing Units Per Year',
    color='blue',
    width=800,
    height=400
)

# Style and format the bar chart
bar_chart.opts(
    show_legend=False,
    tools=['hover'],
    fontscale=1.2,
    xrotation=45,
    ylim=(350000, 400000),
    xlim=(2010, 2016),
    yformatter='%.0f'  
)


### Step 5: Answer the following question:

**Question:** What is the overall trend in housing_units over the period being analyzed?

**Answer:** The over all tend in the housing per unit over the years is that it grows in a linear manner over the years 2010 till 2016.

---

## Calculate and Plot the Average Sale Prices per Square Foot

For this part of the assignment, use numerical and visual aggregation to calculate the average prices per square foot, and then visualize the results as a bar chart. To do so, complete the following steps:

1. Group the data by year, and then average the results. What’s the lowest gross rent that’s reported for the years that the DataFrame includes?

2. Create a new DataFrame named `prices_square_foot_by_year` by filtering out the “housing_units” column. The new DataFrame should include the averages per year for only the sale price per square foot and the gross rent.

3. Use hvPlot to plot the `prices_square_foot_by_year` DataFrame as a line plot.

    > **Hint** This single plot will include lines for both `sale_price_sqr_foot` and `gross_rent`.

4. Style and format the line plot to ensure a professionally styled visualization.

5. Note that your resulting plot should appear similar to the following image:

![A screenshot depicts an example of the resulting plot.](Images/avg-sale-px-sq-foot-gross-rent.png)

6. Use both the `prices_square_foot_by_year` DataFrame and interactive plots to answer the following questions:

    * Did any year experience a drop in the average sale price per square foot compared to the previous year?

    * If so, did the gross rent increase or decrease during that year?



### Step 1: Group the data by year, and then average the results.

In [5]:
# Group the data by year and calculate the mean for "price_per_sqft" and "gross_rent"
average_prices_by_year = sfo_data_df.groupby('year')[['sale_price_sqr_foot', 'gross_rent']].mean()

# Create a new DataFrame by filtering out the "housing_units" column
sale_price_sqr_foot = average_prices_by_year.reset_index()

# Find the lowest reported gross rent
lowest_gross_rent = sfo_data_df['gross_rent'].min()

# Step 4: Review the resulting DataFrame and lowest gross rent
print(sale_price_sqr_foot)
print(f'Lowest Gross Rent: {lowest_gross_rent}')


   year  sale_price_sqr_foot  gross_rent
0  2010           369.344353      1239.0
1  2011           341.903429      1530.0
2  2012           399.389968      2324.0
3  2013           483.600304      2971.0
4  2014           556.277273      3528.0
5  2015           632.540352      3739.0
6  2016           697.643709      4390.0
Lowest Gross Rent: 1239


**Question:** What is the lowest gross rent reported for the years included in the DataFrame?

**Answer:** Lowest Gross Rent: 1239

### Step 2: Create a new DataFrame named `prices_square_foot_by_year` by filtering out the “housing_units” column. The new DataFrame should include the averages per year for only the sale price per square foot and the gross rent.

In [6]:
# Group the data by year and calculate the mean for "price_per_sqft" and "gross_rent"
average_prices_by_year = sfo_data_df.groupby('year')[['sale_price_sqr_foot', 'gross_rent']].mean()

# Create a new DataFrame by filtering out the "housing_units" column
prices_square_foot_by_year = average_prices_by_year.reset_index()

#  Review the resulting DataFrame
print(prices_square_foot_by_year)


   year  sale_price_sqr_foot  gross_rent
0  2010           369.344353      1239.0
1  2011           341.903429      1530.0
2  2012           399.389968      2324.0
3  2013           483.600304      2971.0
4  2014           556.277273      3528.0
5  2015           632.540352      3739.0
6  2016           697.643709      4390.0


### Step 3: Use hvPlot to plot the `prices_square_foot_by_year` DataFrame as a line plot.

> **Hint** This single plot will include lines for both `sale_price_sqr_foot` and `gross_rent`

### Step 4: Style and format the line plot to ensure a professionally styled visualization.


In [7]:
#  Use hvPlot to create a line plot for "sale_price_sqr_foot" and "gross_rent"
line_plot = prices_square_foot_by_year.hvplot.line(
    x='year',
    y=['sale_price_sqr_foot', 'gross_rent'],
    xlabel='Year',
    ylabel='Average Sale Price per Sqft / Gross Rent',
    title='Average Sale Price per Sqft and Gross Rent Per Year',
    color=['blue', 'green'],
    width=800,
    height=400
)

#  Style and format the line plot
line_plot.opts(
    show_legend=True,
    tools=['hover'],
    fontscale=1.2,
    xrotation=45
)

### Step 6: Use both the `prices_square_foot_by_year` DataFrame and interactive plots to answer the following questions:

**Question:** Did any year experience a drop in the average sale price per square foot compared to the previous year?

**Answer:** # If digged deep into the graph it can be seeen that sale price dropped in between 2014 and 2015.

**Question:** If so, did the gross rent increase or decrease during that year?

**Answer:** # It did increase very graduially and steadily.

---

## Compare the Average Sale Prices by Neighborhood

For this part of the assignment, use interactive visualizations and widgets to explore the average sale price per square foot by neighborhood. To do so, complete the following steps:

1. Create a new DataFrame that groups the original DataFrame by year and neighborhood. Aggregate the results by the `mean` of the groups.

2. Filter out the “housing_units” column to create a DataFrame that includes only the `sale_price_sqr_foot` and `gross_rent` averages per year.

3. Create an interactive line plot with hvPlot that visualizes both `sale_price_sqr_foot` and `gross_rent`. Set the x-axis parameter to the year (`x="year"`). Use the `groupby` parameter to create an interactive widget for `neighborhood`.

4. Style and format the line plot to ensure a professionally styled visualization.

5. Note that your resulting plot should appear similar to the following image:

![A screenshot depicts an example of the resulting plot.](Images/pricing-info-by-neighborhood.png)

6. Use the interactive visualization to answer the following question:

    * For the Anza Vista neighborhood, is the average sale price per square foot for 2016 more or less than the price that’s listed for 2012? 


### Step 1: Create a new DataFrame that groups the original DataFrame by year and neighborhood. Aggregate the results by the `mean` of the groups.

In [8]:
# Group the data by year and neighborhood, and calculate the mean for "price_per_sqft"
prices_by_year_by_neighborhood = sfo_data_df.groupby(['year', 'neighborhood'])['sale_price_sqr_foot'].mean().reset_index()

# Review the resulting DataFrame
print(prices_by_year_by_neighborhood)


     year            neighborhood  sale_price_sqr_foot
0    2010            Alamo Square           291.182945
1    2010              Anza Vista           267.932583
2    2010                 Bayview           170.098665
3    2010        Buena Vista Park           347.394919
4    2010        Central Richmond           319.027623
..    ...                     ...                  ...
392  2016          Telegraph Hill           903.049771
393  2016              Twin Peaks           970.085470
394  2016  Van Ness/ Civic Center           552.602567
395  2016       Visitacion Valley           328.319007
396  2016           Westwood Park           631.195426

[397 rows x 3 columns]


### Step 2: Filter out the “housing_units” column to create a DataFrame that includes only the `sale_price_sqr_foot` and `gross_rent` averages per year.

In [9]:
## Filter out the "housing_units" column
prices_by_year_by_neighborhood = sfo_data_df[['year', 'neighborhood', 'sale_price_sqr_foot']]

# Review the first and last five rows of the DataFrame
first_five_rows = prices_by_year_by_neighborhood.head()
last_five_rows = prices_by_year_by_neighborhood.tail()

# Print both the first and last five rows
print("First Five Rows:")
print(first_five_rows)
print("\nLast Five Rows:")
print(last_five_rows)


First Five Rows:
   year      neighborhood  sale_price_sqr_foot
0  2010      Alamo Square           291.182945
1  2010        Anza Vista           267.932583
2  2010           Bayview           170.098665
3  2010  Buena Vista Park           347.394919
4  2010  Central Richmond           319.027623

Last Five Rows:
     year            neighborhood  sale_price_sqr_foot
392  2016          Telegraph Hill           903.049771
393  2016              Twin Peaks           970.085470
394  2016  Van Ness/ Civic Center           552.602567
395  2016       Visitacion Valley           328.319007
396  2016           Westwood Park           631.195426


### Step 3: Create an interactive line plot with hvPlot that visualizes both `sale_price_sqr_foot` and `gross_rent`. Set the x-axis parameter to the year (`x="year"`). Use the `groupby` parameter to create an interactive widget for `neighborhood`.

### Step 4: Style and format the line plot to ensure a professionally styled visualization.

In [10]:
import pandas as pd
import hvplot.pandas
import holoviews as hv

# Filter the data to include relevant columns
neighborhood_prices = sfo_data_df[['year', 'neighborhood', 'sale_price_sqr_foot', 'gross_rent']]

#  Create the interactive line plot with a dropdown selector for the neighborhood
line_plot = neighborhood_prices.hvplot.line(
    x="year",
    y=["sale_price_sqr_foot", "gross_rent"],
    groupby="neighborhood",
    xlabel="Year",
    ylabel="Average Price per Sqft / Gross Rent",
    title="Average Price per Sqft and Gross Rent by Neighborhood",
    width=800,
    height=400
)

#  Style and format the line plot
line_plot.opts(
    show_legend=True,
    tools=['hover'],
    fontscale=1.2,
    xrotation=45
)


BokehModel(combine_events=True, render_bundle={'docs_json': {'a4ed33b2-b08d-4321-a9fd-ee00c02bef16': {'version…

### Step 6: Use the interactive visualization to answer the following question:

**Question:** For the Anza Vista neighborhood, is the average sale price per square foot for 2016 more or less than the price that’s listed for 2012? 

**Answer:** # It less than what was listed in 2012 compared to 2016.

---

## Build an Interactive Neighborhood Map

For this part of the assignment, explore the geospatial relationships in the data by using interactive visualizations with hvPlot and GeoViews. To build your map, use the `sfo_data_df` DataFrame (created during the initial import), which includes the neighborhood location data with the average prices. To do all this, complete the following steps:

1. Read the `neighborhood_coordinates.csv` file from the `Resources` folder into the notebook, and create a DataFrame named `neighborhood_locations_df`. Be sure to set the `index_col` of the DataFrame as “Neighborhood”.

2. Using the original `sfo_data_df` Dataframe, create a DataFrame named `all_neighborhood_info_df` that groups the data by neighborhood. Aggregate the results by the `mean` of the group.

3. Review the two code cells that concatenate the `neighborhood_locations_df` DataFrame with the `all_neighborhood_info_df` DataFrame. Note that the first cell uses the [Pandas concat function](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.concat.html) to create a DataFrame named `all_neighborhoods_df`. The second cell cleans the data and sets the “Neighborhood” column. Be sure to run these cells to create the `all_neighborhoods_df` DataFrame, which you’ll need to create the geospatial visualization.

4. Using hvPlot with GeoViews enabled, create a `points` plot for the `all_neighborhoods_df` DataFrame. Be sure to do the following:

    * Set the `size` parameter to “sale_price_sqr_foot”.

    * Set the `color` parameter to “gross_rent”.

    * Set the `size_max` parameter to “25”.

    * Set the `zoom` parameter to “11”.

Note that your resulting plot should appear similar to the following image:

![A screenshot depicts an example of a scatter plot created with hvPlot and GeoViews.](Images/6-4-geoviews-plot.png)

5. Use the interactive map to answer the following question:

    * Which neighborhood has the highest gross rent, and which has the highest sale price per square foot?


### Step 1: Read the `neighborhood_coordinates.csv` file from the `Resources` folder into the notebook, and create a DataFrame named `neighborhood_locations_df`. Be sure to set the `index_col` of the DataFrame as “Neighborhood”.

In [11]:
import pandas as pd
from pathlib import Path

# Define the file path to the CSV file
csv_file_path = Path("neighborhoods_coordinates.csv")

# Check if the file exists
if csv_file_path.exists():
    # Use the read_csv function to create the DataFrame
    neighborhood_locations_df = pd.read_csv(csv_file_path, index_col="Neighborhood")
else:
    print("File does not exist: ", csv_file_path)

# Review the DataFrame
print(neighborhood_locations_df)


                          Lat         Lon
Neighborhood                             
Alamo Square        37.791012 -122.402100
Anza Vista          37.779598 -122.443451
Bayview             37.734670 -122.401060
Bayview Heights     37.728740 -122.410980
Bernal Heights      37.728630 -122.443050
...                       ...         ...
West Portal         37.740260 -122.463880
Western Addition    37.792980 -122.435790
Westwood Highlands  37.734700 -122.456854
Westwood Park       37.734150 -122.457000
Yerba Buena         37.792980 -122.396360

[73 rows x 2 columns]


### Step 2: Using the original `sfo_data_df` Dataframe, create a DataFrame named `all_neighborhood_info_df` that groups the data by neighborhood. Aggregate the results by the `mean` of the group.

In [12]:
# Calculate the mean values for each neighborhood
all_neighborhood_info_df = sfo_data_df.groupby('neighborhood').mean()

# Review the resulting DataFrame
print(all_neighborhood_info_df)


                           year  sale_price_sqr_foot  housing_units  \
neighborhood                                                          
Alamo Square        2013.000000           366.020712      378401.00   
Anza Vista          2013.333333           373.382198      379050.00   
Bayview             2012.000000           204.588623      376454.00   
Bayview Heights     2015.000000           590.792839      382295.00   
Bernal Heights      2013.500000           576.746488      379374.50   
...                         ...                  ...            ...   
West Portal         2012.250000           498.488485      376940.75   
Western Addition    2012.500000           307.562201      377427.50   
Westwood Highlands  2012.000000           533.703935      376454.00   
Westwood Park       2015.000000           687.087575      382295.00   
Yerba Buena         2012.500000           576.709848      377427.50   

                     gross_rent  
neighborhood                     
Alamo Sq

### Step 3: Review the two code cells that concatenate the `neighborhood_locations_df` DataFrame with the `all_neighborhood_info_df` DataFrame. 

Note that the first cell uses the [Pandas concat function](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.concat.html) to create a DataFrame named `all_neighborhoods_df`. 

The second cell cleans the data and sets the “Neighborhood” column. 

Be sure to run these cells to create the `all_neighborhoods_df` DataFrame, which you’ll need to create the geospatial visualization.

In [13]:
# Using the Pandas `concat` function, join the 
# neighborhood_locations_df and the all_neighborhood_info_df DataFrame
# The axis of the concatenation is "columns".
# The concat function will automatially combine columns with
# identical information, while keeping the additional columns.

all_neighborhoods_df = pd.concat(
    [neighborhood_locations_df, all_neighborhood_info_df], 
    axis="columns",
    sort=False
)

# Review the resulting DataFrame
display(all_neighborhoods_df.head())
display(all_neighborhoods_df.tail())


Unnamed: 0,Lat,Lon,year,sale_price_sqr_foot,housing_units,gross_rent
Alamo Square,37.791012,-122.4021,2013.0,366.020712,378401.0,2817.285714
Anza Vista,37.779598,-122.443451,2013.333333,373.382198,379050.0,3031.833333
Bayview,37.73467,-122.40106,2012.0,204.588623,376454.0,2318.4
Bayview Heights,37.72874,-122.41098,2015.0,590.792839,382295.0,3739.0
Bernal Heights,37.72863,-122.44305,,,,


Unnamed: 0,Lat,Lon,year,sale_price_sqr_foot,housing_units,gross_rent
Yerba Buena,37.79298,-122.39636,2012.5,576.709848,377427.5,2555.166667
Bernal Heights,,,2013.5,576.746488,379374.5,3080.333333
Downtown,,,2013.0,391.434378,378401.0,2817.285714
Ingleside,,,2012.5,367.895144,377427.5,2509.0
Outer Richmond,,,2013.0,473.900773,378401.0,2817.285714


In [14]:
# Call the dropna function to remove any neighborhoods that do not have data
all_neighborhoods_df = all_neighborhoods_df.reset_index().dropna()

# Rename the "index" column as "Neighborhood" for use in the Visualization
all_neighborhoods_df = all_neighborhoods_df.rename(columns={"index": "Neighborhood"})

# Review the resulting DataFrame
display(all_neighborhoods_df.head())
display(all_neighborhoods_df.tail())

Unnamed: 0,Neighborhood,Lat,Lon,year,sale_price_sqr_foot,housing_units,gross_rent
0,Alamo Square,37.791012,-122.4021,2013.0,366.020712,378401.0,2817.285714
1,Anza Vista,37.779598,-122.443451,2013.333333,373.382198,379050.0,3031.833333
2,Bayview,37.73467,-122.40106,2012.0,204.588623,376454.0,2318.4
3,Bayview Heights,37.72874,-122.41098,2015.0,590.792839,382295.0,3739.0
5,Buena Vista Park,37.76816,-122.43933,2012.833333,452.680591,378076.5,2698.833333


Unnamed: 0,Neighborhood,Lat,Lon,year,sale_price_sqr_foot,housing_units,gross_rent
68,West Portal,37.74026,-122.46388,2012.25,498.488485,376940.75,2515.5
69,Western Addition,37.79298,-122.43579,2012.5,307.562201,377427.5,2555.166667
70,Westwood Highlands,37.7347,-122.456854,2012.0,533.703935,376454.0,2250.5
71,Westwood Park,37.73415,-122.457,2015.0,687.087575,382295.0,3959.0
72,Yerba Buena,37.79298,-122.39636,2012.5,576.709848,377427.5,2555.166667


### Step 4: Using hvPlot with GeoViews enabled, create a `points` plot for the `all_neighborhoods_df` DataFrame. Be sure to do the following:

* Set the `geo` parameter to True.
* Set the `size` parameter to “sale_price_sqr_foot”.
* Set the `color` parameter to “gross_rent”.
* Set the `frame_width` parameter to 700.
* Set the `frame_height` parameter to 500.
* Include a descriptive title.

In [15]:
import hvplot.pandas

# Create a plot to analyze neighborhood info
neighborhood_plot = all_neighborhoods_df.hvplot.points(
    x="Lon",
    y="Lat",
    geo=True,
    size="sale_price_sqr_foot",
    color="gross_rent",
    size_max=25,
    zoom=11,
    frame_width=700,
    frame_height=500,
    title="San Francisco Neighborhoods Analysis",
)

neighborhood_plot




### Step 5: Use the interactive map to answer the following question:

**Question:** Which neighborhood has the highest gross rent, and which has the highest sale price per square foot?

**Answer:** 
Lon: -122.457
Lat: 37.734

sale_price_sar_foot: 779.811

sale_price_sar_foot: 528.318


sale_price_sar_foot: 533.704


sale_price_sar_foot: 687.088

gross_rent: 3959

## Compose Your Data Story

Based on the visualizations that you have created, compose a data story that synthesizes your analysis by answering the following questions:

**Question:**  How does the trend in rental income growth compare to the trend in sales prices? Does this same trend hold true for all the neighborhoods across San Francisco?

**Answer:** # Based on each neighborhoods across San Francisco has a diffrent kind of renatl income growth caopred to sales price of thata given area.

**Question:** What insights can you share with your company about the potential one-click, buy-and-rent strategy that they're pursuing? Do neighborhoods exist that you would suggest for investment, and why?

**Answer:** #  Buy and reantal stratagy should be based of neighborhoods across San Francisco which leaed to the amount the investor is willing to buy ore net based on the location of the area. 