Python libraries used:
- matplotlib
- matplotlib.pyplot
- pandas
- seaborn
- numpy
- statsmodel.api
Python skills demonstrated:
- Subsetting a Pandas DataFrame using [] and boolean operators
- Summing up records with value_counts()
- Creating calculated fields
- Group By in Pandas
- Creating Bar Plots with Matplotlib
- Count how many Airbnb listings are in each of the 5 Neighbourhood Groups (Manhattan, Brooklyn, Queens, Bronx, Staten Island), then identify which Neighbourhood Groups have the greatest number of Airbnb listings.
- Calculate the percentage of Airbnb listings that each Neighbourhood Group contains.
- Create a new calculated field called Revenue and place this into the Airbnb DataFrame. This is to be calculated by using the Price Column x Number_Of_Reviews Columns.
- Create a Bar Plot that shows which Neighbourhood Group has the highest average revenues.
- Filter the Airbnb DataFrame to include only the Neighbourhood Groups Manhattan, Brooklyn, and Queens.
- Identify the top 3 Revenue Generating Neighborhoods within each of the 3 Neighbourhood_Groups. This should give us 9 Overall Rows: 3 of the top generating neighbourhoods within each of the 3 Neighbourhood_Groups.
- Filter the Airbnb Dataframe to include only the top 3 Neighbroos within each neighbourhood_group.
- Identify the top average revenue-generating room-type for each of the nine neighbourhoods and plot this in a Bar Chart.