Direction :
- “All the files for this group project are located in the Resources folder.”
Timeline :
- 10/16/2024
- Decided dataset that the group use for the project
- Distribute the analysis tasks among the group members, each person is responsible for a specific part of the analysis
- 10/17/2024
- The group members worked on each of their responsible part of analysis
- 10/18/2024
- We tried to solve the github push issues
- 10/20/2024
- Each group member presents their analysis, and we provide feedback and help each other improve the analysis.
- 10/21/2024
- We worked together during the classroom meeting on Monday to finalize all the code and resolve some GitHub issues.
- 10/22/2024
- All group members have finalized their analyses. We discussed the presentation slides.
- 10/23/2024
- We have finalized the presentation slides and are ready for today’s 'Project-One' presentation
Steps :
- Clean the raw data and make a new clean data frame that all members of the group will use for the further analysis.
- We analyze the data title : Melbourne Housing Snapshot.
- Resources : https://www.kaggle.com/datasets/dansbecker/melbourne-housing-snapshot
- What we analyze :
- Top 5 agents by revenue
- Bottom 5 agents by revenue
- Top 5 agents by house sold
- Bottom 5 agents by house sold
- Distribution of house sold for each year for each type of property using pie charts
- Using boxplot : we analyze each region distance from CBD and how it affected the price
- Using mapping : to analyze sales per region, sales per type, sales per types of rooms
- Using bar charts : to analyze the seasonality / trends of sales per type of property, per rooms, to know what is the best time to buy
- Using Bar : to analyze total seasonality / price of sales for each type and for number of rooms
- Using group bar : to compare the trends of sales and price per type number of rooms
- Using boxplot and scatter plot : to analyze the correlation of average price and number of bedrooms
- Find the correlation between latitude/longitude and property prices to identify high-demand areas
- From the calculation on latitude/longitude and property price, find the linear regression for each region
- Using bar chart : to show the average price by month and average price by quarter
- Calculate the property count trends over time in booming regions
- Using bar chart : find the average property price against month and also per quarter to see what season to buy cheaper/more expensive property
- Using line charts : to calculate property count trends over time in booming region
- Using scatter plot and linear regression find the correlation between distance of houses for specific number of bedroom
- Using scatter plot and linear regression find the correlation between number of car garage for the price
- Using HeatMap to show the property distribution region and type of the property
- Make the summary analysis
Limitation of The Data and Analysis :
- The data was collected in 2016-2017, which is over five years old. While it’s still acceptable to analyze today’s housing market based on this data, it may not be as effective for understanding current trends or predicting the future of the market.