This repo contains R scripts to scrape the website of Airbnb. We scraped the Hungarian accomodations and our current study is also reported: https://marcellgranat.github.io/airbnb-research/
Public perception of shared goods and/or services has changed significantly in the last few years. Shared accommodations have gained so great popularity, that house and flat sharing platforms like Airbnb now rival some of the worldās largest businesses in hospitality. Sharing of personal properties provides an opportunity for owners to lower the transaction costs of operating short-term rentals and online rental marketplaces connect people who want to rent out their dwellings with the ones who are looking for accommodations. This study is aimed at determining the perceived behavior of individuals choosing Airbnb and exploring the factors that influence user ratings and consumer adoption of Airbnb while assuming that customer feedbacks contribute significantly to consumer choice. We also analyze the market trends of the Hungarian Airbnb accommodations as primary examples of sharing or collaborative economy. Weekly data was collected for the Hungarian accommodation establishments all over the country. We aimed to build a complete dataset of the active suppliers by using automated āweb scrapingā techniques during a certain window of time. Our database contained customer ratings, reviews and pieces of public information concerning the rooms. We performed a TF-IDF analysis and a Lasso-based feature selection on the aforementioned variables. Our key findings were that four attributes form the vast majority of online review comments. These are āamenitiesā, āhostā, ālocationā and ācleanlinessā. Contrary to our expectations, āpriceā was not identified as a key determinant of customer satisfaction. A positivity bias can be detected in Airbnb usersā comments (this means an overwhelmingly large number of positive comments), and a higher degree of intimacy between users and hosts than in the case of traditional hotels. Negative feedback is usually related to ālocationā (safety issues), ānoiseā and bad quality of āamenitiesā.