Joanna Broniarek, Alice Shirinà, Daniele Sanna.
The project consists in analyzing the text of Airbnb property listings and building two different Search Engines that, given as input a query, return the houses that match the query.
- Airbnb_Texas_Rentals.csv
https://www.kaggle.com/PromptCloudHQ/airbnb-property-data-from-texas
- Homework_3.ipynb - This jupyter notebook contains the implementation of Search_Engine_1, Search_Engine_2 and the definition of scoring functions. Some of the used functions are located in the function.py file.
For correct working of Search Engines it is necessary to run Creating_Files.ipyn notebook. Search Engine 1 is using "vocabulary.txt", "inv_indx.txt" files. Search Engine 2 is using "vocabulary.txt", "inv_indx_tfidf.txt" files.
-
Creating_Files.ipynb - In this notebook we create and save the following files "vocabulary.txt", "inv_indx.txt", "inv_indx_tfidf.txt" according to the data.
-
GeoMap.ipynb - There is an implementation of the Geomap for searching documents according to their locations.
- functions.py - external file with definitions of functions used in the Homework_3 notebook.
- Python 3.6.4