Skip to content

Latest commit

 

History

History
7 lines (6 loc) · 513 Bytes

README.md

File metadata and controls

7 lines (6 loc) · 513 Bytes

Trip-scraping

Before any machine learning task, data has to be processed and get its features extracted. This project contains a python script to extract data from tourism websites, about all trip offers that are presented, and organize it in a mangoDB noSQL database, for ready usage in a machine learning pipeline. The python script had been made with python libraries such as BeautifulSoup for the scraping.

Future Features :

  • Javascript scripts for scraping
  • Ready containers for scraping data regularly