-
First you'll need to install python modules:
pip install selenium pip install bs4 pip install pyautogui pip install webdrivermanager pip install matplotlib pip install pandas
-
Then, clone the repository:
git clone https://github.com/GabrielZuany/PriceComparator.git
- After run the code, the only thing you need to do is insert the product name and click 'OK'
- After run the code, the only thing you need to do is insert the product name and click 'OK'
Automate the search, table and manipulate several information from different databases with specific structure. In this case, the database was a lot of commercial websites in witch each of them have their own builded HTML and CSS differently to the others. This project came when a realize the time and effort to collect all information manually, not easy and inefficient by the time overview. So, the idea was automate this process.
Basically it works by automating the internet browsing using:
- Selenium and Webdriver to open browser and click or submit.
- BeautifulSoup to get html page, after establish a stable connection (Response ~200).
- Pandas and Matplotlib to analyze and build plot with the collected data.
When the connection is stable, the script pulls the HTML page content. Once I have the HTML, I can look for elements that I'm interested to collect about the
current page and store it into a list.
Using Selenium to go through the websites pages, and bs4 to extract all data that I want to, the automating system is almost
100% concluded.
When my product list is finished, I start to build plots (histograms) -Matplotlib- to analyse the collected data.
Then, I read the price column into dataframe (Pandas) and start to manipulate it to obtain important values like mean, median and (future) other info.
At the end, I will have .xlsx files with all product information (Name, price, link...) and two statistical graph to interpret the data.
- Displays a window asking the user to enter the product name they want to search.
- Author: Gabriel Zuany Duarte Vargas. (ES, Brazil)
- Date: 18/11/2022 (last update)
- Please, feel free to suggest improvements and new features!
- You can contact me in gzuanydev@gmail.com.