Skip to content

A web scraper for helping individuals or companies to find out which kind of products are recommended more than others in specific category.

Notifications You must be signed in to change notification settings

RaminMammadzada/ruby-web-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

55 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Microverse Ruby Capstone Project -> Custom Web Scraper

This is a fashion follower application for helping individuals or companies to find out which kind of products are recommended more than others in the category if the interest. The project is a web scraper and uses kimurai gem to fetch data from the Single Page Application. By default, the script is implemented brings the top 5 reviewed products from the specific category in the Trendyol.com. Users can also specify price ranges from the command line interface if needed. Both the category and the price range can be given from the command line interface. The category can be composed of several words, but it must be given in the Turkish language.

  • Milestone 1 - CLI user interface is implemented.
  • Milestone 2 - Implement logic to retrieve data from the webpage and store them in a JSON file.
  • Milestone 3 - Update codebase to follow OOP principles
  • Milestone 4 - Scraping data from Single Page Application
  • Milestone 5 - Rspec tests

Video Presentation

Rules To Run

  • The category can be composed of several words, but it must be given in the Turkish language. You can see the examples below:

If the user enters hp bilgisar (computer means "bilgisayar" mean in Turkish) and 300-4000 as a price range, then it will search all computers with the hp brand that costs between 300 and 4000 Turkish Lira.

If the user enters Nike siyah erkek ayakkabi, it will search for black Nike shoes for men.

Built With

  • Kimurai (it depends on nokagiri gem)
  • Ruby
  • Rspec
  • Rubocop

Getting Started

To get a local copy up and running follow these simple example steps.

Install

You can run those functions in your own local environment. To run, you need to install RUBY on your computer. For windows, you can go to Ruby installer and for MAC and LINUX you can go to Ruby official site for instructions on how to install it.

Then you can clone the project by typing git clone https://github.com/RaminMammadzada/ruby-web-scraper.git

Dependencies

You must install the dependendies by bundling the Gemfile:

  • First fo the root of the project by typing cd ruby-web-scraper
  • Go to the feature branch by typing git checkout feature
  • gem install bundler
  • bundle update
  • bundle install

You must also have chromedriver in your local development environment, otherwise you will get the error. It cannot be installed as a ruby gem.

  • On Mac OS:

    • Method 1: with brew
      • brew install chromedriver
    • Method 2: normal installation
      • curl http://chromedriver.storage.googleapis.com/2.38/chromedriver_mac64.zip -o chromedriver_mac64.zip
      • unzip chromedriver_mac64.zip
      • mv chromedriver /usr/local/bin
      • chmod +x /usr/local/bin/chromedriver
  • On Linux, and chromedriver (use proper path for your system and version):

    • Install chrome first
      • wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | sudo apt-key add -
      • sh -c 'echo "deb http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list'
      • apt-get update
      • apt-get install google-chrome-stable
    • Install chromedriver ( use proper path for your system and version, this versin may not work for you )
      • cd /tmp
      • wget https://chromedriver.storage.googleapis.com/2.44/chromedriver_linux64.zip
      • sudo unzip chromedriver_linux64.zip -d /usr/local/bin
      • rm -f chromedriver_linux64.zip
      • Now you can leave tmp directory.

Run app

  1. In a terminal window type:
    • Go inside to the project root by typing cd ruby-web-scraper .
    • Go to the feature branch by typing git checkout feature .
    • Then type ./bin/main.rb in the root file of the project. You can also type ruby bin/main.rb in the root file of the project.
  2. Follow the instructions given in the command-line interface.
    • Enter the category keywords screenshot of step 1
    • Enter the price range (you can also skip by pressing Enter) screenshot of step 2
    • Press Enter and wait some moment to get the results. The time to get the data depends on the total amount of products found.
      • screenshot of step 3
      • All of the fetched products' informations are saved to product_search_result.json file. screenshot of step 3
      • If there is no product found, then it means there is no matching category for your keywords or you gave the price range too narrow. You should run the app again and try other keywords or price range. screenshot of step 3
    • Here you go! You got the 5 most reviewed products with their information. screenshot of step 4

Testing the script

This script was tested using RSpec which is a ruby testing tool. All public methods are tested.

Install

  • In a terminal window type gem install rspec
  • Once rspec install has finished, go to the project directory and type rspec --init
  • You will see a folder spec and a file .rspec
  • Inside spec folder you'll see a spec_helper.rb file.

Run the test

  • Open a terminal window and type rspec

  • All tests should be passed:

    screenshot of tests

Authors

👤 Ramin Mammadzada

🤝 Contributing

Contributions, issues, and feature requests are welcome!

Feel free to check the issues page.

Show your support

Give a ⭐️ if you like this project!

Acknowledgments

  • Microverse
  • Odin project

📝 License

This project is MIT licensed.

About

A web scraper for helping individuals or companies to find out which kind of products are recommended more than others in specific category.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages