Skip to content

Guide-Analytics/simple_amazon_author_scraper

Repository files navigation

Simple Amazon Author Scraper

Install the following packages:

  • selenium: To automate scraping. You can download using pip through command line as:
    pip install selenium

  • webdriver-manager: Install the chrome driver inplace so no need to download explicitly. You can download it through command line as:
     pip install webdriver_manager

  • pandas: For file manipulation (saving data to csv). You can download using:
     pip install pandas

  • word2number: Convert words to numbers. You can download using pip through command line as:
    pip install word2number

    Currently this script works on Chrome browser.

File structure:
--AuthorProfileConfigConfig.py: Contains user-defined functions to retrieve data.
--DriverSetup.py: Defines and initiate webdriver object of selenium.
--main.py: Run this file to scrape data for author profile.

--ProductMain.py: Run this file to scrape data for all the subprodcuts related to each author.

To run:

  • run main.py. Data will be scraped from main_product folder containing all the main product data.
    Data will be stored in reviewers folder.
  • run ProductMain.py. Data will be scraped from reviewers folder containing all the author profile data. Data will be store in reviews folder. For example: \data_scraping_v2\

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages