Based on the original project in perl

Should work on both python 2.7+ and python 3. Python 3 is the preferred version. Python 2 support will be likely dropped.


This version adds minimal handling of robots detection from Amazon.

Crawling command example:

> python -d it B00WPW3CQ0 -o reviews

Use help to see all the options

> python -h
usage: [-h] [-d DOMAIN] [-f] [-r MAXRETRIES] [-t TIMEOUT]
                         [-p PAUSE] [-m MAXREVIEWS] [-o OUT] [-c]
                         ID [ID ...]

positional arguments:
  ID                    Product IDs for which to download reviews

optional arguments:
  -h, --help            show this help message and exit
  -d DOMAIN, --domain DOMAIN
                        Domain from which to download the reviews. Default:
  -f, --force           Force download even if already successfully downloaded
  -r MAXRETRIES, --maxretries MAXRETRIES
                        Max retries to download a file. Default: 3
  -t TIMEOUT, --timeout TIMEOUT
                        Timeout in seconds for http connections. Default: 180
  -p PAUSE, --pause PAUSE
                        Seconds to wait between http requests. Default: 1
  -m MAXREVIEWS, --maxreviews MAXREVIEWS
                        Maximum number of reviews per item to download.
  -o OUT, --out OUT     Output base path. Default: amazonreviews
  -c, --captcha         Retry on captcha pages until captcha is not asked.
                        Default: skip


Once downloaded, the reviews can be extracted to a CSV file with the command:

> python -d reviews -o reviews.csv

Use help to see all the options

> python -h
usage: [-h] -d DIR -o OUTFILE

Amazon review parser

optional arguments:
  -h, --help            show this help message and exit
  -d DIR, --dir DIR     Directory with the data for parsing
  -o OUTFILE, --outfile OUTFILE
                        Output file path for saving the reviews in csv format


I provide you the tool to download the reviews, not the right to download them. You have to respect Amazon's rights on its own data. Do not release the data you download without Amazon's consent.