Install:
pip3 install python-app-store-scraper
Scrape reviews for an app:
from app_store_scraper import AppStore
from pprint import pprint
facebook = AppStore(country="us", app_name="facebook")
facebook.review(how_many=20)
pprint(facebook.reviews)
pprint(facebook.reviews_count)
Scrape reviews for a podcast:
from app_store_scraper import Podcast
from pprint import pprint
sysk = Podcast(country="us", app_name="stuff you should know")
sysk.review(how_many=20)
pprint(sysk.reviews)
pprint(sysk.reviews_count)
There are two required and one positional parameters:
country
(required)- two-letter country code of ISO 3166-1 alpha-2 standard
app_name
(required)- name of an iOS application to fetch reviews for
- also used by
search_id()
method to search forapp_id
internally
app_id
(positional)- can be passed directly
- or ignored to be obtained by
search_id
method internally
Once instantiated, the object can be examined:
>>> facebook
AppStore(country='us', app_name='facebook', app_id=284882215)
>>> print(app)
Country | us
Name | facebook
ID | 284882215
URL | https://apps.apple.com/us/app/facebook/id284882215
Review count | 0
Other optional parameters are:
log_format
- passed directly to
logging.basicConfig(format=log_format)
- default is
"%(asctime)s [%(levelname)s] %(name)s - %(message)s"
- passed directly to
log_level
- passed directly to
logging.basicConfig(level=log_level)
- default is
"INFO"
- passed directly to
log_interval
- log is produced every 5 seconds (by default) as a "heartbeat" (useful for a long scraping session)
- default is
5
The maximum number of reviews fetched per request is 20. To minimise the number of calls, the limit of 20 is hardcoded. This means the review()
method will always grab more than the how_many
argument supplied with an increment of 20.
>>> facebook.review(how_many=33)
>>> facebook.reviews_count
40
If how_many
is not provided, review()
will terminate after all reviews are fetched.
NOTE the review count seen on the landing page differs from the actual number of reviews fetched. This is simply because only some users who rated the app also leave reviews.
after
- a
datetime
object to filter older reviews
- a
sleep
- an
int
to specify seconds to sleep between each call
- an
The fetched review data are loaded in memory and live inside reviews
attribute as a list of dict.
>>> facebook.reviews
[{'userName': 'someone', 'rating': 5, 'date': datetime.datetime(...
Each review dictionary has the following schema:
{
"date": datetime.datetime,
"isEdited": bool,
"rating": int,
"review": str,
"title": str,
"userName": str
}