You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
In rare situations, an app will be listed as a result in the search function, while the app actually has been (temporarily) removed from the Play store. When using the detailed=True argument; the package will throw an error once the missing app is scraped, as it tries to access the actual app page.
To Reproduce
Steps to reproduce the behavior, e.g. the full example code, not just a snippet of where the error occurs!
$ print(play_scraper.search('CAUTI', gl='nl', detailed='True', page=6))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "(blahblah)/.env/lib/python3.6/site-packages/play_scraper/api.py", line 79, in search
return s.search(query, page, detailed)
File "(blahblah)/.env/lib/python3.6/site-packages/play_scraper/scraper.py", line 224, in search
apps = self._parse_multiple_apps(response)
File "(blahblah)/.env/lib/python3.6/site-packages/play_scraper/scraper.py", line 71, in _parse_multiple_apps
return multi_futures_app_request(app_ids, params=self.params)
File "(blahblah)/.env/lib/python3.6/site-packages/play_scraper/utils.py", line 531, in multi_futures_app_request
result = response.result()
File "/usr/lib/python3.6/concurrent/futures/_base.py", line 432, in result
return self.__get_result()
File "/usr/lib/python3.6/concurrent/futures/_base.py", line 384, in __get_result
raise self._exception
File "/usr/lib/python3.6/concurrent/futures/thread.py", line 56, in run
result = self.fn(*self.args, **self.kwargs)
File "(blahblah)/.env/lib/python3.6/site-packages/requests/sessions.py", line 533, in request
resp = self.send(prep, **send_kwargs)
File "(blahblah)/.env/lib/python3.6/site-packages/requests/sessions.py", line 653, in send
r = dispatch_hook('response', hooks, r, **kwargs)
File "(blahblah)/.env/lib/python3.6/site-packages/requests/hooks.py", line 31, in dispatch_hook
_hook_data = hook(hook_data, **kwargs)
File "(blahblah)/.env/lib/python3.6/site-packages/play_scraper/utils.py", line 504, in parse_app_details_response_hook
details = parse_app_details(soup)
File "(blahblah)/.env/lib/python3.6/site-packages/play_scraper/utils.py", line 239, in parse_app_details
title = soup.select_one('h1[itemprop="name"] span').text
AttributeError: 'NoneType' object has no attribute 'text'
In the original usecase (function that iterated over the pages using celery) the following error was thrown as well:
[2019-10-28 10:48:41,362: ERROR/ForkPoolWorker-1] Error occurred fetching uk.incrediblesoftware.mpcmachine.demo: 404 Client Error: Not Found for url: https://play.google.com/store/apps/details?id=uk.incrediblesoftware.mpcmachine.demo&hl=en&gl=nl&q=CAUTI&c=apps
From this I tried to check out the actual play store page for uk.incrediblesoftware.mpcmachine.demo; which as expected, throws an HTTP 404 error.
Expected behavior
I hoped the package would print the 404-error; skip over this one and still return the remaining results. I can catch errors in my code to prevent problems, but that way an entire page of apps will still be excluded from the results.
Desktop (please complete the following information):
OS: Windows 10 - Running WSL Ubuntu 18.04
Python Version 3.6.8
play_scraper Version 0.6.0
The text was updated successfully, but these errors were encountered:
Describe the bug
In rare situations, an app will be listed as a result in the search function, while the app actually has been (temporarily) removed from the Play store. When using the detailed=True argument; the package will throw an error once the missing app is scraped, as it tries to access the actual app page.
To Reproduce
Steps to reproduce the behavior, e.g. the full example code, not just a snippet of where the error occurs!
In the original usecase (function that iterated over the pages using celery) the following error was thrown as well:
From this I tried to check out the actual play store page for
uk.incrediblesoftware.mpcmachine.demo
; which as expected, throws an HTTP 404 error.Expected behavior
I hoped the package would print the 404-error; skip over this one and still return the remaining results. I can catch errors in my code to prevent problems, but that way an entire page of apps will still be excluded from the results.
Desktop (please complete the following information):
The text was updated successfully, but these errors were encountered: