Missing apps are not skipped #63

Holtder · 2019-10-28T09:54:20Z

Describe the bug
In rare situations, an app will be listed as a result in the search function, while the app actually has been (temporarily) removed from the Play store. When using the detailed=True argument; the package will throw an error once the missing app is scraped, as it tries to access the actual app page.

To Reproduce
Steps to reproduce the behavior, e.g. the full example code, not just a snippet of where the error occurs!

 $ print(play_scraper.search('CAUTI', gl='nl', detailed='True', page=6))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "(blahblah)/.env/lib/python3.6/site-packages/play_scraper/api.py", line 79, in search
    return s.search(query, page, detailed)
  File "(blahblah)/.env/lib/python3.6/site-packages/play_scraper/scraper.py", line 224, in search
    apps = self._parse_multiple_apps(response)
  File "(blahblah)/.env/lib/python3.6/site-packages/play_scraper/scraper.py", line 71, in _parse_multiple_apps
    return multi_futures_app_request(app_ids, params=self.params)
  File "(blahblah)/.env/lib/python3.6/site-packages/play_scraper/utils.py", line 531, in multi_futures_app_request
    result = response.result()
  File "/usr/lib/python3.6/concurrent/futures/_base.py", line 432, in result
    return self.__get_result()
  File "/usr/lib/python3.6/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
  File "/usr/lib/python3.6/concurrent/futures/thread.py", line 56, in run
    result = self.fn(*self.args, **self.kwargs)
  File "(blahblah)/.env/lib/python3.6/site-packages/requests/sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "(blahblah)/.env/lib/python3.6/site-packages/requests/sessions.py", line 653, in send
    r = dispatch_hook('response', hooks, r, **kwargs)
  File "(blahblah)/.env/lib/python3.6/site-packages/requests/hooks.py", line 31, in dispatch_hook
    _hook_data = hook(hook_data, **kwargs)
  File "(blahblah)/.env/lib/python3.6/site-packages/play_scraper/utils.py", line 504, in parse_app_details_response_hook
    details = parse_app_details(soup)
  File "(blahblah)/.env/lib/python3.6/site-packages/play_scraper/utils.py", line 239, in parse_app_details
    title = soup.select_one('h1[itemprop="name"] span').text
AttributeError: 'NoneType' object has no attribute 'text'

In the original usecase (function that iterated over the pages using celery) the following error was thrown as well:

[2019-10-28 10:48:41,362: ERROR/ForkPoolWorker-1] Error occurred fetching uk.incrediblesoftware.mpcmachine.demo: 404 Client Error: Not Found for url: https://play.google.com/store/apps/details?id=uk.incrediblesoftware.mpcmachine.demo&hl=en&gl=nl&q=CAUTI&c=apps

From this I tried to check out the actual play store page for uk.incrediblesoftware.mpcmachine.demo; which as expected, throws an HTTP 404 error.

Expected behavior
I hoped the package would print the 404-error; skip over this one and still return the remaining results. I can catch errors in my code to prevent problems, but that way an entire page of apps will still be excluded from the results.

Desktop (please complete the following information):

OS: Windows 10 - Running WSL Ubuntu 18.04
Python Version 3.6.8
play_scraper Version 0.6.0

The text was updated successfully, but these errors were encountered:

Holtder · 2019-11-05T13:45:56Z

It should be noted that the app I mentioned is back online again, so apparently some apps just go missing from time to time.

Holtder added the bug label Oct 28, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missing apps are not skipped #63

Missing apps are not skipped #63

Holtder commented Oct 28, 2019 •

edited

Loading

Holtder commented Nov 5, 2019

Missing apps are not skipped #63

Missing apps are not skipped #63

Comments

Holtder commented Oct 28, 2019 • edited Loading

Holtder commented Nov 5, 2019

Holtder commented Oct 28, 2019 •

edited

Loading