pagingControls Error #14

wangrunzu · 2019-05-31T20:11:43Z

I got the following error about the paging control when I try to scrap the data.

python.exe main.py --headless --url "https://www.glassdoor.com/Reviews/Walmart-Reviews-E715.htm" --limit 100 -f test.csv

2019-05-31 15:06:49,643 INFO 377 :main.py(17796) - Configuring browser

DevTools listening on ws://127.0.0.1:50831/devtools/browser/8c7890e8-fe24-41f7-b77f-d22dae3f6c3e
2019-05-31 15:06:51,700 INFO 419 :main.py(17796) - Scraping up to 100 reviews.
2019-05-31 15:06:51,717 INFO 358 :main.py(17796) - Signing in to ******@ou.edu
2019-05-31 15:06:55,478 INFO 339 :main.py(17796) - Navigating to company reviews
2019-05-31 15:07:08,137 INFO 286 :main.py(17796) - Extracting reviews from page 1
2019-05-31 15:07:08,200 INFO 291 :main.py(17796) - Found 10 reviews on page 1
2019-05-31 15:07:08,677 INFO 297 :main.py(17796) - Scraped data for "The Best in Retail"(Thu May 30 2019 20:24:44 GMT-0500 (Central Daylight Time))
2019-05-31 15:07:09,171 INFO 297 :main.py(17796) - Scraped data for "Walmart needs to bring worker dignity back into focus"(Wed May 29 2019 18:04:43 GMT-0500 (Central Daylight Time))
2019-05-31 15:07:09,673 INFO 297 :main.py(17796) - Scraped data for "Great for college students"(Thu May 30 2019 12:25:57 GMT-0500 (Central Daylight Time))
2019-05-31 15:07:10,042 INFO 297 :main.py(17796) - Scraped data for "Retail"(Thu May 30 2019 17:09:02 GMT-0500 (Central Daylight Time))
2019-05-31 15:07:10,497 INFO 297 :main.py(17796) - Scraped data for "walmart"(Mon May 27 2019 17:17:41 GMT-0500 (Central Daylight Time))
2019-05-31 15:07:10,966 INFO 297 :main.py(17796) - Scraped data for "Maintenance is well taken care of"(Tue May 28 2019 08:32:17 GMT-0500
(Central Daylight Time))
2019-05-31 15:07:11,437 INFO 297 :main.py(17796) - Scraped data for "It was the best job that I had to be honest"(Wed May 29 2019 20:29:39 GMT-0500 (Central Daylight Time))
2019-05-31 15:07:11,896 INFO 297 :main.py(17796) - Scraped data for "Great"(Wed May 29 2019 20:36:02 GMT-0500 (Central Daylight Time))
2019-05-31 15:07:12,281 INFO 297 :main.py(17796) - Scraped data for "floater pharmacist"(Wed May 29 2019 21:10:58 GMT-0500 (Central Daylight Time))
2019-05-31 15:07:12,708 INFO 297 :main.py(17796) - Scraped data for "cashier"(Wed May 29 2019 23:11:49 GMT-0500 (Central Daylight Time))
Traceback (most recent call last):
File "main.py", line 461, in
main()
File "main.py", line 446, in main
while more_pages() and
File "main.py", line 314, in more_pages
paging_control = browser.find_element_by_class_name('pagingControls')
File "C:\Users\wang0040\AppData\Local\Continuum\miniconda3\envs\Default\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 564, in find_element_by_class_name
return self.find_element(by=By.CLASS_NAME, value=name)
File "C:\Users\wang0040\AppData\Local\Continuum\miniconda3\envs\Default\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 978, in find_element
'value': value})['value']
File "C:\Users\wang0040\AppData\Local\Continuum\miniconda3\envs\Default\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "C:\Users\wang0040\AppData\Local\Continuum\miniconda3\envs\Default\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line
242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"class name","selector":"pagingControls"}
(Session info: headless chrome=74.0.3729.169)
(Driver info: chromedriver=74.0.3729.6 (255758eccf3d244491b8a1317aa76e1ce10d57e9-refs/branch-heads/3729@{#29}),platform=Windows NT 6.1.7601 SP1 x86_64)

I also got the No Such Element Exception #8 error, but overcoming it by hide the scrape_years part. I do not think this action cause the above issue but I am not sure.

jhatamyar · 2019-06-03T13:46:27Z

I am suddenly getting the exact same exception error using chromedriver 73.0.3683.6 on Mac OS X 10.13.6. The code was working 100% perfectly a few weeks ago. I am looking into get_current_page() as I'm curious if find_elements by class name or xpath might be the problem, but I am a total beginner with selenium. Hoping the author can help.

MatthewChatham · 2019-06-03T15:16:48Z

Thanks folks, I may have time to look at this in the coming week. But if you're able to figure it out and make a PR to fix, I'll merge it!

guoruijiao · 2019-06-14T05:06:52Z

I'm seeing the exact same error as above. It would be great if this can be resolved.

heraldnithesh · 2019-06-26T13:02:03Z

Hi, Is this resolved ?

batordavid · 2019-07-08T09:21:31Z

Replacing some line of codes helped me.

Original (3 places in the codes):
paging_control = browser.find_element_by_class_name('pagingControls')
Updated:
paging_control = browser.find_element_by_css_selector('.eiReviews__EIReviewsPageContainerStyles__pagination.noTabover.mt')

Original (2 places in the codes):
next_ = paging_control.find_element_by_class_name('next')
Updated:
next_ = paging_control.find_element_by_class_name('pagination__PaginationStyle__next')

tsp2123 · 2019-07-28T08:26:24Z

Hey, so does anyone have an issue where they fix the paging_control options but it breaks later on? I'm trying to scrape around 30k worth of data. And the code keeps breaking for me on around p176. I used the following for paging_control

`
def more_pages():
paging_control = browser.find_element_by_css_selector('.eiReviews__EIReviewsPageContainerStyles__pagination.noTabover.mt')
next_ = paging_control.find_element_by_class_name('pagination__PaginationStyle__next')
try:
next_.find_element_by_tag_name('a')
return True
except selenium.common.exceptions.NoSuchElementException:
return False

def go_to_next_page():
logger.info(f'Going to page {page[0] + 1}')
paging_control = browser.find_element_by_class_name('pagination__PaginationStyle__pagination')
next_ = paging_control.find_element_by_class_name(
'pagination__PaginationStyle__next').find_element_by_tag_name('a')
browser.get(next_.get_attribute('href'))
time.sleep(1)
page[0] = page[0] + 1

`

I'm messing around with both to see what works but my code keeps breaking not even a quarter way through the scraping. Does anyone have a work around?

carlotorniai · 2020-01-24T11:18:54Z

Hi all I've tried both suggestions and still the code breaks.
Any clue?
Traceback below:
Traceback (most recent call last):
File "main.py", line 483, in
main()
File "main.py", line 468, in main
while more_pages() and
File "main.py", line 315, in more_pages
paging_control = browser.find_element_by_css_selector('.eiReviews__EIReviewsPageContainerStyles__pagination.noTabover.mt')
File "/usr/local/lib/python3.7/site-packages/selenium/webdriver/remote/webdriver.py", line 598, in find_element_by_css_selector
return self.find_element(by=By.CSS_SELECTOR, value=css_selector)
File "/usr/local/lib/python3.7/site-packages/selenium/webdriver/remote/webdriver.py", line 978, in find_element
'value': value})['value']
File "/usr/local/lib/python3.7/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "/usr/local/lib/python3.7/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":".eiReviews__EIReviewsPageContainerStyles__pagination.noTabover.mt"}
(Session info: headless chrome=79.0.3945.130)

carlotorniai · 2020-01-24T11:25:29Z

Getting the latest code form MuhammadMehran pull request fixed the issue.

EdiLacic123 · 2020-02-10T21:52:44Z

@carlotorniai Could you post the code by any chance? I have been trying to fix the same issue as well. Thanks

carlotorniai · 2020-02-11T08:24:09Z

@EdiLacic123 just grab the main.py, test.py and schema. py from this pull request: https://github.com/MatthewChatham/glassdoor-review-scraper/pull/37/files

LioGabriella mentioned this issue Sep 10, 2019

Gets first 10 reviews then I get this error #23

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pagingControls Error #14

pagingControls Error #14

wangrunzu commented May 31, 2019 •

edited

Loading

jhatamyar commented Jun 3, 2019

MatthewChatham commented Jun 3, 2019

guoruijiao commented Jun 14, 2019

heraldnithesh commented Jun 26, 2019

batordavid commented Jul 8, 2019

tsp2123 commented Jul 28, 2019

carlotorniai commented Jan 24, 2020

carlotorniai commented Jan 24, 2020

EdiLacic123 commented Feb 10, 2020

carlotorniai commented Feb 11, 2020

pagingControls Error #14

pagingControls Error #14

Comments

wangrunzu commented May 31, 2019 • edited Loading

jhatamyar commented Jun 3, 2019

MatthewChatham commented Jun 3, 2019

guoruijiao commented Jun 14, 2019

heraldnithesh commented Jun 26, 2019

batordavid commented Jul 8, 2019

tsp2123 commented Jul 28, 2019

carlotorniai commented Jan 24, 2020

carlotorniai commented Jan 24, 2020

EdiLacic123 commented Feb 10, 2020

carlotorniai commented Feb 11, 2020

wangrunzu commented May 31, 2019 •

edited

Loading