dynamic_crawler 0.1.0

Crawler using ad hoc Selenium for dynamically-loaded websites and elements. Example code crawls category lists drop-downs and URLs (can by DIY for other uses).

This project demonstrates how to use ad-hoc Selenium Python APIs to effectively crawl websites with dynamically loaded contents, emphasizing effectively waiting for elements to be accessible after reloading. Crawler solves the issue of invisible clickables (element.click) by using JavaScript.

Contains an example code with a primary and secondary dropdown list and a redirect button that only directs list selections when clicked.

The example code all_categories_crawler.py contains the following steps:

Load the page from a base URL
Scrapes the contents of two dropdown list boxes
Iterate selections of the two list boxes and press a “Go” button to refresh the domain list in a table
Write the scraped data to a file

The example code can be modified to fit your personal use case. Most functions within the code are very useful and can be directly called.

Crawled a lot of complicated and complex websites in the development process. Some websites automatically refresh after an element is clicked; others require an additional step of clicking another button that redirects the user (e.g. “Go”). This crawler code accommodates all of those instances.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
LICENSE		LICENSE
README.md		README.md
all_categories_crawler.py		all_categories_crawler.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LICENSE

LICENSE

README.md

README.md

all_categories_crawler.py

all_categories_crawler.py

Repository files navigation

dynamic_crawler 0.1.0

About

Releases

Packages

Languages

License

Maoshu413/dynamic_crawler

Folders and files

Latest commit

History

Repository files navigation

dynamic_crawler 0.1.0

About

Topics

Resources

License

Stars

Watchers

Forks

Languages