This repo contains working Selenium examples for web scraping in Python and Node.js.
It covers everything from driver setup, navigation, waits, element extraction, downloads, network/proxy management, to scaling strategies (Grid) and async scraping.
- Requirements
- Project Structure
- Python Selenium Examples
- Node.js Selenium Examples
- Notes
- More Resources
- Python 3.8+
- pip
- Node.js 18+
- npm or yarn
- Node.js 18+
- npm or yarn
selenium-scraping/
│
├── python/
│ ├── 01_install_selenium.md
│ ├── 02_driver_initialization.py
│ ├── 03_context_manager.py
│ ├── 04_options_headless.py
│ ├── 05_prefs_and_cdp.py
│ ├── 06_navigation.py
│ ├── 07_locators_and_find.py
│ ├── 08_shadow_dom.py
│ ├── 09_data_extraction.py
│ ├── 10_lists_and_tables.py
│ ├── 11_export_csv_json.py
│ ├── 12_waits_and_sync.py
│ ├── 13_stale_handling.py
│ ├── 14_interactions.py
│ ├── 15_scrolling.py
│ ├── 16_tabs_frames_alerts.py
│ ├── 17_sessions_and_auth.py
│ ├── 18_downloads_and_monitoring.py
│ ├── 19_debugging_and_logging.py
│ ├── 20_network_and_proxy.py
│ ├── 21_grid_examples.py
│ ├── 22_scrapy_integration.py
│ └── 23_hasdata_async_example.py
│
├── nodejs/
│ ├── 01_install_selenium.md
│ ├── 02_launch_browser.js
│ ├── 03_options_headless.js
│ ├── 04_block_resources_cdp.js
│ ├── 05_navigation_and_locators.js
│ ├── 06_shadow_dom.js
│ ├── 07_data_extraction.js
│ ├── 08_tables_and_export.js
│ ├── 09_waits_and_retry.js
│ ├── 10_downloads_and_monitoring.js
│ └── 11_proxy_and_stealth.js
│
└── README.md
All examples use selenium 4+:
02_driver_initialization.py
— Chrome/Firefox/Edge/Safari drivers03_context_manager.py
— auto-quit with context manager04_options_headless.py
— headless mode, window size, user-agent05_prefs_and_cdp.py
— prefs and CDP commands06_navigation.py
— get, refresh, back, forward07_locators_and_find.py
— By API, find_element(s)08_shadow_dom.py
— accessing shadow root09_data_extraction.py
— element.text normalization, get_attribute10_lists_and_tables.py
— scrape lists and tables11_export_csv_json.py
— CSV/JSON export12_waits_and_sync.py
— implicit vs explicit waits13_stale_handling.py
— StaleElementReferenceException14_interactions.py
— clicks, send_keys, forms15_scrolling.py
— scrollIntoView, infinite scroll16_tabs_frames_alerts.py
— tabs, iframes, alerts17_sessions_and_auth.py
— login and cookies18_downloads_and_monitoring.py
— auto-download, monitor completion19_debugging_and_logging.py
— screenshots, logs20_network_and_proxy.py
— CDP, Selenium Wire, proxies21_grid_examples.py
— Selenium Grid remote driver examples22_scrapy_integration.py
— scrapy-selenium integration23_hasdata_async_example.py
— async scraping example
All examples use selenium-webdriver:
02_driver_initialization.js
— Chrome/Firefox drivers03_options_headless.js
— headless mode, options, window size04_block_resources_cdp.js
— CDP commands to block images/fonts05_navigation_and_locators.js
— get, click, locators06_shadow_dom.js
— access shadow DOM via JS execution07_data_extraction.js
— text, attributes extraction08_tables_and_export.js
— build CSV/JSON from scraped data09_waits_and_retry.js
— implicit/explicit waits and retry10_downloads_and_monitoring.js
— download handling11_proxy_and_stealth.js
— proxies, stealth patterns
These examples are for educational purposes only. Learn more about the legality of web scraping.
- Use context managers in Python to avoid leftover browser processes.
- CDP features work best with Chrome/Chromium.
- For large-scale scraping, consider Selenium Grid or Scrapy integration.
- All code is ready-to-run: adapt snippets to your own scraping tasks.