🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...
-
Updated
Jun 26, 2024 - Python
🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...
🎭 Playwright integration for Scrapy
A package acting as a wrapper around the headless mode of existing web browsers to generate images from URLs and from HTML+CSS strings or files.
Run Selenium with Python via Github Actions using Headless or Non-Headless browsers!
Pyppeteer integration for Scrapy
Example of username and password proxy authentication for use in Selenium
🗄 Save an archived copy of websites from Pocket/Pinboard/Bookmarks/RSS. Outputs HTML, PDFs, and more...
Scrapfly Python SDK for headless browsers and proxy rotation
An embeddable headless browser package for Python that provides a simplified interface for interacting with web pages using Selenium and Selenium Hub.
Web crawler and scraper based on Scrapy and Playwright's headless browser.
COVID-19 Apple Mobility Trends Reports
Automated Selenium-based scraper for extracting data from Myntra
Dare2024.com Solver is a Python automation script for seamlessly solving Dare2024.com quizzes. Impress your friends with correct answers effortlessly. Compatible with all dare2024.com versions and future updates.
Automated Selenium-based scraper for extracting and analyzing job listings from Glassdoor
This repository contains a Python script that simulates views on a GitHub profile by repeatedly reloading the profile page. The script uses the selenium and requests libraries to fetch the content of the profile page and then reloads the page in a headless Firefox browser.
Add a description, image, and links to the headless-browser topic page so that developers can more easily learn about it.
To associate your repository with the headless-browser topic, visit your repo's landing page and select "manage topics."