PyWebScrapr

An open-source Python library for web scraping tasks. Includes support for both text and image scraping.

Installation

You can install PyWebScrapr using pip:

pip install pywebscrapr

Supported Python Versions

PyWebScrapr supports the following Python versions:

Python 3.6
Python 3.7
Python 3.8
Python 3.9
Python 3.10
Python 3.11/Later (Preferred)

Please ensure that you have one of these Python versions installed before using PyWebScrapr. PyWebScrapr may not work as expected on lower versions of Python than the supported.

Features

Text Scraping: Extract textual content from specified URLs.
Image Scraping: Download images from specified URLs.

_{*for a full list check out the PyWebScrapr Documentation.}

Usage

Text Scraping

from pywebscrapr import scrape_text

# Specify links in a file or list
links_file = 'links.txt'
links_array = ['https://example.com/page1', 'https://example.com/page2']

# Scrape text and save to the 'output.txt' file
scrape_text(links_file=links_file, links_array=links_array, output_file='output.txt')

Image Scraping

from pywebscrapr import scrape_images

# Specify links in a file or list
links_file = 'image_links.txt'
links_array = ['https://example.com/image1.jpg', 'https://example.com/image2.png']

# Scrape images and save to the 'images' folder
scrape_images(links_file=links_file, links_array=links_array, save_folder='images')

Contributing

Contributions are welcome! If you encounter any issues, have suggestions, or want to contribute to PyWebScrapr, please open an issue or submit a pull request on GitHub.

License

PyWebScrapr is released under the terms of the MIT License (Modified). Please see the LICENSE file for the full text.

Modified License Clause

The modified license clause grants users the permission to make derivative works based on the PyWebScrapr software. However, it requires any substantial changes to the software to be clearly distinguished from the original work and distributed under a different name.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github/workflows		.github/workflows
pywebscrapr		pywebscrapr
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py
test_links.txt		test_links.txt
tests.py		tests.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PyWebScrapr

Installation

Supported Python Versions

Features

Usage

Text Scraping

Image Scraping

Contributing

License

About

Releases

Packages

Languages

License

Infinitode/PyWebScrapr

Folders and files

Latest commit

History

Repository files navigation

PyWebScrapr

Installation

Supported Python Versions

Features

Usage

Text Scraping

Image Scraping

Contributing

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages