Skip to content

SlamSb/basic-webscraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

Full Python Web Scraper

Overview

This Python web scraper fetches a webpage and outputs the entire HTML structure as formatted JSON.
It recursively extracts every element, tag, attribute, and text node, providing a faithful JSON representation of the DOM tree.


Features

  • Fetches any page via URL
  • Recursively parses all HTML elements, attributes, and text
  • Outputs structured JSON representing the full DOM
  • Handles errors gracefully

Prerequisites

  • Python 3.6+
  • Required libraries:
    • requests
    • beautifulsoup4

Install dependencies with:

pip3 install requests beautifulsoup4

About

A basic full web scraper using Python's Requests & BS4 libraries.

Resources

License

Stars

Watchers

Forks

Languages