Skip to content

Created to parse HTML files, BeautifulSoup builds a DOM object by storing the XML content in memory. This allows changes to be easily made to the contents of the XML file, resulting in a powerful and smart style for extracting and analyzing data :)

License

Notifications You must be signed in to change notification settings

josepmartorell/BeautifulSoup4

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BeautifulSoup4

Created to parse HTML files, BeautifulSoup builds a DOM object by storing the XML content in memory. This allows changes to be easily made to the contents of the XML file, resulting in a powerful and smart style for extracting and analyzing data :)

How to Install 🤖

First you must access from your terminal to the directory where you want to save the project. Install this project on your local computer by typing the following in the command prompt of the linux command interpreter:

(env)$ git clone https://github.com/josepmartorell/BeautifulSoup4.git

(env)$ pip install -r requirements.txt

Table of Contents

  • Context
  • Manuals
  • Credits
  • License

Context

This is the first of a series of projects aimed at automating search and data collection tasks related to my city's business & travel agency. This initial work sets out to create a basis for addressing web scraping techniques in order to extract the best deals from hotel wholesalers. Although Scrapy is almost always the favorite, bs4 has made my job much easier, since it has a way of managing XML information that is normally clear and easy to handle.

Manuals

Credits

License

    • APACHE 2.0

About

Created to parse HTML files, BeautifulSoup builds a DOM object by storing the XML content in memory. This allows changes to be easily made to the contents of the XML file, resulting in a powerful and smart style for extracting and analyzing data :)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages