The GitHub repository features the source codes for the colab notebooks that were presented and taught during the "Data Scraping Workshop" conducted by GDSC - PUP Manila. The source codes are inside the src
folder:
scraping-github-profiles
Contains two Jupyter notebooks (github_profiles_bs4.ipynb
and github_profiles_selectolax.ipynb
) that demonstrate how to scrape Github profiles using Beautiful Soup and Selectolax, respectively.
scraping-quotestoscrape
:
Contains two Jupyter notebooks (quotestoscrape_bs4.ipynb
and quotestoscrape_selectolax.ipynb
) that demonstrate how to scrape the "Quotes to Scrape" website using Beautiful Soup and Selectolax, respectively.
The code in this repository depends on the following Python libraries:
- beautifulsoup4: For parsing HTML and XML documents.
- selectolax: For parsing HTML documents with CSS selectors.
- requests: For making HTTP requests.
You can install these dependencies by running the following command:
pip install bs4 selectolax requests
Download the following Google Colab notebooks as ipynb or py: