In this repository, I extracted, transformed and loaded datasets from disparate sources using jupyter notebook
- Web Scraping in Python
- Extract Transform and Load
I made use of python libraries such as bs4, requests and pandas to extract data from web pages;
- Downloading And Scraping The Contents Of A Web Page
- Scrape all images Tags
- Scrape data from HTML tables
- Scrape data from HTML tables into a DataFrame using BeautifulSoup and Pandas
- Scrape data from HTML tables into a DataFrame using BeautifulSoup and read_html
- Scrape data from HTML tables into a DataFrame using read_html
- Read CSV and JSON file types.
- Extract data from the above file types.
- Transform data.
- Save the transformed data in a ready-to-load format for loading into database or analysis.
The notebooks contains codes, dataset links and comments on code executions