Project that I made for course "Python project for data engineering"

This project implements a simple ETL (Extract – Transform – Load) pipeline that collects data about the world’s largest banks from a Wikipedia archive, processes it, and saves it both as a CSV file and in an SQLite database.

The main goal is to automate the entire data workflow – from extraction, through transformation, to loading and querying.

Features

Extract

Scrapes data from the table on Wikipedia – List of largest banks (archived)

Extracts each bank’s name and market capitalization (in billions of USD).

Transform

Reads exchange rates from exchange_rate.csv.

Converts market capitalization from USD to GBP, EUR, and INR.

Adds new columns to the DataFrame.

Load

Saves the transformed data into Largest_banks_data.csv.

Loads the data into an SQLite database (Banks.db) under the table name Largest_banks.

Query (SQL)

Displays all rows from the table.

Calculates the average market capitalization in GBP.

Retrieves the first 5 bank names.

Logging

Every ETL step is logged in code_log.txt with timestamps.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
banks_project.py		banks_project.py
exchange_rate.csv		exchange_rate.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Project that I made for course "Python project for data engineering"

Features

Extract

Transform

Load

Query (SQL)

Logging

About

Uh oh!

Releases

Packages

Languages

Pawel-Srodek55/Data-enginerring-project-from-course-Python-Project-for-Data-Engineering

Folders and files

Latest commit

History

Repository files navigation

Project that I made for course "Python project for data engineering"

Features

Extract

Transform

Load

Query (SQL)

Logging

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages