ETL Project: Users & Products with SQL Server

Overview

This project is a Java-based ETL (Extract, Transform, Load) pipeline that moves data from external APIs into a SQL Server database. It also supports basic CRUD operations on product data and logs all operations for verification.

It demonstrates how to:

Extract data from APIs
Transform and filter data
Load data into database tables
Handle duplicates and updates automatically
Log and visualize data

Flow of Data

1️⃣ Extract

User data and Product data are fetched from external APIs.
The APIs return JSON responses, which are converted into Java objects.
This process is handled by:
- APIExtractor.java → Extracts user data
- ProductAPIExtractor.java → Extracts product data

At this stage, data is still in memory and ready for transformation.

2️⃣ Transform

The extracted user data may need filtering or modifications before loading.
For example, only users with a valid city are retained.
This transformation ensures the database only receives clean and relevant data.
Transformation is handled by:
- DataTransformer.java → Applies filtering and transformation logic

3️⃣ Load

Transformed data is written to SQL Server tables.
Users are inserted directly into the Users table.
Products use a MERGE strategy:
- Existing product records are updated
- New products are inserted automatically
Loading is handled by:
- DatabaseLoader.java → Loads user data
- ProductDAO.java → Handles products data, including insert, update, and delete

This step moves data from Java objects into permanent storage in the database.

4️⃣ CRUD Operations on Products

After loading, products can be read, updated, or deleted as needed.
ProductDAO.java provides methods to:
- Read all products
- Insert or update products (using MERGE)
- Delete products by ID
Updates are optional because MERGE automatically handles changes to existing records.

5️⃣ Logging & Visualization

All steps in the pipeline are logged for tracking and debugging.
TablePrinter.java prints formatted tables of users and products to the console.
Logs show:
- API data received
- Records after transformation
- Data successfully loaded into SQL Server

Key Takeaways

Data moves in a linear flow: Extract → Transform → Load → CRUD.
Duplicate product handling is automatic using MERGE, avoiding manual update checks.
Each Java file has a clear responsibility:
- Extractors fetch API data
- Transformer filters or modifies data
- Loaders / DAO move data to the database and manage CRUD operations
- TablePrinter helps visualize data
Logging ensures transparency and aids debugging for new developers.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.settings		.settings
bin		bin
src		src
.classpath		.classpath
.gitignore		.gitignore
.project		.project
ETLProject_0001.sql		ETLProject_0001.sql
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ETL Project: Users & Products with SQL Server

Overview

Flow of Data

1️⃣ Extract

2️⃣ Transform

3️⃣ Load

4️⃣ CRUD Operations on Products

5️⃣ Logging & Visualization

Key Takeaways

About

Uh oh!

Releases

Packages

Languages

PrinceOfCoding007/ETL-Process-Java

Folders and files

Latest commit

History

Repository files navigation

ETL Project: Users & Products with SQL Server

Overview

Flow of Data

1️⃣ Extract

2️⃣ Transform

3️⃣ Load

4️⃣ CRUD Operations on Products

5️⃣ Logging & Visualization

Key Takeaways

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages