This project is a Java-based ETL (Extract, Transform, Load) pipeline that moves data from external APIs into a SQL Server database. It also supports basic CRUD operations on product data and logs all operations for verification.
It demonstrates how to:
- Extract data from APIs
- Transform and filter data
- Load data into database tables
- Handle duplicates and updates automatically
- Log and visualize data
- User data and Product data are fetched from external APIs.
- The APIs return JSON responses, which are converted into Java objects.
- This process is handled by:
APIExtractor.java
→ Extracts user dataProductAPIExtractor.java
→ Extracts product data
At this stage, data is still in memory and ready for transformation.
- The extracted user data may need filtering or modifications before loading.
- For example, only users with a valid city are retained.
- This transformation ensures the database only receives clean and relevant data.
- Transformation is handled by:
DataTransformer.java
→ Applies filtering and transformation logic
- Transformed data is written to SQL Server tables.
- Users are inserted directly into the Users table.
- Products use a MERGE strategy:
- Existing product records are updated
- New products are inserted automatically
- Loading is handled by:
DatabaseLoader.java
→ Loads user dataProductDAO.java
→ Handles products data, including insert, update, and delete
This step moves data from Java objects into permanent storage in the database.
- After loading, products can be read, updated, or deleted as needed.
ProductDAO.java
provides methods to:- Read all products
- Insert or update products (using MERGE)
- Delete products by ID
- Updates are optional because MERGE automatically handles changes to existing records.
- All steps in the pipeline are logged for tracking and debugging.
TablePrinter.java
prints formatted tables of users and products to the console.- Logs show:
- API data received
- Records after transformation
- Data successfully loaded into SQL Server
- Data moves in a linear flow: Extract → Transform → Load → CRUD.
- Duplicate product handling is automatic using MERGE, avoiding manual update checks.
- Each Java file has a clear responsibility:
- Extractors fetch API data
- Transformer filters or modifies data
- Loaders / DAO move data to the database and manage CRUD operations
- TablePrinter helps visualize data
- Logging ensures transparency and aids debugging for new developers.