# 🧹 Data Cleaning and Exploratory Analysis

In this notebook, we'll dive deep into the candidate data, cleaning it, analyzing patterns, and preparing it for further use.

## 🚀 Steps Overview

1. **Database Connection** 🔗
2. **Exploratory Data Analysis (EDA)** 📊
3. **Data Cleaning and Transformation** 🧼



---

## 1. 🔗 Database Connection

We start by connecting to the PostgreSQL database to retrieve the raw data for analysis.



In [4]:
# Una sola celda para todas las importaciones
import sys
import os
from dotenv import load_dotenv

# Load environment variables
load_dotenv()
work_dir = os.getenv('WORK_DIR')

# Ensure the working directory is in sys.path
sys.path.append(work_dir)

from src.model import CandidatesRaw, CandidatesTransformed
from sqlalchemy import inspect
from sqlalchemy.orm import sessionmaker
from sqlalchemy.exc import SQLAlchemyError

from src.db_connection import build_engine
from src.transform_data import DataTransformer

In [5]:
# Connect to the database
engine = build_engine()
Session = sessionmaker(bind=engine)
session = Session()

Successfully connected to the database postgres!


---
## 2. 📊 Exploratory Data Analysis (EDA)
Here, we explore the dataset to understand its structure, identify any inconsistencies, and visualize key patterns.