# DA_PROJECT: Day 1 – Setting Up Project and Core Utilities

Today marked the beginning of my hands-on journey in building a **Data Analysis project from scratch**. The primary goal was to **finalize the project structure**, implement foundational backend utilities, and push the first version to GitHub.

I referred to resources like *“Python for Data Science Handbook” by Jake VanderPlas* and other online tutorials to structure the project and handle data safely.

## 1. Finalizing Folder Structure

Before writing any code, I spent time carefully designing a **human-readable and maintainable folder structure** for the project. The main idea was to separate the **backend logic**, **frontend (Streamlit) UI**, **raw and cleaned data**, and **notebooks** for experiments.

The structure looks like this:
```
DA_PROJECT/
├── app_backend/          # Backend logic
│   ├── __init__.py
│   ├── data_loader.py
│   ├── data_cleaning.py
│   └── helpers.py
├── notebooks/            # Jupyter notebooks for experimentation
│   └── day1.ipynb
├── streamlit_app/        # Streamlit frontend
├── data/                 # Raw and cleaned datasets
├── README.md
└── requirements.txt
```

## 2. Implementing `helpers.py`

To make the backend **robust and error-proof**, I implemented a `safe_execute` decorator in `helpers.py`.

**Purpose:**
- Wraps backend methods to **catch common exceptions** like `FileNotFoundError`, `KeyError`, `ValueError`, `TypeError`, and more.
- Logs unexpected errors to a file `project_errors.log` for later debugging.
- Optionally supports **interactive mode** to ask users for valid input when errors occur.

Here is the code:

In [ ]:
import logging

# Setup logging
logging.basicConfig(
    filename="project_errors.log",
    level=logging.ERROR,
    format="%(asctime)s - %(levelname)s - %(message)s"
)

# Safe execution decorator
def safe_execute(interactive=False):
    def decorator(func):
        def wrapper(*args, **kwargs):
            while True:
                try:
                    return func(*args, **kwargs)
                except FileNotFoundError as e:
                    print(f"File not found: {e}")
                    break
                except KeyError as e:
                    print(f"Missing column/key: {e}")
                    break
                except ValueError as e:
                    print(f"Invalid value: {e}")
                    if interactive:
                        if 'input_value' in kwargs:
                            kwargs['input_value'] = input("Please enter a valid value: ")
                        continue
                    break
                except TypeError as e:
                    print(f"Type error: {e}")
                    if interactive:
                        continue
                    break
                except ZeroDivisionError:
                    print("Cannot divide by zero!")
                    break
                except MemoryError:
                    print("Memory Error: Data too large!")
                    break
                except Exception as e:
                    logging.error(f"Unexpected error in {func.__name__}", exc_info=True)
                    print(f"An unexpected error occurred: {e}")
                    break
        return wrapper
    return decorator

## 3. Implementing `data_loader.py`

Next, I created the `data_loader.py` class to **load datasets safely and provide easy access** to data for exploration and analysis.

Key features include:
- Loading CSV datasets and storing them as a DataFrame.
- Previewing top rows for a quick glance.
- Getting column names and shape.
- Creating a copy of the DataFrame for safe downstream operations.

Here is the basic implementation:

In [ ]:
import pandas as pd
from helpers import safe_execute

class data_loader:
    @safe_execute
    def __init__(self, dataset):
        self.df = pd.read_csv(dataset)
        print(f"Loaded data preview:\n{self.df.head(4)}")

    @safe_execute
    def get_columns(self):
        return self.df.columns

    @safe_execute
    def shape(self):
        return self.df.shape

    @safe_execute
    def preview_by_index(self, n):
        return self.df.iloc[n]

    @safe_execute
    def preview_by_value(self, val):
        return self.df.loc[val]

    @safe_execute
    def df_copy(self):
        return self.df.copy()

## 4. Pushing to GitHub

After creating `helpers.py` and `data_loader.py`, I initialized a Git repository and pushed the first version of the project:
- Added the project structure.
- Added `helpers.py` and `data_loader.py`.
- Added `day1.ipynb` for documentation.

This sets the foundation for day-to-day incremental development and tracking progress on GitHub.