# House Price Prediction with XGBoost (AWS SageMaker)

End-to-end machine learning pipeline built on AWS SageMaker using XGBoost to predict house prices (Kaggle dataset).

## Project Overview

This project demonstrates:

- Data preprocessing and feature engineering
- Training XGBoost model on SageMaker
- Model deployment
- Batch inference
- Kaggle submission generation
- Endpoint cleanup to stop billing

##  Kaggle Score

Public Score: **0.20244**

## Tech Stack

- Python
- Pandas / NumPy
- Scikit-learn
- XGBoost
- AWS SageMaker
- Boto3

## Notebooks

### 01_preprocessing.ipynb
- Cleans data
- One-hot encodes features
- Splits train/validation
- Uploads to S3

### 02_training_and_inference.ipynb
- Trains XGBoost on SageMaker
- Deploys endpoint
- Generates predictions
- Creates submission.csv
- Deletes endpoint to stop billing

##  Important
Always delete SageMaker endpoints after inference to avoid AWS charges.

---

Built as part of applied machine learning practice.


In [1]:
import shutil

shutil.make_archive(
    "house-price-ml-project",  # name of zip file (without .zip)
    "zip",                     # format
    "HousePricePrediction-MLProject"  # folder to zip
)

print("Zip file created successfully.")

FileNotFoundError: [Errno 2] No such file or directory: 'HousePricePrediction-MLProject'

In [2]:
!pwd

/home/sagemaker-user/HousePricePrediction-MLProject


In [3]:
!ls

README.ipynb  data  notebooks  processed  requirements.txt  submission.csv
