# Kaggle Coin Price Forecasting Challenge
Beyond the training data: Test your forecasting model's adaptability on an unseen asset.

## Dataset Description
### Data Description
Welcome to the dataset for the Kaggle Coin Price Forecasting Challenge! The data provided enables you to train time series models and generate predictions according to the competition's unique evaluation structure.

The core task involves using historical 15-minute price data for "Kaggle Coin" to predict its Close price at future 4-hour intervals.

### 📁 Files Provided
You are provided with the following files:

- **train.csv**: Contains the historical training data for Kaggle Coin.
- **test.csv**: Defines the future time points where predictions are required. This file does not include target values.
- **sample_submission.csv**: Demonstrates the exact format your submission must follow.

#### 📘 train.csv — Training Data
Your primary resource for model training, feature engineering, and temporal pattern recognition.

Content: Contains approximately 1 year of historical OHLCV (Open, High, Low, Close, Volume) data.
Frequency: 15-minute intervals (candlesticks).
Format: Time-indexed data.
##### Columns
- *timestamp*: (DateTime, UTC) Start time of the 15-minute interval. Ensures chronological order.
- *open*: (float) Price at the beginning of the interval.
- *high*: (float) Maximum price during the interval.
- *low*: (float) Minimum price during the interval.
- *close*: (float) Price at the end of the interval — this is what you'll be forecasting.
- *volume*: (float) Total amount traded during the interval.

⚠️ Note: While this data is derived from real historical crypto data, such as MANA, the final evaluation will use a completely different, hidden asset. Avoid overfitting to asset-specific patterns.

#### 📘 test.csv — Prediction Timepoints
This file outlines the future points in time (after the train period ends) for which your model should generate predictions.

- *Purpose*: Defines which timestamps require forecasts.
- *Structure*: Contains day_id entries representing abstracted points in time (e.g., D3660400 = Day 366, 04:00).
- *Target*: These rows do not include target prices — they only indicate what your model should predict.
- *Usage*: Use the train.csv data (i.e., past 365 days) to make predictions for each day_id in test.csv.

This setup simulates a realistic forecasting scenario where the model is asked to generalize beyond the training window.

#### 📘 sample_submission.csv — Submission Format Guide
This file shows the required format for your prediction file.

Columns
- *day_id*: A unique synthetic timestamp identifier matching those in test.csv (e.g., D3661600)
- *close*: Your predicted numeric Close price for that day_id
day_id Format

Each day_id follows the pattern:

- *D/DDD/HHMM*
    - DDD = Day index (e.g., 366 = day 366)
    - HHMM = Time of day in 24-hour format (e.g., 1600 = 16:00)
    
    Example: D3670000 refers to Day 367, 00:00.

This anonymized format hides real calendar dates to ensure fairness, while maintaining correct chronological order and intra-day time resolution.



### ⏳ Time and Forecasting Structure
- **train.csv** uses standard UTC timestamps.
- **test.csv** and sample_submission.csv use synthetic day_id identifiers.

This design allows for temporal modeling without revealing real-world dates, thereby supporting the competition's core focus on generalization.