 ## Niyaz Gubaidullin

## Task 1: Exploratory Data Analysis (EDA) and Preprocessing (25 points)

## 1. Load the Data
- **Read `opsd_raw.csv` into a DataFrame.** Identify the relevant columns.
- **Manually confirm** which columns are for Denmark’s power load and productions by referencing `README.md`.

In [None]:
import numpy as np
import pandas as pd
from datetime import datetime
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import matplotlib.pyplot as plt

pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
pd.set_option('display.width', None)
pd.set_option('display.max_colwidth', None)
df = pd.read_csv('opsd_raw.csv')
required_columns = [
    'utc_timestamp',
    'DK_load_actual_entsoe_transparency',
    'DK_wind_generation_actual',
    'DK_solar_generation_actual'
]
df = df[required_columns]



## 2. Data Inspection
- Print the shape (rows × columns) of the raw data and the first few lines.
- Check for missing values (NaNs). Decide how you’ll handle them (drop rows,
fill forward, etc.).

In [None]:
print("=== Data Inspection ===")
print(f"Shape: {df.shape}")
print("\nFirst 5 rows:")
print(df.head())
print("\nMissing values:")
print(df.isnull().sum())
df = df.dropna()

We can see that we have only **15 nil values** over **50,000 lines**. We can just remove them.

### Missing values:
- `utc_timestamp`: **0** missing values
- `DK_load_actual_entsoe_transparency`: **2** missing values
- `DK_wind_generation_actual`: **2** missing values
- `DK_solar_generation_actual`: **11** missing values


## 3. Form 24-Hour Arrays and Label Seasons.
- Convert the hourly data into daily slices of length 24.
- Ensure each daily record lines up with a single date (e.g., from midnight to
midnight).
- Show at least 5 sample arrays to confirm correctness.

In [None]:
##function to get season by date
def get_season(date):
    month = date.month
    if 3 <= month <= 5:
        return 'spring'
    elif 6 <= month <= 8:
        return 'summer'
    elif 9 <= month <= 11:
        return 'autumn'
    else:
        return 'winter'


df['utc_timestamp'] = pd.to_datetime(df['utc_timestamp'])
df = df.sort_values('utc_timestamp')
df['date'] = df['utc_timestamp'].dt.date
df['season'] = df['utc_timestamp'].apply(get_season)

daily_data = []
for date, group in df.groupby('date'):
    if len(group) == 24:  # we only add full days
        daily_data.append({
            'date': date,
            'season': group['season'].iloc[0],
            'load_24h': group['DK_load_actual_entsoe_transparency'].values,
            'wind_24h': group['DK_wind_generation_actual'].values,
            'solar_24h': group['DK_solar_generation_actual'].values
        })

daily_df = pd.DataFrame(daily_data)
print("\n=== 5 Sample Days ===")
for i in range(5):  ##print first 5 days
    print(f"\nDate: {daily_df['date'].iloc[i]}, Season: {daily_df['season'].iloc[i]}")
    print(f"Load (first 5h): {daily_df['load_24h'].iloc[i]}")
    print(f"Wind (first 5h): {daily_df['wind_24h'].iloc[i]}")
    print(f"Solar (first 5h): {daily_df['solar_24h'].iloc[i]}")

print("\n=== Season Distribution ===")
print(daily_df['season'].value_counts())

##seson distribution:
# === Season Distribution ===
# season
# spring    552
# summer    552
# winter    510
# autumn    484
# Name: count, dtype: int64



## 5. Train-test split.
- Split the daily records into train (70%), validation (15%), and test
(15%) sets.

In [None]:
train_df, temp_df = train_test_split(daily_df, test_size=0.3, random_state=42)
val_df, test_df = train_test_split(temp_df, test_size=0.5, random_state=42)

## 6. Data scaling.
- Apply one of the standardisation (or scaling) methods you've learnt on
the IML course on every feature column. Avoid data leakage when fit a scaler.

In [None]:
def scale_data(df, scalers, fit=False):
    scaled = df.copy()
    for feature, scaler in scalers.items():
        data = np.vstack(scaled[f'{feature}_24h'])

        if fit:
            ## we train scaler with train data
            scaled_data = scaler.fit_transform(data)
        else:
            ## here we just scale data, no train
            scaled_data = scaler.transform(data)
        scaled[f'{feature}_24h'] = [row.tolist() for row in scaled_data]
    return scaled


features = ['load_24h', 'wind_24h', 'solar_24h']
scaler = StandardScaler()
scalers = {
    'load': StandardScaler(),
    'wind': StandardScaler(),
    'solar': StandardScaler()
}
train_df_scaled = scale_data(train_df, scalers, fit=True)
val_df_scaled = scale_data(val_df, scalers)
test_df_scaled = scale_data(test_df, scalers)

## here we check results of scaling. Mean nust be 0 while std must be 1
for feature in ['load', 'wind', 'solar']:
    data = np.vstack(train_df_scaled[f'{feature}_24h'])
    print(f"{feature} - mean: {data.mean():.2f}, std: {data.std():.2f}")

## 7. Brief Analysis
- Plot at least one example of a daily consumption profile for each season.
- Include 1–2 personal observations (e.g., “We see that winter consumption is
generally higher than summer.”). This helps ensure you’re truly analyzing the
data beyond a purely automated approach.

I will plot first day of each season after scaling data

In [None]:
plt.figure(figsize=(18, 12))
seasons = ['winter', 'spring', 'summer', 'autumn']

for i, season in enumerate(seasons, 1):
    # take first day of each season
    season_data = train_df_scaled[train_df_scaled['season'] == season].iloc[0]

    plt.subplot(2, 2, i)

    # plot lines for load, solar and wind
    plt.plot(season_data['load_24h'], label='Load', color='blue')
    plt.plot(season_data['wind_24h'], label='Wind', color='green')
    plt.plot(season_data['solar_24h'], label='Solar', color='orange')

    plt.title(f'{season.capitalize()} (Scaled)')
    plt.xlabel('Hour of Day')
    plt.ylabel('Scaled Value')
    plt.axhline(y=0, color='gray', linestyle='--')  # Нулевая линия
    plt.legend()
    plt.grid(True)

plt.tight_layout()
plt.show()

- We may notice that energy consumption is much higher in winter than in summer.
- There is also more solar energy generated in summer than in winter.

# Task 2: Baseline MLP (Fully-Connected Network) (15 points)

### 1. Implement an MLP in PyTorch with at least:
- One hidden layer (e.g., 32–128 neurons).
- Non-linear activation (e.g., ReLU).

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
from matplotlib import pyplot as plt
from torch.utils.data import DataLoader, TensorDataset
import numpy as np
from sklearn.metrics import accuracy_score, classification_report


class MLP(nn.Module):
    #These parameters provided an optimal balance between the learning rate and the quality of the model for this task. hidden_size=64 and Learning Rate (0.001):
    def __init__(self, input_size=72, hidden_size=64, output_size=4):
        super(MLP, self).__init__()
        self.fc1 = nn.Linear(input_size, hidden_size)
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(hidden_size, output_size)

    def forward(self, x):
        x = self.fc1(x)
        x = self.relu(x)
        x = self.fc2(x)
        return x

### 2. Input Representation
- Flatten the 24-hour sequence into a 24-dimensional vector.
- Normalize or standardize the inputs if needed.

In [None]:
def prepare_data(df):
    features = np.array([np.concatenate([day['load_24h'],
                                         day['wind_24h'],
                                         day['solar_24h']])
                         for day in df.to_dict('records')])
    season_to_idx = {'winter': 0, 'spring': 1, 'summer': 2, 'autumn': 3}
    labels = np.array([season_to_idx[day['season']] for day in df.to_dict('records')])
    return torch.FloatTensor(features), torch.LongTensor(labels)

### 3. Train & Evaluate
- Show training curves for loss and accuracy.
- Evaluate on both validation and test sets, and report final accuracy.

In [None]:
## here we train model with 50 epochs and 32 batch_size

def train_model(train_df, val_df, epochs=50, batch_size=32):
    # prepare data
    X_train, y_train = prepare_data(train_df)
    X_val, y_val = prepare_data(val_df)

    train_dataset = TensorDataset(X_train, y_train)
    train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)

    # create model
    model = MLP(input_size=72, hidden_size=64, output_size=4)
    criterion = nn.CrossEntropyLoss()
    optimizer = optim.Adam(model.parameters(), lr=0.001)

    #save data for plot
    train_losses, val_losses = [], []
    train_accs, val_accs = [], []

    for epoch in range(epochs):
        model.train()
        running_loss = 0.0
        correct = 0
        total = 0

        for inputs, labels in train_loader:
            optimizer.zero_grad()
            outputs = model(inputs)
            loss = criterion(outputs, labels)
            loss.backward()
            optimizer.step()

            running_loss += loss.item()
            _, predicted = torch.max(outputs.data, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()

        model.eval()
        with torch.no_grad():
            val_outputs = model(X_val)
            val_loss = criterion(val_outputs, y_val)
            _, val_predicted = torch.max(val_outputs.data, 1)
            val_acc = accuracy_score(y_val.numpy(), val_predicted.numpy())

        train_loss = running_loss / len(train_loader)
        train_acc = correct / total
        train_losses.append(train_loss)
        val_losses.append(val_loss.item())
        train_accs.append(train_acc)
        val_accs.append(val_acc)

        # print results of epochs
        print(f'Epoch {epoch + 1}/{epochs}: '
              f'Train Loss: {train_loss:.4f}, Val Loss: {val_loss.item():.4f}, '
              f'Train Acc: {train_acc:.4f}, Val Acc: {val_acc:.4f}')

    return model, train_losses, val_losses, train_accs, val_accs


model, train_losses, val_losses, train_accs, val_accs = train_model(train_df_scaled, val_df_scaled, epochs=50)

### Check accurancy on test data, print accuracy, final “test set” predictions and classification report

- we have test accuracy 0.8730
- Classification Report: to view it run code

In [None]:
def evaluate_model(model, test_df):
    X_test, y_test = prepare_data(test_df)
    model.eval()
    with torch.no_grad():
        outputs = model(X_test)
        _, predicted = torch.max(outputs.data, 1)
        accuracy = accuracy_score(y_test.numpy(), predicted.numpy())
    print(f'Test Accuracy: {accuracy:.4f}')

    print("\nTest Set Predictions:")
    print(predicted.numpy())

    print("\nClassification Report:")
    print(classification_report(y_test.numpy(), predicted.numpy(),
                                target_names=['winter', 'spring', 'summer', 'autumn']))
    return accuracy


evaluate_model(model, test_df_scaled)

### Plots

In [None]:
def plot_metrics(train_losses, val_losses, train_accs, val_accs):
    plt.figure(figsize=(12, 5))

    plt.subplot(1, 2, 1)
    plt.plot(train_losses, label='Train Loss')
    plt.plot(val_losses, label='Val Loss')
    plt.title('Training and Validation Loss')
    plt.xlabel('Epoch')
    plt.ylabel('Loss')
    plt.legend()

    plt.subplot(1, 2, 2)
    plt.plot(train_accs, label='Train Accuracy')
    plt.plot(val_accs, label='Val Accuracy')
    plt.title('Training and Validation Accuracy')
    plt.xlabel('Epoch')
    plt.ylabel('Accuracy')
    plt.legend()

    plt.tight_layout()
    plt.show()

plot_metrics(train_losses, val_losses, train_accs, val_accs)

# Task 3: 1D-CNN on Raw Time-Series (20 points)

### 1. 1D Convolution Architecture
- Use PyTorch Conv1d layers to process 3-channels sequences of length 24.
- At least one convolutional layer, one pooling layer, and a final fully connected
layer for classification.

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset
import numpy as np
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
import pandas as pd

class CNN1D(nn.Module):
    def __init__(self, input_channels=3, seq_len=24, num_classes=4):
        super(CNN1D, self).__init__()

        # (batch_size, 3, 24) -> (batch_size, 32, 22)
        self.conv1 = nn.Conv1d(input_channels, 32, kernel_size=3, padding='valid')
        self.relu = nn.ReLU()

        # (batch_size, 32, 22) -> (batch_size, 32, 11)
        self.pool = nn.MaxPool1d(kernel_size=2)

        # (batch_size, 32 * 11) -> (batch_size, 64)
        self.fc1 = nn.Linear(32 * 11, 64)

        # (batch_size, 64) -> (batch_size, 4)
        self.fc2 = nn.Linear(64, num_classes)

    def forward(self, x):
        x = self.conv1(x)
        x = self.relu(x)
        x = self.pool(x)

        x = x.view(x.size(0), -1)  # Flatten
        x = self.fc1(x)
        x = self.relu(x)
        x = self.fc2(x)

        return x

def prepare_cnn_data(df):
    features = np.array([np.stack([day['load_24h'],
                                   day['wind_24h'],
                                   day['solar_24h']], axis=0)
                         for day in df.to_dict('records')])

    season_to_idx = {'winter': 0, 'spring': 1, 'summer': 2, 'autumn': 3}
    labels = np.array([season_to_idx[day['season']] for day in df.to_dict('records')])

    return torch.FloatTensor(features), torch.LongTensor(labels)

### Train and validate cnn

In [None]:
def train_cnn(train_df, val_df, epochs=50, batch_size=32):
    X_train, y_train = prepare_cnn_data(train_df)
    X_val, y_val = prepare_cnn_data(val_df)

    train_dataset = TensorDataset(X_train, y_train)
    train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)

    model = CNN1D(input_channels=3, seq_len=24, num_classes=4)
    criterion = nn.CrossEntropyLoss()
    optimizer = optim.Adam(model.parameters(), lr=0.001)

    train_losses, val_losses = [], []
    train_accs, val_accs = [], []

    for epoch in range(epochs):
        model.train()
        running_loss = 0.0
        correct = 0
        total = 0

        for inputs, labels in train_loader:
            optimizer.zero_grad()
            outputs = model(inputs)
            loss = criterion(outputs, labels)
            loss.backward()
            optimizer.step()

            running_loss += loss.item()
            _, predicted = torch.max(outputs.data, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()


        model.eval()
        with torch.no_grad():
            val_outputs = model(X_val)
            val_loss = criterion(val_outputs, y_val)
            _, val_predicted = torch.max(val_outputs.data, 1)
            val_acc = accuracy_score(y_val.numpy(), val_predicted.numpy())

        train_loss = running_loss / len(train_loader)
        train_acc = correct / total
        train_losses.append(train_loss)
        val_losses.append(val_loss.item())
        train_accs.append(train_acc)
        val_accs.append(val_acc)

        print(f'Epoch {epoch + 1}/{epochs}: '
              f'Train Loss: {train_loss:.4f}, Val Loss: {val_loss.item():.4f}, '
              f'Train Acc: {train_acc:.4f}, Val Acc: {val_acc:.4f}')

    return model, train_losses, val_losses, train_accs, val_accs

model_cnn, train_losses_cnn, val_losses_cnn, train_accs_cnn, val_accs_cnn = train_cnn(
        train_df_scaled, val_df_scaled, epochs=50)


### CNN Test Accurancy, Test Set Predictions and Classification Report

In [None]:
def evaluate_cnn(model, test_df):
    X_test, y_test = prepare_cnn_data(test_df)
    model.eval()
    with torch.no_grad():
        outputs = model(X_test)
        _, predicted = torch.max(outputs, 1)

        accuracy = accuracy_score(y_test.numpy(), predicted.numpy())
        print(f'Test Accuracy: {accuracy:.4f}')

        print("\nTest Set Predictions:")
        print(predicted.numpy())

        print("\nClassification Report:")
        print(classification_report(y_test.numpy(), predicted.numpy(),
                                    target_names=['winter', 'spring', 'summer', 'autumn']))

    return

test_acc = evaluate_cnn(model_cnn, test_df_scaled)



### MLP vs CNN

- CNN accuracy: 0.8889
- MLP accuracy: 0.8730

# Task 4: 2D Transform & 2D-CNN (20 points)

### 1. Choose a transformation from the possible options:
- Option A: Use pyts (e.g., GramianAngularField, RecurrencePlot, or
MarkovTransitionField).
- Option B: Create a simple matrix that arranges the data in a 2D pattern
(though 24 points is small, you might artificially reshape it into 6×4 or some
other arrangement).
- Option C: Another custom transformation that you can justify

 #### GAF is the best choice, as it transforms time series into a spatial representation that preserves seasonal patterns, which is ideal for 2D CNNs. This is also confirmed by practice: GAF is often used in time series classification tasks through images (for example, in energy and bioinformatics).

In [None]:
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader
from pyts.image import GramianAngularField
import matplotlib.pyplot as plt
from sklearn.preprocessing import LabelEncoder, StandardScaler



def transform_to_gaf(data_2d, image_size=24):
    transformer = GramianAngularField(image_size=image_size, method='summation')
    return transformer.transform(data_2d)


def apply_gaf_transformation(df):
    df_gaf = df.copy()
    features = ['load', 'wind', 'solar']

    for feature in features:
        data = np.vstack(df[f'{feature}_24h'])
        gaf_images = transform_to_gaf(data)
        df_gaf[f'{feature}_gaf'] = [img for img in gaf_images]

    return df_gaf


# Dataset и DataLoader
class EnergyDataset(Dataset):
    def __init__(self, df, labels):
        self.load_images = np.stack(df['load_gaf'])
        self.wind_images = np.stack(df['wind_gaf'])
        self.solar_images = np.stack(df['solar_gaf'])
        self.labels = labels

    def __len__(self):
        return len(self.labels)

    def __getitem__(self, idx):
        image = np.stack([
            self.load_images[idx],
            self.wind_images[idx],
            self.solar_images[idx]
        ], axis=0)
        return torch.FloatTensor(image), torch.LongTensor([self.labels[idx]])

### 2. Train a 2D CNN
- Use standard 2D convolutions (Conv2d in PyTorch).
- At least one pooling layer, at least one hidden conv layer, and a final fullyconnected block.

In [None]:
class SeasonClassifier2DCNN(nn.Module): #CNN2D class
    def __init__(self, num_classes=4):
        super(SeasonClassifier2DCNN, self).__init__()
        # Input shape: (batch_size, 3, 24, 24)
        # Output shape: (batch_size, 16, 24, 24) [padding=1 preserves spatial dims]
        self.conv1 = nn.Conv2d(3, 16, kernel_size=3, stride=1, padding=1)
        self.relu1 = nn.ReLU()  # Shape preserved: (batch_size, 16, 24, 24)

        # Input: (batch_size, 16, 24, 24)
        # Output: (batch_size, 16, 12, 12) [kernel_size=2 halves the dimensions]
        self.pool1 = nn.MaxPool2d(kernel_size=2)

        # Input: (batch_size, 16, 12, 12)
        # Output: (batch_size, 32, 12, 12) [padding=1 preserves spatial dims]
        self.conv2 = nn.Conv2d(16, 32, kernel_size=3, stride=1, padding=1)
        self.relu2 = nn.ReLU()  # Shape preserved: (batch_size, 32, 12, 12)

        # Input: (batch_size, 32, 12, 12)
        # Output: (batch_size, 32, 6, 6) [kernel_size=2 halves the dimensions]
        self.pool2 = nn.MaxPool2d(kernel_size=2)

        # Flatten occurs here in forward(): (batch_size, 32, 6, 6) -> (batch_size, 32*6*6=1152)

        # Input: (batch_size, 1152)
        # Output: (batch_size, 128)
        self.fc1 = nn.Linear(32 * 6 * 6, 128)
        self.relu3 = nn.ReLU()  # Shape preserved: (batch_size, 128)

        # Input: (batch_size, 128)
        # Output: (batch_size, num_classes)
        self.fc2 = nn.Linear(128, num_classes)

    def forward(self, x):
        x = self.pool1(self.relu1(self.conv1(x)))
        x = self.pool2(self.relu2(self.conv2(x)))
        x = x.view(x.size(0), -1)
        x = self.relu3(self.fc1(x))
        x = self.fc2(x)
        return x

def train_model(model, train_loader, val_loader, epochs=20): ## train model
    criterion = nn.CrossEntropyLoss()
    optimizer = optim.Adam(model.parameters(), lr=0.001)

    train_losses = []
    val_losses = []
    train_accuracies = []
    val_accuracies = []

    for epoch in range(epochs):
        model.train()
        running_loss = 0.0
        train_correct = 0
        train_total = 0
        for inputs, labels in train_loader:
            optimizer.zero_grad()
            outputs = model(inputs)
            loss = criterion(outputs, labels.squeeze())
            loss.backward()
            optimizer.step()

            running_loss += loss.item()


            _, predicted = torch.max(outputs.data, 1)
            train_total += labels.size(0)
            train_correct += (predicted == labels.squeeze()).sum().item()

        train_loss = running_loss / len(train_loader)
        train_accuracy = train_correct / train_total

        train_losses.append(train_loss)
        train_accuracies.append(train_accuracy)

        # Валидация
        model.eval()
        val_loss = 0.0
        val_correct = 0
        val_total = 0

        with torch.no_grad():
            for inputs, labels in val_loader:
                outputs = model(inputs)
                loss = criterion(outputs, labels.squeeze())
                val_loss += loss.item()

                _, predicted = torch.max(outputs.data, 1)
                val_total += labels.size(0)
                val_correct += (predicted == labels.squeeze()).sum().item()

        val_loss = val_loss / len(val_loader)
        val_accuracy = val_correct / val_total

        val_losses.append(val_loss)
        val_accuracies.append(val_accuracy)
        #dispaly results
        print(f'Epoch {epoch + 1}/{epochs}: '
              f'Train Loss: {train_loss:.4f}, '
              f'Train Acc: {train_accuracy:.4f}, '
              f'Val Loss: {val_loss:.4f}, '
              f'Val Acc: {val_accuracy:.4f}')

    return train_losses, train_accuracies, val_losses, val_accuracies


## Train model, print train, validation accuracy

In [None]:
train_gaf = apply_gaf_transformation(train_df_scaled)
val_gaf = apply_gaf_transformation(val_df_scaled)
test_gaf = apply_gaf_transformation(test_df_scaled)

# 3. Подготовка DataLoader
le = LabelEncoder()
train_labels = le.fit_transform(train_gaf['season'])
val_labels = le.transform(val_gaf['season'])
test_labels = le.transform(test_gaf['season'])

batch_size = 32
train_dataset = EnergyDataset(train_gaf, train_labels)
val_dataset = EnergyDataset(val_gaf, val_labels)
test_dataset = EnergyDataset(test_gaf, test_labels)

train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=batch_size)
test_loader = DataLoader(test_dataset, batch_size=batch_size)

# 4. Создание и обучение модели
model = SeasonClassifier2DCNN()
train_losses, val_losses, val_accuracies, d = train_model(model, train_loader, val_loader)

## Test model on test data, classification report

In [None]:
from sklearn.metrics import classification_report

model.eval()
correct = 0
total = 0
all_preds = []
all_labels = []

with torch.no_grad():
    for inputs, labels in test_loader:
        outputs = model(inputs)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels.squeeze()).sum().item()

        all_preds.extend(predicted.cpu().numpy())
        all_labels.extend(labels.squeeze().cpu().numpy())

test_accuracy = correct / total
print(f'Test Accuracy: {test_accuracy:.4f}')

# Classification Report
class_names = ['winter', 'spring', 'summer', 'autumn']
print("\nClassification Report:")
print(classification_report(
    all_labels,
    all_preds,
    target_names=class_names,
    digits=4
))

# Model Comparison Results

**1D-CNN** showed better accuracy (**0.8889**) than:
- MLP (**0.8730**)
- 2D-CNN (**0.8190**)

## Key Observations:
- The **GAF conversion**, although visually interpretable, **did not improve accuracy**
  → Likely due to the loss of temporal detail during 2D transformation.

## Conclusion:
For this specific task, **raw time series (1D-CNN)** proved more efficient than image-based (2D-CNN) approaches.

# Explanation of Gramian Angular Field (GAF) Choice

The **Gramian Angular Field (GAF)** was chosen for transforming 1D time-series data into 2D images because it effectively preserves temporal correlations by encoding values into polar coordinates and calculating trigonometric relationships. This method is particularly useful for periodic or seasonal patterns (like energy consumption data), as it highlights:

- **Time-dependent dynamics**: Each point in the GAF matrix represents the temporal interaction between two time steps.

- **Invariance to scaling**: The angular representation normalizes amplitude variations, focusing on shape.

## Why GAF helps in this task:

1. **Seasonal Patterns**: Solar generation has strong daily/seasonal cycles, which GAF captures as distinct diagonal structures.

2. **Multi-channel compatibility**: GAF processes each feature (load/wind/solar) separately, allowing the CNN to learn cross-feature spatial patterns.

> **Citation**:
> *"The Gramian Angular Field (GAF) represents time series in a polar coordinate system, followed by a trigonometric operation to identify temporal correlations"*
> — PyTS Documentation, https://pyts.readthedocs.io/en/stable/generated/pyts.image.GramianAngularField.html