# DimABSA 2026 - DESS Model with VA Regression
## Dimensional Aspect Sentiment Triplet Extraction (Subtask 2)

**Task**: Extract (Aspect, Opinion, VA) triplets with continuous Valence-Arousal scores

**Dataset**: Combined Restaurant + Laptop (3,727 training samples)

**Model**: DESS with VA regression head (2 outputs instead of 3-class classification)

## 1. Clone Repository from GitHub

In [None]:
!git clone https://github.com/VishalRepos/dimabsa-2026.git
%cd dimabsa-2026/DESS/Codebase
!echo "\n=== Repository Cloned ==="
!git log --oneline -3

## 2. Check GPU Availability

In [None]:
!nvidia-smi

## 3. Install Dependencies

In [None]:
# Install required packages
!pip install -q torch==2.1.0
!pip install -q transformers==4.36.0
!pip install -q numpy==1.26.4
!pip install -q scikit-learn==1.3.2
!pip install -q tqdm==4.65.0
!pip install -q tensorboardX==2.6
!pip install -q matplotlib==3.8.0
!pip install -q torch_geometric==2.4.0

# Restart kernel
import os
os.kill(os.getpid(), 9)

## 4. Verify Setup

In [None]:
import torch
import numpy as np
import transformers

print("=== Package Versions ===")
print(f"PyTorch: {torch.__version__}")
print(f"Transformers: {transformers.__version__}")
print(f"NumPy: {np.__version__}")

print("\n=== GPU Information ===")
print(f"CUDA Available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")

## 5. Verify Data Files

In [None]:
import json

# Check data directory
!ls -lh data/dimabsa_combined/

# Load and verify data
train_data = json.load(open('data/dimabsa_combined/train_dep_triple_polarity_result.json'))
test_data = json.load(open('data/dimabsa_combined/test_dep_triple_polarity_result.json'))

print(f"\n=== Data Verification ===")
print(f"Training samples: {len(train_data)}")
print(f"Test samples: {len(test_data)}")

# Check sample structure
sample = train_data[0]
print(f"\nSample structure:")
print(f"  Tokens: {len(sample['tokens'])}")
print(f"  Entities: {len(sample['entities'])}")
print(f"  Sentiments: {len(sample['sentiments'])}")
if sample['sentiments']:
    print(f"  VA format: {sample['sentiments'][0]['type']}")

## 6. Training Configuration

In [None]:
# Training parameters
DATASET = "dimabsa_combined"
EPOCHS = 10
BATCH_SIZE = 4
MAX_SPAN_SIZE = 10
SEED = 42

print("=== Training Configuration ===")
print(f"Dataset: {DATASET}")
print(f"Epochs: {EPOCHS}")
print(f"Batch Size: {BATCH_SIZE}")
print(f"Max Span Size: {MAX_SPAN_SIZE}")
print(f"Seed: {SEED}")

## 7. Run Training - VA Regression
### Expected: Model learns to predict continuous VA scores [1.0-9.0]

In [None]:
!python train.py \
    --seed 42 \
    --max_span_size 10 \
    --batch_size 4 \
    --epochs 10 \
    --dataset dimabsa_combined \
    --pretrained_deberta_name microsoft/deberta-v3-base \
    --deberta_feature_dim 768 \
    --hidden_dim 384 \
    --emb_dim 768 \
    --lstm_layers 1

## 8. View Training Logs

In [None]:
!ls -lh log/dimabsa_combined/
!echo "\n=== Latest Training Log ==="
!tail -50 log/dimabsa_combined/train_*.log

## 9. Check Best Model

In [None]:
!ls -lh savemodels/dimabsa_combined/

## 10. Evaluate Results

In [None]:
import json
import glob

# Find latest results
result_files = glob.glob('log/dimabsa_combined/*.json')
if result_files:
    latest_file = max(result_files, key=lambda x: os.path.getmtime(x))
    print(f"Results file: {latest_file}\n")
    
    with open(latest_file, 'r') as f:
        results = json.load(f)
    
    print("=== Best Results ===")
    for key, value in results.items():
        print(f"{key}: {value}")
else:
    print("No results found. Training may still be running.")

## 11. Download Trained Model

In [None]:
# List saved models
!ls -lh savemodels/dimabsa_combined/

print("\n=== Download Instructions ===")
print("1. Go to Output panel (right side)")
print("2. Navigate to savemodels/dimabsa_combined/")
print("3. Download the best model file")
print("4. Also download training logs from log/dimabsa_combined/")

## 12. Training Summary

### Model Configuration:
- **Base Model**: DeBERTa-v3-base
- **Task**: VA Regression (2 outputs: valence, arousal)
- **Loss**: MSE Loss for continuous prediction
- **Dataset**: 3,727 training samples (restaurant + laptop)

### Expected Output:
- Trained model checkpoint
- Training logs with loss curves
- Evaluation metrics

### Next Steps:
1. Download trained model
2. Run inference on test data
3. Generate submission file
4. Submit to DimABSA competition

### Files to Download:
```
savemodels/dimabsa_combined/best_model.pt
log/dimabsa_combined/train_*.log
log/dimabsa_combined/results_*.json
```