# Analyzing and Mitigating Dataset Artifacts in NLI

**Project:** Final Project - CS388  
**Dataset:** SNLI (Stanford Natural Language Inference)  
**Model:** ELECTRA-small  
**Goal:** Detect and mitigate dataset artifacts using hypothesis-only baselines and ensemble debiasing

## Project Structure
- **Part 1: Analysis** - Detect artifacts and analyze model errors
- **Part 2: Fix** - Implement and evaluate debiasing method


## Setup and Installation


In [None]:
# Connecting using personal token

import os
from google.colab import userdata

os.environ['gituser'] = userdata.get('gituser')
os.environ['gitpw'] = userdata.get('gitpw')
os.environ['REPO'] = 'fp-dataset-artifacts'

!git clone https://$gituser:$gitpw@github.com/$gituser/$REPO.git

In [None]:
# Install required packages
%pip install -q -r requirements.txt


## Part 1: Analysis

### Part 1.1: Baseline Model Training

Train a standard NLI model on SNLI dataset using both premise and hypothesis.


In [None]:
!python train/run.py --do_train --do_eval --task nli --dataset snli --model google/electra-small-discriminator --output_dir ./outputs/evaluations/baseline_100k/ --max_train_samples 100000 --num_train_epochs 3 --per_device_train_batch_size 32 --per_device_eval_batch_size 32 --max_length 128 --learning_rate 2e-5


In [None]:
# Check baseline results
with open(os.path.join(PROJECT_ROOT, 'outputs', 'evaluations', 'baseline_100k', 'eval_metrics.json'), 'r') as f:
    baseline_metrics = json.load(f)

print("=" * 80)
print("Baseline Model Results")
print("=" * 80)
print(f"Accuracy: {baseline_metrics['eval_accuracy']:.4f} ({baseline_metrics['eval_accuracy']*100:.2f}%)")
print(f"Eval Loss: {baseline_metrics.get('eval_loss', 'N/A')}")


### Part 1.2: Artifact Detection - Hypothesis-Only Model

Train a model that only sees the hypothesis (not the premise) to detect dataset artifacts.  
If this model achieves >33.33% accuracy (random baseline), it indicates strong artifacts exist.


In [None]:
!python train/train_hypothesis_only.py


In [None]:
# Check hypothesis-only results
with open(os.path.join(PROJECT_ROOT, 'outputs', 'evaluations', 'hypothesis_only_model', 'eval_metrics.json'), 'r') as f:
    hyp_metrics = json.load(f)

hyp_accuracy = hyp_metrics['eval_accuracy']
random_baseline = 1.0 / 3.0
above_random = hyp_accuracy - random_baseline

print("=" * 80)
print("Hypothesis-Only Model Results (Artifact Detection)")
print("=" * 80)
print(f"Accuracy: {hyp_accuracy:.4f} ({hyp_accuracy*100:.2f}%)")
print(f"Random Baseline: {random_baseline:.4f} ({random_baseline*100:.2f}%)")
print(f"Above Random: {above_random:.4f} ({above_random*100:.2f}%)")
print(f"\n{'STRONG ARTIFACTS DETECTED!' if above_random > 0.2 else 'Weak artifacts detected' if above_random > 0.1 else 'No significant artifacts'}")


### Part 1.3: Baseline Error Analysis

Analyze the baseline model's errors, confusion patterns, and identify artifact-related mistakes.


In [None]:
!python analyze/error_analysis.py


### Part 1.4: Visualizations - Baseline Model

Create visualizations to show error patterns and confusion matrices.


In [None]:
!python analyze/visualize_baseline.py


## Part 2: Fix - Debiasing Implementation

### Part 2.1: Train Debiased Model

Train a debiased model using confidence-based reweighting.  
Examples where the hypothesis-only model is confident (likely artifacts) are downweighted.


In [None]:
!python train/train_debiased.py


In [None]:
# Check debiased results
import json
with open(os.path.join(PROJECT_ROOT, 'outputs', 'evaluations', 'debiased_model', 'eval_metrics.json'), 'r') as f:
    debiased_metrics = json.load(f)

print("=" * 80)
print("Debiased Model Results")
print("=" * 80)
print(f"Accuracy: {debiased_metrics['eval_accuracy']:.4f} ({debiased_metrics['eval_accuracy']*100:.2f}%)")
print(f"Eval Loss: {debiased_metrics.get('eval_loss', 'N/A')}")


In [None]:
### Part 2.2: Results Comparison and Analysis

Compare baseline vs debiased model performance.


In [None]:
!python analyze/compare_results.py


### Part 2.3: Visualizations - Comparison

Create visualizations comparing baseline and debiased models.


In [1]:
!python analyze/visualize_comparison.py


python3: can't open file '/content/analyze/visualize_comparison.py': [Errno 2] No such file or directory


In [None]:
!python analyze/show_fixes.py


python3: can't open file '/content/analyze/show_fixes.py': [Errno 2] No such file or directory


## Summary

### Key Results:
- **Hypothesis-Only**: 60.80% (proves strong artifacts exist - 27.47% above random)
- **Baseline**: 86.54% (standard model performance)
- **Debiased**: 86.42% (maintains performance while reducing artifact dependence)

### Conclusions:
1. **Strong artifacts detected** in SNLI dataset
2. **Debiasing method works** - maintains overall accuracy
3. **Framework provides** quantitative artifact detection and mitigation

### Next Steps:
- Use these results for paper writing
- Reference `ANALYSIS_RESULTS.md` and `PAPER_OUTLINE.md` for detailed analysis
- All results saved in `outputs/evaluations/` directory
