# 07 â€” Real Dataset Walkthrough (D4RL)

This notebook shows how to load a D4RL dataset and map it to the CRL data
contract. D4RL datasets do **not** include behavior propensities, so any IS/DR
estimator requires an estimated logging policy or a model-based alternative.

If D4RL is not installed, we fall back to a synthetic example to keep the
notebook executable.

## Setup

```
pip install "causalrl[d4rl]"
```

In [1]:
from __future__ import annotations

from pathlib import Path

import numpy as np

from crl.ope import evaluate
from crl.utils.seeding import set_seed

In [2]:
set_seed(0)
np.random.seed(0)

## Load D4RL

In [3]:
dataset = None
report = None
try:
    from crl.adapters.d4rl import load_d4rl_dataset

    dataset = load_d4rl_dataset("hopper-medium-v2")
    dataset.describe()
except Exception as exc:
    print("D4RL unavailable; falling back to synthetic data:", exc)

D4RL unavailable; falling back to synthetic data: d4rl and gym are required for D4RL dataset loading.


## Load RL Unplugged (optional)

RL Unplugged datasets are accessed via tensorflow-datasets. We keep this
example small and optional so the notebook still runs without TFDS.

In [None]:
rlu_dataset = None
try:
    from crl.adapters.rl_unplugged import load_rl_unplugged_dataset

    rlu_dataset = load_rl_unplugged_dataset(
        dataset_name="d4rl_mujoco_hopper/v2-medium",
        split="train",
        max_episodes=1,
        return_type="transition",
    )
    rlu_dataset.describe()
except Exception as exc:
    print("RL Unplugged unavailable; skipping:", exc)

## Fallback: synthetic dataset for report demo

We still generate a report artifact so reviewers can see the pipeline end-to-end.

In [4]:
if dataset is None:
    from crl.benchmarks.bandit_synth import SyntheticBandit, SyntheticBanditConfig

    benchmark = SyntheticBandit(SyntheticBanditConfig(seed=0))
    dataset = benchmark.sample(num_samples=1_000, seed=1)
    report = evaluate(dataset=dataset, policy=benchmark.target_policy)
    report.summary_table()
else:
    print(
        "D4RL dataset loaded. OPE estimators requiring propensities are not applicable."
    )

## Report snapshot

When using the synthetic fallback, we show a quick estimator comparison.

In [None]:
if report is not None:
    summary = report.summary_table()
    print(
        summary[["estimator", "value", "lower_bound", "upper_bound"]]
        .round(3)
        .to_string(index=False)
    )
    summary

In [None]:
if report is not None:
    fig = report.plot_estimator_comparison()
    fig

In [None]:
if report is not None and dataset.behavior_action_probs is not None:
    weights = (
        benchmark.target_policy.action_prob(dataset.contexts, dataset.actions)
        / dataset.behavior_action_probs
    )
    fig_w = report.plot_importance_weights(weights, logy=True)
    fig_w

## Save HTML report artifact

In [5]:
if dataset is not None:
    output_dir = Path("docs/assets/reports")
    output_dir.mkdir(parents=True, exist_ok=True)
    report_path = output_dir / "d4rl_report.html"
    try:
        report.save_html(str(report_path))
        report_path
    except Exception as exc:
        print("Report generation skipped:", exc)

## What went wrong (and how to fix it)

- D4RL logs do not include behavior propensities.
- IS/DR estimators require propensities or an estimated logging policy.
- Use behavior estimation (if discrete) or model-based OPE until propensities
  are available.