# 05 â€” Advanced DR Family (WDR, MAGIC, MRDR)

This notebook compares DR-family estimators that blend model-based and
importance-weighted signals to reduce variance.

## Setup

```
pip install "causalrl[plots]"
```

In [1]:
from __future__ import annotations

import numpy as np

from crl.assumptions import AssumptionSet
from crl.assumptions_catalog import MARKOV, OVERLAP, SEQUENTIAL_IGNORABILITY
from crl.benchmarks.mdp_synth import SyntheticMDP, SyntheticMDPConfig
from crl.estimands.policy_value import PolicyValueEstimand
from crl.estimators.dr import DRCrossFitConfig, DoublyRobustEstimator
from crl.estimators.magic import MAGICConfig, MAGICEstimator
from crl.estimators.mrdr import MRDRConfig, MRDREstimator
from crl.estimators.wdr import WDRConfig, WeightedDoublyRobustEstimator
from crl.ope import evaluate
from crl.utils.seeding import set_seed

In [2]:
set_seed(0)
np.random.seed(0)

## Run estimators

In [3]:
benchmark = SyntheticMDP(SyntheticMDPConfig(seed=0, horizon=5))
dataset = benchmark.sample(num_trajectories=200, seed=1)
true_value = benchmark.true_policy_value(benchmark.target_policy)

report = evaluate(
    dataset=dataset,
    policy=benchmark.target_policy,
    estimators=["dr", "wdr", "magic", "mrdr"],
)
report.summary_table()

  weights_norm = np.divide(weights, weights_sum, where=weights_sum > 0)


Unnamed: 0,value,stderr,ci,diagnostics,assumptions_checked,assumptions_flagged,warnings,metadata,lower_bound,upper_bound,estimator
0,-85.13446,87.018072,"(-255.68988233413336, 85.42096150884583)",{'overlap': {'min_behavior_prob': 0.0140824109...,"[sequential_ignorability, overlap, markov]",[],[Effective sample size ratio below threshold; ...,"{'estimator': 'DR', 'config': {'num_folds': 2,...",-255.689882,85.420962,DR
1,0.013405,0.00625,"(0.0011540526255607413, 0.025655608868344726)",{'overlap': {'min_behavior_prob': 0.0140824109...,"[sequential_ignorability, overlap, markov]",[],[Effective sample size ratio below threshold; ...,"{'estimator': 'WDR', 'config': {'num_folds': 2...",0.001154,0.025656,WDR
2,1.195299,0.034648,"(1.127389973841737, 1.2632090232610396)",{'overlap': {'min_behavior_prob': 0.0140824109...,"[sequential_ignorability, overlap, markov]",[],[Effective sample size ratio below threshold; ...,"{'estimator': 'MAGIC', 'weights': [0.945461714...",1.12739,1.263209,MAGIC
3,-85.250828,87.134798,"(-256.0350326585504, 85.53337656687883)",{'overlap': {'min_behavior_prob': 0.0140824109...,"[sequential_ignorability, overlap, markov]",[],[Effective sample size ratio below threshold; ...,"{'estimator': 'MRDR', 'config': {'num_folds': ...",-256.035033,85.533377,MRDR


## Custom DR-family configuration

Each estimator exposes a config object for cross-fitting, ridge strengths, and
mixing parameters. We instantiate them directly to show the controls.

In [None]:
estimand = PolicyValueEstimand(
    policy=benchmark.target_policy,
    discount=dataset.discount,
    horizon=dataset.horizon,
    assumptions=AssumptionSet([SEQUENTIAL_IGNORABILITY, OVERLAP, MARKOV]),
)

dr_config = DRCrossFitConfig(num_folds=3, num_iterations=3, ridge=5e-3, seed=0)
wdr_config = WDRConfig(num_folds=3, num_iterations=3, ridge=5e-3, seed=0)
magic_config = MAGICConfig(num_iterations=4, ridge=1e-3, mixing_horizons=(0, 2, 4))
mrdr_config = MRDRConfig(num_folds=3, num_iterations=3, ridge=5e-3, seed=0)

custom_estimators = [
    DoublyRobustEstimator(estimand, config=dr_config),
    WeightedDoublyRobustEstimator(estimand, config=wdr_config),
    MAGICEstimator(estimand, config=magic_config),
    MRDREstimator(estimand, config=mrdr_config),
]

rows = []
for estimator in custom_estimators:
    report = estimator.estimate(dataset)
    rows.append(
        {
            "estimator": report.metadata["estimator"],
            "value": report.value,
            "stderr": report.stderr,
            "ci": report.ci,
        }
    )

rows

## Diagnostics

Advanced DR estimators report weight normalization behavior and per-step ESS
to help you understand variance tradeoffs.

In [4]:
report.diagnostics

{'overlap': {'min_behavior_prob': 0.014082410934770993,
  'min_target_prob': 0.028531739031223108,
  'fraction_behavior_below_threshold': 0.0,
  'fraction_target_below_threshold': 0.0,
  'ratio_min': 0.046320675647481456,
  'ratio_max': 38.73110147278951,
  'ratio_q50': 0.7088565782300077,
  'ratio_q90': 2.443063185546507,
  'ratio_q99': 7.510195773254278,
  'support_violations': 0},
 'ess': {'ess': 1.0123257289260863, 'ess_ratio': 0.005061628644630432},
 'weights': {'min': 8.867227348146143e-06,
  'max': 13584.63027719639,
  'mean': 68.34054500690395,
  'std': 958.1453329642768,
  'q95': 1.7178894396174047,
  'q99': 8.839043479963573,
  'skew': 14.035805359837088,
  'kurtosis': 195.00423352742501,
  'tail_fraction': 0.005},
 'max_weight': 13584.63027719639,
 'model': {},
 'shift': {'mmd_rbf': 0.0012203607446121811,
  'mean_shift_norm': 0.06361967577728,
  'cov_shift_fro': 0.13102568876799836,
  'ess': 127.20585400425024}}

## Takeaways

- WDR normalizes per-step weights to reduce variance.
- MAGIC blends truncated estimators based on variance estimates.
- MRDR trains the model component to minimize DR variance.