# Univariate Drift Analysis with tab-right
This notebook demonstrates how to perform univariate drift analysis using the `tab-right` package.
We use the UCI Adult dataset (census income) from OpenML, as in the classification example.

In [None]:
# Install dependencies if needed
# !pip install pandas scikit-learn tab-right

In [None]:
from sklearn.datasets import fetch_openml

from tab_right.drift.univariate import detect_univariate_drift_df
from tab_right.plotting.plotly import plot_drift

## Load Example Dataset
We'll use the UCI Adult dataset (census income) from OpenML.

In [None]:
data = fetch_openml("adult", version=2, as_frame=True)
df = data.frame.copy()
df = df.sample(frac=1, random_state=42).reset_index(drop=True)  # Shuffle
df = df.dropna()  # Drop missing for simplicity
df["target"] = (df["class"] == ">50K").astype(int)
df = df.drop(columns=["class"])
df.head()

## Split Data: Reference vs. Current
We'll simulate drift by splitting the data by time (first 70% as reference, last 30% as current).

In [None]:
split_idx = int(0.7 * len(df))
df_ref = df.iloc[:split_idx].reset_index(drop=True)
df_cur = df.iloc[split_idx:].reset_index(drop=True)
print(f"Reference: {df_ref.shape}, Current: {df_cur.shape}")

## Univariate Drift Analysis
Let's compute drift for all features using the recommended metrics.

In [None]:
drift_df = detect_univariate_drift_df(df_ref, df_cur)
drift_df.sort_values("value", ascending=False).head(10)

# Plot drift values for all features
fig = plot_drift(drift_df)
fig.show()

## Plot Drift for an Individual Feature
Let's visualize the drift for a single feature using the plot_feature_drift function.

In [None]:
from tab_right.plotting.plotly import plot_feature_drift

# Select a feature to visualize (e.g., 'age')
feature = "age"
fig = plot_feature_drift(df_ref[feature], df_cur[feature], feature_name=feature)
fig.show()