# ModSSC | Views

Build and inspect multi view representations used by some SSL methods.

## Objective
- Show the minimal steps to run this component in a notebook setting.
- Provide the exact objects to look at (outputs, shapes, metrics) to confirm it worked.

## Prerequisites
- Python 3.11+.
- `pip install modssc`.
- Optional dependencies depend on datasets and backends. If an import fails, install the matching extra and rerun.

## Outline
1) Imports and configuration
2) Core run (the part that does the work)
3) Sanity checks and outputs



## Notebook notes

This notebook demonstrates **feature view generation** with `modssc.views`.

A *view* is a 2D feature matrix derived from the same dataset, typically used by classic multi-view SSL methods such as **Co-Training**.

> Augmentation-based multi-view training (weak/strong augmentation) belongs to `modssc.data_augmentation`.


## Imports and configuration



In [None]:
from modssc.data_loader import load_dataset
from modssc.preprocess.plan import PreprocessPlan, StepConfig
from modssc.views import generate_views, two_view_random_feature_split

## 1) Load a dataset

We'll use the built-in deterministic **toy** dataset so the notebook runs fast.

In [None]:
ds = load_dataset("toy")
print(ds.train.X.shape, ds.test.X.shape)

## 2) Define a preprocessing plan (optional)

We ensure features are 2D and cast to float32.

In [None]:
pre = PreprocessPlan(
    steps=(
        StepConfig(step_id="core.ensure_2d"),
        StepConfig(step_id="core.cast_dtype", params={"dtype": "float32"}),
    ),
    output_key="features.X",
)

## 3) Build a 2-view random feature split plan

- View A selects a random subset of columns.
- View B is the complement of View A.

This is a common setup for classic Co-Training when no natural multi-view feature sets exist.

In [None]:
plan = two_view_random_feature_split(preprocess=pre, fraction=0.5)
plan

## 4) Generate the views

In [None]:
res = generate_views(ds, plan=plan, seed=0, cache=True)
res.columns

In [None]:
ds_a = res.views["view_a"]
ds_b = res.views["view_b"]

print("A train:", ds_a.train.X.shape, "test:", ds_a.test.X.shape)
print("B train:", ds_b.train.X.shape, "test:", ds_b.test.X.shape)

## 5) Integration idea (Co-Training)

A Co-Training implementation can consume the two view datasets:

- `ds_a.train.X`, `ds_a.train.y`
- `ds_b.train.X`, `ds_b.train.y`

The algorithm can train two classifiers (one per view), and exchange pseudo-labeled samples between them.

> The exact Co-Training loop depends on the paper/variant, so the view brick intentionally stays independent.


## Outputs

- The last cells should print key shapes and a minimal metric or artifact summary.
- If something fails early, the error should point to a missing optional dependency.


## Next steps
- Explore the adjacent notebooks in this folder for the other pipeline components.
- If you hit an optional dependency error, install the suggested extra and rerun.
