## **Co-Training**

**Type:** Semi-Supervised Learning Technique

---

### **Definition**

Co-training is a semi-supervised learning method where **two (or more) models**, each trained on **different, independent “views” of the same data**, iteratively label unlabeled examples for each other.
A “view” means a distinct subset of features that can independently predict the target.

---

### **Key Idea**

It’s like two friends studying for an exam: each has different notes (views) on the same topics. They solve problems separately, then swap their confident answers so both can learn more.

---

### **Core Steps**

1. **Split Features into Two Views** — Example: webpage classification using *page content* vs *link text*.
2. **Train Two Independent Models** — Each model learns from its own view using the labeled data.
3. **Label Unlabeled Data** — Each model predicts labels for the unlabeled set.
4. **Exchange Confident Labels** — Model A gives its confident predictions to Model B, and vice versa.
5. **Retrain Models** — Both models train again with the newly labeled data from their partner.
6. **Repeat** until no improvement or no confident new samples.

---

### **Pros**

* Uses complementary information from different views.
* Can achieve strong performance with very few initial labels.

### **Cons**

* Requires truly independent and sufficient views (not always possible).
* Wrong confident labels can mislead the partner model.

---

### **Real-Life Examples**

* **Webpage classification:** One model uses page text, another uses anchor text from inbound links.
* **Video analysis:** One model uses visual features, another uses audio features.
* **Medical data:** One model uses lab test results, another uses imaging data.


* **View A** = feature `a`
* **View B** = feature `b`

---

# Tiny Walkthrough (Two Views, One Exchange Round)

## 1) Labeled data (start small)

| ID | a (View A) | b (View B) |  y  |
| -: | :--------: | :--------: | :-: |
| L1 |     1.0    |     5.0    |  0  |
| L2 |     1.2    |     4.8    |  0  |
| L3 |     3.0    |     1.0    |  1  |
| L4 |     3.2    |     1.2    |  1  |

**Class centers (averages):**

* View A: class-0 mean = (1.0+1.2)/2 = **1.1**; class-1 mean = (3.0+3.2)/2 = **3.1** → boundary ≈ **2.1**
* View B: class-0 mean = (5.0+4.8)/2 = **4.9**; class-1 mean = (1.0+1.2)/2 = **1.1** → boundary ≈ **3.0**

**Simple rule per view:**

* **View A:** if `a < 2.1` ⇒ predict **0**, else **1**
* **View B:** if `b > 3.0` ⇒ predict **0**, else **1**

**Confidence (super simple):** farther from the boundary = more confident.

---

## 2) Unlabeled pool

| ID |  a  |  b  |
| -: | :-: | :-: |
| U1 | 0.9 | 5.1 |
| U2 | 1.1 | 4.9 |
| U3 | 2.9 | 1.1 |
| U4 | 3.3 | 0.8 |

---

## 3) Each model predicts (on its **own** view)

### View A (boundary = 2.1)

* U1: a=0.9 → **0**, distance from boundary = |0.9−2.1|= **1.2** (very confident)
* U2: a=1.1 → **0**, distance = **1.0**
* U3: a=2.9 → **1**, distance = **0.8**
* U4: a=3.3 → **1**, distance = **1.2** (very confident)

**A’s most confident (pick 2):** U1→0 and U4→1.

### View B (boundary = 3.0)

* U1: b=5.1 → **0**, distance = |5.1−3.0|= **2.1** (very confident)
* U2: b=4.9 → **0**, distance = **1.9**
* U3: b=1.1 → **1**, distance = **1.9**
* U4: b=0.8 → **1**, distance = **2.2** (very confident)

**B’s most confident (pick 2):** U4→1 and U1→0.

---

## 4) Exchange confident labels

* **A → B**: add **U1=0**, **U4=1** to B’s training set (B uses only `b`).
* **B → A**: add **U4=1**, **U1=0** to A’s training set (A uses only `a`).

(They happened to choose the same two—fine!)

---

## 5) Retrain with the new labeled points

### View A (use `a` only)

* Class 0 `a`: {1.0, 1.2, **0.9**} → new mean = (3.1)/3 = **1.033**
* Class 1 `a`: {3.0, 3.2, **3.3**} → new mean = (9.5)/3 = **3.167**
* New boundary ≈ (1.033 + 3.167)/2 = **2.10** (almost unchanged)

### View B (use `b` only)

* Class 0 `b`: {5.0, 4.8, **5.1**} → new mean = 14.9/3 = **4.967**
* Class 1 `b`: {1.0, 1.2, **0.8**} → new mean = 3.0/3 = **1.000**
* New boundary ≈ (4.967 + 1.000)/2 = **2.98** (moved slightly)

**What improved?**
Each model nudged its class centers in a direction supported by the **other view’s** confident labels—this is the essence of co-training.

---

## 6) Next round (optional)

Re-run on remaining unlabeled points (U2, U3), pick confident ones again, exchange, retrain… stop when nothing new is confidently labeled or validation stops improving.

---

### TL;DR (recipe)

1. Train Model A on `a`, Model B on `b`.
2. Each predicts unlabeled examples; pick the **most confident**.
3. **Swap** those labels.
4. Retrain each on its own view + partner’s labels.
5. **Repeat** until stable.