<a href="https://colab.research.google.com/github/hideaki-kyutech/softcomp2025/blob/main/Week3_FuzzyCartPole_Student.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Advanced Soft Computing — Week 3
## Fuzzy Control of CartPole: Baseline Rule Set A → Assignment Rule Set B

**What's this code for?**
- You can run the notebook end-to-end **immediately** with a baseline controller (Rule Set A).
- Then you implement the assignment part (Rule Set B) to improve the performance.

**Goal (Week 3):**
1) Run CartPole with **Rule Set A** (baseline, usually < 100 steps)
2) Upgrade to **Rule Set B** (Angle + Speed) to reach **100+ steps**
3) Export an MP4 for Moodle submission


## 0. Setup
Run the cells below to prepare the environment (Colab).


In [None]:
!pip -q install gymnasium
!apt-get -qq install -y ffmpeg


In [None]:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import animation
from IPython.display import HTML
import gymnasium as gym

np.random.seed(0)


## 1. Membership Functions (MFs)
We define fuzzy sets for:
- **Angle θ**: Negative / Zero / Positive
- **Speed |θ̇|**: Slow / Fast

In Rule Set A, we only use **θ** (angle). In Rule Set B, we use both **θ** and **|θ̇|**.


In [None]:
def triangular_mf(x, a, b, c):
    x = np.asarray(x, dtype=float)
    left = (x - a) / (b - a + 1e-12)
    right = (c - x) / (c - b + 1e-12)
    return np.maximum(np.minimum(left, right), 0.0)

def trapezoidal_mf(x, a, b, c, d):
    x = np.asarray(x, dtype=float)
    left = (x - a) / (b - a + 1e-12)
    right = (d - x) / (d - c + 1e-12)
    return np.maximum(np.minimum(np.minimum(left, 1.0), right), 0.0)


### 1.1 MF parameter design (EDITABLE)
Baseline design (you may tune these ranges later).


In [None]:
# Angle θ MFs (Negative / Zero / Positive)
theta_neg_params  = (-0.5, -0.2, 0.0)
theta_zero_params = (-0.15, 0.0, 0.15)
theta_pos_params  = (0.0, 0.2, 0.5)

# Speed |θ̇| MFs (Slow / Fast) -- magnitude only
slow_params = (0.0, 0.0, 0.4, 0.9)
fast_params = (0.6, 1.2, 2.5, 2.5)


### 1.2 Visualize MFs (recommended)


In [None]:
theta_grid = np.linspace(-0.5, 0.5, 600)
neg = triangular_mf(theta_grid, *theta_neg_params)
zero = triangular_mf(theta_grid, *theta_zero_params)
pos = triangular_mf(theta_grid, *theta_pos_params)

plt.figure(figsize=(7.5, 3.0))
plt.plot(theta_grid, neg, label="Negative")
plt.plot(theta_grid, zero, label="Zero")
plt.plot(theta_grid, pos, label="Positive")
plt.xlabel("θ (radians)")
plt.ylabel("μ")
plt.ylim(-0.05, 1.05)
plt.grid(True, alpha=0.3)
plt.legend()
plt.title("Angle θ Membership Functions")
plt.show()

speed_grid = np.linspace(0.0, 2.5, 600)
slow = trapezoidal_mf(speed_grid, *slow_params)
fast = trapezoidal_mf(speed_grid, *fast_params)

plt.figure(figsize=(7.5, 3.0))
plt.plot(speed_grid, slow, label="Slow")
plt.plot(speed_grid, fast, label="Fast")
plt.xlabel("|θ̇| (rad/s)")
plt.ylabel("μ")
plt.ylim(-0.05, 1.05)
plt.grid(True, alpha=0.3)
plt.legend()
plt.title("Speed |θ̇| Membership Functions")
plt.show()


## 2. Fuzzification helper
`fuzzify()` converts numeric values (θ, |θ̇|) into membership degrees (μ).


In [None]:
def fuzzify(theta, theta_dot_abs):
    mu_theta = {
        "N": float(triangular_mf(theta, *theta_neg_params)),
        "Z": float(triangular_mf(theta, *theta_zero_params)),
        "P": float(triangular_mf(theta, *theta_pos_params)),
    }
    mu_speed = {
        "SLOW": float(trapezoidal_mf(theta_dot_abs, *slow_params)),
        "FAST": float(trapezoidal_mf(theta_dot_abs, *fast_params)),
    }
    return mu_theta, mu_speed


## 3. Controllers
We provide a baseline controller (Rule Set A) that works immediately.

### Rule Set A (baseline)
- If θ is Positive → Right
- If θ is Negative → Left
- If θ is Zero → Right (tie-break)

### Rule Set B (assignment)
- Uses θ and |θ̇| with 3×2 rule table (to improve performance)


### 3.1 Rule Set A (baseline) — PROVIDED


In [None]:
def compute_scores_ruleA(mu_theta):
    left_score = mu_theta["N"]
    right_score = mu_theta["P"] + mu_theta["Z"]
    return left_score, right_score


### 3.2 Rule Set B (assignment) — TODO
Implement Rule Set B scoring using μ values only.
- AND = min(μ_angle, μ_speed)
- Accumulate scores for:
  (N,SLOW), (N,FAST), (Z,SLOW), (Z,FAST), (P,SLOW), (P,FAST)


In [None]:
def compute_scores_ruleB(mu_theta, mu_speed):
    # TODO: implement Rule Set B scoring using μ values
    raise NotImplementedError("TODO: implement compute_scores_ruleB")


### 3.3 Discrete action selection


In [None]:
def choose_action(left_score, right_score):
    return 1 if right_score >= left_score else 0


### 3.4 Choose which controller to run
Start with Rule Set A, then switch to B after implementing Rule Set B.


In [None]:
CONTROLLER = "A"  # "A" or "B"


## 4. One-step demo (always runnable)


In [None]:
env = gym.make("CartPole-v1")
obs, info = env.reset()

theta = float(obs[2])
theta_dot = float(obs[3])
mu_theta, mu_speed = fuzzify(theta, abs(theta_dot))

print("theta =", theta, "theta_dot =", theta_dot)
print("mu_theta =", mu_theta)
print("mu_speed =", mu_speed)

if CONTROLLER == "A":
    left_score, right_score = compute_scores_ruleA(mu_theta)
else:
    left_score, right_score = compute_scores_ruleB(mu_theta, mu_speed)

action = choose_action(left_score, right_score)

print("left_score =", left_score, "right_score =", right_score)
print("chosen action =", action, "(0=Left, 1=Right)")

env.close()


## 5. Run an episode and record frames (MP4)


In [None]:
env = gym.make("CartPole-v1", render_mode="rgb_array")
obs, info = env.reset()

frames = []
steps = 0

for t in range(500):
    theta = float(obs[2])
    theta_dot = float(obs[3])
    mu_theta, mu_speed = fuzzify(theta, abs(theta_dot))

    if CONTROLLER == "A":
        left_score, right_score = compute_scores_ruleA(mu_theta)
    else:
        left_score, right_score = compute_scores_ruleB(mu_theta, mu_speed)

    action = choose_action(left_score, right_score)

    obs, reward, terminated, truncated, info = env.step(action)
    frames.append(env.render())
    steps += 1

    if terminated or truncated:
        break

env.close()
print(f"Controller {CONTROLLER}: Episode steps =", steps)

fig, ax = plt.subplots()
ax.axis("off")
im = ax.imshow(frames[0])

def init():
    im.set_data(frames[0])
    return (im,)

def animate(i):
    im.set_data(frames[i])
    return (im,)

anim = animation.FuncAnimation(fig, animate, init_func=init,
                               frames=len(frames), interval=50, blit=True)

HTML(anim.to_jshtml())


### 5.1 Save MP4 (for Moodle submission)


In [None]:
mp4_name = f"week3_cartpole_fuzzy_{CONTROLLER}.mp4"
anim.save(mp4_name, fps=20)
print("Saved:", mp4_name)


### 5.2 Download in Colab


## ✅ Homework (Week 3)
1. Confirm Rule Set A runs end-to-end.
2. Implement Rule Set B (`compute_scores_ruleB`).
3. Switch CONTROLLER="B" and run.
4. Save **week3_cartpole_fuzzy_B.mp4** and submit to Moodle.
Target: 100+ steps with Rule Set B.
