# Lab 11 – Learning-Driven NAO Behaviors Starter Notebook

## Overview
Bridge reinforcement learning policies with NAO robot behaviors. Students will deploy small state-space policies to the simulator (and physical robot if available), iterating on control loops informed by RL methods.

## Objectives
- Adapt previously trained policies for NAO-friendly state and action spaces.
- Integrate sensor feedback loops into NAO scripts.
- Evaluate performance in simulation vs. hardware, documenting discrepancies.

## Pre-Lab Review
- Revisit [`old content/NAO_LAB_REBIRTH.pdf`](../../old%20content/NAO_LAB_REBIRTH.pdf) sections on behavior design and safety.
- Review relevant control examples from [`old content/ALL_WEEKS_V5 - Student.ipynb`](../../old%20content/ALL_WEEKS_V5%20-%20Student.ipynb).

## In-Lab Exercises
1. Define a manageable NAO task (e.g., line following, posture stabilization, simple gesture sequencing).
2. Map RL state features to NAO sensor readings; discretize or normalize as required.
3. Deploy the policy to the simulator, collect logs, and adjust reward structures if needed.
4. Test on a real NAO robot when available, noting latency, safety constraints, and differences from simulation.

## Deliverables
- Code package containing policy implementation, NAO interface scripts, and configuration files.
- Lab report summarizing deployment process, observations, and open issues for hardware trials.

## Resources
- [`old content/howto.txt`](../../old%20content/howto.txt) for troubleshooting connection and environment issues.
- Instructor guidance on NAO safety procedures and lab access protocols.

### NAOqi Setup Checklist
The following steps are summarised from `old content/howto.txt`. Complete them before running the interaction cells.

1. Install **Python 2.7 (32-bit)** and ensure `pip` is available.
2. Create a dedicated virtual environment for NAO development.
3. Download the NAOqi SDK (e.g., `pynaoqi-python2.7-2.5.5.5-win32-vs2013.zip`).
4. Update your activation scripts (`activate.bat`/`activate.ps1`) to append the SDK's `lib` directory to `PYTHONPATH`.
5. Activate the environment and verify that `from naoqi import ALProxy` succeeds.
6. For simulator work, launch Choregraphe or the NAOqi virtual robot before attempting to connect.


### Policy Deployment Notes
Use this cell to log configuration details from your simulator or physical NAO sessions.

In [None]:
deployment_log = {
    "session": "",
    "policy_source": "Describe the RL policy or heuristic you are deploying.",
    "environment": "Simulator or real robot.",
    "observations": [],
    "issues": [],
}

deployment_log


### Closed-Loop Control Placeholder
Integrate your learned policy with NAO motion primitives. The helper below expects a policy function returning joint targets.

In [None]:
import time

try:
    from naoqi import ALProxy
except ImportError:
    print("Ensure the NAOqi SDK is installed before running control loops.")
else:
    ROBOT_IP = "127.0.0.1"
    ROBOT_PORT = 9559

    motion = ALProxy("ALMotion", ROBOT_IP, ROBOT_PORT)

    def run_policy_step(policy_fn, sensor_snapshot):
        joint_targets = policy_fn(sensor_snapshot)
        for joint_name, target in joint_targets.items():
            motion.setAngles(joint_name, target, 0.2)
        time.sleep(0.1)

    def placeholder_policy(sensor_data):
        return {"HeadYaw": 0.0, "HeadPitch": 0.0}

    # sensor_data = motion.getAngles("Head", True)
    # run_policy_step(placeholder_policy, sensor_data)
