## Hypersonic_HeyGen â€” (Avatar Educational Demonstration of HeyGen API)

This notebook demonstrates a minimal, **classroom-friendly** workflow to launch a HeyGen avatar **inside Jupyter Notebook**.
It uses only a **HeyGen API key**, keeps UI inline, and annotates each step with **short descriptions** and links to the **relevant HeyGen docs** used by this code.


### 0) What youâ€™ll do
1. Provide your **HeyGen API key** (via `getpass` or environment variable).
2. Create a **streaming session** and **session token** (required to authorize actions).
3. Prepare `viewer.html`, fill placeholders, and display it **inline**.
4. Send one **text task** to the avatar.
5. **Close** the session (to stop billing).

> This notebook intentionally avoids any extra features; the focus is on a transparent, step-by-step flow you can teach.


### References used in this notebook (HeyGen Docs)
- **Create New Session** â€” https://docs.heygen.com/reference/new-session  
  *What/Why:* Requests a new streaming session and returns an SDP offer + ICE servers (needed to render the avatar).

- **Create Session Token** â€” https://docs.heygen.com/reference/create-session-token  
  *What/Why:* Issues a short-lived token used by the client/browser to interact with the session securely.

- **Send Task** â€” https://docs.heygen.com/reference/send-task  
  *What/Why:* Sends a task (here we use a simple *repeat* text-to-speech) to the active session.

- **Close Session** â€” https://docs.heygen.com/reference/close-session  
  *What/Why:* Cleanly ends the session so **billing stops**.


In [1]:
# 1) Setup
import json, os, requests
from pathlib import Path
from typing import Optional
from getpass import getpass
from IPython.display import IFrame, display

def debug(*a):
    print("[DEBUG]", *a)

### 1a) API key
We only need the **HeyGen API key**. If set as an environment variable (`HEYGEN_API_KEY`), you wonâ€™t be prompted.


In [2]:
# 2) Secrets â€” prompt for HeyGen API key only
HEYGEN_API_KEY = os.getenv("HEYGEN_API_KEY") or getpass("Enter HEYGEN_API_KEY: ")

if not HEYGEN_API_KEY:
    raise RuntimeError("HEYGEN_API_KEY is required to run this notebook.")

debug("HeyGen key accepted.")

[DEBUG] HeyGen key accepted.


### 2) Endpoints & headers
The following endpoints correspond to the docs listed above (streaming API base path + methods).


In [3]:
# 3) Endpoints & headers
BASE = "https://api.heygen.com/v1"
API_STREAM_NEW     = f"{BASE}/streaming.new"        # Docs: Create New Session
API_CREATE_TOKEN   = f"{BASE}/streaming.create_token"  # Docs: Create Session Token
API_STREAM_TASK    = f"{BASE}/streaming.task"       # Docs: Send Task
API_STREAM_STOP    = f"{BASE}/streaming.stop"       # Docs: Close Session

HEADERS_XAPI = {
    "accept": "application/json",
    "x-api-key": HEYGEN_API_KEY,
    "Content-Type": "application/json",
}

def _headers_bearer(tok: str):
    return {
        "accept": "application/json",
        "Authorization": f"Bearer {tok}",
        "Content-Type": "application/json",
    }

def _post_xapi(url, payload=None):
    r = requests.post(url, headers=HEADERS_XAPI, data=json.dumps(payload or {}), timeout=60)
    try:
        body = r.json()
    except Exception:
        body = {"_raw": r.text}
    debug(f"[POST x-api] {url} -> {r.status_code}")
    if r.status_code >= 400:
        debug(r.text)
        r.raise_for_status()
    return body

def _post_bearer(url, token, payload=None):
    r = requests.post(url, headers=_headers_bearer(token), data=json.dumps(payload or {}), timeout=60)
    try:
        body = r.json()
    except Exception:
        body = {"_raw": r.text}
    debug(f"[POST bearer] {url} -> {r.status_code}")
    if r.status_code >= 400:
        debug(r.text)
        r.raise_for_status()
    return body

### (Note) Finding avatar and voice IDs
**List of Avatars** â€” https://docs.heygen.com/reference/list-avatars-v2  
*Purpose:* To browse available avatar IDs you can use in your session.

> In this notebook, we will **prompt** you to provide the `avatar_id`, `voice_id`, and `pose_name` to keep the demo compatible with any user account.  
> Your default/testing values are kept as comments for quick reference.


### 3) Helper functions
We will now define helper functions to:
- **Create a session** (*Create New Session*)
- **Create a session token** (*Create Session Token*)
- **Send a text task** (*Send Task*)
- **Stop the session** (*Close Session*)


In [6]:
# 4) HeyGen helpers â€” with user inputs for avatar/voice/pose
# For ease of testing, your known-good values are kept as comments beside the inputs.
# In public/production, users can copy-paste IDs from the HeyGen dashboard or List Avatars API.

# Enter values (your prior constants shown as comments):
avatar_id = input("Enter Avatar ID name: ")   # e.g. "June_HR_public"
voice_id  = input("Enter Voice ID string: ")  # e.g. "68dedac41a9f46a6a4271a95c733823c"
pose_name = input("Enter Pose Name: ")        # e.g. "June HR"

def new_session(avatar_id: str, voice_id: Optional[str] = None):
    payload = {"avatar_id": avatar_id}
    if voice_id:
        payload["voice_id"] = voice_id
    # Docs: Create New Session â€” returns session_id, SDP offer, and ICE servers
    body = _post_xapi(API_STREAM_NEW, payload)
    data = body.get("data") or {}
    sid = data.get("session_id")
    offer_sdp = (data.get("offer") or data.get("sdp") or {}).get("sdp")
    ice2 = data.get("ice_servers2")
    ice1 = data.get("ice_servers")
    if isinstance(ice2, list) and ice2:
        rtc_config = {"iceServers": ice2}
    elif isinstance(ice1, list) and ice1:
        rtc_config = {"iceServers": ice1}
    else:
        rtc_config = {"iceServers": [{"urls": ["stun:stun.l.google.com:19302"]}]}
    if not sid or not offer_sdp:
        raise RuntimeError(f"Missing session_id or offer in response: {body}")
    return {"session_id": sid, "offer_sdp": offer_sdp, "rtc_config": rtc_config}

def create_session_token(session_id: str) -> str:
    # Docs: Create Session Token â€” used by the client/browser to authenticate to this session
    body = _post_xapi(API_CREATE_TOKEN, {"session_id": session_id})
    tok = (body.get("data") or {}).get("token") or (body.get("data") or {}).get("access_token")
    if not tok:
        raise RuntimeError(f"Missing token in response: {body}")
    return tok

def send_text_to_avatar(session_id: str, session_token: str, text: str):
    # Docs: Send Task â€” Here we use a simple "repeat" task to speak the provided text
    _post_bearer(
        API_STREAM_TASK,
        session_token,
        {
            "session_id": session_id,
            "task_type": "repeat",
            "task_mode": "sync",
            "text": text,
        },
    )

def stop_session(session_id: str, session_token: str):
    # Docs: Close Session â€” Very important to call so that billing stops
    _post_bearer(API_STREAM_STOP, session_token, {"session_id": session_id})

Enter Avatar ID name:  June_HR_public
Enter Voice ID string:  68dedac41a9f46a6a4271a95c733823c
Enter Pose Name:  June HR


### 4) Start a session
We now create a new session (returns `session_id`, `offer_sdp`, and ICE servers) and then request a **session token**.


In [7]:
created = new_session(avatar_id, voice_id)
SESSION_ID   = created["session_id"]
OFFER_SDP    = created["offer_sdp"]
RTC_CONFIG   = created["rtc_config"]
debug("Session created:", SESSION_ID[:10], "...")

SESSION_TOKEN = create_session_token(SESSION_ID)
debug("Session token created (len):", len(SESSION_TOKEN))

print("Ready. Next: fill & launch viewer.html inline.")

[DEBUG] [POST x-api] https://api.heygen.com/v1/streaming.new -> 200
[DEBUG] Session created: 7721a192-b ...
[DEBUG] [POST x-api] https://api.heygen.com/v1/streaming.create_token -> 200
[DEBUG] Session token created (len): 132
Ready. Next: fill & launch viewer.html inline.


### 5) Prepare `viewer.html`
If `viewer.html` is not present next to this notebook, we will write a copy of the template. Then we fill placeholders.


In [8]:
TEMPLATE = r"""<!doctype html>
<html>
  <head>
    <meta charset="utf-8"/>
    <meta name="viewport" content="width=device-width,initial-scale=1,maximum-scale=1"/>
    <title>AI Avatar Viewer (WebRTC)</title>
    <style>
      html,body{margin:0;padding:0;height:100%;width:100%;background:#0b0b0c;color:#eaeaea;font-family:system-ui,-apple-system,Segoe UI,Roboto,Helvetica,Arial,sans-serif}
      .wrap{display:flex;flex-direction:column;gap:8px;width:100%;height:100%;align-items:center;justify-content:flex-start}
      .title{font-size:15px;text-align:center;margin-top:6px;opacity:.9}
      .stage{
        position:relative;width:100%;max-width:420px;
        aspect-ratio:16/9; /* default; updated to real stream AR after first frame 16/9 */
        background:#111;border-radius:18px;overflow:hidden;border:1px solid #222;
        display:grid;place-items:center
      }
      video{width:100%;height:100%;object-fit:contain;display:block;background:#0b0b0c}
      .status{font-size:12px;opacity:.85}
      .overlay{position:absolute;inset:0;display:flex;align-items:center;justify-content:center;background:rgba(0,0,0,0.25)}
      .overlay button{border:none;border-radius:999px;padding:12px 16px;font-size:14px;cursor:pointer;box-shadow:0 8px 22px rgba(0,0,0,.35)}
    </style>
  </head>
  <body>
    <div class="wrap">
      <div class="title">Avatar: <span id="aname"></span></div>
      <div class="stage" id="stage">
        <video id="video" playsinline autoplay muted></video>
        <div class="overlay" id="audioGate" style="display:flex">
          <button id="enableBtn">ðŸ”Š Tap to enable sound</button>
        </div>
      </div>
      <div class="status" id="status">initializingâ€¦</div>
    </div>

    <audio id="audio" autoplay></audio>

    <script>
      // Injected by Streamlit
      const SESSION_TOKEN = "__SESSION_TOKEN__";
      const AVATAR_NAME   = "__AVATAR_NAME__";
      const SESSION_ID    = "__SESSION_ID__";
      const OFFER_SDP     = "__OFFER_SDP__";
      const RTC_CONFIG    = __RTC_CONFIG__ || {};

      // UI helpers
      const stage   = document.getElementById("stage");
      const video   = document.getElementById("video");
      const audio   = document.getElementById("audio");
      const gate    = document.getElementById("audioGate");
      const btn     = document.getElementById("enableBtn");
      const statusEl= document.getElementById("status");
      document.getElementById("aname").textContent = AVATAR_NAME;
      const setStatus = (t)=> statusEl.textContent = t;

      async function ensureAudio() {
        try { audio.muted = false; audio.volume = 1.0; await audio.play(); gate.style.display = "none"; }
        catch { gate.style.display = "flex"; }
      }
      btn.addEventListener('click', ensureAudio);

      // Adaptive aspect ratio once we know the stream dimensions
      function applyAdaptiveAR() {
        const w = video.videoWidth || 0, h = video.videoHeight || 0;
        if (w > 0 && h > 0) {
          // clamp to sane limits (e.g., avoid extreme tall ratios)
          const ratio = w / h;
          const clamped = Math.max(0.9, Math.min(2.1, ratio)); // ~from 9:10 to ~21:10
          stage.style.aspectRatio = `${clamped} / 1`;
        }
      }
      video.addEventListener("loadedmetadata", applyAdaptiveAR);

      // ---- WebRTC wiring with auto-reconnect ----
      let pc = null;
      let reconnectAttempts = 0;
      const MAX_RETRIES = 3;

      async function startOnce() {
        // Build a fresh RTCPeerConnection
        if (pc) { try { pc.close(); } catch {} pc = null; }
        pc = new RTCPeerConnection(RTC_CONFIG);

        // Ensure we request both tracks
        try {
          pc.addTransceiver("audio", { direction: "recvonly" });
          pc.addTransceiver("video", { direction: "recvonly" });
        } catch {}

        pc.ontrack = (ev) => {
          const [stream] = ev.streams;
          if (!stream) return;
          if (ev.track.kind === "video") {
            video.srcObject = stream;
            video.muted = true;
            video.play().catch(()=>{});
            // If metadata already known, apply AR immediately
            if (video.videoWidth && video.videoHeight) applyAdaptiveAR();
          } else if (ev.track.kind === "audio") {
            audio.srcObject = stream;
            setTimeout(ensureAudio, 100);
          }
        };

        pc.oniceconnectionstatechange = () => {
          const s = pc.iceConnectionState;
          if (s === "connected" || s === "completed") {
            setStatus("connected");
            reconnectAttempts = 0; // reset backoff after a good connection
          } else if (s === "disconnected") {
            setStatus("ice disconnected");
            // soft wait; transient disconnects often recover without action
            setTimeout(() => {
              if (pc && (pc.iceConnectionState === "disconnected" || pc.iceConnectionState === "failed")) {
                triggerReconnect();
              }
            }, 1500);
          } else if (s === "failed") {
            setStatus("ice failed");
            triggerReconnect();
          }
        };

        setStatus("applying offerâ€¦");
        await pc.setRemoteDescription({ type: "offer", sdp: OFFER_SDP });

        setStatus("creating answerâ€¦");
        const answer = await pc.createAnswer();
        await pc.setLocalDescription(answer);

        // Wait for ICE gathering to finish or timeout
        await new Promise((resolve) => {
          if (pc.iceGatheringState === "complete") return resolve();
          const check = () => {
            if (pc.iceGatheringState === "complete") {
              pc.removeEventListener("icegatheringstatechange", check);
              resolve();
            }
          };
          pc.addEventListener("icegatheringstatechange", check);
          setTimeout(resolve, 1500);
        });

        setStatus("starting sessionâ€¦");
        await fetch("https://api.heygen.com/v1/streaming.start", {
          method: "POST",
          headers: {
            "Authorization": `Bearer ${SESSION_TOKEN}`,
            "Content-Type": "application/json",
            "accept": "application/json"
          },
          body: JSON.stringify({
            session_id: SESSION_ID,
            sdp: { type: "answer", sdp: pc.localDescription.sdp }
          })
        });

        setStatus("waiting for mediaâ€¦");
        gate.style.display = "flex";
      }

      async function triggerReconnect() {
        if (reconnectAttempts >= MAX_RETRIES) {
          setStatus("reconnect limit reached â€” use Start/Restart");
          return;
        }
        reconnectAttempts++;
        const delay = 500 * reconnectAttempts; // simple backoff
        setStatus(`reconnecting (${reconnectAttempts}/${MAX_RETRIES})â€¦`);
        try {
          // small pause helps on mobile networks
          await new Promise(r => setTimeout(r, delay));
          await startOnce();
        } catch (e) {
          console.error("reconnect error", e);
          // try again soon if we still have attempts
          if (reconnectAttempts < MAX_RETRIES) {
            setTimeout(triggerReconnect, 800);
          } else {
            setStatus("reconnect failed â€” use Start/Restart");
          }
        }
      }

      // Kickoff
      (async () => {
        try { await startOnce(); }
        catch (err) { setStatus("init error"); console.error(err); }
      })();
    </script>
  </body>
</html>
"""

html_path = Path("viewer.html")
if not html_path.exists():
    html_path.write_text(TEMPLATE, encoding="utf-8")
    debug("Wrote viewer.html.")

filled = (
    html_path.read_text(encoding="utf-8")
    .replace("__SESSION_TOKEN__", SESSION_TOKEN)
    .replace("__AVATAR_NAME__", pose_name or avatar_id)
    .replace("__SESSION_ID__", SESSION_ID)
    .replace("__OFFER_SDP__", json.dumps(OFFER_SDP)[1:-1])
    .replace("__RTC_CONFIG__", json.dumps(RTC_CONFIG or {}))
)

filled_path = Path("viewer_filled.html")
filled_path.write_text(filled, encoding="utf-8")
print("Filled viewer path:", filled_path.resolve())

Filled viewer path: C:\Unni\AiAi\Hypersonic_AVATAR\viewer_filled.html


### 6) Launch viewer (inline-only)
We render the filled HTML directly inside the notebook using an IFrame.


In [9]:
p = Path("viewer_filled.html").resolve()
display(IFrame(src=str(p), width="100%", height=380))

# Inline-only demo â€” no external browser pop-up:
# url = p.as_uri()
# ok = webbrowser.open(url, new=2)
# print("External browser launched:", ok, "â†’", url)

### 7) Send a sample text task
Uses **Send Task** to make the avatar speak a demo phrase. You can change the text below.


In [11]:
try:
    send_text_to_avatar(SESSION_ID, SESSION_TOKEN, "Hello! I am your HeyGen avatar inside Jupyter Notebook.")
    print("Sent test line to avatar.")
except Exception as e:
    print("Could not send test line:", e)

[DEBUG] [POST bearer] https://api.heygen.com/v1/streaming.task -> 200
Sent test line to avatar.


### 8) Close the session
Always end the session when your demo is complete so **billing stops**.


In [12]:
try:
    stop_session(SESSION_ID, SESSION_TOKEN)
    print("Session stopped.")
except Exception as e:
    print("Could not stop session:", e)

[DEBUG] [POST bearer] https://api.heygen.com/v1/streaming.stop -> 200
Session stopped.


---
**Done.** You ran the full inline avatar demo with minimal steps and clear doc references.  
For production, you can parameterize more fields or integrate with a UI layer.
