# Practical guide to building data-intensive apps with the Realtime API

This cookbook serves as a practical guide to help AI Engineers maximize the effectiveness of OpenAI's Realtime API, specifically when dealing with data-intensive function calls. We'll focus on scenarios common in speech-to-speech agents, where vast amounts of data must be handled smoothly and efficiently.

This post won't cover the basics of setting up a Realtime API solution. Instead, I'll provide links to the relevant documentation and repositories to help you get started on your journey.

By following this guide, you'll gain clear insights and actionable strategies to enhance the performance and reliability of your real-time conversational agents. It addresses specific challenges unique to handling large-scale data in real-time conversational contexts.

### What is the Realtime API?

Before we dive in, let’s quickly recap the API for those who are new. The OpenAI Realtime API is a relatively recent offering that supports low-latency, multimodal interactions—such as speech-to-speech conversations and live transcription. Picture scenarios like real-time voice-based customer support or live movie transcriptions. 

### Setting the stage

Customers often face complex challenges that demand access to large volumes of data via function calls. As engineers, our role is to design the architecture so the model can retrieve this data smoothly—delivering a truly seamless conversational experience. 

Our example is an NBA Scouting Agent that calls various functions to deliver in-depth analysis of upcoming draft prospects. To illustrate practical guidelines for robust speech-to-speech interactions, we showcase large, realistic payloads inspired by draft prospects and team stats. Below, you’ll see a monolithic `searchDraftProspects` function supplied to the model in a Realtime API session.

```json
// "Hey, pull up point guards projected in the top 10 in the 2025 draft"
{
  "type": "session.update",
  "session": {
    "tools": [
      {
        "type": "function",
        "name": "searchDraftProspects",
        "description": "Search draft prospects for a given year e.g., Point Guard",
        "parameters": {
          "type": "object",
          "properties": {
            "sign": {
              "type": "string",
              "description": "The player position",
              "enum": [
                "Point Guard",
                "Shooting Guard",
                "Small Forward",
                "Power Forward",
                "Center",
                "Any"
              ]
            },
            year: { type: "number", description: "Draft year e.g., 2025" },
            mockDraftRanking: { type: "number", description: "Predicted Draft Ranking" },
          },
          "required": ["position", "year"]
        }
      }
    ],
    "tool_choice": "auto",
  }
}
```

For example, the searchDraftProspects function call returns a hefty payload—as expected. Most environments offer API endpoints as low-hanging fruit, but these are rarely optimized for Agent use cases and usually need refinement. The example payload’s structure and size are based on real-world scenarios we’ve encountered.

```json
// Example Payload
{
  "status": {
    "code": 200,
    "message": "SUCCESS"
  },
  "found": 4274,
  "offset": 0,
  "limit": 10,
  "data": [
    {
      "prospectId": 10001,
      "data": {
        "ProspectInfo": {
          "league": "NCAA",
          "collegeId": 301,
          "isDraftEligible": true,
          "Player": {
            "personalDetails": {
              "firstName": "Jalen",
              "lastName": "Storm",
              "dateOfBirth": "2003-01-15",
              "nationality": "USA"
            },
            "physicalAttributes": {
              "position": "PG",
              "height": {
                "feet": 6,
                "inches": 4
              },
              "weightPounds": 205
            },
            "hometown": {
              "city": "Springfield",
              "state": "IL"
            }
          },
          "TeamInfo": {
            "collegeTeam": "Springfield Tigers",
            "conference": "Big West",
            "teamRanking": 12,
            "coach": {
              "coachId": 987,
              "coachName": "Marcus Reed",
              "experienceYears": 10
            }
          }
        },
        "Stats": {
          "season": "2025",
          "gamesPlayed": 32,
          "minutesPerGame": 34.5,
          "shooting": {
            "FieldGoalPercentage": 47.2,
            "ThreePointPercentage": 39.1,
            "FreeThrowPercentage": 85.6
          },
          "averages": {
            "points": 21.3,
            "rebounds": 4.1,
            "assists": 6.8,
            "steals": 1.7,
            "blocks": 0.3
          }
        },
        "Scouting": {
          "evaluations": {
            "strengths": ["Court vision", "Clutch shooting"],
            "areasForImprovement": ["Defensive consistency"]
          },
          "scouts": [
            {
              "scoutId": 501,
              "name": "Greg Hamilton",
              "organization": "National Scouting Bureau"
            }
          ]
        },
        "DraftProjection": {
          "mockDraftRanking": 5,
          "lotteryPickProbability": 88,
          "historicalComparisons": [
            {
              "player": "Chris Paul",
              "similarityPercentage": 85
            }
          ]
        },
        "Media": {
          "highlightReelUrl": "https://example.com/highlights/jalen-storm",
          "socialMedia": {
            "twitter": "@jstorm23",
            "instagram": "@jstorm23_ig"
          }
        },
        "Agent": {
          "agentName": "Rick Allen",
          "agency": "Elite Sports Management",
          "contact": {
            "email": "rallen@elitesports.com",
            "phone": "555-123-4567"
          }
        }
      }
    },
    // ... Many thousands of tokens later.
  ]
}
```

## Guiding principles

### 1. Function Call 101

It almost goes without saying—when building function calls, your top priority is to design clear, well-defined functions. This makes it easy to trim response sizes and avoid overwhelming the model. Each function call should be straightforward to explain, sharply scoped, and return only the information needed for its purpose. Overlapping responsibilities between functions inevitably invite hallucinations. Still, you’ll sometimes face data-heavy responses.

Furthermore, you may encounter long-running dependencies in your function calls. Breaking these calls into smaller, focused pieces might significantly improve latency and eliminate the need for mitigations like “just a moment” prompts—which we’ll discuss later in the post.

For example, we can limit the `searchDraftProspects` function call to return only general information and player stats for each prospect—vastly reducing the response size. Expanded details are handled by the `getProspectDetails` function call instead. Remember there’s no one-size-fits-all solution; it’s trial and error.

```json
{
  "tools": [
    {
      "type": "function",
      "name": "searchDraftProspects",
      "description": "Search NBA draft prospects by position, draft year, and projected ranking, returning only general statistics to optimize response size.",
      "parameters": {
        "type": "object",
        "properties": {
          "position": {
            "type": "string",
            "description": "The player's basketball position.",
            "enum": [
              "Point Guard",
              "Shooting Guard",
              "Small Forward",
              "Power Forward",
              "Center",
              "Any"
            ]
          },
          "year": {
            "type": "number",
            "description": "Draft year, e.g., 2025"
          },
          "maxMockDraftRanking": {
            "type": "number",
            "description": "Maximum predicted draft ranking (e.g., top 10)"
          }
        },
        "required": ["position", "year"]
      }
    },
    {
      "type": "function",
      "name": "getProspectDetails",
      "description": "Fetch detailed information for a specific NBA prospect, including comprehensive stats, agent details, and scouting reports.",
      "parameters": {
        "type": "object",
        "properties": {
          "playerName": {
            "type": "string",
            "description": "Full name of the prospect (e.g., Jalen Storm)"
          },
          "year": {
            "type": "number",
            "description": "Draft year, e.g., 2025"
          },
          "includeAgentInfo": {
            "type": "boolean",
            "description": "Include agent information"
          },
          "includeStats": {
            "type": "boolean",
            "description": "Include detailed player statistics"
          },
          "includeScoutingReport": {
            "type": "boolean",
            "description": "Include scouting report details"
          }
        },
        "required": ["playerName", "year"]
      }
    }
  ],
  "tool_choice": "auto"
}

```

**Tip:** Break down large, unwieldy functions into smaller, focused ones with clear roles and responsibilities.

### 2. Be context aware

It’s important to note that the service truncates the conversation once the total tokens exceed roughly 14,000–16,000. In other words, the engine keeps only **the most recent ~16k tokens of chat history**; earlier messages—like **long system prompts** or previous turns—are **silently dropped or truncated** as the session continues and the context grows.

**1. System Prompt Reminders**

Data-heavy payloads can quickly fill the entire context window. That’s why it’s essential to periodically remind the model of its system instructions, ensuring it maintains its roles and responsibilities.

```json
// Need a better example of this
{
  type: 'session.update',
  response: {
    instructions: systemPrompt
}
```

**Tip:** As conversations grow longer, periodically reinject the system prompt to keep the model on track.

**2. Out-of-band Responses**

In some scenarios, you might want to avoid adding situational instructions to conversation history.  

```jsx
const prompt = `
A really big redundant prompt...
`;

const event = {
  type: "response.create",
  response: {
    // Setting to "none" indicates the response is out of band
    conversation: "none",
    instructions: prompt,
  },
};

// WebRTC data channel and WebSocket both have .send()
dataChannel.send(JSON.stringify(event));
```

**Tip:** Use out-of-band responses to prevent cluttering the conversation history with unnecessary instructions.

### 3. Data processing and optimization

**1. Data Filtering**

Data size directly impacts the effectiveness of real-time interactions. Generally, fewer tokens returned by function calls lead to better quality responses. Common pitfalls occur when function calls return excessively large payloads spanning thousands of tokens. Focus on applying filters in each function call, either at the data-level or function-level, to minimize response sizes.

```jsx
// Filtered response
{
  "status": {
    "code": 200,
    "message": "SUCCESS"
  },
  "found": 4274,
  "offset": 0,
  "limit": 5,
  "data": [
    {
    "zpid": 7972122,
      "data": {
        "PropertyInfo": {
            "houseNumber": "19661",
            "directionPrefix": "N ",
            "streetName": "Central",
            "streetSuffix": "Ave",
            "city": "Phoenix",
            "state": "AZ",
            "postalCode": "85024",
            "zipPlusFour": "1641"
            "bedroomCount": 2,
            "bathroomCount": 2,
            "storyCount": 1,
            "livingAreaSize": 1089,
            "livingAreaSizeUnits": "Square Feet",
            "yearBuilt": "1985"
          }
		    }
			}
		]
		// ... 
}
```

As shown in the previous function call example, you can apply filters directly in the function definition, update the underlying database query, or use post-filters to reduce the response size.

**Tip:** Use pre- or post-filtering in your function calls to trim data-heavy responses down to only the essential fields needed to answer the question. 

**2. Data Flattening**

You might find yourself dealing with hierarchical payloads from API calls. In testing, the model often struggled to interpret deeply nested structures. These hierarchical payloads frequently contain redundant information, which adds unnecessary noise to the model’s context. 

```json
// Flattened payload
{
  "status": {
    "code": 200,
    "message": "SUCCESS"
  },
  "found": 4274,
  "offset": 0,
  "limit": 2,
  "data": [
    {
      "prospectId": 10001,
      "league": "NCAA",
      "collegeId": 301,
      "isDraftEligible": true,
      "firstName": "Jalen",
      "lastName": "Storm",
      "position": "PG",
      "heightFeet": 6,
      "heightInches": 4,
      "weightPounds": 205,
      "hometown": "Springfield",
      "state": "IL",
      "collegeTeam": "Springfield Tigers",
      "conference": "Big West",
      "teamRanking": 12,
      "coachId": 987,
      "coachName": "Marcus Reed",
      "gamesPlayed": 32,
      "minutesPerGame": 34.5,
      "FieldGoalPercentage": 47.2,
      "ThreePointPercentage": 39.1,
      "FreeThrowPercentage": 85.6,
      "averagePoints": 21.3,
      "averageRebounds": 4.1,
      "averageAssists": 6.8,
      "stealsPerGame": 1.7,
      "blocksPerGame": 0.3,
      "strengths": ["Court vision", "Clutch shooting"],
      "areasForImprovement": ["Defensive consistency"],
      "mockDraftRanking": 5,
      "lotteryPickProbability": 88,
      "highlightReelUrl": "https://example.com/highlights/jalen-storm",
      "agentName": "Rick Allen",
      "agency": "Elite Sports Management",
      "contactEmail": "rallen@elitesports.com"
    },
		...
 }
```

**Tip:** After your function call, use post-processing to flatten hierarchical payloads as much as possible—without losing key information. 

**3. Data formats**

The format of your data plays a crucial role in how well the model interprets and summarizes returned payloads. In our experiments, structured formats like YAML or JSON—with clear keys— outperformed tabular formats such as Markdown. Lengthy tables tripped up the model, reducing both interpretability and accuracy. Ultimately, YAML formats thrived in testing, likely because its nesting structure is easier for the model to follow.

```yaml
status:
  code: 200
  message: "SUCCESS"
found: 4274
offset: 0
limit: 10
data:
  - prospectId: 10001
    data:
      ProspectInfo:
        league: "NCAA"
        collegeId: 301
        isDraftEligible: true
        Player:
          firstName: "Jalen"
          lastName: "Storm"
          position: "PG"
          heightFeet: 6
          heightInches: 4
          weightPounds: 205
          hometown: "Springfield"
          state: "IL"
        TeamInfo:
          collegeTeam: "Springfield Tigers"
          conference: "Big West"
          teamRanking: 12
          coachId: 987
          coachName: "Marcus Reed"
      Stats:
        gamesPlayed: 32
        minutesPerGame: 34.5
        FieldGoalPercentage: 47.2
        ThreePointPercentage: 39.1
        FreeThrowPercentage: 85.6
        averagePoints: 21.3
        averageRebounds: 4.1
        averageAssists: 6.8
        stealsPerGame: 1.7
        blocksPerGame: 0.3
      Scouting:
        strengths:
          - "Court vision"
          - "Clutch shooting"
        areasForImprovement:
          - "Defensive consistency"
      DraftProjection:
        mockDraftRanking: 5
        lotteryPickProbability: 88
      Media:
        highlightReelUrl: "https://example.com/highlights/jalen-storm"
      Agent:
        agentName: "Rick Allen"
        agency: "Elite Sports Management"
        contactEmail: "rallen@elitesports.com"
```

**Tip:** Favor structured formats like JSON or YAML with clear key-value pairs to make the model’s output easier to read. Still, don’t be afraid to experiment with other formats to strike the right balance.

### 4. Function call hints

The model often stumbles when moving directly from data-heavy responses to a clear answer. By adding explicit prompt hints immediately after a function call, you help it interpret the returned data more effectively—especially when you can’t trim those hefty responses any further. This approach is especially powerful for clarifying abbreviations and other nuanced fields.

The following example shows a hint prompt that helps the model respond effectively after receiving a large payload.

```jsx
// Function call hint
let prospectSearchPrompt = `
Parse NBA prospect data and provide a concise, engaging response.

General Guidelines
- Act as an NBA scouting expert.
- Highlight key strengths and notable attributes.
- Use conversational language.
- Mention identical attributes once.
- Ignore IDs and URLs.

Player Details
- State height conversationally ("six-foot-eight").
- Round weights to nearest 5 lbs.

Stats & Draft Info
- Round stats to nearest whole number.
- Use general terms for draft ranking ("top-five pick").
Experience
- Refer to players as freshman, sophomore, etc., or mention professional experience.
- Location & TeamMention hometown city and state/country.
- Describe teams conversationally.

Skip (unless asked explicitly)
- Exact birth dates
- IDs
- Agent/contact details
- URLs

Examples
- "Jalen Storm, a dynamic six-foot-four point guard from Springfield, Illinois, averages 21 points per game."
- "Known for his clutch shooting, he's projected as a top-five pick."

Important: Respond based strictly on provided data, without inventing details.
`;
```

In practice, we append the data as a conversation item before emitting a response event from the Realtime API. Voilà—the model gracefully handles all the information.

```jsx
// Add new conversation item for the model
const conversationItem = {
  type: 'conversation.item.create',
  previous_item_id: output.id,
  item: {
    call_id: output.call_id,
    type: 'function_call_output',
    output: `Draft Prospect Search Results: ${result}`
  }
};

dataChannel.send(JSON.stringify(conversationItem));

// Emit a response from the model including the hint prompt
const event = {
  type: 'response.create',
  conversation: "none",
  response: {
    instructions: prospectSearchPrompt # function call hint
  }
};

dataChannel.send(JSON.stringify(event));
```

**Tip:** No matter how large the payload, function call hints can dramatically boost the model’s coherence and accuracy when interpreting complex data. These hints—whether task definitions, guidelines, domain knowledge, or examples—sharpen the model’s focus. Without them, the model may drift or struggle to respond effectively. 

### 5. “Just a moment” prompts

Sometimes, you’ll encounter long-running function calls that simply can’t be optimized. In these cases, expect awkwardly long pauses—not exactly ideal for business. Try appending an additional follow-up to reassure the user that their request is being processed.

```json
// Function call executed ...

// "Just a moment" prompt
const responseEvent = {
  type: 'response.create',
  conversation: "none",
  response: {
    instructions: "Let the user know you're working on the request."
  }
};

// Function call hint prompt ...
```

**Tip:** Use “just a moment” prompts for long-running function calls to prevent awkward pauses and avoid confusing silence.

## Wrapping up

In closing, building effective data-intensive apps with the Realtime API is a continual journey of exploration and adaptation. The best results come from experimenting with data formats, response filtering, and prompt hints until you discover the right mix for your needs. Stay open to tinkering and learning—your ideal solution is always just a few iterations away. As always, we’ll keep you posted on new techniques and emerging best practices to help you get the most out of the API.