GitHub - rahulsense/structuring_prompts: token optimization techniques for voice ai prompts using concise language structured formatting abbreviations context compression chunked processing and model knowledge reuse to reduce tokens while maintaining accuracy and clarity

Token Optimization Techniques

Efficient prompts reduce operational costs and latency while maintaining effectiveness. Average token reduction targets: 30-50% without information loss.

1. Remove Filler Language

Eliminate: "please", "kindly", "in order to", "make sure to", "carefully", "thoroughly"

Before: "Please kindly transcribe the following audio recording carefully and thoroughly, making sure to capture every single word accurately."

After: "Transcribe audio accurately."

Savings: 18 tokens → 4 tokens (78% reduction)

2. Structure with Delimiters

Replace verbose explanations with clear section markers. Use triple hashtags, dashes, or brackets to separate instruction components.

Before (32 tokens): "The context for this conversation is that the user is calling about a job application. The job title is Senior Software Engineer. The task you need to perform is to screen the candidate and collect their information."

After (19 tokens):

Context

Job application call | Title: Senior Software Engineer

Task

Screen candidate, collect: name, experience, availability, salary expectations

Savings: Clearer structure, 40% token reduction (32→19 tokens), improved model parsing

Why It Works:

Section headers (###) provide explicit semantic boundaries
Pipe separator (|) efficiently connects related facts
Comma-separated lists replace verbose phrases
Model processes structured data more accurately than prose
Visual clarity aids both human review and LLM comprehension

3. Use Concise Abbreviations

Apply abbreviations where meaning remains unambiguous in context.

Safe abbreviations:

avg (average)
sec (seconds)
min (minutes)
info (information)
config (configuration)
doc (document)

Before: "Calculate average call duration in seconds, number of speakers, and overall sentiment."

After: "Calculate: avg_duration_sec, speaker_count, sentiment"

Savings: 12 tokens → 7 tokens, preserves complete meaning

4. Enforce Structured Output Formats

Replace prose-based instructions with format specifications.

Before: "Please provide the call summary including the duration of the call, the number of speakers involved, and the overall sentiment detected during the conversation."

After: "Return JSON: {duration_sec: int, speakers: int, sentiment: positive|neutral|negative}"

Savings: More concise, machine-parseable, enforces format compliance

5. Context Compression

Voicemail Detection Example:

Before (58 tokens): "You need to carefully analyze the audio input and look for various indicators that might suggest the bot has reached a voicemail system instead of connecting with a live person. Pay close attention to typical voicemail greetings, lack of interactive responses, beep sounds, or mentions of leaving a message."

After (16 tokens): "Voicemail indicators: greeting messages, beeps, 'leave a message', no interaction. If detected: invoke end_call_global immediately. Do not leave message."

Savings: 72% reduction while preserving critical decision logic

6. Implement Chunked Processing

For long contexts (2000+ words), replace single massive prompts with sequential smaller prompts.

Pattern:

Step 1: Summarize full transcript (output: 200 tokens max)
Step 2: Extract action items from summary
Step 3: Analyze sentiment from summary

Benefits: Lower per-call token usage, better accuracy on focused tasks, reduced context overflow risk

7. Leverage Pre-Existing Model Knowledge

Avoid restating common knowledge the model already possesses.

Before: "You are an expert call analyst with 10 years of experience in customer service, trained in active listening techniques, familiar with various industries, and skilled at identifying customer pain points and sentiment."

After: "You are a call analyst. Extract: topics, sentiment, action items."

Savings: 35 tokens → 10 tokens (71% reduction)

8. Remove Redundant Instructions

Identify and eliminate instructions stated multiple times throughout prompt.

Audit process: Search for repeated phrases like "make sure to", "remember to", "don't forget to", "it's important that". Consolidate into single instruction section.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Token Optimization Techniques

1. Remove Filler Language

2. Structure with Delimiters

Context

Task

3. Use Concise Abbreviations

4. Enforce Structured Output Formats

5. Context Compression

6. Implement Chunked Processing

7. Leverage Pre-Existing Model Knowledge

8. Remove Redundant Instructions

About

Uh oh!

Releases

Packages

rahulsense/structuring_prompts

Folders and files

Latest commit

History

Repository files navigation

Token Optimization Techniques

1. Remove Filler Language

2. Structure with Delimiters

Context

Task

3. Use Concise Abbreviations

4. Enforce Structured Output Formats

5. Context Compression

6. Implement Chunked Processing

7. Leverage Pre-Existing Model Knowledge

8. Remove Redundant Instructions

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages