# Prompt Decomposition Example
This notebook contains an example of prompt decomposition, or taking one long prompt and breaking it down into smaller parts.  These smaller parts can then be run independantly, often in parallel, and often on smaller models.  This can lead to significant performance enhancements and cost savings, may increase quality, and will make your prompts much easier to maintain as the workload grows.  This is because each step can be maintained and tested independantly, without any tweaks on one step impacting all the others (which may occur if they're all one huge prompt).

Here are three common times when Prompt Decomposition can be helpful:
  1) Preprocessing of RAG or context data.  For example, if context data is large, such a a long support document, consider a prompt that summarizes that content once, then future queries retrieve the summary rather than the long document. 
  2)  Breaking multi-step prompts into a maintainable DAG.  This can be helpful when a prompt reads like code, with a large number of steps or if/then instructions.  Instead, consider breaking these out and generating a flow diagram, resulting in maintainable individual pieces.
  3)  Streamlining long linear prompts.  Even where prompts have only a single logical flow, if they are long, it can help to break the flow into small, sequential steps.  These steps combined may execute faster than the long original prompt, and they can also be maintained and tested independently. 

The notebook follows this structure:
  1) Set up the envionment
  2) Examples of decomposition

## 1) Set up the envionment

In [1]:
!pip install anthropic

Defaulting to user installation because normal site-packages is not writeable



[notice] A new release of pip is available: 25.0.1 -> 26.0.1
[notice] To update, run: C:\Users\disha\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\python.exe -m pip install --upgrade pip


In [2]:
#use Anthropics library only to count tokens locally
import os
from anthropic import Anthropic # type: ignore
client = Anthropic(api_key=os.environ.get('ANTHROPIC_API_KEY', 'dummy-key'))
def count_tokens(text):
    return len(text.split()) * 1.3  # Rough estimate: 1.3 tokens per word

In [None]:
!pip install boto3

In [3]:
#for connecting with Bedrock, use Boto3
import boto3, time, json # type: ignore
from botocore.config import Config # type: ignore

#increase the standard time out limits in boto3, because Bedrock may take a while to respond to large requests.
my_config = Config(
    connect_timeout=60*3,
    read_timeout=60*3,
)
bedrock = boto3.client(service_name='bedrock-runtime',config=my_config)
bedrock_service = boto3.client(service_name='bedrock',config=my_config)

In [4]:
#check that it's working:
models = bedrock_service.list_foundation_models()
for line in models["modelSummaries"]:
    #print (line["modelId"])
    pass
if "anthropic.claude-3" in str(models):
    print("Claud-v3 found!")
else:
    print ("Error, no model found.")

Claud-v3 found!


In [5]:
MAX_ATTEMPTS = 1 #how many times to retry if Claude is not working.
session_cache = {} #for this session, do not repeat the same query to claude.
def ask_claude(messages,system="", DEBUG=False, model="haiku"):
    '''
    Send a prompt to Bedrock, and return the response.  Debug is used to see exactly what is being sent to and from Bedrock.
    messages can be an array of role/message pairs, or a string.
    '''
    raw_prompt_text = str(messages)
    #print ("Calling %s on Bedrock.  Prompt length (tokens):%s"%(model,count_tokens(raw_prompt_text)))
    if type(messages)==str:
        messages = [{"role": "user", "content": messages}]
    
    promt_json = {
        "system":system,
        "messages": messages,
        "max_tokens": 10000,
        "temperature": 0.7,
        "anthropic_version":"",
        "top_k": 250,
        "top_p": 0.7,
        "stop_sequences": ["\n\nHuman:"]
    }
    
    
    if DEBUG: print("sending:\nSystem:\n",system,"\nMessages:\n","\n".join(messages))
    
    if model== "opus":
        modelId = 'us.anthropic.claude-3-5-sonnet-20241022-v2:0'
    elif model== "sonnet":
        modelId = 'us.anthropic.claude-3-5-sonnet-20241022-v2:0'
    elif model== "haiku":
        modelId = 'us.anthropic.claude-3-5-haiku-20241022-v1:0'
    else:
        print ("ERROR:  Bad model, must be opus, sonnet, or haiku.")
        modelId = 'us.anthropic.claude-3-5-haiku-20241022-v1:0'
    
    if raw_prompt_text in session_cache:
        return [raw_prompt_text,session_cache[raw_prompt_text]]
    attempt = 1
    while True:
        try:
            response = bedrock.invoke_model(body=json.dumps(promt_json), modelId=modelId, accept='application/json', contentType='application/json')
            response_body = json.loads(response.get('body').read())
            results = response_body.get("content")[0].get("text")
            if DEBUG:print("Recieved:",results)
            break
        except Exception as e:
            print("Error with calling Bedrock: "+str(e))
            attempt+=1
            if attempt>MAX_ATTEMPTS:
                print("Max attempts reached!")
                results = str(e)
                break
            else:#retry in 10 seconds
                time.sleep(10)
    session_cache[raw_prompt_text] = results
    return [raw_prompt_text,results]

In [6]:
%%time
#check that it's working:
session_cache = {} 
try:
    query = "Please say the number four."
    #query = [{"role": "user", "content": "Please say the number two."},{"role": "assistant", "content": "Two."},{"role": "user", "content": "Please say the number three."}]
    result = ask_claude(query)
    print(query)
    print(result[1])
except Exception as e:
    print("Error with calling Claude: "+str(e))

Please say the number four.
Four.
CPU times: total: 203 ms
Wall time: 1.85 s


In [7]:
from queue import Queue
from threading import Thread

# Threaded function for queue processing.
def thread_request(q, result):
    while not q.empty():
        work = q.get()                      #fetch new work from the Queue
        thread_start_time = time.time()
        try:
            data = ask_claude(work[1],model=work[2])
            result[work[0]] = data          #Store data back at correct index
        except Exception as e:
            error_time = time.time()
            print('Error with prompt!',str(e))
            result[work[0]] = (str(e))
        #signal to the queue that task has been processed
        q.task_done()
    return True

def ask_claude_threaded(prompts,model="haiku",DEBUG=False):
    '''
    Call ask_claude, but multi-threaded.
    Returns a dict of the prompts and responces.
    '''
    q = Queue(maxsize=0)
    num_theads = min(50, len(prompts))
    
    #Populating Queue with tasks
    results = [{} for x in prompts];
    #load up the queue with the promts to fetch and the index for each job (as a tuple):
    for i in range(len(prompts)):
        #need the index and the url in each queue item.
        q.put((i,prompts[i],model))
        
    #Starting worker threads on queue processing
    for i in range(num_theads):
        #print('Starting thread ', i)
        worker = Thread(target=thread_request, args=(q,results))
        worker.setDaemon(True)    #setting threads as "daemon" allows main program to 
                                  #exit eventually even if these dont finish 
                                  #correctly.
        worker.start()

    #now we wait until the queue has been processed
    q.join()

    if DEBUG:print('All tasks completed.')
    return results

In [8]:
%%time
#test if our threaded Claude calls are working
session_cache = {} 
#q1 = [{"role": "user", "content": "Please say the number one."}]
#q2 = [{"role": "user", "content": "Please say the number two."}]
#q3 = [{"role": "user", "content": "Please say the number three."}]
#print(ask_claude_threaded([q1,q2,q3]))
print(ask_claude_threaded(["Please say the number one.","Please say the number two.","Please say the number three.","Please say the number four.","Please say the number five."],model='sonnet'))


  worker.setDaemon(True)    #setting threads as "daemon" allows main program to


[['Please say the number one.', 'one'], ['Please say the number two.', '2'], ['Please say the number three.', 'three'], ['Please say the number four.', 'four'], ['Please say the number five.', 'five']]
CPU times: total: 1.42 s
Wall time: 2.17 s


## 2) Examples of decomposition

Here we'll consider an example use case of a user who would like to undersand how many unique characters are in a novel, and a bit about the three most common ones.

### Start by downloading the novel.  Here we use Frankenstein by Mary Shelley, as it is in public domain.

In [None]:
!pip install beautifulsoup4

In [9]:
import requests, re
from bs4 import BeautifulSoup  # type: ignore

In [10]:
#grab the text from the Gutenberg project, a collection of public domain works.
#We use Beautiful Soup to parse the HTML of the webpage.
url = "https://www.gutenberg.org/files/84/84-h/84-h.htm"
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
raw_full_text_webpage = soup.text

In [None]:
!pip install --upgrade anthropic

In [11]:
#Cut the top and bottom of the webpage so that we only have the text of the book.
raw_full_text = raw_full_text_webpage[raw_full_text_webpage.index("Letter 1\n\nTo Mrs. Saville, England."):raw_full_text_webpage.find("*** END OF THE PROJECT GUTENBERG EBOOK FRANKENSTEIN ***")].replace("\r\n"," ").replace("\n", " ")
#encode some misc unicode charaters.
full_text = raw_full_text.encode('raw_unicode_escape').decode()
#show that we found the expected length
words_count = len(full_text.split())
pages_count = int(words_count/500)#quick estimate, real page count is dependant on page and font size.

token_count = count_tokens(full_text)
print ("Approximate word count:",words_count)
print ("Approximate page count:",pages_count)
print ("Approximate token count:",token_count)

Approximate word count: 74984
Approximate page count: 149
Approximate token count: 97479.2


### Now that we have our novel, let's try to find all the unique characters with a single prompt.

In [12]:
long_prompt_template = """Consider the following novel:
<novel>
{{NOVEL}}
</novel>

How many unique characters are there with at least one spoken line of dialog?  Please also provide a brief description of the top three most common characters in separate paragraphs. 
Only count charaters that have at least one spoken line of dialog.
"""

long_prompt = long_prompt_template.replace("{{NOVEL}}",full_text)
print ("Approximate prompt token count:",count_tokens(long_prompt))

Approximate prompt token count: 97541.6


In [13]:
%%time
session_cache = {}#don't use cached info, since we. want to time this.
long_responce = ask_claude(long_prompt, model="sonnet")[1]
print(long_responce)

Let me analyze the characters with spoken dialog in the novel:

Characters with spoken lines:
1. Victor Frankenstein
2. The Monster/Creature/Daemon
3. Elizabeth Lavenza
4. Henry Clerval
5. Alphonse Frankenstein (Victor's father)
6. M. Krempe
7. M. Waldman
8. Justine Moritz
9. Robert Walton
10. The Magistrate (Mr. Kirwin)
11. Felix De Lacey
12. Various sailors/villagers (minor characters with brief lines)

Top 3 most prominent speaking characters:

Victor Frankenstein is the primary protagonist and narrator for much of the novel. He is a brilliant but obsessed scientist who creates the monster that ultimately destroys his life. His dialog reveals his passionate nature, his guilt over creating the monster, and his descent from ambitious scientist to broken man seeking revenge.

The Monster/Creature is Victor's creation who becomes the antagonist, though his extensive dialogue reveals him to be articulate and complex rather than simply evil. His speeches show his evolution from an innocen

## Not bad!  93K tokens processed.  Let's see if we can make that faster and cheaper using prompt decomposition.
### We'll divide the novel into thirds, run each third in parallel, then write a fourth prompt to combine the results.

In [14]:
short_prompt_template = """Consider the following portion of a novel:
<novel>
{{NOVEL}}
</novel>

Please provide a list of unique characters, each in a character tag.  Inside the character tag should be a name tag with their name,
a count tag with an exact count of times they appear, and a description tag with a brief description of that character.
Only count charaters that have at least one spoken line of dialog.
"""

#let's cut the novel into thirds.
third = int(len(full_text)/3)
short_prompt_1 = short_prompt_template.replace("{{NOVEL}}",full_text[:third])
short_prompt_2 = short_prompt_template.replace("{{NOVEL}}",full_text[third:third+third])
short_prompt_3 = short_prompt_template.replace("{{NOVEL}}",full_text[third+third:])
print ("Approximate prompt token count:",count_tokens(short_prompt_1))

Approximate prompt token count: 32208.800000000003


### Now let's run these three prompts in parallel

In [15]:
%%time
session_cache = {}#don't use cached info, since we. want to time this.
short_responces = ask_claude_threaded([short_prompt_1,short_prompt_2,short_prompt_3],model='sonnet')
#show the reply from one of the three prompts
print(short_responces[0][1])

  worker.setDaemon(True)    #setting threads as "daemon" allows main program to


Here are the characters with spoken dialog from the provided text:

<character>
<name>Victor Frankenstein</name>
<count>3</count>
<description>The protagonist and narrator, a young scientist who creates a monster. He is well-educated, ambitious, and becomes consumed by guilt over his creation.</description>
</character>

<character>
<name>Henry Clerval</name>
<count>4</count>
<description>Victor's best friend who studies languages. He is kind, supportive, and helps nurse Victor back to health after his illness.</description>
</character>

<character>
<name>M. Krempe</name>
<count>3</count>
<description>A professor of natural philosophy at Ingolstadt. He is blunt and dismissive of Victor's early studies in alchemy.</description>
</character>

<character>
<name>M. Waldman</name>
<count>2</count>
<description>Another professor at Ingolstadt who encourages Victor's scientific pursuits. He is gentle and well-respected.</description>
</character>

<character>
<name>Elizabeth Lavenza</name>
<

## So far it's looking good!  We've processed the whole novel in around 17 seconds, down from 42.  Let's make a final call to get a final result that matches our original long prompt.

In [16]:
final_prompt_template = """Consider the following list of charaters from a novel.  Each entry contains the character's name,
a count of the number of times they appeared, and a brief description of that charater:
<characters>
{{CHARACTERS}}
</characters>
Some charaters may be listed more than once.  Use the name and description to determine that two entries are the same, 
and if they are, sum their count to support your responce.

How many unique characters are there?  Please also provide a brief description of the top three most common characters in separate paragraphs. 
"""

characters = short_responces[0][1]+short_responces[1][1]+short_responces[2][1]

final_prompt = final_prompt_template.replace("{{CHARACTERS}}",characters)
print ("Approximate prompt token count:",count_tokens(final_prompt))

Approximate prompt token count: 842.4


In [17]:
%%time
session_cache = {}#don't use cached info, since we. want to time this.
final_responce = ask_claude(final_prompt, model="sonnet")[1]
print(final_responce)

Let me analyze the unique characters and combine the counts for duplicates.

There are 15 unique characters after combining duplicates:
Victor Frankenstein (193 total appearances), The Monster/Creature/Daemon/Fiend (106 appearances), Henry Clerval (16 appearances), Elizabeth Lavenza/Elizabeth (15 appearances), Felix (7 appearances), De Lacey (6 appearances), Mr. Kirwin (6 appearances), M. Krempe (3 appearances), Ernest Frankenstein (3 appearances), M. Waldman (2 appearances), Justine Moritz (2 appearances), Alphonse Frankenstein (2 appearances), William (2 appearances), Robert Walton (4 appearances), and The Old Woman/Nurse (2 appearances).

Here are descriptions of the top three most frequently appearing characters:

Victor Frankenstein is the protagonist and narrator of the story, appearing 193 times throughout the novel. He is a well-educated and ambitious young scientist who creates a monster and becomes consumed by guilt over his creation. As the story progresses, he suffers great

### This final prompt took about 5 seconds to run.  The original long prompt took 42 seconds to run, and our decomposed version took 17s + 5, or 22 seconds total.  Almost twice as fast to do the same amount work!
Note that the decomposed version actually found 11 characters, not 10.  This is somewhat common, that the quality will slightly improve with smaller, more focused prompts, because the LLM can focus more when the prompt is smaller.

## For fun, let's repeate the exact same tests as above, but with the smaller, faster Haiku model.

In [18]:
%%time
session_cache = {}#don't use cached info, since we. want to time this.
long_responce_haiku = ask_claude(long_prompt, model="haiku")[1]
print(long_responce_haiku)

Let me count the unique characters with spoken dialog:

1. Victor Frankenstein
2. The Monster/Creature
3. Elizabeth Lavenza
4. Robert Walton
5. Victor's Father (Alphonse Frankenstein)
6. Henry Clerval
7. M. Waldman
8. M. Krempe
9. Justine Moritz
10. De Lacey
11. Felix
12. Safie
13. A magistrate in Ireland
14. Mr. Kirwin
15. A sailor/ship's leader
16. A rustic who shoots at the monster

Total: 16 unique characters with spoken dialog

Top 3 Most Common Characters:

Victor Frankenstein: The protagonist and creator of the monster. A brilliant but deeply troubled scientist who becomes consumed by guilt and a desire for revenge after creating life and witnessing the destruction caused by his creation.

The Monster/Creature: A sentient being created by Victor Frankenstein, who is rejected by his creator and society. Initially seeking companionship and understanding, he becomes increasingly violent and vengeful after repeated experiences of rejection and isolation.

Robert Walton: The narrator

### Now the full prompt takes only 11s, down from about 40 with the larger model.

In [19]:
%%time
session_cache = {}#don't use cached info, since we. want to time this.
short_responces_haiku = ask_claude_threaded([short_prompt_1,short_prompt_2,short_prompt_3],model='haiku')
characters_haiku = short_responces_haiku[0][1]+short_responces_haiku[1][1]+short_responces_haiku[2][1]
final_prompt_haiku = final_prompt_template.replace("{{CHARACTERS}}",characters_haiku)
final_responce_haiku = ask_claude(final_prompt_haiku, model="haiku")[1]
print(final_responce_haiku)

  worker.setDaemon(True)    #setting threads as "daemon" allows main program to


Let me help you consolidate the characters and count unique entries:

Unique Characters (17 total):
1. Victor Frankenstein (total count: 134 + Numerous + Numerous = multiple)
2. The Monster/Creature/Daemon
3. Henry Clerval (total count: 26 + 2 + Multiple)
4. Elizabeth Lavenza (total count: 22 + 14 + Multiple)
5. Alphonse Frankenstein (total count: 11 + 3 + Multiple)
6. Justine Moritz (total count: 10 + 5 + Not specified)
7. Robert Walton
8. Ernest Frankenstein
9. William Frankenstein
10. M. Waldman
11. M. Krempe
12. De Lacey
13. Felix
14. Agatha
15. Safie
16. Mr. Kirwin
17. Mr. Kirwin

Top Three Most Common Characters:

Victor Frankenstein is the protagonist and central figure of the novel. A brilliant but tormented scientist, he creates the monster and becomes consumed by guilt and a desire for revenge, ultimately pursuing his creation across vast distances.

The Monster/Creature is Frankenstein's sentient creation, rejected by society and his own creator. Seeking companionship and un

### Here we see that with such a fast model and small prompt, we don't get a perfomance improvment from decomposition.  Part of this is that with the shorter times involved, system operations and data transfer takes a larger percent of the time.