# GPT-Image-1

The latest image generation model released by OpenAI. Therefore we will work with this.

Please update your 'openai' package to 1.76.0 to see the latest documentation

In [1]:
import os

os.chdir("../../../")

In [2]:
from langchain_openai import ChatOpenAI

from src.initialization import credential_init


credential_init()

model = ChatOpenAI(openai_api_key=os.environ['OPENAI_API_KEY'],
                   model_name="gpt-4o-2024-05-13", temperature=0)

### OpenAI Image API Parameters:

https://platform.openai.com/docs/guides/image-generation?image-generation-model=gpt-image-1
https://cookbook.openai.com/examples/generate_images_with_gpt_image

<!-- - model: dall-e-3
- size (str): 1024x1024, 1024x1792, 1792x1024
- quality: hd, standard
- style: vivid, natural. Default vivid -->

- model: gpt-image-1
    - size (str): 1024x1024 (square), 1536x1024 (landscape), 1024x1536 (portrait) or auto (default)
    - quality: low, medium, high or auto
    - moderation: auto, low

In [3]:
from openai import OpenAI

prompt = ("A Sumi-e style watercolor painting of mountains during sunset. The sky is depicted with bold "
          "splashes of orange, pink, and purple hues, blending and overlapping in a dynamic composition. "
          "The mountains are represented with expressive brushstrokes, emphasizing their majestic and serene "
          "presence. The focus is on capturing the essence and mood of the scene rather than detailed realism. "
          "The overall effect is serene and contemplative, with a harmonious balance of color and form.")

client = OpenAI()

response = client.images.generate(
    model="gpt-image-1",
    prompt=prompt,
    size="1024x1024",
    # quality="hd",
    quality='high',
    n=1,
    # response_format = 'b64_json'
)

image_base64 = response.data[0].b64_json

In [4]:
client.images.generate?

[1;31mSignature:[0m
[0mclient[0m[1;33m.[0m[0mimages[0m[1;33m.[0m[0mgenerate[0m[1;33m([0m[1;33m
[0m    [1;33m*[0m[1;33m,[0m[1;33m
[0m    [0mprompt[0m[1;33m:[0m [1;34m'str'[0m[1;33m,[0m[1;33m
[0m    [0mbackground[0m[1;33m:[0m [1;34m"Optional[Literal['transparent', 'opaque', 'auto']] | NotGiven"[0m [1;33m=[0m [0mNOT_GIVEN[0m[1;33m,[0m[1;33m
[0m    [0mmodel[0m[1;33m:[0m [1;34m'Union[str, ImageModel, None] | NotGiven'[0m [1;33m=[0m [0mNOT_GIVEN[0m[1;33m,[0m[1;33m
[0m    [0mmoderation[0m[1;33m:[0m [1;34m"Optional[Literal['low', 'auto']] | NotGiven"[0m [1;33m=[0m [0mNOT_GIVEN[0m[1;33m,[0m[1;33m
[0m    [0mn[0m[1;33m:[0m [1;34m'Optional[int] | NotGiven'[0m [1;33m=[0m [0mNOT_GIVEN[0m[1;33m,[0m[1;33m
[0m    [0moutput_compression[0m[1;33m:[0m [1;34m'Optional[int] | NotGiven'[0m [1;33m=[0m [0mNOT_GIVEN[0m[1;33m,[0m[1;33m
[0m    [0moutput_format[0m[1;33m:[0m [1;34m"Optional[Literal['png', 'jpeg

## Save the image in your local computer

In [5]:
import base64

with open("tutorial/LLM+Langchain/Week-8/test.png", "wb") as fh:
    fh.write(base64.b64decode(image_base64))

## Two Challenges:

### 1. How to create prompt more efficiently? 

There are two types of prompt: 

1. Danbooru Tag: masterpiece, best quality, beautiful eyes, clear eyes, detailed eyes, Blue-eyes, 1girl, 20_old, full-body, break, smoking, break, high_color, blue-hair, beauty, black-boots,break, break, Flat vector art, Colorful art, white_shirt, simple_background, blue_background, Ink art, peeking out upper body, Eyes

2. Natural language: A Sumi-e style watercolor painting of mountains during sunset. The sky is depicted with bold splashes of orange, pink, and purple hues, blending and overlapping in a dynamic composition. The mountains are represented with expressive brushstrokes,emphasizing their majestic and serene presence. The focus is on capturing the essence and mood of the scene rather than detailed realism. The overall effect is serene and contemplative, with a harmonious balance of color and form.

As non-native English speakers, we find the natural language prompt challenging, even for native speakers, due to the inclusion of specialized terminologies and advanced vocabulary.

由於涉及專業術語和高級詞彙，我們作為非母語英語使用者，發現這個自然語言提示對我們來說是具有挑戰性的，即使對母語使用者來說也是如此。

### 2. How to make it an LCEL?

## Some websites for natural language prompt

- https://leonardo.ai/: An Image generation SaaS. A lot of works are created with natural language prompt. 
- https://blog.mlq.ai/dalle-prompts/: Some tutorial about how to come up with a natural language prompt.

In [6]:
def build_standard_chat_prompt_template(kwargs):

    system_content = kwargs['system']
    human_content = kwargs['human']
    
    system_prompt = PromptTemplate(**system_content)
    system_message = SystemMessagePromptTemplate(prompt=system_prompt)
    
    human_prompt = PromptTemplate(**human_content)
    human_message = HumanMessagePromptTemplate(prompt=human_prompt)
    
    chat_prompt = ChatPromptTemplate.from_messages([system_message,
                                                     human_message
                                                   ])

    return chat_prompt

### Natural Language Prompt Generation

In [7]:
from langchain.prompts import PromptTemplate, ChatPromptTemplate, HumanMessagePromptTemplate, SystemMessagePromptTemplate
from langchain_core.output_parsers import StrOutputParser


system_template = ("You are a helpful AI assistant and an art expert with extensive knowledge of photography "
                   "and illustration. You excel at creating breathtaking masterpieces with the DALLE-3 model. "
                   "For this task, you will be provided with a description of an image, and you will generate a "
                   "corresponding DALLE-3 prompt. The prompt should be detailed and descriptive, capturing the "
                   "essence of the image.")

human_template = "{image_desc}"

input_ = {"system": {"template": system_template},
          "human": {"template": human_template,
                    "input_variable": ["image_desc"]}}
    
chat_prompt = build_standard_chat_prompt_template(input_)

nl_prompt_generation_chain = chat_prompt | model | StrOutputParser()

## We wrap OPENAI API call as a function for the langchain usage 

In [8]:
from typing import Dict
from langchain_core.runnables import chain


def gpt_image_worker(kwargs: Dict):

    """
    Generates an image using OpenAI's GPT-Image-1 model based on the provided prompt and optional parameters.
    
    Parameters:
    kwargs (Dict): A dictionary containing the following keys:
        - 'nl_prompt' (str): The natural language prompt describing the image to be generated.
        - 'size' (str, optional): The size of the generated image. Default is "1024x1024".
        - 'quality' (str, optional): The quality of the generated image. Default is "medium".
    
    Returns:
    str: image base64 string
    """
    
    print("Start generating image...")
    print(f"prompt: {kwargs['nl_prompt']}")
    client = OpenAI()

    response = client.images.generate(
        model="gpt-image-1",
        prompt=kwargs['nl_prompt'],
        size=kwargs.get("size", "1024x1024"),
        quality=kwargs.get('quality', 'medium'),
        moderation=kwargs.get('moderation', 'auto'),
        n=1)

    image_base64 = response.data[0].b64_json

    print("Image is generated succesfully.")
    
    return image_base64

@chain
def base64_to_file(kwargs):

    """
    Save the image from a base64 string
    """
    
    image_base64 = kwargs['image_base64']
    filename = kwargs['filename']
    
    with open(f"{filename}", "wb") as fh:
        fh.write(base64.b64decode(image_base64))
    

In [9]:
from operator import itemgetter

from langchain_core.runnables import RunnableLambda, RunnableParallel, RunnablePassthrough

# step 1: create the image prompt
step_1 = RunnablePassthrough.assign(nl_prompt=itemgetter('image_desc')|nl_prompt_generation_chain)

# step 2: image generation process, as a base64
step_2 = RunnablePassthrough.assign(image_base64=gpt_image_worker)

# step 3: save the image
step_3 = base64_to_file

# chain step 1, step 2, step 3 together
gpt_image_chain =  step_1|step_2|step_3

In [10]:
gpt_image_chain.invoke({"size": "1024x1536",
                     "quality": "medium",
                     "image_desc": ("warhammer 40k, astartes, power armor, chain sword, purity seal, oil painting"),
                     "filename": "tutorial/LLM+Langchain/Week-8/astartes.png"
                    })

Start generating image...
prompt: Create an oil painting of a Warhammer 40k Astartes warrior in full power armor. The Astartes stands in a heroic pose, holding a massive chain sword in one hand, with the other hand clenched into a fist. The power armor is intricately detailed, featuring battle scars, ornate engravings, and a prominent purity seal attached to the chest plate. The background is a war-torn battlefield, with smoke and fire adding to the dramatic atmosphere. The painting style should be rich and textured, capturing the intensity and grandeur of the scene.
Image is generated succesfully.


In [11]:
gpt_image_chain.invoke({"size": "1536x1024",
                     "quality": "medium",
                     "image_desc": ("Tifa Lockhart, kimono, head ornament, looking at viewer, cherry blossom, "
                                    "black-white hightech combat suite, chibi style."),
                     "filename": "tutorial/LLM+Langchain/Week-8/Tifa-01.png",
                     "moderation": 'low'
                    })


Start generating image...
prompt: Create a chibi-style illustration of Tifa Lockhart wearing a traditional kimono adorned with intricate patterns and a delicate head ornament. She is looking directly at the viewer with a gentle smile. Surround her with blooming cherry blossoms, their petals softly falling around her. Tifa's kimono contrasts with a sleek, black-and-white high-tech combat suit visible underneath, blending traditional and futuristic elements seamlessly. The background should be a serene, pastel-colored landscape with hints of modern technology subtly integrated.
Image is generated succesfully.


Every model has its strength
In my opinion:

- SDXL: style
- PONY, Illustrious: pose control, view angle control
- FLUX: realistic

In [12]:
gpt_image_chain.invoke({"size": "1024x1536",
                        "quality": "medium",
                        "image_desc": ("1girl, azur lane style outfit, high ponytail, very long hair, sidelocks, bangs, "
                                       "in a whisky bar, rest head on hand, dim lighting, exuding an aura of youth and "
                                       "ethereal beauty, sketch, illustration, looking at viewer, heart in air, based on the outfit of Bismarck."),
                        "filename": "tutorial/LLM+Langchain/Week-8/azur_lane_style.png",
                        "moderation": 'low'
                      })

Start generating image...
prompt: Create a detailed sketch illustration of a young woman in an Azur Lane style outfit, inspired by Bismarck's attire. She has a high ponytail with very long hair, sidelocks, and bangs. The setting is a dimly lit whisky bar, where she rests her head on her hand, exuding an aura of youth and ethereal beauty. She is looking directly at the viewer, with a heart floating in the air near her. The overall style should be delicate and enchanting, capturing the essence of the character and the intimate atmosphere of the bar.
Image is generated succesfully.


### Image Render

In [None]:
# client.images.edit?

In [13]:
from src.io.path_definition import get_project_dir

In [14]:
image_path = os.path.join(get_project_dir(), "tutorial", "LLM+Langchain", "Week-8", "Prinz_Eugen.png")

result_edit = client.images.edit(
    model="gpt-image-1",
    image=open(image_path, "rb"), 
    prompt="generate a photorealistic image",
    size="1024x1536"
)

image_base64 = result_edit.data[0].b64_json

with open("tutorial/LLM+Langchain/Week-8/Eugen_Prinz_Render.png", "wb") as fh:
    fh.write(base64.b64decode(image_base64))

In [15]:
result_edit = client.images.edit(
    model="gpt-image-1",
    image=open(image_path, "rb"), 
    prompt="Generate a photorealistic image. Make the character have an ulzzang look",
    size="1024x1536"
)

In [16]:
image_base64 = result_edit.data[0].b64_json

with open("tutorial/LLM+Langchain/Week-8/Eugen_Prinz_Render_Ulzzang.png", "wb") as fh:
    fh.write(base64.b64decode(image_base64))

In [19]:
from textwrap import dedent


result_edit = client.images.edit(
    model="gpt-image-1",
    image=open(image_path, "rb"), 
    prompt=dedent("""
                  Generate a photorealistic image. 
                  Make the character have an ulzzang look with flawless Korean kpop makeup.
                  She should have a glamorous appearance and glowing dew skin texture.
                  """),
    size="1024x1536"
)

image_base64 = result_edit.data[0].b64_json

with open("tutorial/LLM+Langchain/Week-8/Eugen_Prinz_Flawless.png", "wb") as fh:
    fh.write(base64.b64decode(image_base64))

In [None]:
# result_edit = client.images.edit(
#     model="gpt-image-1",
#     image=open(image_path, "rb"), 
#     prompt="Generate a photorealistic image.\n"
#            "Make the character have an ulzzang appearance with soft, flawless, youthful look with large eyes and gentle features. "
#            "While keep the polish and porcelain skin texture, and glamorous appearance of the girl.",
#     size="1024x1536"
# )

# image_base64 = result_edit.data[0].b64_json

# with open("tutorial/LLM+Langchain/Week-8/Eugen_Prinz_Render_Combined.png", "wb") as fh:
#     fh.write(base64.b64decode(image_base64))

You can experiment with different art styles to render your images — there's much more than just the Ghibli style!

Some examples of art styles you can explore:
- Studio Ghibli
- Disney animation
- Pixel art
- Cyberpunk
- Watercolor
- Oil painting
- Dark fantasy aesthetic

## Use this as a tool for Agent

In [None]:
# prompt_template = """
# Answer the following questions as best you can. You have access to the following tools:

# {tools}

# Use the following format:

# Question: the input question you must answer

# Thought: you should always think about what to do

# Action: the action to take, should be one of [{tool_names}]

# Action Input: the input to the action

# Observation: the result of the action

# ... (this Thought/Action/Action Input/Observation can repeat N times)

# Thought: I now know the final answer

# Final Answer: the final answer to the original input question

# Begin!

# Question: {input}

# Thought:{agent_scratchpad}
# """

In [20]:
import base64
from openai import OpenAI
from operator import itemgetter

from langchain.agents import Tool, AgentExecutor, create_react_agent
from langchain.prompts import PromptTemplate, ChatPromptTemplate, HumanMessagePromptTemplate, SystemMessagePromptTemplate
from langchain.tools import BaseTool
from langchain_core.output_parsers import StrOutputParser, PydanticOutputParser
from pydantic import BaseModel, Field
from langchain_core.runnables import RunnableLambda, RunnableParallel, RunnablePassthrough

from src.agent.react_zero_shot import prompt_template as zero_shot_prompt_template

client = OpenAI()

# We need both the query and filename (minimal requirement):
# Some control variables
# What we learned last week?

# step 1: create the image prompt
step_1 = RunnablePassthrough.assign(nl_prompt=itemgetter('image_desc')|nl_prompt_generation_chain)

# step 2: image generation process, as a base64
step_2 = RunnablePassthrough.assign(image_base64=gpt_image_worker)

# step 3: save the image
step_3 = base64_to_file

# chain step 1, step 2, step 3 together
gpt_image_chain =  step_1|step_2|step_3


from typing import Literal

class ImageInput(BaseModel):
    image_desc: str = Field(description=("image description / prompt"))
    filename: str = Field(description="the location at which the image will be saved")
    size: Literal["1024x1024",
                  "1536x1024",
                  "1024x1536"] = Field(description='image size, can be 1024x1024 (square), 1536x1024 (landscape), 1024x1536 (portrait) or auto (default)')
    quality: Literal["low",
                     "medium",
                     "high",
                     "auto"] = Field(description='image quality, low, medium, high or auto')


class ImageTool(BaseTool):

    name: str = "Image generator with GPT-Image-1"

    input_output_parser: PydanticOutputParser = PydanticOutputParser(pydantic_object=ImageInput)
    
    input_format_instructions: str = input_output_parser.get_format_instructions()

    description_template: str = ("Use this tool when you need to create an image\n\n"
                                 "input format_instructions: {input_format_instructions}")

    description: str = description_template.format(input_format_instructions=input_format_instructions)
    
    def _run(self, query):

        input_ = self.input_output_parser.parse(query)
        
        image_desc = input_.image_desc
        size = input_.size
        quality = input_.quality
        filename = input_.filename
        
        gpt_image_chain.invoke({"image_desc": image_desc,
                                "size": size,
                                "quality": quality,
                                "filename": filename,
                                "moderation": 'low'})
        
        return "Done"

    def _arun(self, radius: int):
        raise NotImplementedError("This tool does not support async")

# Zero Shot 標準模板
prompt = PromptTemplate.from_template(zero_shot_prompt_template)

# 建立工具庫 
tools = [ImageTool()]

# 創造Agent 
zero_shot_agent = create_react_agent(
    llm=model,
    tools=tools,
    prompt=prompt,
)

# 創造Agent Executor
agent_executor = AgentExecutor(agent=zero_shot_agent, tools=tools, verbose=True)

In [21]:
image_prompt = """
brown hair, bangs, two side up, twin ponytails, sidelocks, black hat, jewelry, black cheongsam, intricate golden embroidery, long sleeves,
detached sleeves, sheer long skirt, head ornament, a 17-years-old ethereal and glamorous beautiful japanese idol,
translucent skin tone, anime-like face, profound facial features, bright eyes, faint rosy blush, mesmerizing city view. 
night, photorealistic, heart hands, heart in air
"""

filename = "tutorial/LLM+Langchain/Week-8/test_04.png"

In [22]:
agent_executor.invoke({"input": f"Generate in image with the following information: \n {image_prompt}. and save the image at {filename}"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mTo generate the image with the specified details, I will use the image generator tool. I need to provide the image description, filename, size, and quality.

Action: Image generator with GPT-Image-1

Action Input:
```json
{
  "image_desc": "brown hair, bangs, two side up, twin ponytails, sidelocks, black hat, jewelry, black cheongsam, intricate golden embroidery, long sleeves, detached sleeves, sheer long skirt, head ornament, a 17-years-old ethereal and glamorous beautiful japanese idol, translucent skin tone, anime-like face, profound facial features, bright eyes, faint rosy blush, mesmerizing city view. night, photorealistic, heart hands, heart in air",
  "filename": "tutorial/LLM+Langchain/Week-8/test_04.png",
  "size": "1024x1536",
  "quality": "high"
}
```
[0mStart generating image...
prompt: Create a photorealistic image of a 17-year-old ethereal and glamorous beautiful Japanese idol with an anime-like face and profou

{'input': 'Generate in image with the following information: \n \nbrown hair, bangs, two side up, twin ponytails, sidelocks, black hat, jewelry, black cheongsam, intricate golden embroidery, long sleeves,\ndetached sleeves, sheer long skirt, head ornament, a 17-years-old ethereal and glamorous beautiful japanese idol,\ntranslucent skin tone, anime-like face, profound facial features, bright eyes, faint rosy blush, mesmerizing city view. \nnight, photorealistic, heart hands, heart in air\n. and save the image at tutorial/LLM+Langchain/Week-8/test_04.png',
 'output': 'The image has been generated and saved at the location "tutorial/LLM+Langchain/Week-8/test_04.png" with the specified details.'}

### OpenAI WebSearch Update:

- https://platform.openai.com/docs/guides/tools-web-search?api-mode=responses

The world changes very fast.

Below are a few notable implementation considerations when using web search.

- Web search is currently not supported in the gpt-4.1-nano model.
- The gpt-4o-search-preview and gpt-4o-mini-search-preview models used in Chat Completions only support a subset of API parameters - view their model data pages for specific information on rate limits and feature support.
- When used as a tool in the Responses API, web search has the same tiered rate limits as the models above.
- Web search is limited to a context window size of 128000 (even with gpt-4.1 and gpt-4.1-mini models).
- Refer to this guide for data handling, residency, and retention information.

In [None]:
from openai import OpenAI
client = OpenAI()

## ACG Characters

### Genshin

In [27]:
from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
        model="gpt-4o-search-preview",
        web_search_options={"search_context_size": "high"},
        messages=[{"role": "user",
                   "content": dedent("""What is the appearance of Hu Tao from Genshin?""")}]
    )

print(response.choices[0].message.content)

Hu Tao, the 77th Director of the Wangsheng Funeral Parlor in Genshin Impact, has a distinctive and memorable appearance:

- **Complexion and Eyes**: She has a fair complexion complemented by bright scarlet eyes, each featuring white, blossom-shaped pupils. ([genshin-impact.fandom.com](https://genshin-impact.fandom.com/wiki/Hu_Tao/Lore?utm_source=openai))

- **Hair**: Her long, dark brown hair fades into crimson at the tips. It's styled into two high twintails, parted with a zig-zag pattern, with side-swept bangs framing her face. ([ultraverse.fandom.com](https://ultraverse.fandom.com/wiki/Hu_Tao?utm_source=openai))

- **Attire**: Hu Tao wears a traditional red shirt with a mandarin collar beneath a dark brown coat that features a darker collar and sleeve cuffs. The coat is adorned with long rectangular coattails and golden brooches attached beneath the collar and at the waist. She pairs this with black shorts accented with gold details. ([ultraverse.fandom.com](https://ultraverse.fando

### Stellar Blade

In [28]:
response = client.chat.completions.create(
        model="gpt-4o-search-preview",
        web_search_options={"search_context_size": "high"},
        messages=[{"role": "user",
                   "content": dedent("""What is the appearance of Eve of Stellar Blade?""")}]
    )

print(response.choices[0].message.content)

Eve, the protagonist of *Stellar Blade*, is depicted as a striking woman with a slender yet shapely physique, brown eyes, and long black hair styled into a ponytail with bangs. Her default attire, the Planet Diving Suit (7th), is a green, form-fitting bodysuit with a metallic sheen, complemented by a green tie, white gloves, and cape-like extensions at the back. Notably, her sword, Blood Edge, doubles as a hair clip for her ponytail when not in use. ([stellarblade.fandom.com](https://stellarblade.fandom.com/wiki/EVE?utm_source=openai))

The development team at Shift Up crafted Eve's body based on 3D scans of South Korean model Shin Jae-eun, aiming to create an appealing character design. However, Eve's facial features were uniquely designed in-house. ([automaton-media.com](https://automaton-media.com/en/news/20240207-26822/?utm_source=openai))

Players have the option to customize Eve's appearance extensively, including choices in hairstyles, hair color, earrings, glasses, and various 

# **** 預計第一個小時結束 ****

## LCEL ACG character appearance chain

In [31]:
system_template

AttributeError: 'str' object has no attribute 'content'

In [35]:
from langchain.docstore.document import Document
from langchain.agents import Tool, AgentExecutor, create_react_agent
from langchain.tools import BaseTool
from langchain.prompts import PromptTemplate, ChatPromptTemplate, HumanMessagePromptTemplate, SystemMessagePromptTemplate
from langchain_core.output_parsers import StrOutputParser

system_template = ("You are a helpful AI assistant with deep knowledge of anime, manga, "
                   "and mobile games. You will generate the face, body, attire, hairstyle, and accessories of a character in great " 
                   "detail. The output should consist of:\n\n"
                   "- Face:\n"
                   "- Body:\n"
                   "- Attire:\n"
                   "- Hairstyle:\n"
                   "- Footwear:\n"
                   "- Accessories:\n\n"
                   "If you are not sure about the answer, please find the content from the internet.")

# Make this simple in the beginning
@chain
def gpt_web_search_tool(text):

    client = OpenAI()

    response = client.chat.completions.create(
        model="gpt-4o-search-preview",
        web_search_options={"search_context_size": "high"},
        messages=[{"role": "system",
                   "content": system_template},
                   {"role": "user",
                   "content": text}]
    )

    return response.choices[0].message.content

In [36]:
output = gpt_web_search_tool.invoke("What is the appearance of Hu Tao from Genshin")
print(output)

Hu Tao, the 77th Director of the Wangsheng Funeral Parlor in Genshin Impact, has a distinctive and memorable appearance:

- **Face**: She possesses a fair complexion with bright scarlet eyes that feature white, blossom-shaped pupils. ([genshin-impact.fandom.com](https://genshin-impact.fandom.com/wiki/Hu_Tao/Lore?utm_source=openai))

- **Body**: Hu Tao has a petite build, complementing her lively and energetic demeanor. ([genshin-impact.fandom.com](https://genshin-impact.fandom.com/wiki/Hu_Tao/Lore?utm_source=openai))

- **Attire**: She wears a traditional red shirt with a mandarin collar beneath a dark brown coat that has a darker collar and sleeve cuffs. The coat features long rectangular coattails and is adorned with golden brooches attached beneath the collar and at the waist. Her black shorts have gold accents, and she completes her outfit with white socks that reach mid-calf, decorated with red straps and plum blossoms, along with low-heeled black dress shoes. ([genshin-impact.fan

In [59]:
class ACGLLMTool(BaseTool):

    name: str = "Anime character explorer"
    description: str = "Use this tool to generate and explore detailed designs for anime and ACG (Animation, Comics, and Games) characters."

    def _run(self, query: str):
        
        response = gpt_web_search_tool.invoke(query)
        
        return response

    def _arun(self, radius: int):
        raise NotImplementedError("This tool does not support async")
        
        
class ImageTool(BaseTool):

    name:str = "ACG characters image generator with GPT-Image-1"

    input_output_parser: PydanticOutputParser = PydanticOutputParser(pydantic_object=ImageInput)
    
    input_format_instructions: str = input_output_parser.get_format_instructions()

    description_template: str = ("Use this tool when you need to create an image\n\n"
                                 "input format_instructions: {input_format_instructions}")

    description: str = description_template.format(input_format_instructions=input_format_instructions)
    
    def _run(self, query):
        
        input_ = self.input_output_parser.parse(query)
        
        image_desc = input_.image_desc

        print(f"ImageTool image_desc: {image_desc}")
        
        size = input_.size
        quality = input_.quality
        filename = input_.filename
        
        gpt_image_chain.invoke({"image_desc": image_desc,
                                "size": size,
                                "quality": quality,
                                "filename": filename,
                                "moderation": 'low'})
        
        return "Done"

    def _arun(self, radius: int):
        raise NotImplementedError("This tool does not support async")

        
prompt_template = """
Answer the following questions as best you can. You have access to the following tools:

{tools}

Use the following format:

Question: the input question you must answer

Thought: you should always think about what to do

Action: the action to take, should be one of [{tool_names}]

Action Input: the input to the action

Observation: the result of the action

... (this Thought/Action/Action Input/Observation can repeat N times)

Thought: I now know the final answer

Final Answer: the final answer to the original input question

Begin!

Question: {input}

Thought:{agent_scratchpad}
"""        
             
prompt = PromptTemplate.from_template(prompt_template)

tools = [ImageTool(), ACGLLMTool()]

zero_shot_agent = create_react_agent(
    llm=model,
    tools=tools,
    prompt=prompt,
)

agent_executor = AgentExecutor(agent=zero_shot_agent, tools=tools, verbose=True)

In [60]:
agent_executor.invoke({"input": "Generate an image of Hu Tao from Genshim sitting in a roller coaster in pastel art style"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mTo generate an image of Hu Tao from Genshin Impact sitting in a roller coaster in a pastel art style, I will use the ACG characters image generator with GPT-Image-1. I need to provide a detailed description of the image, specify the filename, size, and quality.

Action: ACG characters image generator with GPT-Image-1

Action Input:
```json
{
  "image_desc": "Hu Tao from Genshin Impact sitting in a roller coaster, pastel art style",
  "filename": "hu_tao_roller_coaster_pastel.png",
  "size": "1024x1024",
  "quality": "high"
}
```
[0mImageTool image_desc: Hu Tao from Genshin Impact sitting in a roller coaster, pastel art style
Start generating image...
prompt: Create a pastel art style illustration of Hu Tao from Genshin Impact sitting in a roller coaster. Hu Tao, with her signature hat and playful expression, is seated in the front row of a vibrant, whimsical roller coaster. The background features a dreamy amusement park wit

{'input': 'Generate an image of Hu Tao from Genshim sitting in a roller coaster in pastel art style',
 'output': 'The image of Hu Tao from Genshin Impact sitting in a roller coaster in a pastel art style has been generated and saved as "hu_tao_roller_coaster_pastel.png" with a size of 1024x1024 and high quality.'}

In [61]:
agent_executor.invoke({"input": dedent("""
Generate an image of Eve from Stellar Blade (Keyhole Suit), walking on the street in Taipei City, with Taipei 101 as the background.
Please use the tool 'Anime character explorer' to get the detailed information.
""") })



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mTo generate an image of Eve from Stellar Blade in a keyhole suit, walking on the street in Taipei City with Taipei 101 in the background, I need to first gather detailed information about Eve's character design using the Anime character explorer tool. 

Action: Anime character explorer
Action Input: "Eve from Stellar Blade in a keyhole suit"
[0m[33;1m[1;3mIn the game *Stellar Blade*, Eve can wear the Keyhole Suit, a distinctive outfit designed by Tetrastar C&T's lead designer, "Galaxy" Alan. This suit is part of the "Exotic Sense" collection and features a black, thigh-length dress with hexagonal patterns, elbow-length gloves, and thigh-high boots. Notably, the dress has a revealing chest area, intended to be covered by an electromagnetic field to allow the wearer to receive the universe's energy. ([stellarblade.fandom.com](https://stellarblade.fandom.com/wiki/Keyhole_Suit?utm_source=openai))

To acquire the Keyhole Suit, 

{'input': "\nGenerate an image of Eve from Stellar Blade (Keyhole Suit), walking on the street in Taipei City, with Taipei 101 as the background.\nPlease use the tool 'Anime character explorer' to get the detailed information.\n",
 'output': 'The image of Eve from Stellar Blade in a keyhole suit, walking on the street in Taipei City with Taipei 101 in the background, has been generated and saved as "eve_stellar_blade_taipei_city.png".'}

## Audible 有聲書

- 文轉語音: TTS tool
- 文轉圖: Image tool

### Children Book Image Generator

- Generate image according to the story

In [40]:
system_template = ("You are a helpful AI assistant and an art expert with extensive knowledge of illustration.\n "
                   "You excel at creating Pencil and Ink Style illustrations for 6-year-old children using the GPT-Image-1 model. "
                   "This style is characterized by detailed line work, often in black and white or with minimal color, and has a classic, "
                   "timeless feel. For this task, you will be provided with a paragraph of a story, and you will generate a corresponding "
                   "DALLE-3 prompt which captures the storyline. The prompt should be detailed and descriptive, capturing the essence of "
                   "the image.")


system_prompt = PromptTemplate(template=system_template)

# System prompt
system_message = SystemMessagePromptTemplate(prompt=system_prompt)

human_prompt = PromptTemplate(template="{story}",
                              input_variables=['story'])

# Create a human message prompt template based on the prompt
human_message = HumanMessagePromptTemplate(prompt=human_prompt)

chat_prompt = ChatPromptTemplate.from_messages([system_message, human_message])

nl_prompt_generation_chain = chat_prompt | model | StrOutputParser()     

step_1 = RunnablePassthrough.assign(nl_prompt=itemgetter('story')|nl_prompt_generation_chain)
step_2 = RunnablePassthrough.assign(image_base64=gpt_image_worker)
step_3 = base64_to_file
image_chain = step_1 | step_2 | step_3

In [None]:
# step_1 = RunnablePassthrough.assign(nl_prompt=itemgetter('image_desc')|nl_prompt_generation_chain)

# # step 2: image generation process, as a base64
# step_2 = RunnablePassthrough.assign(image_base64=dalle3_worker)

# # step 3: save the image
# step_3 = RunnableLambda(base64_to_file)

# # chain step 1, step 2, step 3 together
# dalle3_chain =  step_1|step_2|step_3

- Generate the story

In [41]:
system_template = ("You are a helpful AI assistant who likes children. You are great storyteller and know how to create content for kindergarten kids. "
                   "A short chapter is created once at a time.")

system_prompt = PromptTemplate(template=system_template)

# System prompt
system_message = SystemMessagePromptTemplate(prompt=system_prompt)

human_prompt = PromptTemplate(template="{input}",
                              input_variables=['input'])

# Create a human message prompt template based on the prompt
human_message = HumanMessagePromptTemplate(prompt=human_prompt)

chat_prompt = ChatPromptTemplate.from_messages([system_message, human_message])

story_chain = chat_prompt | model | StrOutputParser()     

In [42]:
story = story_chain.invoke({"input": "Create a chapter of a baby owl capturing a rodent in the night as his dinner"})

In [43]:
image_chain.invoke({"story":story,
                    "filename": "tutorial/LLM+Langchain/Week-8/story_2_image.png"})

Start generating image...
prompt: **Prompt:**

A detailed pencil and ink style illustration of a cozy forest at night, with tall, whispering trees and a deep blue sky illuminated by a silvery moon. High up in an old oak tree, a snug nest is visible, where a baby owl named Ollie, with big, round eyes that sparkle like stars and soft, fluffy feathers, is preparing for his first nighttime adventure. Ollie is seen flapping his tiny wings with a mix of excitement and nervousness. Below, the forest floor is alive with the sounds of crickets chirping and leaves rustling. In the distance, a tiny mouse is scurrying through the underbrush, unaware of Ollie hovering silently above. The scene captures the moment Ollie, with a swift and graceful swoop, spreads his wings wide and dives down towards the mouse, ready to catch his first dinner. The illustration should convey a sense of wonder, bravery, and the magical atmosphere of the nighttime forest.
Image is generated succesfully.


In [44]:
import json

from src.agent.react_zero_shot import prompt_template as zero_shot_prompt_template
# from langchain_core.prompts import MessagesPlaceholder

client = OpenAI(api_key=os.environ['OPENAI_API_KEY'])

class TTSInput(BaseModel):
    text: str = Field(description="The story")
    filename: str = Field(description="the location at which the audio file will be saved")

    
class TTSTool(BaseTool):

    name: str = "Text to Sound (tts) tool"

    input_output_parser: PydanticOutputParser = PydanticOutputParser(pydantic_object=TTSInput)
    
    input_format_instructions: str = input_output_parser.get_format_instructions()

    description_template: str = ("Use this tool to generate an audio file of the story.\n"
                                 "input format: {input_format_instructions}.")

    description: str = description_template.format(input_format_instructions=input_format_instructions)

    
    def _run(self, text: str):

        input_ = self.input_output_parser.parse(text)

        text = input_.text
        filename = input_.filename
        response = self.tts(text)
        
        response.stream_to_file(filename)
        
        return filename

    def _arun(self, radius: int):
        raise NotImplementedError("This tool does not support async")
        
        
    def tts(self, text: str):
        
        response = client.audio.speech.create(
          model="tts-1",
          voice="nova",
          input=text
        )

        return response
           
            
prompt = PromptTemplate.from_template(zero_shot_prompt_template)

tools = [TTSTool(), 
         ImageTool(),
         Tool(name="StoryTeller",
              func=story_chain.invoke,
              description="useful for create a story",
        )]

zero_shot_agent = create_react_agent(
    llm=model,
    tools=tools,
    prompt=prompt,
)

agent_executor = AgentExecutor(agent=zero_shot_agent, tools=tools, verbose=True, handle_parsing_errors=True)

In [45]:
prompt = ("Create a chapter of a baby owl capturing a rodent in the night as his dinner.\n"
         "After having the final answer, please create a corresponding image and record the story as an mp3. "
         "The saved image (.png) and mp3 (.mp3) should have same name in the folder `tutorial/LLM+Langchain/Week-8")

agent_executor.invoke({"input": prompt})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mTo create a chapter of a baby owl capturing a rodent in the night as his dinner, I will first generate the story using the StoryTeller tool. After that, I will create a corresponding image using the ACG characters image generator with GPT-Image-1. Finally, I will record the story as an mp3 using the Text to Sound (tts) tool. The saved image and mp3 will have the same name in the specified folder.

Action: StoryTeller
Action Input: "Create a chapter of a baby owl capturing a rodent in the night as his dinner."
[0m[38;5;200m[1;3m**Chapter 1: Ollie the Baby Owl's Nighttime Adventure**

Once upon a time, in a cozy forest filled with tall trees and twinkling stars, there lived a baby owl named Ollie. Ollie had big, round eyes that sparkled like the moon and soft, fluffy feathers that kept him warm during the chilly nights.

One evening, as the sun dipped below the horizon and the sky turned a deep shade of blue, Ollie woke up f

  response.stream_to_file(filename)


[36;1m[1;3mtutorial/LLM+Langchain/Week-8/Ollie_Nighttime_Adventure.mp3[0m[32;1m[1;3mI now know the final answer.

Final Answer: The chapter of the baby owl capturing a rodent in the night as his dinner has been successfully created. The corresponding image and audio file have been generated and saved in the specified folder.

- Image: `tutorial/LLM+Langchain/Week-8/Ollie_Nighttime_Adventure.png`
- Audio: `tutorial/LLM+Langchain/Week-8/Ollie_Nighttime_Adventure.mp3`

You can find the image and audio file in the `tutorial/LLM+Langchain/Week-8` folder.[0m

[1m> Finished chain.[0m


{'input': 'Create a chapter of a baby owl capturing a rodent in the night as his dinner.\nAfter having the final answer, please create a corresponding image and record the story as an mp3. The saved image (.png) and mp3 (.mp3) should have same name in the folder `tutorial/LLM+Langchain/Week-8',
 'output': 'The chapter of the baby owl capturing a rodent in the night as his dinner has been successfully created. The corresponding image and audio file have been generated and saved in the specified folder.\n\n- Image: `tutorial/LLM+Langchain/Week-8/Ollie_Nighttime_Adventure.png`\n- Audio: `tutorial/LLM+Langchain/Week-8/Ollie_Nighttime_Adventure.mp3`\n\nYou can find the image and audio file in the `tutorial/LLM+Langchain/Week-8` folder.'}

In [None]:
prompt = """
         Assuming that Harry Porter is in the world of Warhammer 40k. He still has his magical power.
         He lives in the lower part of a Hive city. It is about the time for Tithe and the black ship is comming.
         Describe me the fate of Harry Porter. Please keep the darkness of the world view of Warhammer 40k.
        The saved image and mp3 should have same name in the folder `tutorial/LLM+Langchain/Week-8`
        """

agent_executor.invoke({"input": prompt})

## Can we create a story with multiple pages?

I do not know the answer, let me try...

4 pages to save the cost. But it can be extended.

In [46]:
prompt = """
         I want to create an 4 pages story for a child. He likes snow owl.
         For each page, please create a corresponding image and record the story as an mp3.
         After having the final answer, please create a corresponding image and record the story as an mp3. 
         The saved image and mp3 should have same name, following the structure of 
         <Page - idx>, with idx as a number starting from 1, in the folder `tutorial/LLM+Langchain/Week-8`
         """

agent_executor.invoke({"input": prompt})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mI need to create a 4-page story for a child who likes snow owls. For each page, I will generate a corresponding image and record the story as an mp3. The saved images and mp3 files should have the same name, following the structure of `<Page - idx>`, with idx as a number starting from 1, in the folder `tutorial/LLM+Langchain/Week-8`.

First, I will create the story using the StoryTeller tool.

Action: StoryTeller
Action Input: "Create a 4-page story for a child who likes snow owls."
[0m[38;5;200m[1;3m**Page 1: The Snowy Forest**

Once upon a time, in a land where the trees were always covered in a blanket of snow, there lived a little snow owl named Oliver. Oliver had the fluffiest white feathers and the biggest, roundest eyes. He loved to explore the snowy forest, where he would swoop and glide between the tall, frosty trees.

**Page 2: Oliver's New Friend**

One chilly morning, as Oliver was flying over a sparkling, froz

  response.stream_to_file(filename)


[36;1m[1;3mtutorial/LLM+Langchain/Week-8/Page-1.mp3[0m[32;1m[1;3mI have successfully generated the image and recorded the story for Page 1. Now, I will proceed to create the corresponding image and record the story for Page 2.

Action: ACG characters image generator with GPT-Image-1
Action Input: 
```json
{
  "image_desc": "A tiny, shivering bunny named Bella sitting on a snow-covered log by a sparkling, frozen pond. Oliver, the snow owl, lands softly beside her.",
  "filename": "tutorial/LLM+Langchain/Week-8/Page-2.png",
  "size": "1024x1024",
  "quality": "high"
}
```
[0mStart generating image...
prompt: A heartwarming winter scene featuring a tiny, shivering bunny named Bella sitting on a snow-covered log by a sparkling, frozen pond. The delicate snowflakes gently fall around her, creating a serene and magical atmosphere. Beside Bella, a majestic snow owl named Oliver lands softly, his feathers blending seamlessly with the snowy landscape. The scene is bathed in the soft, ethe

  response.stream_to_file(filename)


[36;1m[1;3mtutorial/LLM+Langchain/Week-8/Page-2.mp3[0m[32;1m[1;3mI have successfully generated the image and recorded the story for Page 2. Now, I will proceed to create the corresponding image and record the story for Page 3.

Action: ACG characters image generator with GPT-Image-1
Action Input: 
```json
{
  "image_desc": "Oliver the snow owl flying above, keeping an eye out for any signs of Bella's home, while Bella the bunny hops along the snowy path. They meet a wise old fox and a family of deer in the snowy forest.",
  "filename": "tutorial/LLM+Langchain/Week-8/Page-3.png",
  "size": "1024x1024",
  "quality": "high"
}
```
[0mStart generating image...
prompt: Create an enchanting winter scene featuring Oliver the snow owl soaring gracefully above a snowy forest, his keen eyes scanning for any signs of Bella's home. Below, Bella the bunny hops along a snow-covered path, her fur blending with the pristine white surroundings. In the heart of the forest, they encounter a wise old

  response.stream_to_file(filename)


[36;1m[1;3mtutorial/LLM+Langchain/Week-8/Page-3.mp3[0m[32;1m[1;3mI have successfully generated the image and recorded the story for Page 3. Now, I will proceed to create the corresponding image and record the story for Page 4.

Action: ACG characters image generator with GPT-Image-1
Action Input: 
```json
{
  "image_desc": "A familiar-looking burrow nestled under a large pine tree with a golden glow from the setting sun. Bella the bunny is hopping quickly to the burrow where her family is rushing out to greet her. Oliver the snow owl is watching happily.",
  "filename": "tutorial/LLM+Langchain/Week-8/Page-4.png",
  "size": "1024x1024",
  "quality": "high"
}
```
[0mStart generating image...
prompt: Create an enchanting scene of a cozy burrow nestled under a large, majestic pine tree, bathed in the warm, golden glow of the setting sun. Bella the bunny is seen hopping quickly towards the burrow, where her family of bunnies is eagerly rushing out to greet her. Above, perched on a bra

  response.stream_to_file(filename)


[36;1m[1;3mtutorial/LLM+Langchain/Week-8/Page-4.mp3[0m[32;1m[1;3mI have successfully generated the images and recorded the story for all four pages. Here is the final answer with the corresponding images and mp3 files saved in the specified folder structure.

Final Answer:

**Page 1: The Snowy Forest**
- Image: `tutorial/LLM+Langchain/Week-8/Page-1.png`
- Audio: `tutorial/LLM+Langchain/Week-8/Page-1.mp3`

**Page 2: Oliver's New Friend**
- Image: `tutorial/LLM+Langchain/Week-8/Page-2.png`
- Audio: `tutorial/LLM+Langchain/Week-8/Page-2.mp3`

**Page 3: The Adventure Begins**
- Image: `tutorial/LLM+Langchain/Week-8/Page-3.png`
- Audio: `tutorial/LLM+Langchain/Week-8/Page-3.mp3`

**Page 4: Home Sweet Home**
- Image: `tutorial/LLM+Langchain/Week-8/Page-4.png`
- Audio: `tutorial/LLM+Langchain/Week-8/Page-4.mp3`

The story and corresponding media files are now ready for the child to enjoy.[0m

[1m> Finished chain.[0m


{'input': '\n         I want to create an 4 pages story for a child. He likes snow owl.\n         For each page, please create a corresponding image and record the story as an mp3.\n         After having the final answer, please create a corresponding image and record the story as an mp3. \n         The saved image and mp3 should have same name, following the structure of \n         <Page - idx>, with idx as a number starting from 1, in the folder `tutorial/LLM+Langchain/Week-8`\n         ',
 'output': "**Page 1: The Snowy Forest**\n- Image: `tutorial/LLM+Langchain/Week-8/Page-1.png`\n- Audio: `tutorial/LLM+Langchain/Week-8/Page-1.mp3`\n\n**Page 2: Oliver's New Friend**\n- Image: `tutorial/LLM+Langchain/Week-8/Page-2.png`\n- Audio: `tutorial/LLM+Langchain/Week-8/Page-2.mp3`\n\n**Page 3: The Adventure Begins**\n- Image: `tutorial/LLM+Langchain/Week-8/Page-3.png`\n- Audio: `tutorial/LLM+Langchain/Week-8/Page-3.mp3`\n\n**Page 4: Home Sweet Home**\n- Image: `tutorial/LLM+Langchain/Week-8/P

## Can we create a story in an interactive way: chat based

-- Rolling back...

In [47]:
from langchain.prompts import PromptTemplate
from langchain_core.runnables import RunnablePassthrough

from langchain.output_parsers import StructuredOutputParser, ResponseSchema

output_response_schemas = [
        ResponseSchema(name="story", description="the story content in the page"),
        ResponseSchema(name="page index", description="The page number of the story"),
    ]

output_parser = StructuredOutputParser.from_response_schemas(output_response_schemas)

output_format_instructions = output_parser.get_format_instructions()


template = """
           Create a story page {idx}, based on the description: {text}

           The answer continues from previous content:
           {context}

           After having the final answer, please create a corresponding image and record the story as an mp3. 
           The saved image and mp3 should have same name, following the structure of 
           <Page - idx>, in the folder `tutorial/LLM+Langchain/Week-8`

           The output should have the following format: {output_format_instruction}
           """

prompt_template = PromptTemplate(template=template,
                                 input_variables=["text", "context", "idx"],
                                 partial_variables={"output_format_instruction": output_format_instructions})

agent_chain = RunnablePassthrough.assign(input=prompt_template)|agent_executor

In [48]:
Q = agent_chain.invoke({"text": "A little cat just woke up in the morning",
                        "context": "The beginning of the story:\n",
                        "idx": str(1)})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mTo create the story page, I need to first generate the story content based on the given description. Then, I will create an image and an audio file for the story. Finally, I will format the output as a JSON snippet.

Action: StoryTeller
Action Input: "A little cat just woke up in the morning"
[0m[38;5;200m[1;3m**Chapter 1: The Morning Adventure**

Once upon a time, in a cozy little house at the edge of a friendly forest, a little cat named Whiskers just woke up in the morning. Whiskers had soft, fluffy fur that was as white as snow, and big, curious green eyes that sparkled like emeralds.

As the first rays of sunshine peeked through the window, Whiskers stretched his tiny paws and let out a big, happy yawn. "Good morning, world!" he purred, feeling excited about the new day.

Whiskers loved mornings because they were always full of surprises. He hopped out of his comfy bed and padded over to the window. Outside, the birds

  response.stream_to_file(filename)


[36;1m[1;3mtutorial/LLM+Langchain/Week-8/Page-1.mp3[0m[32;1m[1;3mI have successfully generated the image and the audio file for the story. Now, I will format the output as a JSON snippet.

Final Answer:
```json
{
	"story": "Chapter 1: The Morning Adventure\n\nOnce upon a time, in a cozy little house at the edge of a friendly forest, a little cat named Whiskers just woke up in the morning. Whiskers had soft, fluffy fur that was as white as snow, and big, curious green eyes that sparkled like emeralds.\n\nAs the first rays of sunshine peeked through the window, Whiskers stretched his tiny paws and let out a big, happy yawn. 'Good morning, world!' he purred, feeling excited about the new day.\n\nWhiskers loved mornings because they were always full of surprises. He hopped out of his comfy bed and padded over to the window. Outside, the birds were singing cheerful songs, and the flowers in the garden were blooming in bright, beautiful colors.\n\n'Today is going to be a wonderful day,'

若是以下步驟失敗，嘗試重新生成。這是大語言模型，沒有保證可以100%產出你希望的格式。我們只能盡可能提高成功輸出的機率。

In [49]:
Q['output']

'```json\n{\n\t"story": "Chapter 1: The Morning Adventure\\n\\nOnce upon a time, in a cozy little house at the edge of a friendly forest, a little cat named Whiskers just woke up in the morning. Whiskers had soft, fluffy fur that was as white as snow, and big, curious green eyes that sparkled like emeralds.\\n\\nAs the first rays of sunshine peeked through the window, Whiskers stretched his tiny paws and let out a big, happy yawn. \'Good morning, world!\' he purred, feeling excited about the new day.\\n\\nWhiskers loved mornings because they were always full of surprises. He hopped out of his comfy bed and padded over to the window. Outside, the birds were singing cheerful songs, and the flowers in the garden were blooming in bright, beautiful colors.\\n\\n\'Today is going to be a wonderful day,\' Whiskers thought to himself. He decided to start his morning with a little adventure. But first, he needed to have his breakfast. Whiskers trotted over to the kitchen, where a bowl of his fav

In [50]:
output_parser.parse(Q['output'])

{'story': "Chapter 1: The Morning Adventure\n\nOnce upon a time, in a cozy little house at the edge of a friendly forest, a little cat named Whiskers just woke up in the morning. Whiskers had soft, fluffy fur that was as white as snow, and big, curious green eyes that sparkled like emeralds.\n\nAs the first rays of sunshine peeked through the window, Whiskers stretched his tiny paws and let out a big, happy yawn. 'Good morning, world!' he purred, feeling excited about the new day.\n\nWhiskers loved mornings because they were always full of surprises. He hopped out of his comfy bed and padded over to the window. Outside, the birds were singing cheerful songs, and the flowers in the garden were blooming in bright, beautiful colors.\n\n'Today is going to be a wonderful day,' Whiskers thought to himself. He decided to start his morning with a little adventure. But first, he needed to have his breakfast. Whiskers trotted over to the kitchen, where a bowl of his favorite fishy treats was wai

In [51]:
output_parser.parse(Q['output'])['story']

"Chapter 1: The Morning Adventure\n\nOnce upon a time, in a cozy little house at the edge of a friendly forest, a little cat named Whiskers just woke up in the morning. Whiskers had soft, fluffy fur that was as white as snow, and big, curious green eyes that sparkled like emeralds.\n\nAs the first rays of sunshine peeked through the window, Whiskers stretched his tiny paws and let out a big, happy yawn. 'Good morning, world!' he purred, feeling excited about the new day.\n\nWhiskers loved mornings because they were always full of surprises. He hopped out of his comfy bed and padded over to the window. Outside, the birds were singing cheerful songs, and the flowers in the garden were blooming in bright, beautiful colors.\n\n'Today is going to be a wonderful day,' Whiskers thought to himself. He decided to start his morning with a little adventure. But first, he needed to have his breakfast. Whiskers trotted over to the kitchen, where a bowl of his favorite fishy treats was waiting for h

In [52]:
output_parser.parse(Q['output'])['page index']

'1'

### 第二頁

In [53]:
context_list = [output_parser.parse(Q['output'])['story']]
print(context_list)

["Chapter 1: The Morning Adventure\n\nOnce upon a time, in a cozy little house at the edge of a friendly forest, a little cat named Whiskers just woke up in the morning. Whiskers had soft, fluffy fur that was as white as snow, and big, curious green eyes that sparkled like emeralds.\n\nAs the first rays of sunshine peeked through the window, Whiskers stretched his tiny paws and let out a big, happy yawn. 'Good morning, world!' he purred, feeling excited about the new day.\n\nWhiskers loved mornings because they were always full of surprises. He hopped out of his comfy bed and padded over to the window. Outside, the birds were singing cheerful songs, and the flowers in the garden were blooming in bright, beautiful colors.\n\n'Today is going to be a wonderful day,' Whiskers thought to himself. He decided to start his morning with a little adventure. But first, he needed to have his breakfast. Whiskers trotted over to the kitchen, where a bowl of his favorite fishy treats was waiting for 

In [54]:
Q_2 = agent_chain.invoke({"text": "Whisker found a dove and wanted to hunt it down!",
                          "context": ":\n".join(context_list),
                          "idx": str(2)})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mTo create the second page of the story, I need to continue from where the previous content left off. The story should describe Whiskers finding a dove and wanting to hunt it down. After writing the story, I will generate an image and an audio file for the story. Both the image and audio file will be saved with the same name in the specified folder.

Let's start by creating the continuation of the story.

Action: StoryTeller

Action Input: 
```
{
  "input": "Chapter 1: The Morning Adventure (Continued)\n\nWhiskers and Mr. Squirrel wandered through the garden, enjoying the fresh morning air. As they explored, Whiskers' keen eyes spotted a beautiful white dove perched on a low branch of an old oak tree. The dove's feathers glistened in the sunlight, and it cooed softly, unaware of the curious cat below.\n\nWhiskers' instincts kicked in, and he crouched low, his eyes fixed on the dove. 'Look, Mr. Squirrel,' Whiskers whispered, 't

  response.stream_to_file(filename)


[36;1m[1;3mtutorial/LLM+Langchain/Week-8/Page-2.mp3[0m[32;1m[1;3mI have successfully generated both the image and the audio file for the second page of the story. Here is the final output in the required format:

```json
{
	"story": "Chapter 1: The Morning Adventure (Continued)\n\nWhiskers and Mr. Squirrel wandered through the garden, enjoying the fresh morning air. As they explored, Whiskers' keen eyes spotted a beautiful white dove perched on a low branch of an old oak tree. The dove's feathers glistened in the sunlight, and it cooed softly, unaware of the curious cat below.\n\nWhiskers' instincts kicked in, and he crouched low, his eyes fixed on the dove. 'Look, Mr. Squirrel,' Whiskers whispered, 'there's a dove! I want to hunt it down!'\n\nMr. Squirrel looked at the dove and then back at Whiskers. 'Be careful, Whiskers,' he cautioned. 'Doves are quick and can fly away easily.'\n\nBut Whiskers was determined. He moved slowly, his body close to the ground, inching closer to the 

In [55]:
context_list

["Chapter 1: The Morning Adventure\n\nOnce upon a time, in a cozy little house at the edge of a friendly forest, a little cat named Whiskers just woke up in the morning. Whiskers had soft, fluffy fur that was as white as snow, and big, curious green eyes that sparkled like emeralds.\n\nAs the first rays of sunshine peeked through the window, Whiskers stretched his tiny paws and let out a big, happy yawn. 'Good morning, world!' he purred, feeling excited about the new day.\n\nWhiskers loved mornings because they were always full of surprises. He hopped out of his comfy bed and padded over to the window. Outside, the birds were singing cheerful songs, and the flowers in the garden were blooming in bright, beautiful colors.\n\n'Today is going to be a wonderful day,' Whiskers thought to himself. He decided to start his morning with a little adventure. But first, he needed to have his breakfast. Whiskers trotted over to the kitchen, where a bowl of his favorite fishy treats was waiting for 

In [None]:
output_parser.parse(Q_2['output'])['story']

### okay, it looks fine, let us see how to make it a interactive

In [None]:
# "前情提要"
context_list = []

# 頁面起始
idx = 1

while True:
    if len(context_list) == 0:
        context = "The beginning of the story:\n"
    else:
        context = "\n".join(context_list)

    text = input("請輸入故事內容: 若想要結束 請輸入 `QUIT`")

    if text == "QUIT":
        break
    
    Q = agent_chain.invoke({"text": text,
                            "context": context,
                            "idx": str(idx)})

    story = output_parser.parse(Q['output'])['story']
    
    # 下一頁
    idx += 1

    context_list.append(output_parser.parse(Q['output'])['story'])