# DOCUMENTATION
___
## **LLM**

| **Model**                    | **Context Window** |**Limits**           |
|------------------------------|--------------------|---------------------|
| **Qwen 2.5 35B Code**        | Up to 128K tokens  |Fine Tuned for coding|
| **Cohere Command R+**        | Up to 128K tokens  | 1000 calls per month|
| **Mistral-7B-Instruct-v0.3** | 32,768 tokens      | Hallucinations      |

#### Using `Mistral-7B-Instruct-v0.3` because 
- reliable outputs
- 1000 calls per day
- works in huggingface(same as text-to-image)


## **Text-to-Image**


| **Category**               | **Stable Diffusion 3.5 Large Turbo**                                                                 | **FLUX.1-Dev FP16**                                                                       |
|----------------------------|-----------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------|
| **API Endpoint**           | `stabilityai/stable-diffusion-3-5-large-turbo`                                                     | `black-forest-labs/FLUX.1-Dev-FP16`                                                     |
| **Key API Parameters**     | `guidance_scale=3`, `num_inference_steps=4`, `negative_prompt`                                     | `num_inference_steps=8`, `control_image` (Canny/Depth), `true_cfg=4.0`                  |
| **API Latency**            | 3.8–5.2 sec/image (cold start: 12–18 sec)                                                          | 6.1–8.9 sec/image (cold start: 15–22 sec)                                                |
| **Cost Efficiency**        | $0.0021/image (PRO tier)                                                                           | $0.0033/image (PRO tier)                                                                 |
| **Free Tier Limits**       | 500 requests/hour                                                                                  | 300 requests/hour                                                                        |
| **Max Resolution**         | 1024x1024 via single API call                                                                      | 2048x2048 (requires `high_res_fix=true` parameter)                                       |
| **Advanced Features**      | - 4-step inference <br> - Text-to-image only                                                       | - Unified ControlNet (Canny/Depth) <br> - Image-to-image <br> - Inpainting/Outpainting   |
| **NSFW Filtering**         | Enabled by default (`safety_checker=strict`)                                                       | Optional (`safety_checker=relaxed`)                                                      |
| **Rate Limits (PRO)**      | 5K requests/hour                                                                                   | 3K requests/hour                                                                         |
| **Use Case Focus**         | Rapid batch generation (social media, prototyping)                                                 | High-detail workflows (product design, architectural viz)                                |

#### using `SD-3.5-LT` because 
- latency is low
- More requests
- we want stialized images which is better on SD 



___
___
___

# Get **Summary**, **Characters** AND **Places** 

![Local Image](/home/prince/Documents/Project/BOOK/media/Sudo_summary.png "Smmary, characters and Places")

In [3]:
from reader import ebook,read_list,read_json
from typing import Optional,Dict,List
import requests
from dotenv import load_dotenv
import json
import os
from pydantic import BaseModel, conint
from dataclasses import dataclass

In [2]:
load_dotenv()
API = os.getenv("HF_API")

headers = {
    "Authorization": f"Bearer {API}",
    "Content-Type": "application/json",
}
url = "https://api-inference.huggingface.co/models/mistralai/Mistral-7B-Instruct-v0.3/v1/chat/completions"



In [3]:
sum_role='''(NOTE: Only output in JSON. Ensure the JSON format is valid, well-formed, and Ready to parse. nothing before or after the json file)
Input:  
1.Current Chapter Text: The current chapter to be analyzed.
2.Character List: A list of characters with their physical/visual descriptions till now (This chapter).
3.Places list: list of places and their visual description till now (This chapter).
4.Previous Chapters' Summary: Context from earlier chapters.

Rules:  
1.Narrative Summary: Summarize and explain the chapter in detail, integrating context and key developments from previous chapters and create a self containing summary and explaination. end with to be continued.
2.Character List: add new characters to the list based on this chapter and Update existing character's physical/visual descriptions. If no characters are mentioned, return the same list as given.
3.Places: Include an updated description of any significant locations mentioned in this chapter, focusing on environment, weather, vibe, and structure.
4.Output Format: Ensure the output is valid and well-structured JSON.

Output:  
Generate a JSON object in this format:
{
  "summary": "Detailed Summary and explination of the current chapter in context of previous chapters. Use previus chapter summary as context",
  "characters": {
      "Character Name": "Updated or new physical/visual description (age, looks, clothes, hair, body language) based on this chapter."
    },
  "places": {
      "Place Name": "Updated or new visual description (environment, weather, vibe, structure, etc.) based on this chapter."
  }
}
'''

#### **get_sum()** Function
![Local Image](/home/prince/Documents/Project/BOOK/media/Get_sum_logic.png "Smmary, characters and Places")


outputdir=
```{
    0:
        {
            "summary":"No previous Context Yet",
            "characters":{},
            "places": {}
        }
            }```


In [79]:


class SummarySchema(BaseModel):
    summary: str
    characters: Dict[str,str]
    places: Dict[str,str]

def sum_msg(text: str, context: str, characters: dict = {}, places: dict = {}) -> list:
    message = [
        {
            "role": "system",
            "content": sum_role
        },
        {
            "role": "user",
            "content": json.dumps({
                "past_context": context,
                "Current_Chapter": text,
                "character_list": characters,
                "places_list": places
            }),
        },
    ]
    return message

def get_summ(messages:Dict[str,str]) -> Optional[str]:
    data = {
        "messages": messages,
        "temperature": 0.7,
        "stream": False,
        "max_tokens":10000,
        "parameters": {
        "repetition_penalty": 1.3,
        "grammar": {
            "type": "json",
            "value": SummarySchema.model_json_schema()
                }
                    }
    }
    response = requests.post(url, headers=headers, json=data)

    if response.status_code == 200:
        response_data = response.json()
        assistant_message = response_data["choices"][0]["message"]["content"]
        return assistant_message
    else:
        print(f"Error: {response.status_code}, {response.text}")
        return None



In [6]:
# book=ebook("./books/Alchemist/Alchemist.epub")
book=ebook("./books/LP.epub")
title=book.get_metadata()['title']
_,text=zip(*book.get_chapters())
text=text[:10]
text[8]


'Then she had forced her cough to inflict remorse.\nThus the little prince, notwithstanding the goodwill of his love,\nsoon doubted her. He had taken unimportant words seriously, and\nhad become very unhappy.\n"I should not have listened to him," he confided to me one day, "you\nmust never listen to flowers. You have to look at them and breathe\nthem. Mine embalmed my planet, but I could not rejoice. This story\nof claws, which had so annoyed me, should have tempted me ... "\nHe again confided to me:\n"I did not understand anything! I should have judged her on the acts\nand not the words. She embarrassed me and enlightened me. I\nshould never have fled! I should have guessed his tenderness behind\nhis poor tricks. The flowers are so contradictory! But I was too\nyoung to know how to love him. "\nIX\nJe crois qu’il profita, pour son évasion, d’une migration d’oiseaux\nsauvages.\nI believe that he took advantage of a migration of wild birds for his\nescape. On the morning of the departur

In [81]:
output_dict={
    0:
        {
            "summary":"There is no previous context",
            "characters":{},
            "places": {}
        }
            }


In [82]:
output_dict

{0: {'summary': 'There is no previous context',
  'characters': {},
  'places': {}}}

In [83]:
for idx,i in enumerate(text):
 
    if len(i)<1200:
        output_dict[idx+1]=output_dict[idx]
    else:
        context=output_dict[idx]["summary"]
        characters=output_dict[idx]["characters"]
        places=output_dict[idx]["places"]
        
        mes=sum_msg(i,context,characters,places)
        
        summary_characters=get_summ(mes)
        output_json=read_json(summary_characters)
        output_dict[idx+1]=output_json
    print(f"chapter: {idx} done")
print(output_dict)

chapter: 0 done
chapter: 1 done
chapter: 2 done
{0: {'summary': 'There is no previous context', 'characters': {}, 'places': {}}, 1: {'summary': "The chapter delves into the author's childhood memories, specifically focusing on an incident where he drew a boa constrictor digesting an elephant. This drawing was met with confusion by grown-ups, leading him to abandon his dreams of becoming a painter. The chapter also touches upon his later life, including his career as a plane pilot and his encounters with adults. The story ends with a mysterious encounter in the Sahara desert, where a small voice asks him to draw a sheep.", 'characters': {'Author': 'A grown-up author who used to draw and dream of becoming a painter, but was discouraged by adults. Later, he became a plane pilot.'}, 'places': {'Sahara Desert': 'A vast, isolated desert where the author had a breakdown in his plane. The environment is harsh and unforgiving.'}}, 2: {'summary': 'In this chapter, the author recounts his life ex

In [84]:
for num, dic in output_dict.items():
    if num==0:
        continue
    print(f"CHAPTER {num} :\n")
    
    sum,char,places=dic.items()
    print("Characters:")
    if len(char)>0 :
        for name,i in char[1].items():
            print(f"{name}: {i}")
    
    print("\nPlaces:")
    if len(places)>0:
        for name,i in places[1].items():
            print(f"{name}: {i}")
    print(f"\nSummary:\n{sum[1]}")
    print("\n-------------------------------------------------------------------------------------------------------------------")

CHAPTER 1 :

Characters:
Author: A grown-up author who used to draw and dream of becoming a painter, but was discouraged by adults. Later, he became a plane pilot.

Places:
Sahara Desert: A vast, isolated desert where the author had a breakdown in his plane. The environment is harsh and unforgiving.

Summary:
The chapter delves into the author's childhood memories, specifically focusing on an incident where he drew a boa constrictor digesting an elephant. This drawing was met with confusion by grown-ups, leading him to abandon his dreams of becoming a painter. The chapter also touches upon his later life, including his career as a plane pilot and his encounters with adults. The story ends with a mysterious encounter in the Sahara desert, where a small voice asks him to draw a sheep.

-------------------------------------------------------------------------------------------------------------------
CHAPTER 2 :

Characters:
Author: A grown-up author who used to draw and dream of becoming

___
___
___


# Get **Scenes**
![Local Image](/home/prince/Documents/Project/BOOK/media/Sudo_summary.png )

In [85]:
class SceneSchema(BaseModel):
    scenes: Dict[str,str]
print(SceneSchema.model_json_schema())

{'properties': {'scenes': {'additionalProperties': {'type': 'string'}, 'title': 'Scenes', 'type': 'object'}}, 'required': ['scenes'], 'title': 'SceneSchema', 'type': 'object'}


In [86]:
def scene_msg(text: str) -> list:
    message = [
        {
            "role": "system",
            "content": ''' IMPORTANT-> ONLY OUTPUT IN JSON.
                You are a text-to-image prompt generator for for book visulizer.
                Your task is to analyze the provided input text and identify distinct scenes where there are changes in place or time. 
                For each identified scene, create a detailed and descriptive prompt suitable for generating an image.
                Only Consider a Scene change if there is a change in time place or characters.
                only include visual info. dont go into details.
                
                Characters: Refer to characters by their respective names as mentioned in the text.
                Places: Refer to places by their proper names as mentioned in the text.
                Ensure each prompt captures the scene's mood, setting, and key visual elements.
                
                1. **Input:**
                    - `text`: A block of narrative text.```
                2. **Output:**
                    - {
                        "scene1":"prompt1",
                        "scene2":"prompt2",
                        ...
                    }

                ### Instructions:
                - Identify key changes in location, characters, or significant actions to define separate scenes.
                - Use descriptive language to paint a vivid picture of each scene in the prompt.
                
                   ''',
        },
        
        {
            "role": "user",
            "content": f"""TEXT: {text}""",  # This should be your input text that describes the scenes
        },]
        
    return message

def get_scene(scene_msg: list) -> Optional[str]:
    data = {
        "messages": scene_msg,
        "max_tokens": 10000,  # Specify the maximum length of the response
        "temperature": 0,  # Control the randomness of the response
        "stream": False,
        "repetition_penalty": 1.3,
        "grammar": {
            "type": "json",
            "value": SceneSchema.model_json_schema()
                }
        }
    
    response = requests.post(url, headers=headers, json=data)

    # Check the response status code and process the output
    if response.status_code == 200:
        response_data = response.json()
        # Extract the assistant's message content
        assistant_message = response_data["choices"][0]["message"]["content"]
        return assistant_message
    else:
        print(f"Error: {response.status_code}, {response.text}")  # Print error details
        return None

In [87]:
scene_output_list=[]
print("--Genrating Scenes per chapter--")
for idx,i in enumerate(text):
    inputs=i.replace("\n"," ")
    scene_message = scene_msg(text)
    scene_output=get_scene(scene_message)
    scene_json_output=read_json(scene_output)
    scene_output_list.append(scene_json_output)
    print(f"chapter: {idx+1} Done")

--Genrating Scenes per chapter--
chapter: 1 Done
chapter: 2 Done
chapter: 3 Done


In [88]:
scene_output_list

[{'scene1': 'A young boy, around 6 years old, is drawing a boa constrictor digesting an elephant. The scene is set in his home, with a sense of isolation and confusion as the grown-ups do not understand his drawing. The mood is playful yet frustrated.',
  'scene2': 'The same boy, now a grown-up and a pilot, is stranded in the Sahara desert with a broken engine. He encounters a mysterious little man who requests a drawing of a sheep. The scene is set in the desert, with a sense of danger and loneliness.',
  'scene3': 'The little man, now referred to as the little prince, continues to ask questions and reveal his origins to the pilot. The scene is set in the desert, with a sense of mystery and intrigue.'},
 {'scene1': 'A young boy, around 6 years old, is drawing a boa constrictor digesting an elephant. The scene is set in his home, with a sense of isolation and confusion as the grown-ups do not understand his drawing. The mood is playful yet frustrated.',
  'scene2': 'The same boy, now a

___
___
___


# Get **Style**
![Local Image](/home/prince/Documents/Project/BOOK/media/Get_Style_logic.png )

In [4]:
def basic_llm_req(text: str) -> Optional[str]:
    messages = [
        {
            "role":"system",
            "content":"DO As Asked in The Input"
        },
        {
            "role":"user",
            "content":text
        }
    ]
    
    data = {
        "messages": messages,
        "max_tokens": 10000,  # Specify the maximum length of the response
        "temperature": 0,  # Control the randomness of the response
        "stream": False,
    }
    response = requests.post(url, headers=headers, json=data)

    # Check the response status code and process the output
    if response.status_code == 200:
        response_data = response.json()
        # Extract the assistant's message content
        assistant_message = response_data["choices"][0]["message"]["content"]
        return assistant_message
    else:
        print(f"Error: {response.status_code}, {response.text}")  # Print error details
        return None
 

In [90]:
combined_summary=""
for key,val in output_dict.items():
    val_content=val["summary"].replace("\n"," ")
    if key==0:
        continue
    output_string= f'''{combined_summary}... Chapter{key}: {val_content}'''

In [91]:
style_prompt='''Prompt:
Note - dont give more than asked for. 
"Analyze the following story and provide a list of image style tags that would best suit its themes, settings, and overall mood. 
The response should include the style, period, type of art, color palette etc. 1 tags per entry.
dont explain it just give tags.

**ONLY TAGS**

Response Format:

Style: Realism or Impressionism or Surrealism oretc
Type:Landscape, Portrait or Abstract or etc.
Color Palette:Warm tones or Cool colors or Monochromatic or black and white or etc
Mood:Serene or Dramatic or Melancholic or etc.

reuired:(Style,Type,Color_palette,Mood)
Story
'''

style=basic_llm_req(f'''{style_prompt}:{combined_summary} ''')

In [96]:
style

'Style: Realism\nType: Landscape\nColor Palette: Warm Tones\nMood: Melancholic'

___

# Get **Image**
![Local Image](/home/prince/Documents/Project/BOOK/media/Get_img_logic.png )

In [38]:
from huggingface_hub import InferenceClient
import time

client = InferenceClient(
    # model="strangerzonehf/Qs-Sketch",
    model="cagliostrolab/animagine-xl-3.1",
    token=API,
)

In [50]:
from PIL import Image

# def sd_get_images(text:str):
#     image = client.text_to_image(
#         f"Text: {text}",
#         model="stabilityai/stable-diffusion-3.5-large-turbo",
#         # negative_prompt="hand,feet,text,written,shinny,artificial,unnatural,plastic,words,letters",
#         guidance_scale=2,
#         num_inference_steps=10
        
#     )
#     image.show()

# def mj_get_images(text:str):
#     # output is a PIL.Image object
#     image = client.text_to_image(
# 	    f"Text: {text}",
# 	    model="strangerzonehf/Flux-Midjourney-Mix2-LoRA",
#         # negative_prompt="hand,feet,text,written,shinny,artificial,unnatural,plastic,words,letters",
#         guidance_scale=2,
#         num_inference_steps=10
#     )
#     image.show()
    
def img_gen(text: str,model:str):
    client = InferenceClient(
    model=model,
    token=API,
    )
    image = client.text_to_image(
    	f"Text:{text}",

        # guidance_scale=12,
        # num_inference_steps=30,
    )
    name=model.replace("/","_")
    image.save(f'''./model-test/{name}.png''')

model="digiplay/aurorafantasy_v1"
text="a middle aged man near a crashed plane. he talking to a little boy with yellow hair in a dessert. //Style - landscape "
img_gen(text,model)


In [54]:
model="digiplay/aurorafantasy_v1"
text="a middle aged man near a crashed plane. he talking to a little boy with yellow hair in a dessert. //Style - landscape "
img_gen(text,model)



KeyboardInterrupt: 

In [98]:
def get_images(key, text,tag,characters,places,style):
    image = client.text_to_image(
        f"Text: {text}// context-> characters: {characters}, places:{places} // style:{style} ",
        negative_prompt="hand,feet,text,written,shinny,artificial,unnatural,plastic,words,letters",
        height=1024,
        width= 1024,
        guidance_scale=2,
        num_inference_steps=10
        
    )
    image.save(f"./output/{title}_{key}_{tag}.png")

In [99]:
# outputs=[]
outputs=[[val for key,val in chapter_prompt.items()] for chapter_prompt in scene_output_list]
outputs

[['A young boy, around 6 years old, is drawing a boa constrictor digesting an elephant. The scene is set in his home, with a sense of isolation and confusion as the grown-ups do not understand his drawing. The mood is playful yet frustrated.',
  'The same boy, now a grown-up and a pilot, is stranded in the Sahara desert with a broken engine. He encounters a mysterious little man who requests a drawing of a sheep. The scene is set in the desert, with a sense of danger and loneliness.',
  'The little man, now referred to as the little prince, continues to ask questions and reveal his origins to the pilot. The scene is set in the desert, with a sense of mystery and intrigue.'],
 ['A young boy, around 6 years old, is drawing a boa constrictor digesting an elephant. The scene is set in his home, with a sense of isolation and confusion as the grown-ups do not understand his drawing. The mood is playful yet frustrated.',
  'The same boy, now a grown-up and a pilot, is stranded in the Sahara d

### Loop Logic
![Local Image](/home/prince/Documents/Project/BOOK/media/Image_Loop.png )

In [100]:

for idx,i in enumerate(outputs):
    chars=output_dict[idx+1]["characters"]
    places=output_dict[idx+1]["places"]
    for jdx,j in enumerate(i):
        ##try except for internal server error
        try:
            key=f"C{idx}S{jdx+1}"
            text=j
            tag=""
            get_images(key=key,text=text,tag=tag,characters=chars,places=places,style=style)
            print(f"{key} saved. tag : {tag} ")
            print(f"    prompt:{text} \n    characters:{chars}\n    places:{places}")
        except Exception as e:
            print(f"Error while calling the API :{e} \n waiting for 10 seconds.. ")
            time.sleep(10)
            get_images(key=key,text=text,tag=tag,characters=chars,places=places,style=style)
            print(f"{key} saved. tag : {tag} ")
            print(f"    prompt:{text}")

        time.sleep(18)

C0S1 saved. tag :  
    prompt:A young boy, around 6 years old, is drawing a boa constrictor digesting an elephant. The scene is set in his home, with a sense of isolation and confusion as the grown-ups do not understand his drawing. The mood is playful yet frustrated. 
    characters:{'Author': 'A grown-up author who used to draw and dream of becoming a painter, but was discouraged by adults. Later, he became a plane pilot.'}
    places:{'Sahara Desert': 'A vast, isolated desert where the author had a breakdown in his plane. The environment is harsh and unforgiving.'}
C0S2 saved. tag :  
    prompt:The same boy, now a grown-up and a pilot, is stranded in the Sahara desert with a broken engine. He encounters a mysterious little man who requests a drawing of a sheep. The scene is set in the desert, with a sense of danger and loneliness. 
    characters:{'Author': 'A grown-up author who used to draw and dream of becoming a painter, but was discouraged by adults. Later, he became a plane 