# Step 1: Generate Samples

First, we will take some books from the public domain and convert them into our representation of a Story object so that we have some solid references to build on.

## Data Model

In order to keep building up the context for the LLM to have more an more details as we get them, we will make a data model and put it in [model.py](./model.py) so that all of the notebooks can share it. We will build up a Story object and iterate over it to refine it, save it as a checkpoint to disk and then reload the Story in the next step. 

By using pydantic types, it will be easy to add other functionality like FastAPI.

## Install Requirements

In [None]:
%pip install -U ipywidgets
%pip install torch
%pip install llama-index
%pip install llama-index-llms-ollama
# %pip install llama-index-llms-huggingface
# %pip install llama-index-embeddings-huggingface
# %pip install llama-index-embeddings-huggingface-api
%pip install llama-index-embeddings-ollama
# %pip install llama-index-extractors-entity

%pip install tiktoken


# %pip install --upgrade protobuf
# %pip install -U bitsandbytes



# %pip install llama-parse
# # !pip install replicate

In [None]:
# Enable the use of asyncio in Jupyter notebooks
import nest_asyncio
nest_asyncio.apply()

## Make sure the LLM works

In [None]:
import model_text
model_text.llm.complete("Hello, my name is Matt", temperature=0)

In [None]:
# Super verbose debugging
import llama_index_log_handler
import logging
callback_manager = llama_index_log_handler.callback_manager

## Load Some Documents

### Short Story 

"Gift of the Magi"

Let's start with a short one.

In [None]:
# Load the book
from llama_index.core import SimpleDirectoryReader

gift_of_the_magi_path = "training/books/gift_of_the_magi.txt"
gift_of_the_magi_docs = SimpleDirectoryReader(input_files=[gift_of_the_magi_path], filename_as_id=True).load_data()
gift_of_the_magi_docs

## Run extractors over all of the nodes of the story
We've written a couple custom LlamaIndex extractors that use the LLM to parse out Character and Scene details. 

Check [llama_index_extractors.py](./llama_index_extractors.py) and [LlamaIndex Metadata Extraction Docs](https://docs.llamaindex.ai/en/stable/module_guides/indexing/metadata_extraction/) to see how they work.

Let's extract all the characters, scenes and summary from the nodes in story.


In [None]:
import model_text
from llama_index_extractors import CharacterExtractor, SceneExtractor

from llama_index.core.extractors import TitleExtractor, SummaryExtractor
from llama_index.core.node_parser import SentenceSplitter
from llama_index.core.ingestion import IngestionPipeline, IngestionCache

# TODO: Vector store

llama_index_log_handler.enable()
# llama_index_log_handler.disable()

# Split chunks of text by sentences
sentence_splitter = SentenceSplitter(
    chunk_size=1024, 
    chunk_overlap=128,
    callback_manager=callback_manager,
)

# Our custom CharacterExtractor
character_extractor = CharacterExtractor(
    llm=model_text.llm, 
    callback_manager=callback_manager,
)

# Our custom SceneExtractor
scene_extractor = SceneExtractor(
    llm=model_text.llm, 
    callback_manager=callback_manager,
)

# Generic TitleExtractor
title_extractor = TitleExtractor(
    nodes=5, 
    llm=model_text.llm, 
    callback_manager=callback_manager
)

# Generic SumamryExtractor
summary_extractor = SummaryExtractor(
    nodes=5, 
    llm=model_text.llm, 
    callback_manager=callback_manager,
)

# Setup all the extractors in the pipeline
pipeline = IngestionPipeline(
    transformations=[
        sentence_splitter,
        title_extractor, 
        # qa_extractor,
        summary_extractor,
        scene_extractor,
        character_extractor,
        model_text.embedding,
        # entity_extractor,
    ]
)

# Run the pipeline
magi_nodes = pipeline.run(
    documents=gift_of_the_magi_docs,
    in_place=True,
    show_progress=True,
    callback_manager=callback_manager,
    num_workers=1,
)

## Take a look at the first node

In [None]:
from IPython.display import Markdown, display
node = magi_nodes[0]
characters = node.metadata['characters']
characters_md = "## Characters\n\n" + characters.replace('\n', '\n\n')
display(Markdown(characters_md))

# scenes = node.metadata['scenes']
# scenes_md = "## Scenes\n\n" + scenes.replace('\n', '\n\n')
# display(Markdown(scenes_md))

In [None]:
characters = node.metadata['characters']
characters_md = "## Characters\n\n" + characters.replace('\n', '\n\n')
display(Markdown(characters_md))

# scenes = node.metadata['scenes']
# scenes_md = "## Scenes\n\n" + scenes.replace('\n', '\n\n')
# display(Markdown(scenes_md))

## Synthesize Nodes into a Story

Let's summarize all of the data gathered by the nodes.

[LlamaIndex Query Docs](https://docs.llamaindex.ai/en/stable/module_guides/deploying/query_engine/)

For the different modes refer to [LlamaIndex Response Synthesizer docs](https://docs.llamaindex.ai/en/stable/module_guides/querying/response_synthesizers/#configuring-the-response-mode)

**tree_summarize**: Query the LLM using the summary_template prompt as many times as needed so that all concatenated chunks have been queried, resulting in as many answers that are themselves recursively used as chunks in a tree_summarize LLM call and so on, until there's only one chunk left, and thus only one final answer.


In [None]:
import model_text

from llama_index.core import get_response_synthesizer
from llama_index.core.response_synthesizers.type import ResponseMode
from llama_index.core.prompts.base import PromptTemplate
from llama_index.core.prompts.prompt_type import PromptType

llama_index_log_handler.enable()
# llama_index_log_handler.disable()

# tree_summarize_prompt = (
#     "Context information from multiple sources is below.\n"
#     "---------------------\n"
#     "{context_str}\n"
#     "---------------------\n"
#     "Given the information from multiple sources and not prior knowledge, "
#     "answer the query.\n"
#     "Query: {query_str}\n"
#     "Answer: "
# )
# tree_summarize_prompt_tpl = PromptTemplate(tree_summarize_prompt, prompt_type=PromptType.SUMMARY)
tree_summarize_prompt = (
    "Here are multiple Story objects that were generated from different parts of the same story.\n"
    "---------------------\n"
    "{context_str}\n"
    "---------------------\n"
    "Given the information from multiple sources about the same story and not prior knowledge, "
    "answer the query.\n"
    "Query: {query_str}\n"
    "Answer: "
)
tree_summarize_prompt_tpl = PromptTemplate(tree_summarize_prompt, prompt_type=PromptType.SUMMARY)

# refine_prompt = (
#     "The original query is as follows: {query_str}\n"
#     "We have provided an existing answer: {existing_answer}\n"
#     "We have the opportunity to refine the existing answer "
#     "(only if needed) with some more context below.\n"
#     "------------\n"
#     "{context_msg}\n"
#     "------------\n"
#     "Given the new context, refine the original answer to better "
#     "answer the query. "
#     "If the context isn't useful, return the original answer.\n"
#     "Refined Answer: "
# )
# tree_summarize_prompt_tpl = PromptTemplate(refine_prompt, prompt_type=PromptType.REFINE)

# Loop over all the nodes and generate a combined response
response_synthesizer = get_response_synthesizer(
    # structured_answer_filtering=False, # Improve the quality of the structured answer, but needs tool calling
    response_mode=ResponseMode.TREE_SUMMARIZE, 
    # response_mode=ResponseMode.REFINE,
    use_async=True, 
    llm=model_text.llm,
    callback_manager=llama_index_log_handler.callback_manager,
    # summary_template=tree_summarize_prompt_tpl,
)



response_str = response_synthesizer.get_response(
    "Combine all JSON story objects into one single collective JSON object with the same format", 
    [f"Summary:\n\n{node.metadata['section_summary']}\n\n\nCharacters:\n\n{node.metadata['characters']}\n\n\nScenes:\n\n{node.metadata['scenes']}" for node in magi_nodes],
)
response_str



In [None]:
from llama_index.core import VectorStoreIndex
from llama_index.core import Settings

index = VectorStoreIndex(nodes=magi_nodes, embed_model=model_text.embedding)
query_engine = index.as_query_engine(llm=model_text.llm)
response = query_engine.query("Combine all JSON story objects into one single collective JSON object with the same format")
print(response)


## JSON Extraction

Sometimes for lower parameter models, we won't get JSON back that we need. One way to get around that is to retry with increasing temperature (randomness) until we have one that looks good. [story_extractor.py](./story_extractor.py) has a funtion `extract_story_json` that does this for us. There are other libraries like `Guidance` that will do this, but they won't do any sort of retry on models they don't support. The `llama-index` pydantic extraction has the same problem. We will roll our own to make it flexible.

In [None]:
from story_extractor import extract_story_json

story_json = extract_story_json(model_text.llm, response_str)
story_json.display()

In [None]:
from llama_index.core.postprocessor import SimilarityPostprocessor
from llama_index.core.data_structs import Node
from llama_index.core.schema import NodeWithScore


In [None]:
from llama_index.core import DocumentSummaryIndex
# from llama_index.core.node_parser import SentenceSplitter

# splitter = SentenceSplitter(chunk_size=1024)

print(type(magi_nodes))
print(magi_nodes)

doc_summary_index = DocumentSummaryIndex.from_documents(
    gift_of_the_magi_docs,
    llm=model_text.llm,
    transformations=[
        sentence_splitter,
        title_extractor, 
        # qa_extractor,
        summary_extractor,
        scene_extractor,
        character_extractor,
        model_text.embedding,
        # entity_extractor,
    ],
    response_synthesizer=response_synthesizer,
    show_progress=True,
    embed_model=model_text.embedding,
    embed_summaries=True,
    callback_manager=callback_manager,
)

doc_summary_index

In [None]:
summary = doc_summary_index.get_document_summary(gift_of_the_magi_path)


# "combine all Story JSON objects into one single collective JSON object with the same format"


In [None]:
# from IPython.display import Markdown, display

# # Summarize the book
# plot_summary = doc_summary_index.get_document_summary(DOC_ID)

# # Display the summary as markdown
# display(Markdown(f"## Plot Summary\n\n{plot_summary}"))