# Llama Debug Handler Demo

Here we showcase the capabilities of our LlamaDebugHandler in logging events as we run queries
within LlamaIndex.

**NOTE**: This is a beta feature. The usage within different classes and the API interface
    for the CallbackManager and LlamaDebugHandler may change!

In [1]:
from llama_index.callbacks import CallbackManager, LlamaDebugHandler, CBEventType

  from .autonotebook import tqdm as notebook_tqdm


In [2]:
from llama_index import GPTListIndex, ServiceContext, SimpleDirectoryReader, GPTVectorStoreIndex

In [3]:
docs = SimpleDirectoryReader("../data/paul_graham/").load_data()

In [4]:
from llama_index import ServiceContext, LLMPredictor, GPTTreeIndex
from langchain.chat_models import ChatOpenAI
llm_predictor = LLMPredictor(llm=ChatOpenAI(model_name='gpt-3.5-turbo', temperature=0))
service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor)

## Callback Manager Setup

In [5]:
llama_debug = LlamaDebugHandler(print_trace_on_end=True)
callback_manager = CallbackManager([llama_debug])
service_context = ServiceContext.from_defaults(callback_manager=callback_manager, llm_predictor=llm_predictor)

## Trigger the callback with a query

In [6]:
index = GPTVectorStoreIndex.from_documents(docs, service_context=service_context)
query_engine = index.as_query_engine()

**********
Run: index_construction
    |_CBEventType.NODE_PARSING ->  0.141473 seconds
      |_CBEventType.CHUNKING ->  0.14093 seconds
    |_CBEventType.EMBEDDING ->  0.670048 seconds
    |_CBEventType.EMBEDDING ->  0.580676 seconds
    |_CBEventType.EMBEDDING ->  1.178181 seconds
**********


In [7]:
response = query_engine.query("What did the author do growing up?")

**********
Run: query
    |_CBEventType.QUERY ->  6.521125 seconds
      |_CBEventType.RETRIEVE ->  0.37282 seconds
        |_CBEventType.EMBEDDING ->  0.363772 seconds
      |_CBEventType.SYNTHESIZE ->  6.148209 seconds
        |_CBEventType.LLM ->  6.125813 seconds
**********


## Explore the Debug Information

The callback manager will log several start and end events for the following types:
- CBEventType.LLM
- CBEventType.EMBEDDING
- CBEventType.CHUNKING
- CBEventType.NODE_PARSING
- CBEventType.RETRIEVE
- CBEventType.SYNTHESIZE 
- CBEventType.TREE
- CBEventType.QUERY

The LlamaDebugHandler provides a few basic methods for exploring information about these events

In [8]:
# Print info on the LLM calls during the list index query
print(llama_debug.get_event_time_info(CBEventType.LLM))

EventStats(total_secs=6.125813, average_secs=6.125813, total_count=1)


In [9]:
# Print info on llm inputs/outputs - returns start/end events for each LLM call
event_pairs = llama_debug.get_llm_inputs_outputs()
print(event_pairs[0][0])
print(event_pairs[0][1].payload.keys())
print(event_pairs[0][1].payload['response'])

CBEvent(event_type=<CBEventType.LLM: 'llm'>, payload={'context_str': 'What I Worked On\n\nFebruary 2021\n\nBefore college the two main things I worked on, outside of school, were writing and programming. I didn\'t write essays. I wrote what beginning writers were supposed to write then, and probably still are: short stories. My stories were awful. They had hardly any plot, just characters with strong feelings, which I imagined made them deep.\n\nThe first programs I tried writing were on the IBM 1401 that our school district used for what was then called "data processing." This was in 9th grade, so I was 13 or 14. The school district\'s 1401 happened to be in the basement of our junior high school, and my friend Rich Draves and I got permission to use it. It was like a mini Bond villain\'s lair down there, with all these alien-looking machines — CPU, disk drives, printer, card reader — sitting up on a raised floor under bright fluorescent lights.\n\nThe language we used was an early ve

In [10]:
# Get info on any event type
event_pairs = llama_debug.get_event_pairs(CBEventType.CHUNKING)
print(event_pairs[0][0].payload.keys())  # get first chunking start event
print(event_pairs[0][1].payload.keys())  # get first chunking end event

dict_keys(['text'])
dict_keys(['chunks'])


In [11]:
# Clear the currently cached events
llama_debug.flush_event_logs()