In [1]:
!pip install llmlingua llama-index llama-index-postprocessor-longllmlingua

Collecting llmlingua
  Downloading llmlingua-0.2.2-py3-none-any.whl.metadata (17 kB)
Collecting llama-index
  Downloading llama_index-0.10.59-py3-none-any.whl.metadata (11 kB)
Collecting llama-index-postprocessor-longllmlingua
  Downloading llama_index_postprocessor_longllmlingua-0.1.2-py3-none-any.whl.metadata (684 bytes)
Collecting tiktoken (from llmlingua)
  Downloading tiktoken-0.7.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.6 kB)
Collecting llama-index-agent-openai<0.3.0,>=0.1.4 (from llama-index)
  Downloading llama_index_agent_openai-0.2.9-py3-none-any.whl.metadata (729 bytes)
Collecting llama-index-cli<0.2.0,>=0.1.2 (from llama-index)
  Downloading llama_index_cli-0.1.13-py3-none-any.whl.metadata (1.5 kB)
Collecting llama-index-core==0.10.59 (from llama-index)
  Downloading llama_index_core-0.10.59-py3-none-any.whl.metadata (2.4 kB)
Collecting llama-index-embeddings-openai<0.2.0,>=0.1.5 (from llama-index)
  Downloading llama_index_embeddings_openai-

In [2]:
# OpenAI 키 설정
import os
os.environ["OPENAI_API_KEY"] = ''

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core.response.pprint_utils import pprint_response
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.core import Settings

Settings.embed_model = OpenAIEmbedding(
    model="text-embedding-3-small"
)
Settings.llm= OpenAI(model='gpt-4o-mini')

In [3]:
# load documents
documents = SimpleDirectoryReader("/content/pg").load_data()

# build index
index = VectorStoreIndex.from_documents(documents=documents)
retriever = index.as_retriever(similarity_top_k=10)

In [4]:
question = "Where did the author go for art school?"

In [5]:
contexts = retriever.retrieve(question)

In [6]:
context_list = [n.get_content() for n in contexts]
len(context_list)

10

In [7]:
context_list

['I didn\'t want to drop out of grad school, but how else was I going to get out? I remember when my friend Robert Morris got kicked out of Cornell for writing the internet worm of 1988, I was envious that he\'d found such a spectacular way to get out of grad school.\n\nThen one day in April 1990 a crack appeared in the wall. I ran into professor Cheatham and he asked if I was far enough along to graduate that June. I didn\'t have a word of my dissertation written, but in what must have been the quickest bit of thinking in my life, I decided to take a shot at writing one in the 5 weeks or so that remained before the deadline, reusing parts of On Lisp where I could, and I was able to respond, with no perceptible delay "Yes, I think so. I\'ll give you something to read in a few days."\n\nI picked applications of continuations as the topic. In retrospect I should have written about macros and embedded languages. There\'s a whole world there that\'s barely been explored. But all I wanted w

In [8]:
llm = OpenAI(model="gpt-4o-mini")
prompt = "\n\n".join(context_list + [question])

response = llm.complete(prompt)
print(str(response))

The author applied to two art schools: the Rhode Island School of Design (RISD) in the United States and the Accademia di Belle Arti in Florence, Italy. Ultimately, RISD accepted the author, and they went to Providence to attend that school.


In [9]:
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.core.response_synthesizers import CompactAndRefine
from llama_index.postprocessor.longllmlingua import LongLLMLinguaPostprocessor
#pip install llama-index-postprocessor-longllmlingua
node_postprocessor = LongLLMLinguaPostprocessor(
    instruction_str="Given the context, please answer the final question",
    target_token=300,
    rank_method="longllmlingua",
    additional_compress_kwargs={
        "condition_compare": True,
        "condition_in_question": "after",
        "context_budget": "+100",
        "reorder_context": "sort",  # enable document reorder,
        "dynamic_context_compression_ratio": 0.3,
    },
)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/583 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/746 [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.84M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/435 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/26.8k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/9.98G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/3.50G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/200 [00:00<?, ?B/s]

In [10]:
retrieved_nodes = retriever.retrieve(question)
synthesizer = CompactAndRefine()

In [12]:
from llama_index.core.schema import QueryBundle

# outline steps in RetrieverQueryEngine for clarity:
# postprocess (compress), synthesize
new_retrieved_nodes = node_postprocessor.postprocess_nodes(
    retrieved_nodes, query_bundle=QueryBundle(query_str=question)
)

We detected that you are passing `past_key_values` as a tuple and this is deprecated and will be removed in v4.43. Please use an appropriate `Cache` class (https://huggingface.co/docs/transformers/v4.41.3/en/internal/generation_utils#transformers.Cache)


In [13]:
original_contexts = "\n\n".join([n.get_content() for n in retrieved_nodes])
compressed_contexts = "\n\n".join([n.get_content() for n in new_retrieved_nodes])

original_tokens = node_postprocessor._llm_lingua.get_token_length(original_contexts)
compressed_tokens = node_postprocessor._llm_lingua.get_token_length(compressed_contexts)

print(compressed_contexts)
print()
print("Original Tokens:", original_tokens)
print("Compressed Tokens:", compressed_tokens)
print("Compressed Ratio:", f"{original_tokens/(compressed_tokens + 1e-5):.2f}x")

Our teacher, professor Ulivi, was a nice guy. He could see I worked hard, and gave me a good grade, which he wrote down in a sort of passport each student had. But the Accademia wasn't teaching me anything except Italian, and my money was running out, so at the end of the first year I went back to the US.

 do next?tm' advice hadn't anything about that. wanted to something completely different, so I I'd. I wanted to see good I could get if really focused on it day after I working onC, I started painting I rusty and it took a while to get back into shape, but it was at least completely engaging. [8]
 wanted back to RISD, broke RISD very so decided to get a for and RIS the next fall. I one Interleaf, software. You mean Microsoft Word? Exactly That was how that end software tends to high software. Inter had a years yetI learned in the class I took atD but was basically myself I that in  I then friend Nancy Par favor. A herant. much than my, and was be where artists were So yes wanted it! 

In [19]:
new_retrieved_nodes

[NodeWithScore(node=TextNode(id_='3a03140f-b66c-4cb5-a190-098d3e77402f', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text="Our teacher, professor Ulivi, was a nice guy. He could see I worked hard, and gave me a good grade, which he wrote down in a sort of passport each student had. But the Accademia wasn't teaching me anything except Italian, and my money was running out, so at the end of the first year I went back to the US.", mimetype='text/plain', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\n\n{content}', metadata_template='{key}: {value}', metadata_seperator='\n'), score=None),
 NodeWithScore(node=TextNode(id_='2d58fd97-f015-4759-b8ef-a6b7efbedd0a', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text=" do next?tm' advice hadn't anything about that. wanted to something completely different, so I I'd. I wanted to see good I could

In [14]:
response = synthesizer.synthesize(question, new_retrieved_nodes)

In [15]:
print(str(response))

The author went to RISD (Rhode Island School of Design) for art school.


In [16]:
retriever_query_engine = RetrieverQueryEngine.from_args(
    retriever, node_postprocessors=[node_postprocessor]
)

In [17]:
response = retriever_query_engine.query(question)

In [18]:
print(str(response))

The author went to RISD, which stands for the Rhode Island School of Design.
