<a href="https://colab.research.google.com/github/thegallier/configs/blob/main/Mistral_7b_instruct_feature_test.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Note: Responses from local models can be quite slow, especially with 8-bit quantization.

With 4bit quantization, `mistralai/Mistral-7B-Instruct-v0.1` uses about 12GB of VRAM and 8.5GB of RAM. I used a T4-High RAM instance for this notebook.

In [1]:
#!pip install edgartools
from edgar import *

In [2]:
!pip install git+https://github.com/run-llama/llama_index

Collecting git+https://github.com/run-llama/llama_index
  Cloning https://github.com/run-llama/llama_index to /tmp/pip-req-build-49cpmlfo
  Running command git clone --filter=blob:none --quiet https://github.com/run-llama/llama_index /tmp/pip-req-build-49cpmlfo
  Resolved https://github.com/run-llama/llama_index to commit 6c6f586322b088bcae9005e0a704e9bc4d205055
  Running command git submodule update --init --recursive -q
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone


In [3]:
!pip install transformers accelerate bitsandbytes



## Setup

### Data

In [4]:
from llama_index.readers import BeautifulSoupWebReader

url = "https://www.theverge.com/2023/9/29/23895675/ai-bot-social-network-openai-meta-chatbots"

documents = BeautifulSoupWebReader().load_data([url])

In [5]:
documents

[Document(id_='3c54bf12-8b73-40ae-8ebc-15bfb4500aca', embedding=None, metadata={'URL': 'https://www.theverge.com/2023/9/29/23895675/ai-bot-social-network-openai-meta-chatbots'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, hash='8bd1ac6935d2b15aeb539b7d5502efa0116c547f02899a478795a82705825838', text="The synthetic social network is coming - The VergeSkip to main contentThe VergeThe Verge logo.The Verge homepageThe Verge homepageThe VergeThe Verge logo./Tech/Reviews/Science/Entertainment/MoreMenuExpandThe VergeThe Verge logo.MenuExpandPlatformer/Artificial Intelligence/TechThe synthetic social network is comingThe synthetic social network is coming / Between ChatGPT’s surprisingly human voice and Meta’s AI characters, our feeds may be about to change foreverBy  Casey Newton, a contributing editor who has been writing about tech for over 10 years. He founded Platformer, a newsletter about Big Tech and democracy. Sep 29, 2023, 1:30 PM UTC|CommentsShare

### LLM

This should run on a T4 instance on the free tier

In [6]:
import torch
from transformers import BitsAndBytesConfig
from llama_index.prompts import PromptTemplate
from llama_index.llms import HuggingFaceLLM

quantization_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
)


llm = HuggingFaceLLM(
    model_name="mistralai/Mistral-7B-Instruct-v0.1",
    tokenizer_name="mistralai/Mistral-7B-Instruct-v0.1",
    query_wrapper_prompt=PromptTemplate("<s>[INST] {query_str} [/INST] </s>\n"),
    context_window=3900,
    max_new_tokens=256,
    model_kwargs={"quantization_config": quantization_config},
    # tokenizer_kwargs={},
    generate_kwargs={"temperature": 0.2, "top_k": 5, "top_p": 0.95},
    device_map="auto",
)

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

In [7]:
!pip install llama_index



In [8]:
from llama_index import ServiceContext

#service_context = ServiceContext.from_defaults(llm=llm, embed_model="local:BAAI/bge-small-en-v1.5")
service_context = ServiceContext.from_defaults(llm=llm, embed_model="local")

### Index Setup

In [14]:
from edgar import *
import os
os.environ['EDGAR_IDENTITY']="peter decrem pdecrem@hotmail.com"
aapl=Company("aapl")
filings=aapl.get_filings(form="10-K")


In [15]:
aapl_html=filings.latest(1).html()

In [16]:
with open("aaplhtml","w") as f:
  f.write(aapl_html)

In [17]:
from llama_index.readers.file.flat_reader import FlatReader
from pathlib import Path

reader = FlatReader()
documents= reader.load_data(Path("aaplhtml"))


In [18]:
from llama_index import VectorStoreIndex

vector_index = VectorStoreIndex.from_documents(documents, service_context=service_context)

In [19]:
from llama_index import SummaryIndex

summary_index = SummaryIndex.from_documents(documents, service_context=service_context)

In [26]:
!pip install langchain sentence_transformers

Collecting sentence_transformers
  Downloading sentence-transformers-2.2.2.tar.gz (85 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m86.0/86.0 kB[0m [31m2.2 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting sentencepiece (from sentence_transformers)
  Downloading sentencepiece-0.1.99-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.3/1.3 MB[0m [31m23.1 MB/s[0m eta [36m0:00:00[0m
Building wheels for collected packages: sentence_transformers
  Building wheel for sentence_transformers (setup.py) ... [?25l[?25hdone
  Created wheel for sentence_transformers: filename=sentence_transformers-2.2.2-py3-none-any.whl size=125923 sha256=e668a1790cf931b4ea490840b592d644fd396097fbb4b5fc2148e443bf02e066
  Stored in directory: /root/.cache/pip/wheels/62/f2/10/1e606fd5f02395388f74e7462910fe851042f97238cbbd902f
Successfully built sentence_tr

In [27]:
from langchain.embeddings.huggingface import HuggingFaceBgeEmbeddings
from llama_index import ServiceContext

embed_model = HuggingFaceBgeEmbeddings(model_name="BAAI/bge-base-en")

#service_context = ServiceContext.from_defaults(embed_model=embed_model,llm=llm)
service_context = ServiceContext.from_defaults(embed_model="local",llm=llm)

.gitattributes:   0%|          | 0.00/1.52k [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/90.1k [00:00<?, ?B/s]

config.json:   0%|          | 0.00/719 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/438M [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/438M [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/52.0 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/125 [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/711k [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/366 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

In [42]:

set_global_service_context(service_context)

In [43]:
from llama_index.node_parser import (
    UnstructuredElementNodeParser,
)

node_parser = UnstructuredElementNodeParser(llm=llm)

# old
https://medium.com/@jerryjliu98/how-unstructured-and-llamaindex-can-help-bring-the-power-of-llms-to-your-own-data-3657d063e30dm

In [44]:
raw_nodes_2021 = node_parser.get_nodes_from_documents(documents,llm=llm,embed_model="local")

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
__root__
  Invalid \escape: line 5 column 5 (char 151) (type=value_error.jsondecode; msg=Invalid \escape; doc={
"summary": "This table is a list of notes issued by the Nasdaq Stock Market LLC with various maturity dates and interest rates.",
"columns": [
{
"col\_name": "Trading symbol(s)",
"col\_type": "string"
},
{
"col\_name": "Name of each exchange on which registered",
"col\_type": "string"
},
{
"col\_name": "Common Stock, $0.00001 par value per share",
"col\_type": "string"
},
{
"col\_name": "AAPL",
"col\_type": "string"
},
{
"col\_name": "The Nasdaq Stock Market LLC",
"col\_type": "string"
},
{
"col\_name": "1.375% Notes due 2024",
"col\_type": "string"
},
{
"col\_name": "The Nasdaq Stock Market LLC",
"col\_type": "string"
}; pos=151; lineno=5; colno=5)
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/pydantic/v1/main.py", line 539, in parse_raw
    obj = load_str_bytes(
  File "

Validation error on structured response: 1 validation error for TableOutput
__root__
  Invalid \escape: line 5 column 5 (char 151) (type=value_error.jsondecode; msg=Invalid \escape; doc={
"summary": "This table is a list of notes issued by the Nasdaq Stock Market LLC with various maturity dates and interest rates.",
"columns": [
{
"col\_name": "Trading symbol(s)",
"col\_type": "string"
},
{
"col\_name": "Name of each exchange on which registered",
"col\_type": "string"
},
{
"col\_name": "Common Stock, $0.00001 par value per share",
"col\_type": "string"
},
{
"col\_name": "AAPL",
"col\_type": "string"
},
{
"col\_name": "The Nasdaq Stock Market LLC",
"col\_type": "string"
},
{
"col\_name": "1.375% Notes due 2024",
"col\_type": "string"
},
{
"col\_name": "The Nasdaq Stock Market LLC",
"col\_type": "string"
}; pos=151; lineno=5; colno=5)
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/pydantic/v1/main.py", line 539, in parse_raw
    obj = load_str_bytes(


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
 25%|██▌       | 1/4 [00:34<01:44, 34.83s/it]Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
__root__
  Invalid \escape: line 5 column 5 (char 376) (type=value_error.jsondecode; msg=Invalid \escape; doc={
"summary": "This table shows the potential impact of a hypothetical interest rate increase on an investment portfolio. It compares the fair value and annual interest expense of the portfolio in 2022 and 2023, assuming a 100 basis point increase in interest rates for all tenors. The table also includes the impact on term debt and investment portfolio.",
"columns": [
{
"col\_name": "Interest Rate Sensitive Instrument",
"col\_type": "string",
"summary": "The instrument that is sensitive to interest rate changes."
},
{
"col\_name": "Hypothetical Interest Rate Increase",
"col\_type": "string",
"summary": "The potential interest rate increase."
},
{
"col\_name": "Potential Impact 2023",
"col\_type": "str

Validation error on structured response: 1 validation error for TableOutput
__root__
  Invalid \escape: line 5 column 5 (char 376) (type=value_error.jsondecode; msg=Invalid \escape; doc={
"summary": "This table shows the potential impact of a hypothetical interest rate increase on an investment portfolio. It compares the fair value and annual interest expense of the portfolio in 2022 and 2023, assuming a 100 basis point increase in interest rates for all tenors. The table also includes the impact on term debt and investment portfolio.",
"columns": [
{
"col\_name": "Interest Rate Sensitive Instrument",
"col\_type": "string",
"summary": "The instrument that is sensitive to interest rate changes."
},
{
"col\_name": "Hypothetical Interest Rate Increase",
"col\_type": "string",
"summary": "The potential interest rate increase."
},
{
"col\_name": "Potential Impact 2023",
"col\_type": "string",
"summary": "The potential impact of the interest rate increase on the investment portfolio in 2023.

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
 50%|█████     | 2/4 [01:04<01:03, 31.88s/it]Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
__root__
  Invalid \escape: line 5 column 5 (char 386) (type=value_error.jsondecode; msg=Invalid \escape; doc={
"summary": "This table provides consolidated financial statements for a company for the years ended September 30, 2023, September 24, 2022, and September 25, 2021. It includes statements of operations, comprehensive income, balance sheets, shareholders' equity, and cash flows, as well as notes and a report from an independent registered public accounting firm.",
"columns": [
{
"col\_name": "Statement",
"col\_type": "string",
"summary": "The type of financial statement provided, such as statements of operations or balance sheets."
},
{
"col\_name": "Year",
"col\_type": "string",
"summary": "The year the financial statement is for, such as 2023 or 2022."
},
{
"col\_name": "Description",
"col\_type": 

Validation error on structured response: 1 validation error for TableOutput
__root__
  Invalid \escape: line 5 column 5 (char 386) (type=value_error.jsondecode; msg=Invalid \escape; doc={
"summary": "This table provides consolidated financial statements for a company for the years ended September 30, 2023, September 24, 2022, and September 25, 2021. It includes statements of operations, comprehensive income, balance sheets, shareholders' equity, and cash flows, as well as notes and a report from an independent registered public accounting firm.",
"columns": [
{
"col\_name": "Statement",
"col\_type": "string",
"summary": "The type of financial statement provided, such as statements of operations or balance sheets."
},
{
"col\_name": "Year",
"col\_type": "string",
"summary": "The year the financial statement is for, such as 2023 or 2022."
},
{
"col\_name": "Description",
"col\_type": "string",
"summary": "A brief description of the financial statement, such as 'Consolidated Statements of

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
 75%|███████▌  | 3/4 [01:33<00:30, 30.27s/it]Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
__root__
  Invalid \escape: line 5 column 5 (char 386) (type=value_error.jsondecode; msg=Invalid \escape; doc={
"summary": "This table provides consolidated financial statements for a company for the years ended September 30, 2023, September 24, 2022, and September 25, 2021. It includes statements of operations, comprehensive income, balance sheets, shareholders' equity, and cash flows, as well as notes and a report from an independent registered public accounting firm.",
"columns": [
{
"col\_name": "Statement",
"col\_type": "string",
"summary": "The type of financial statement provided, such as statements of operations or balance sheets."
},
{
"col\_name": "Year",
"col\_type": "string",
"summary": "The year the financial statement is for, such as 2023 or 2022."
},
{
"col\_name": "Description",
"col\_type": 

Validation error on structured response: 1 validation error for TableOutput
__root__
  Invalid \escape: line 5 column 5 (char 386) (type=value_error.jsondecode; msg=Invalid \escape; doc={
"summary": "This table provides consolidated financial statements for a company for the years ended September 30, 2023, September 24, 2022, and September 25, 2021. It includes statements of operations, comprehensive income, balance sheets, shareholders' equity, and cash flows, as well as notes and a report from an independent registered public accounting firm.",
"columns": [
{
"col\_name": "Statement",
"col\_type": "string",
"summary": "The type of financial statement provided, such as statements of operations or balance sheets."
},
{
"col\_name": "Year",
"col\_type": "string",
"summary": "The year the financial statement is for, such as 2023 or 2022."
},
{
"col\_name": "Description",
"col\_type": "string",
"summary": "A brief description of the financial statement, such as 'Consolidated Statements of

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
100%|██████████| 4/4 [02:01<00:00, 30.40s/it]


In [45]:
base_nodes_2021, node_mappings_2021 = node_parser.get_base_nodes_and_mappings(
    raw_nodes_2021
)

In [46]:
from llama_index.retrievers import RecursiveRetriever
from llama_index.query_engine import RetrieverQueryEngine
from llama_index import VectorStoreIndex

In [47]:
# construct top-level vector index + query engine
vector_index = VectorStoreIndex(base_nodes_2021)
vector_retriever = vector_index.as_retriever(similarity_top_k=1)
vector_query_engine = vector_index.as_query_engine(similarity_top_k=1)

In [48]:
from llama_index.retrievers import RecursiveRetriever

recursive_retriever = RecursiveRetriever(
    "vector",
    retriever_dict={"vector": vector_retriever},
    node_dict=node_mappings_2021,
    verbose=True,
)
query_engine = RetrieverQueryEngine.from_args(recursive_retriever)

In [49]:
response = query_engine.query("What was the revenue in 2020?")
print(str(response))

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


[1;3;34mRetrieving with query id None: What was the revenue in 2020?
[0m[1;3;38;5;200mRetrieving text node: (2)

Services net sales include amortization of the deferred value of services bundled in the sales price of certain products.

Total net sales include $8.2 billion of revenue recognized in 2023 that was included in deferred revenue as of September 24, 2022, $7.5 billion of revenue recognized in 2022 that was included in deferred revenue as of September 25, 2021, and $6.7 billion of revenue recognized in 2021 that was included in deferred revenue as of September 26, 2020.

The Company’s proportion of net sales by disaggregated revenue source was generally consistent for each reportable segment in Note 13, “Segment Information and Geographic Data” for 2023, 2022 and 2021, except in Greater China, where iPhone revenue represented a moderately higher proportion of net sales.

Note 3 – Earnings Per Share

The following table shows the computation of basic and diluted earnings per 

In [50]:
response = query_engine.query("How many treasuries did apple hold?")
print(str(response))

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


[1;3;34mRetrieving with query id None: How many treasuries did apple hold?
[0m[1;3;38;5;200mRetrieving text node: Apple Inc. | 2023 Form 10-K | 31

Apple Inc.

CONSOLIDATED STATEMENTS OF CASH FLOWS

(In millions)

Years ended September 30, 2023 September 24, 2022 September 25, 2021 Cash, cash equivalents and restricted cash, beginning balances $ 24,977   $ 35,929   $ 39,789    Operating activities: Net income 96,995   99,803   94,680   Adjustments to reconcile net income to cash generated by operating activities: Depreciation and amortization 11,519   11,104   11,284   Share-based compensation expense 10,833   9,038   7,906    Other ( 2,227 ) 1,006   ( 4,921 ) Changes in operating assets and liabilities: Accounts receivable, net ( 1,688 ) ( 1,823 ) ( 10,125 ) Vendor non-trade receivables 1,271   ( 7,520 ) ( 3,903 ) Inventories ( 1,618 ) 1,484   ( 2,642 ) Other current and non-current assets ( 5,684 ) ( 6,499 ) ( 8,042 ) Accounts payable ( 1,889 ) 9,448   12,326   Other current and n

In [51]:
response

Response(response='[/', source_nodes=[NodeWithScore(node=TextNode(id_='20f1222a-a49d-48aa-a9f0-3770dbd3508d', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='b36e46cc-4121-4ddd-92ad-cd27925399e9', node_type=<ObjectType.DOCUMENT: '4'>, metadata={}, hash='dcd4d39a57d46207779b5736fb219986cbebc22c6aa6b397b325cc2477c427b3'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='61ebad6f-b6ce-49ab-a2cb-91a4b8ac17c5', node_type=<ObjectType.TEXT: '1'>, metadata={}, hash='c2989eb96fd5f51ce58d26e0d0d2bc57d8e9f9f76861a076830d2f83a456e328'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='5b4a3dd0-c7d8-48de-b633-4838f7ee4aef', node_type=<ObjectType.TEXT: '1'>, metadata={}, hash='9d3a00881b0683d1cec91e35fa25479ff8ad83e7d3ef14f746d1850ab29c848c')}, hash='50c415621b2336e302fc3c9b03fc9a5e42db72aa3484966f73caf6fe9d86734b', text='Apple Inc. | 2023 Form 10-K | 31\n\nApple Inc.\

In [66]:
llm.complete("Extract apple's cash position from the following text\n ```"+response.source_nodes[0].node.text)

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


CompletionResponse(text='\nApple Inc. | 2023 Form 10-K | 31\n\nApple Inc.\n\nCONSOLIDATED STATEMENTS OF CASH FLOWS\n\n(In millions)\n\nYears ended September 30, 2023 September 24, 2022 September 25, 2021 Cash, cash equivalents and restricted cash, beginning balances $ 24,977 $ 35,929 $ 39,789 \n\nOperating activities: Net income 96,995 99,803 94,680 \n\nAdjustments to reconcile net income to cash generated by operating activities: Depreciation and amortization 11,519 11,104 11,284 \n\nShare-based compensation expense 10,833 9,038 7,906 \n\nOther ( 2,227 ) 1,006 1,006 \n\nChanges in operating assets and li', additional_kwargs={}, raw={'model_output': tensor([[    1,     1,   733,  ..., 12858,   304,   635]], device='cuda:0')}, delta=None)

In [65]:
response.source_nodes[0].node.text

'Apple Inc. | 2023 Form 10-K | 31\n\nApple Inc.\n\nCONSOLIDATED STATEMENTS OF CASH FLOWS\n\n(In millions)\n\nYears ended September 30, 2023 September 24, 2022 September 25, 2021 Cash, cash equivalents and restricted cash, beginning balances $ 24,977 \xa0 $ 35,929 \xa0 $ 39,789 \xa0  Operating activities: Net income 96,995 \xa0 99,803 \xa0 94,680 \xa0 Adjustments to reconcile net income to cash generated by operating activities: Depreciation and amortization 11,519 \xa0 11,104 \xa0 11,284 \xa0 Share-based compensation expense 10,833 \xa0 9,038 \xa0 7,906 \xa0  Other ( 2,227 ) 1,006 \xa0 ( 4,921 ) Changes in operating assets and liabilities: Accounts receivable, net ( 1,688 ) ( 1,823 ) ( 10,125 ) Vendor non-trade receivables 1,271 \xa0 ( 7,520 ) ( 3,903 ) Inventories ( 1,618 ) 1,484 \xa0 ( 2,642 ) Other current and non-current assets ( 5,684 ) ( 6,499 ) ( 8,042 ) Accounts payable ( 1,889 ) 9,448 \xa0 12,326 \xa0 Other current and non-current liabilities 3,031 \xa0 6,110 \xa0 7,475 \xa0 C

In [67]:
from typing import Any

from pydantic import BaseModel
from unstructured.partition.html import partition_html


In [31]:
!pip install InstructorEmbedding

Collecting InstructorEmbedding
  Downloading InstructorEmbedding-1.0.1-py2.py3-none-any.whl (19 kB)
Installing collected packages: InstructorEmbedding
Successfully installed InstructorEmbedding-1.0.1


In [None]:
!pip install sentence_transformers

In [68]:
from typing import Any, List
from InstructorEmbedding import INSTRUCTOR
from llama_index.embeddings.base import BaseEmbedding


class InstructorEmbeddings(BaseEmbedding):
    def __init__(
        self,
        instructor_model_name: str = "hkunlp/instructor-large",
        instruction: str = "Represent the Computer Science documentation or question:",
        **kwargs: Any,
    ) -> None:
        self._model = INSTRUCTOR(instructor_model_name)
        self._instruction = instruction
        super().__init__(**kwargs)

        def _get_query_embedding(self, query: str) -> List[float]:
            embeddings = self._model.encode([[self._instruction, query]])
            return embeddings[0]

        def _get_text_embedding(self, text: str) -> List[float]:
            embeddings = self._model.encode([[self._instruction, text]])
            return embeddings[0]

        def _get_text_embeddings(self, texts: List[str]) -> List[List[float]]:
            embeddings = self._model.encode(
                [[self._instruction, text] for text in texts]
            )
            return embeddings

In [None]:
!pip install langchain

In [69]:
from langchain.embeddings.huggingface import HuggingFaceBgeEmbeddings
from llama_index import ServiceContext

embed_model = HuggingFaceBgeEmbeddings(model_name="BAAI/bge-base-en")

#service_context = ServiceContext.from_defaults(embed_model=embed_model,llm=llm)
service_context = ServiceContext.from_defaults(embed_model="local",llm=llm)

In [72]:
embed_model.get_text_embeddings = embed_model.embed_documents
embed_model.get_text_embedding(
    "It is raining cats and dogs here!"
)
embed_model.embed_documents("It'raining and the cats are out")

ValueError: ignored

In [70]:
embeddings = embed_model.get_text_embedding(
    "It is raining cats and dogs here!"
)


AttributeError: ignored

In [73]:
from llama_index import ServiceContext, set_global_service_context
from llama_index.llms import OpenAI
from llama_index.embeddings import OpenAIEmbedding, HuggingFaceEmbedding
from llama_index.node_parser import (
    SentenceWindowNodeParser,
)
from llama_index.text_splitter import SentenceSplitter

# create the sentence window node parser w/ default settings
node_parser = SentenceWindowNodeParser.from_defaults(
    window_size=3,
    window_metadata_key="window",
    original_text_metadata_key="original_text",
)

# base node parser is a sentence splitter
text_splitter = SentenceSplitter()

#llm = OpenAI(model="gpt-3.5-turbo", temperature=0.1)
embed_model = HuggingFaceEmbedding(
    model_name="sentence-transformers/all-mpnet-base-v2", max_length=512
)
ctx = ServiceContext.from_defaults(
    llm=llm,
    embed_model=embed_model,
    # node_parser=node_parser,
)

In [35]:

set_global_service_context(service_context)

In [36]:
raw_nodes_2021 = node_parser.get_nodes_from_documents(documents,llm=llm,embed_model="local")

In [37]:
len(raw_nodes_2021)

625

In [38]:
import os
import pickle

if not os.path.exists("2021_nodes.pkl"):
    # raw_nodes_2021 = node_parser.get_nodes_from_documents(documents,llm=llm,embed_model="local:BAAI/bge-small-en-v1.5")
    raw_nodes_2021 = node_parser.get_nodes_from_documents(documents,llm=llm,embed_model="local")
    pickle.dump(raw_nodes_2021, open("2021_nodes.pkl", "wb"))
else:
    raw_nodes_2021 = pickle.load(open("2021_nodes.pkl", "rb"))

In [None]:
print(raw_nodes_2021)

### Helpful Imports / Logging

In [39]:
from llama_index.response.notebook_utils import display_response

In [74]:
import logging
import sys

logging.basicConfig(stream=sys.stdout, level=logging.INFO)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

## Basic Query Engine

### Compact (default)

In [75]:
query_engine = vector_index.as_query_engine(response_mode="compact")

response = query_engine.query("What are apples financial risks wrt to interest rates, inflation and foreign exachange?")

display_response(response)

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


**`Final Response:`** [/

In [76]:
response

Response(response='[/', source_nodes=[NodeWithScore(node=TextNode(id_='82726765-43d7-492c-98ef-8d4030c3afb3', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='96761280-1add-4e51-b1f3-d1af28c43f4f', node_type=<ObjectType.DOCUMENT: '4'>, metadata={}, hash='458f4f616ef88556bd5bdcfaf788bbfba1aca4762edc538de07853990ef633a0'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='bdbc6346-4dac-4ec8-95e9-05d2b633e36b', node_type=<ObjectType.TEXT: '1'>, metadata={}, hash='91d8b35cb6e67f0746b52fb8fbb6db457278d71b152d36a58e8856252eeb45d9'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='4357cf46-3a05-4b3f-a937-250f6f750b8f', node_type=<ObjectType.TEXT: '1'>, metadata={}, hash='7fd6ce578bd10222fa4598814e601072116f045a246deb89e8b47042ee560cb8')}, hash='7ddfdf856d367dbb4709dfdf9bb765a02de44cec8af362f4e4c722d4fba49034', text='Additionally, strengthening of foreign currenci

### Refine

In [77]:
query_engine = vector_index.as_query_engine(response_mode="refine")

response = query_engine.query("How do OpenAI and Meta differ on AI tools?")

display_response(response)

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


**`Final Response:`** [/

In [None]:
response

### Tree Summarize

In [78]:
query_engine2 = vector_index.as_query_engine(response_mode="tree_summarize")

response = query_engine2.query("Does apple have exposure to foreign exhange changes?")

display_response(response)

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


**`Final Response:`** [/

In [83]:
response

Response(response='[/', source_nodes=[NodeWithScore(node=TextNode(id_='82726765-43d7-492c-98ef-8d4030c3afb3', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='96761280-1add-4e51-b1f3-d1af28c43f4f', node_type=<ObjectType.DOCUMENT: '4'>, metadata={}, hash='458f4f616ef88556bd5bdcfaf788bbfba1aca4762edc538de07853990ef633a0'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='bdbc6346-4dac-4ec8-95e9-05d2b633e36b', node_type=<ObjectType.TEXT: '1'>, metadata={}, hash='91d8b35cb6e67f0746b52fb8fbb6db457278d71b152d36a58e8856252eeb45d9'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='4357cf46-3a05-4b3f-a937-250f6f750b8f', node_type=<ObjectType.TEXT: '1'>, metadata={}, hash='7fd6ce578bd10222fa4598814e601072116f045a246deb89e8b47042ee560cb8')}, hash='7ddfdf856d367dbb4709dfdf9bb765a02de44cec8af362f4e4c722d4fba49034', text='Additionally, strengthening of foreign currenci

In [80]:
llm.complete("Summarize the following :"+response)

TypeError: ignored

## Router Query Engine

In [84]:
from llama_index.tools import QueryEngineTool, ToolMetadata

vector_tool = QueryEngineTool(
    vector_index.as_query_engine(),
    metadata=ToolMetadata(
        name="vector_search",
        description="Useful for searching for specific facts."
    )
)

summary_tool = QueryEngineTool(
    summary_index.as_query_engine(response_mode="tree_summarize"),
    metadata=ToolMetadata(
        name="summary",
        description="Useful for summarizing an entire document."
    )
)

### Single Selector

In [85]:
from llama_index.query_engine import RouterQueryEngine

query_engine = RouterQueryEngine.from_defaults(
    [vector_tool, summary_tool],
    service_context=service_context,
    select_multi=False
)

response = query_engine.query("What was mentioned about apple and treasury investments?")

display_response(response)

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


**`Final Response:`** [/

In [86]:
response

Response(response='[/', source_nodes=[NodeWithScore(node=TextNode(id_='aa16c2c8-9f22-44e3-82e2-31caae8971c3', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='96761280-1add-4e51-b1f3-d1af28c43f4f', node_type=<ObjectType.DOCUMENT: '4'>, metadata={}, hash='458f4f616ef88556bd5bdcfaf788bbfba1aca4762edc538de07853990ef633a0'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='caf04502-931c-48cc-840c-15032780dae5', node_type=<ObjectType.TEXT: '1'>, metadata={}, hash='4725fc4547114a0bdcbfebbb0812993671ed60425103353e2003c2efd0ad539e'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='681cad68-2869-4626-b807-7ddf99d5c238', node_type=<ObjectType.TEXT: '1'>, metadata={}, hash='5b2c04a1513aa524ea6d81f41ee06bc1c54bf91c1ef37192b8f2504b7604386c')}, hash='f8d321b3d4530f9ebbb5edadccdf2a6ea439565cc2ef081dd7b791d55f331127', text='(2)\n\nIn August 2023, the Company entered into

In [None]:
response

### Multi Selector

In [None]:
from llama_index.query_engine import RouterQueryEngine

query_engine = RouterQueryEngine.from_defaults(
    [vector_tool, summary_tool],
    service_context=service_context,
    select_multi=True,
)

response = query_engine.query("Summarize apples interest rate, foreign exchange and inflation risk and the hedges they have in 3 bullet points")

display_response(response)

## SubQuestion Query Engine

In [None]:
from llama_index.tools import QueryEngineTool, ToolMetadata

vector_tool = QueryEngineTool(
    vector_index.as_query_engine(),
    metadata=ToolMetadata(
        name="vector_search",
        description="Useful for searching for specific facts."
    )
)

summary_tool = QueryEngineTool(
    summary_index.as_query_engine(response_mode="tree_summarize"),
    metadata=ToolMetadata(
        name="summary",
        description="Useful for summarizing an entire document."
    )
)

In [None]:
import nest_asyncio
nest_asyncio.apply()

In [None]:
from llama_index.query_engine import SubQuestionQueryEngine

query_engine = SubQuestionQueryEngine.from_defaults(
    [vector_tool, summary_tool],
    service_context=service_context,
    verbose=True,
)

response = query_engine.query("What was mentioned about Meta? How Does it differ from how OpenAI is talked about?")

display_response(response)

## SQL Query Engine

Here, we download and use a sample SQLite database with 11 tables, with various info about music, playlists, and customers. We will limit to a select few tables for this test.

In [None]:
import locale
locale.getpreferredencoding = lambda: "UTF-8"

In [None]:
!curl https://www.sqlitetutorial.net/wp-content/uploads/2018/03/chinook.zip -O /content/chinook.zip
!unzip /content/chinook.zip

In [None]:
from sqlalchemy import create_engine, MetaData, Table, Column, String, Integer, select, column

engine = create_engine("sqlite:////content/chinook.db")

In [None]:
from llama_index import SQLDatabase

sql_database = SQLDatabase(engine)

In [None]:
from llama_index.indices.struct_store import NLSQLTableQueryEngine

query_engine = NLSQLTableQueryEngine(
    sql_database=sql_database,
    tables=["albums", "tracks", "artists"],
    service_context=service_context
)

In [None]:
response = query_engine.query("What are some albums? Limit it to 5.")

display_response(response)

In [None]:
response

In [None]:
response = query_engine.query("What are some artists? Limit it to 5.")

display_response(response)

This last query should be a more complex join

In [None]:
response = query_engine.query("What are some tracks from the artist AC/DC? Limit it to 3")

display_response(response)

In [None]:
response

In [None]:
print(response.metadata['sql_query'])

## Programs

Depending the LLM, you will have to test with either `OpenAIPydanticProgram` or `LLMTextCompletionProgram`

In [None]:
from typing import List
from pydantic import BaseModel

from llama_index.program import OpenAIPydanticProgram, LLMTextCompletionProgram

class Song(BaseModel):
    """Data model for a song."""

    title: str
    length_seconds: int


class Album(BaseModel):
    """Data model for an album."""

    name: str
    artist: str
    songs: List[Song]

In [None]:
from llama_index.output_parsers import PydanticOutputParser

prompt_template_str = """\
Generate an example album, with an artist and a list of songs. \
Using the movie {movie_name} as inspiration.\
"""
program = LLMTextCompletionProgram.from_defaults(
    output_parser=PydanticOutputParser(Album),
    prompt_template_str=prompt_template_str,
    llm=llm,
    verbose=True,
)

In [None]:
output = program(movie_name="The Shining")

In [None]:
print(output)

## Data Agent

Similar to programs, OpenAI LLMs will use `OpenAIAgent`, while other LLMs will use `ReActAgent`.

In [None]:
from llama_index.agent import OpenAIAgent, ReActAgent

agent = ReActAgent.from_tools(
    [vector_tool, summary_tool],
    llm=llm,
    verbose=True
)

It seems tool usage is pretty flakey

In [None]:
response = agent.chat("Hello!")
print(response)

In [None]:
response = agent.chat("What was mentioned about Meta? How Does it differ from how OpenAI is talked about?")
print(response)