

## Step 1: Set up colab and download relevant packages (if needed)

In [None]:
# mount collab to drive
from google.colab import drive
drive.mount("/content/drive")
%cd '/content/drive/My Drive/LlamaIndex/vector_storage_example'

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
/content/drive/My Drive/LlamaIndex/vector_storage_example


In [None]:
!ls

data  data_2  data_3  llama_index  LlamaIndex.ipynb  neat_text.py  __pycache__


In [None]:
# !git clone https://github.com/jerryjliu/llama_index.git
# %cd llama_index
# !git pull
# !pip install llama_index
# !pip install --upgrade llama_index

In [None]:
!pip install llama_index
!pip install pypdf
!pip install openai
!pip install transformers
!pip install accelerate
!pip install sentence_transformers
!pip install chromadb
!pip install -U openai-whisper
!pip install pydub
!pip install einops



In [None]:
import openai
from llama_index import SimpleDirectoryReader, GPTVectorStoreIndex, ServiceContext, VectorStoreIndex, ListIndex, GPTListIndex, GPTTreeIndex, LLMPredictor
from llama_index.vector_stores import ChromaVectorStore
import torch
from llama_index.llms import HuggingFaceLLM
import transformers
import chromadb
from llama_index.vector_stores import ChromaVectorStore
from llama_index.storage.storage_context import StorageContext
from neat_text import neat_text
from transformers import set_seed
from llama_index.composability.joint_qa_summary import QASummaryQueryEngineBuilder
from llama_index import VectorStoreIndex, ListIndex
from llama_index.indices.composability import ComposableGraph
from llama_index.query_engine import SubQuestionQueryEngine
from llama_index.tools import QueryEngineTool, ToolMetadata
from llama_index.indices.query.query_transform.base import StepDecomposeQueryTransform
from llama_index import LLMPredictor
from llama_index.query_engine.multistep_query_engine import MultiStepQueryEngine
from llama_index.llms import OpenAI


set_seed(42)

## Step 2: Load the documents

In [None]:
# Note: OpenAI GPT-3 text-davinci-003 model
# NOTE: Even if you wish to use a different model (eg Huggingface model), you still need to specify an OpenAI API key if not they may keep throwing errors.
openai.api_key = # your openAI API key

In [None]:
documents = SimpleDirectoryReader("data").load_data()

## Steps 3: Define the ServiceContext and StorageContext

### Step 3(a) Definining the ServiceContext (ie LLM) if you wish to use something other than the default

In [None]:
llm = HuggingFaceLLM(
    # context_window=3000,
    # max_new_tokens=256, #100
    generate_kwargs={"temperature": 0.2, "do_sample": False},
    tokenizer_name="mosaicml/mpt-7b",
    model_name="mosaicml/mpt-7b",
    # device_map="auto",
    # stopping_ids=[50278, 50279, 50277, 1, 0],
    # tokenizer_kwargs={"max_length": 4096, "padding": True, "truncation": True, "return_tensors": "pt"},
    # # # uncomment this if using CUDA to reduce memory usage
    model_kwargs={"torch_dtype": torch.float16}
)

Loading mosaicml/mpt-7b requires to execute some code in that repo, you can inspect the content of the repository at https://hf.co/mosaicml/mpt-7b. You can dismiss this prompt by passing `trust_remote_code=True`.
Do you accept? [y/N] y
Loading mosaicml/mpt-7b requires to execute some code in that repo, you can inspect the content of the repository at https://hf.co/mosaicml/mpt-7b. You can dismiss this prompt by passing `trust_remote_code=True`.
Do you accept? [y/N] y
Instantiating an MPTForCausalLM model from /root/.cache/huggingface/modules/transformers_modules/mosaicml/mpt-7b/72e5f594ce36f9cabfa2a9fd8f58b491eb467ee7/modeling_mpt.py
You are using config.init_device='cpu', but you can also use config.init_device="meta" with Composer + FSDP for fast initialization.




Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

### Step 3(b) Defining the StorageContext (ie Vector Database which we want to use) if you wish to use something other than the default

In [None]:
# Creating a Chroma client
# By default, Chroma will operate purely in-memory.
chroma_client = chromadb.Client()
chroma_collection = chroma_client.create_collection("data")
# set up ChromaVectorStore and load in data
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)


In [None]:
service_context = ServiceContext.from_defaults(llm=llm,chunk_size=2000 , embed_model="local")
storage_context = StorageContext.from_defaults(vector_store=vector_store)
# index = VectorStoreIndex.from_documents(documents, storage_context=storage_context, service_context=service_context)
index = VectorStoreIndex.from_documents(documents, service_context=service_context, storage_context=storage_context)


## Step 4. Query the data

### Example 1: Semantic Search

In [None]:
# Query Data
query_engine = index.as_query_engine()
response = (query_engine.query("What are potential factors that could worsen food security due to SNAP enrollment?"))
print(neat_text(response))



1.
Self-selection by more food-needy households into the program.
2.
The program’s limited duration.
3.
The program’s limited amount of food assistance.
4.
The program’s limited eligibility.
5.
The program’s limited access to food.
6.
The program’s limited availability of food.
7.
The program’s limited distribution of food.
8.
The program’s limited access to food.
9.
The program’s limited distribution of food.
10.
The program’s limited access to food.
11.
The program’s limited distribution of food.
12.
The program’s limited access to food.
13.
The program’s limited distribution of food.
14.
The program’s limited access to food.
15.
The program’s limited distribution of food.
16.
The program’s limited access to food.
17.
The program’s limited distribution of food.
18.
The program’s limited access to food.
19.
The program’s limited distribution of food.
20.
The program’s limited access to food.
21.
The program’s limited distribution of food.


### Example 2: Summarization

In [None]:
query_engine = index.as_query_engine(response_mode="simple_summarize")
response = query_engine.query("Summarize the the effect of SNAP benefits towards food insecurity.")

In [None]:
print(neat_text(response))

SNAP reduces food insecurity and diminishes other financial hardships.
Source: Shaefer and Gutierrez 2013.
Note: Sample includes low-income households with children.
Medical hardship is measured as whether the interviewee reported that in the past 12 months someone in the household chose not to see a doctor or go to the hospital when needed because of cost.
Food insecurity Medical hardship Housing UtilitiesRisk of falling behind on expenses including:Percentage point reduction-16-12-8-40In addition to reducing food insecurity, SNAP participation may also reduce households’ risk of suffering financial hardships.
Shaefer and Gutierrez (2013) use variation in state-level policies that affect SNAP access to study the impact of SNAP participation on a variety of outcomes.
They find that receiving SNAP reduces the likelihood of food insecurity by 13 percentage points.
SNAP also has spillover impacts on other aspects of families’ financial well-being.
Households have more resources available 

### Example 3: Synthesis over Heterogeneous Data

In [None]:
documents_2 = SimpleDirectoryReader("data_2").load_data()
index2 = VectorStoreIndex.from_documents(documents_2, service_context=service_context,storage_context=storage_context)
graph = ComposableGraph.from_indices(ListIndex, [index, index2], index_summaries=["summary1", "summary2"],service_context=service_context, storage_context=storage_context)
query_engine = graph.as_query_engine(response_mode="simple_summarize")
response = query_engine.query("Summarize the the effect of SNAP benefits towards food insecurity.")

In [None]:
print(neat_text(response))

SNAP reduces food insecurity and diminishes other financial hardships.
Source: Shaefer and Gutierrez 2013.
Note: Sample includes low-income households with children.
Medical hardship is measured as whether the interviewee reported that in the past 12 months someone in the household chose not to see a doctor or go to the hospital when needed because of cost.
Food insecurity Medical hardship Housing UtilitiesRisk of falling behind on expenses including:Percentage point reduction-16-12-8-40In addition to reducing food insecurity, SNAP participation may also reduce households’ risk of suffering financial hardships.
Shaefer and Gutierrez (2013) use variation in state-level policies that affect SNAP access to study the impact of SNAP participation on a variety of outcomes.
They find that receiving SNAP reduces the likelihood of food insecurity by 13 percentage points.
SNAP also has spillover impacts on other aspects of families’ financial well-being.
Households have more resources available 

### Example 4: Sub Question Query Engine

In [None]:
# load data

# https://github.com/jerryjliu/llama_index/issues/6607
import nest_asyncio

nest_asyncio.apply()
from llama_index.callbacks import CallbackManager, LlamaDebugHandler
wiki_snap = SimpleDirectoryReader(input_dir="data_3").load_data()
llama_debug = LlamaDebugHandler(print_trace_on_end=True)
callback_manager = CallbackManager([llama_debug])
service_context = ServiceContext.from_defaults(callback_manager=callback_manager)
# build index and query engine
vector_query_engine = VectorStoreIndex.from_documents(wiki_snap, service_context=service_context).as_query_engine()

# setup base query engine as tool
query_engine_tools = [QueryEngineTool(query_engine=vector_query_engine,
                                      metadata=ToolMetadata(
                                          name="wiki_snap",
                                          description="SNAP wiki description"),),]

query_engine = SubQuestionQueryEngine.from_defaults(
    query_engine_tools=query_engine_tools,
    service_context=service_context,
)

**********
Trace: index_construction
    |_node_parsing ->  0.052384 seconds
      |_chunking ->  0.000728 seconds
      |_chunking ->  0.000741 seconds
      |_chunking ->  0.021837 seconds
      |_chunking ->  0.017479 seconds
      |_chunking ->  0.001407 seconds
      |_chunking ->  0.001065 seconds
      |_chunking ->  0.00462 seconds
      |_chunking ->  0.00077 seconds
    |_embedding ->  1.122332 seconds
    |_embedding ->  0.906346 seconds
**********


In [None]:
response = query_engine.query("How does SNAP impact ones nutritional, economic and health well-being?")

Generated 3 sub questions.
[36;1m[1;3m[wiki_snap] Q: What is the nutritional impact of SNAP?
[0m[33;1m[1;3m[wiki_snap] Q: What is the economic impact of SNAP?
[0m[38;5;200m[1;3m[wiki_snap] Q: What is the health impact of SNAP?
[0m[38;5;200m[1;3m[wiki_snap] A: Access to SNAP has been found to have positive health impacts. Studies have shown that toddlers and preschoolers in households with access to food stamps have better health outcomes compared to similar children without access to food stamps. Additionally, increasing SNAP participation has been associated with lower overall and male suicide rates. Furthermore, a recent study found that users of the program aged 50 and above had slower memory loss compared to non-users. These findings suggest that SNAP plays a role in improving health outcomes for its participants.
[0m[36;1m[1;3m[wiki_snap] A: The nutritional impact of SNAP is inconclusive. Studies have shown that SNAP participants score slightly lower on the Healthy E

In [None]:
print(neat_text(response))

SNAP has an inconclusive impact on nutritional well-being.
While studies have shown that SNAP participants may have slightly lower scores on the Healthy Eating Index compared to low-income nonparticipants, SNAP increases the likelihood of consuming whole fruit.
However, it decreases the intake of dark green and orange vegetables by a modest amount.
It is important to note that SNAP does not have nutritional standards for purchases.
In terms of economic well-being, SNAP has a positive impact.
It is considered a counter-cyclical government assistance program, providing assistance to more low-income households during economic downturns.
The rise in SNAP participation during these times stimulates the economy by increasing SNAP expenditures.
Studies have shown that every dollar of SNAP benefits generates economic activity, with estimates ranging from $1.
73 to $1.
84.
SNAP benefits also have a multiplier effect on GDP.
Regarding health well-being, access to SNAP has been found to have posi

### Example 5: Joint QA Summary Query Engine

In [None]:
query_engine_builder = QASummaryQueryEngineBuilder()
query_engine = query_engine_builder.build_from_documents(wiki_snap)

In [None]:
response = query_engine.query(
    "Can you give me a summary of the impacts of SNAP?",
)
print(response)

SNAP, or the Supplemental Nutrition Assistance Program, has several impacts. Firstly, it helps alleviate hunger and improve food security by providing food assistance to low-income individuals and families. Secondly, SNAP benefits stimulate the economy by supporting local businesses such as grocery stores and farmers markets. Thirdly, SNAP improves nutrition by allowing participants to purchase a variety of nutritious foods. Fourthly, it supports vulnerable populations, including children, the elderly, individuals with disabilities, and college students, ensuring they have access to the food they need. Lastly, SNAP has been found to have positive effects on health outcomes and crime rates. Overall, SNAP plays a crucial role in providing food assistance, improving nutrition, and supporting the well-being of low-income individuals and families.


In [None]:
print(neat_text(response))

SNAP, or the Supplemental Nutrition Assistance Program, has several impacts.
Firstly, it helps alleviate hunger and improve food security by providing food assistance to low-income individuals and families.
Secondly, SNAP benefits stimulate the economy by supporting local businesses such as grocery stores and farmers markets.
Thirdly, SNAP improves nutrition by allowing participants to purchase a variety of nutritious foods.
Fourthly, it supports vulnerable populations, including children, the elderly, individuals with disabilities, and college students, ensuring they have access to the food they need.
Lastly, SNAP has been found to have positive effects on health outcomes and crime rates.
Overall, SNAP plays a crucial role in providing food assistance, improving nutrition, and supporting the well-being of low-income individuals and families.


In [None]:
response = query_engine.query(
    "What are some proposals to restrict the purchase of junk food through SNAP benefits?",
)


In [None]:
print(neat_text(response))

There have been periodic proposals to restrict the purchase of junk food through SNAP benefits.
However, these proposals have been rejected by Congress and the Department of Agriculture on grounds of administrative burden and personal freedom.
The USDA has noted that there are no federal standards to determine which foods should be considered "healthy" or not.
Some experts suggest incentivizing the purchase of healthy items through a credit or rebate program to encourage healthy eating.


## Example 6: Multi-Step Query Engine

In [None]:
# Original question: What problems lead to the proposed 1977 food stamp act, How did it affect SNAP benfits and how did it have an impact on future legislation and implications on SNAP

In [None]:
# Since it is sequential in nature, we can break it down into 3 'sequntial' components
# What exisiting problems lead to the proposal of the 1977 food stamp act
# How did the 1977 food stamp act affect SNAP benefits
# Did the 1977 food stamp act have an impact on future legislation and implication on SNAP
gpt3 = OpenAI(temperature=0, model="text-davinci-003")
service_context = ServiceContext.from_defaults(llm=gpt3)
index = VectorStoreIndex.from_documents(documents)
step_decompose_transform = StepDecomposeQueryTransform(
    LLMPredictor(llm=gpt3), verbose=True
)
index_summary = "Used to answer questions about role of 1977 on immediate and future implications of SNAP"
# set Logging to DEBUG for more detailed outputs


query_engine = index.as_query_engine(service_context=service_context)
query_engine = MultiStepQueryEngine(
    query_engine=query_engine,
    query_transform=step_decompose_transform,
    index_summary=index_summary,
)



[33;1m[1;3m> Current query: How did the 1977 food stamp act affect SNAP benefits?
[0m[38;5;200m[1;3m> New query:  What were the immediate and future implications of the 1977 Food Stamp Act on SNAP benefits?
[0m[33;1m[1;3m> Current query: How did the 1977 food stamp act affect SNAP benefits?
[0m[38;5;200m[1;3m> New query:  What were the specific changes to the Food Stamp Program made by the 1977 Food Stamp Act?
[0m[33;1m[1;3m> Current query: How did the 1977 food stamp act affect SNAP benefits?
[0m[38;5;200m[1;3m> New query:  What were the specific changes to the Food Stamp Program made by the 1977 Food Stamp Act?
[0m

In [None]:
response = query_engine.query(
    "What existent problems in SNAP lead to the proposal of the 1977 food stamp act?",
)


[33;1m[1;3m> Current query: What existent problems in SNAP lead to the proposal of the 1977 food stamp act?
[0m[38;5;200m[1;3m> New query:  What were the immediate and future implications of the 1977 food stamp act on SNAP?
[0m[33;1m[1;3m> Current query: What existent problems in SNAP lead to the proposal of the 1977 food stamp act?
[0m[38;5;200m[1;3m> New query:  What existent problems in SNAP led to the increased accessibility of the program for low-income households?
[0m[33;1m[1;3m> Current query: What existent problems in SNAP lead to the proposal of the 1977 food stamp act?
[0m[38;5;200m[1;3m> New query:  What factors led to greater food insecurity among SNAP participants than among similarly low-income nonparticipants?
[0m

In [None]:
print(neat_text(response))

The proposal of the 1977 Food Stamp Act was likely influenced by the consistent prevalence of greater food insecurity among SNAP participants compared to similarly low-income nonparticipants.
This disparity in food security may have been attributed to factors such as the demographic characteristics of SNAP participants, including being younger, minority, less educated, female-headed, and having more children and a disabled member.
Additionally, the higher likelihood of poorer individuals leaving the sample prior to the end of the panel could have contributed to the need for changes in the SNAP program to address these existing problems.


In [None]:
response = query_engine.query(
    "How did the 1977 food stamp act affect SNAP benefits?",
)


In [None]:
print(neat_text(response))

The 1977 Food Stamp Act had a significant impact on SNAP benefits.
It brought about several changes to the Food Stamp Program, which is now known as SNAP.
These changes included increasing the maximum benefit amount, expanding eligibility criteria, and providing more flexibility in how benefits could be used.
As a result, the immediate implication of the 1977 Food Stamp Act was increased access to food and nutrition for low-income households.
In the long-term, research has shown that access to SNAP benefits during early life leads to positive health and economic outcomes, such as lower incidence of metabolic syndrome, improved health, higher economic self-sufficiency, and increased high school graduation rates.


In [None]:
response = query_engine.query(
    "Did the 1977 food stamp act have an impact on future legislation or implications on SNAP?",
)



[33;1m[1;3m> Current query: Did the 1977 food stamp act have an impact on future legislation or implications on SNAP?
[0m[38;5;200m[1;3m> New query:  What were the immediate and future implications of the 1977 Food Stamp Act on SNAP?
[0m[33;1m[1;3m> Current query: Did the 1977 food stamp act have an impact on future legislation or implications on SNAP?
[0m[38;5;200m[1;3m> New query:  What are the long-term health and economic outcomes associated with access to SNAP during early life?
[0m[33;1m[1;3m> Current query: Did the 1977 food stamp act have an impact on future legislation or implications on SNAP?
[0m[38;5;200m[1;3m> New query:  What are the immediate implications of the 1977 Food Stamp Act on SNAP?
[0m

In [None]:
print(neat_text(response))

Yes, the 1977 Food Stamp Act had an impact on future legislation and implications on SNAP.
The Act expanded the eligibility criteria and benefits available under the program, which set a precedent for future changes and expansions to SNAP.
This legislation demonstrated a recognition of the importance of providing access to food and a nutritious diet for low-income households, which influenced subsequent policies and discussions surrounding SNAP.
