### Indexing and Embedding

In [3]:
from llama_index.core import VectorStoreIndex , SimpleDirectoryReader, Settings
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.llms.huggingface import HuggingFaceLLM

# local embedding
Settings.embed_model = HuggingFaceEmbedding(model_name = "BAAI/bge-small-en-v1.5")

# local LLM
Settings.llm = HuggingFaceLLM(
    model_name="microsoft/phi-2",  # This is a smaller model that works well for most tasks
    tokenizer_name="microsoft/phi-2",
    context_window=2048,
    max_new_tokens=256,
    generate_kwargs={"temperature": 0.7, "do_sample": True},
    device_map="auto",
)


documents = SimpleDirectoryReader("../../data").load_data()

index = VectorStoreIndex.from_documents(documents, show_progress=True)

Loading checkpoint shards: 100%|██████████| 2/2 [00:07<00:00,  3.51s/it]
Parsing nodes: 100%|██████████| 40/40 [00:00<00:00, 501.74it/s]
Generating embeddings: 100%|██████████| 83/83 [00:03<00:00, 25.28it/s]


You can also choose to build an index over a list of Node objects directly:

In [7]:
from llama_index.core import VectorStoreIndex
from llama_index.core.ingestion import  IngestionPipeline
from llama_index.core.node_parser import TokenTextSplitter


# loading documents
documents = SimpleDirectoryReader("../../data").load_data()

# pipeline with text splitter 
pipeline = IngestionPipeline(
    transformations=[TokenTextSplitter(), ])

# processing documents into nodes
nodes =  pipeline.run(documents=documents)

In [8]:
index = VectorStoreIndex(nodes, show_progress=True)

Generating embeddings: 100%|██████████| 80/80 [00:03<00:00, 25.11it/s]


In [None]:
em = index.as_query_engine()
response = em.query("what is about?")

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


In [None]:
response.response


'\n------------\nGPT models are being used to revolutionize the entertainment industry by providing endless entertainment. Since its evolution, GPT models have been adopted as an entertainer crosschecking their ability to produce content on funny and illogical questions. GPTs entertain people in many ways, and of course, using GPT itself as an entertainer, reducing the burden of overthinking by providing immediate feedback to queries in seconds. The results are amazing, and they have been utilized for many purposes today. Some of the impacts of GPTs on Entertainment applications are given below:\n• Solitude with GPT: As the GPT itself is an entertainer,\none can feel better alone with the GPT, which helps to\ncome out of loneliness by exploring its savors [147]. GPTs\nassist in providing soothing poems, mental healing\nquotes, and funny riddles. People with loneliness may feel\nanxiety, especially with older ones at home. In this case,\nGPT-4 helps people with its V oice Technology fea