In [5]:
from getpass import getpass
import os

# Prompt the user to enter their OpenAI API key securely
OPENAI_API_KEY = getpass("Enter your OpenAI API Key: ")

# Set the environment variable for the API key
os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY


In [7]:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()
response = query_engine.query("What does LLm work on language process?")
print(response)

 LLMs demonstrate a deep understanding of natural language text to extract relevant data for various tasks like translation or summarization. They also possess capabilities in generating human-like text when prompted with specific cues and exhibit contextual awareness by considering factors such as domain expertise, which enhances their effectiveness across diverse industries ranging from healthcare to education. Furthermore, LLMs excel at problem solving and decision making using the information found within passages in tasks like information retrieval or question answering systems.


## Chunking the data

In [8]:
from llama_index.core import Settings

Settings.chunk_size = 512

# Local settings
from llama_index.core.node_parser import SentenceSplitter

index = VectorStoreIndex.from_documents(
    documents, transformations=[SentenceSplitter(chunk_size=512)]
)

In [6]:
!pip install llama-index-vector-stores-chroma

Collecting llama-index-vector-stores-chroma
  Downloading llama_index_vector_stores_chroma-0.1.10-py3-none-any.whl.metadata (705 bytes)
Downloading llama_index_vector_stores_chroma-0.1.10-py3-none-any.whl (5.0 kB)
Installing collected packages: llama-index-vector-stores-chroma
Successfully installed llama-index-vector-stores-chroma-0.1.10


In [10]:
import chromadb
from llama_index.vector_stores.chroma import ChromaVectorStore
from llama_index.core import StorageContext

chroma_client = chromadb.PersistentClient()
chroma_collection = chroma_client.create_collection("llama_store_v1")
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)

In [11]:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine(similarity_top_k=5)
response = query_engine.query("What does LLM mean? Explain me widely and also its prons and cons.")
print(response)

 In this document's examination of language models related to security and privacy concerns, Large Language Models (LLM) are discussed as a significant advancement in the field of computational linguistics with substantial implications for various applications including cybersecurity-related tasks. An LLM is characterized by its extensive training on vast text corpora using advanced pre-training techniques which have contributed to notable progresses, particularly evident within Natural Language Processing (NLP) domains and beyond into areas like programming support vulnerability detection as well as medical text analysis among others where they play a multifaceted role.

Pronunciation: 
The acronym "LLM" stands for Large Language Models, which are typically pronounced [lɪm]. However, in the context of technology and computing terminology like this one's document on LLM impact studies concerning security-related areas, it is often just spoken as its abbreviation without a specific addi

## With Local LLM 

In [12]:
# Global settings
from llama_index.core import Settings
from llama_index.llms.ollama import Ollama

Settings.llm = Ollama(model="phi3:mini", request_timeout=60.0)

# Local settings
index.as_query_engine(llm=Ollama(model="phi3:mini", request_timeout=60.0))

<llama_index.core.query_engine.retriever_query_engine.RetrieverQueryEngine at 0x7f7aea449850>

In [16]:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine(response_mode="tree_summarize")
response = query_engine.query("What is emebdding in LLM?")
print(response)

 Embedding refers to a technique that enables converting text into numerical representations, often called embedds or vectors, which can capture semantic meanings of words and phrases within language models such as LLMs. These numeric vector encodings are then used for various natural language processing tasks by large language models (LLMs).


In [13]:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine(streaming=True)  #streaming the response back
response = query_engine.query("What does LLM mean? Explain me widely and also its prons and cons.")
response.print_response_stream()

 A Large Language Model (LLM) is a type of artificial intelligence that understands natural language by analyzing vast amounts of text data through pretraining tasks like masked language modeling and autoregressive prediction, thereby gaining comprehension about the contextualized semantics in human languages.

Pronunciation: Large Language Model (LLM)

Pros/Advantages - 
1. LLM has an extensive understanding of natural language due to its pretraining on large text data sets and tasks, which aids it greatly at comprehending contextualized semantics in human languages. It also exhibits the ability to generate high-quality, almost human-like text outputs after being fine-tuned further for specific purposes or domains - this is often referred as 'text generation' capability of LLMs
2. The capacity of these models extends into knowledge-intensive areas where they can maintain a contextual awareness and understanding similar to the depth humans possess on various topics, thereby making them

## CHATBOT instead of Q&A

In [14]:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_chat_engine()
response = query_engine.chat("What did the author do growing up?")
print(response)

response = query_engine.chat("Oh interesting, tell me more.")
print(response)

To learn what the author did growing up, it would be best to look into their personal writings like memoirs or autobiographies if available, read biographical studies written by scholars in literary fields, explore interviews where they might share details of their past experiences with fans and interviewers. Online resources such as dedicated fan pages for the author's works can also provide insights into personal histories that are relevant to understanding an artist or writer better.

If you have a specific name in mind, I could try using 'query_engine_tool' by inputting natural language questions about their biography and childhood experiences if such data is available within my knowledge base prior to the cutoff date of 2023-10. However, for detailed information not covered before that time or personal anecdotes shared afterward in interviews or social media posts up until now (beyond October 2023), I would suggest searching through credible biographies and reliable literary sourc