## Customization
### Definition


LlamaIndex provides comprehensive configuration support for the RAG process, allowing developers to personalize the entire process. Common configuration scenarios include:

- Custom document chunk size(自定义文章分块大小)
- Custom vector storage scheme (chroma, pycron)(自定义向量存储方案)
- Custom query (the most common is to set the number of similar documents returned in matching similar documents)(自定义检索)
- Specify LLM (generally defaults to openai, but not limited to openai)
- Specify the response mode(only study how to set the mode today)
- Specify streaming response (streaming response helps users have a better reading experience)(流式响应)

Note that personalized configuration is mainly implemented through the **ServiceContext class** provided by LlamaIndex. (Not all are implemented by servicecontext)

### Samples
basic example:


In [5]:
import os 
os.environ["OPENAI_API_KEY"]='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'

In [6]:
from llama_index import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader('data').load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()
response = query_engine.query("What did the author do growing up?")
print(response)

The author worked on writing and programming outside of school before college. They wrote short stories and tried writing programs on an IBM 1401 computer using an early version of Fortran. They later got a microcomputer and started programming on it, writing simple games and a word processor. They also expressed an interest in studying philosophy in college but eventually switched to AI.



code samples for demonstrating LLaMAIndex's support for various configuration scenarious.

##### “I want to parse my documents into smaller chunks”

In [None]:
from llama_index import ServiceContext
service_context = ServiceContext.from_defaults(chunk_size=500)

合理调整chunk的大小有利于LLM模型更好的进行embedding，从而在用户进行query检索的时候输出与用户查询内容关联度更高的内容。

##### “I want to use a different vector store”
_まだ理解なかった_

In [None]:
import chromadb #使用chroma作为存储向量的方案
from llama_index.vector_stores import ChromaVectorStore
from llama_index import StorageContext

chroma_client = chromadb.PersistentClient()
chroma_collection = chroma_client.create_collection("quickstart") #chroma的客户端创建了一个集合叫做quickstart
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)

##### “I want to retrieve more context when I query”

In [None]:
from llama_index import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader('data').load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine(similarity_top_k=5)

##### “I want to use a different LLM”
default to be openai, but can also use other LLM

In [None]:
from llama_index import ServiceContext
from llama_index.llms import PaLM
service_context = ServiceContext.from_defaults(llm=PaLM())

##### “I want to use a different response mode”

In [2]:
import os 
os.environ["OPENAI_API_KEY"]='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'

In [3]:
from llama_index import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader('data').load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine(response_mode='tree_summarize')
response = query_engine.query("What did the author do growing up?")
print(response)

The author worked on writing and programming outside of school before college. They wrote short stories and tried writing programs on an IBM 1401 computer using an early version of Fortran. They later got a microcomputer, a TRS-80, and started programming more extensively, writing simple games and a word processor. They initially planned to study philosophy in college but switched to AI. They also started publishing essays online and eventually wrote a book called "Hackers & Painters."


tree_summerize: summerize the main idea as a tree, summarize based on some documents

**tree_summerize:**\
The author worked on writing and programming outside of school before college. They wrote short stories and tried writing programs on an IBM 1401 computer using an early version of Fortran. They later got a microcomputer, a TRS-80, and started programming more extensively, writing simple games and a word processor. They initially planned to study philosophy in college but switched to AI. They also started publishing essays online and eventually wrote a book called "Hackers & Painters."

compared with normal

**normal**\
The author worked on writing and programming outside of school before college. They wrote short stories and tried writing programs on an IBM 1401 computer using an early version of Fortran. They later got a microcomputer and started programming on it, writing simple games and a word processor. They also expressed an interest in studying philosophy in college but eventually switched to AI.

##### “I want to stream the response back”
stream response is an user-friendly response way.

In [4]:
import os 
os.environ["OPENAI_API_KEY"]='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'

In [5]:
from llama_index import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader('data').load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine(streaming=True)
response = query_engine.query("What did the author do growing up?")
response.print_response_stream()

The author worked on writing and programming outside of school before college. They wrote short stories and tried writing programs on an IBM 1401 computer using an early version of Fortran. They later got a microcomputer and started programming on it, writing simple games and a word processor. They also mentioned their interest in philosophy and AI.

##### “I want a chatbot instead of Q&A”

In [6]:
import os 
os.environ["OPENAI_API_KEY"]='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'

In [7]:
from llama_index import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader('data').load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_chat_engine()
response = query_engine.chat("What did the author do growing up?")
print(response)

I'm sorry, but I don't have access to personal information about the author.


with the query engine of chat, it will contine chatting with you

In [8]:
response = query_engine.chat("Oh interesting, tell me more.")
print(response)

I apologize for the confusion. As an AI language model, I don't have real-time access to personal information about individuals unless it has been shared with me in the course of our conversation. I can provide general information and answer questions based on my training, but I don't have specific details about the author's personal life or experiences. Is there anything else I can help you with?


### Specific example

In [9]:
%pip install chromadb

Collecting chromadb
  Obtaining dependency information for chromadb from https://files.pythonhosted.org/packages/3c/ff/ac74735884031a3b9ddf7b1abecee0885ec61660588b1e7c6862bccf5116/chromadb-0.4.14-py3-none-any.whl.metadata
  Downloading chromadb-0.4.14-py3-none-any.whl.metadata (7.0 kB)
Collecting chroma-hnswlib==0.7.3 (from chromadb)
  Obtaining dependency information for chroma-hnswlib==0.7.3 from https://files.pythonhosted.org/packages/cc/3d/ca311b8f79744db3f4faad8fd9140af80d34c94829d3ed1726c98cf4a611/chroma_hnswlib-0.7.3-cp310-cp310-win_amd64.whl.metadata
  Downloading chroma_hnswlib-0.7.3-cp310-cp310-win_amd64.whl.metadata (262 bytes)
Collecting fastapi>=0.95.2 (from chromadb)
  Obtaining dependency information for fastapi>=0.95.2 from https://files.pythonhosted.org/packages/db/30/b8d323119c37e15b7fa639e65e0eb7d81eb675ba166ac83e695aad3bd321/fastapi-0.104.0-py3-none-any.whl.metadata
  Downloading fastapi-0.104.0-py3-none-any.whl.metadata (24 kB)
Collecting uvicorn[standard]>=0.18.3 

In [16]:
import os 
os.environ["OPENAI_API_KEY"]='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'

import chromadb #使用chromadb存向量
from llama_index import VectorStoreIndex, SimpleDirectoryReader #vectorstoreindex向量化， simpledirectoryreader从指定文件夹中加载文档
from llama_index import ServiceContext #可以更改chunk大小和使用的llm等
from llama_index.vector_stores import ChromaVectorStore 
from llama_index import StorageContext #向量存储配置
from llama_index.llms import OpenAI #选择使用的llm模型


In [17]:
#第一部分做自定义分块
service_context = ServiceContext.from_defaults(chunk_size=500,llm=OpenAI())

In [18]:
#storagecontext自定义向量存储（StorageContext）
chroma_client = chromadb.PersistentClient()
chroma_collection_recent = chroma_client.create_collection("quickstart")
vector_store_recent = ChromaVectorStore(chroma_collection = chroma_collection_recent)
storage_context = StorageContext.from_defaults(vector_store=vector_store_recent)

In [19]:
#索引文档
documents = SimpleDirectoryReader('data').load_data()
index = VectorStoreIndex.from_documents(documents,service_context=service_context,storage_context=storage_context)
#将自定义的chunk数service_context和向量存储方式storage_content都导入

In [23]:
# 指定查询内容
query_engine = index.as_query_engine(response_mode = "tree_summarize",streaming = True)
response = query_engine.query("what did the auther do?")
response.print_response_stream()

The author published essays online and wrote a book on Lisp. Additionally, the author worked as a painter and became a studio assistant for a painter named Idelle Weber.