# General LLM App Patterns

```python
chain = {"element": lambda x: x} | prompt | model | StrOutputParser()
chain.invoke({'input1': input1})
```

- Prompt
    - `template = ... {schema} ... {question} ... {query} ...`
        - Define a prompt with changeable inputs denoted by `{}`
    - ```python
        prompt_response = ChatPromptTemplate.from_messages(
            [
                (
                    "system",
                    "Given an input question and SQL response, convert it to a natural language answer. No pre-amble.",
                ),
                ("human", template),
            ]
        )
        ```
        - Define a `ChatPromptTemplate` that takes in the defined template string
        - Either you can assign the variables in `{}` through the chain (e.g. `RunnablePassthrough.assign(schema=get_schema)`), or you can invoke it in the eventual function call
    - If you need to have a temporarily nullable object in your chain, use `MessagesPlaceholder(variable_name="history")`. In this case, an empty list is created for the `history` object in case it does not yet exist yet

    - For long running contexts, init a `ConversationBufferMemory()` to store messages 
        - If you need memory to be shared, there are many types of memory. Another one to consider is `ReadOnlySharedMemory`

    - Choose most appropriate chain
        - For exmample: `SQLDatabaseChain` for natural language to SQL 
        - For example: `SmartLLMChain` for hard questions

    - Customise chain from `Chain` base class
        - See https://github.com/langchain-ai/langchain/blob/master/cookbook/sales_agent_with_context.ipynb

    - Customise prompt using base class `StringPromptTemplate`
        - See https://github.com/langchain-ai/langchain/blob/master/cookbook/sales_agent_with_context.ipynb

    - Customise parser using base class `AgentOutputParser`
        - See https://github.com/langchain-ai/langchain/blob/master/cookbook/sales_agent_with_context.ipynb

- Index to Database
    - To store texts, it is often necessary to chunk them first
        ```python
            text_splitter = CharacterTextSplitter.from_tiktoken_encoder(
                chunk_size=4000, chunk_overlap=0
            )
            texts_4k_token = text_splitter.split_text(joined_texts)
        ``` 
    
    - Store chunked texts
        ```python
            # Create chroma
            vectorstore = Chroma(
                collection_name="mm_rag_clip_photos", embedding_function=OpenCLIPEmbeddings()
            )
            # Add images
            vectorstore.add_images(uris=image_uris)

            # Add documents
            vectorstore.add_texts(texts=texts)

            # Make retriever
            retriever = vectorstore.as_retriever()

            # Get docs
            docs = retriever.invoke("Woman with children", k=10)
            ```

    - You can index via the usual text embedding, but there are also multimodal embeddings for non-text data (e.g. langchain_experimental.open_clip.OpenCLIPEmbeddings)

    - For cases where you have nontext data, you can try using the `unstructured` package (e.g. `from unstructured.partition.pdf import partition_pdf`)

    - In cases where there are many documents, it can benefit from recursive clustering and indexing (i.e. you keep clustering and getting GPT to summarise cluster, then add the summary doc to the DB)
        - This is known as RAPTOR
        - See https://github.com/langchain-ai/langchain/blob/master/cookbook/RAPTOR.ipynb

    - 


- Retrieve information 
    - Make retriever/loader
        - https://python.langchain.com/v0.1/docs/modules/data_connection/retrievers/

    - Retrieve from regular SQL database
        - `def get_schema(): db.get_table_info()` 
            - To get database info 
        - `RunnablePassthrough.assign(schema=get_schema) | prompt` 
            - To pass schema data into prompt

        - When retrieving from database, may also be useful to set AttributeInfo for fields `from langchain.chains.query_constructor.base import AttributeInfo`
            - By doing this, you let the LLM decide what to call
            - This is also known as "self-query"


    - Retrieve from regular docsearch database
        - ```python
            retrieval_qa_chat_prompt = hub.pull("langchain-ai/retrieval-qa-chat")
            llm = ChatOpenAI()
            retriever = ...
            combine_docs_chain = create_stuff_documents_chain(
                llm, retrieval_qa_chat_prompt
            )
            retrieval_chain = create_retrieval_chain(retriever, combine_docs_chain)
            ```

    - Retrieval can be multivector (i.e. multiple types of documents)
        - ```python
            retriever = MultiVectorRetriever(
                vectorstore=vectorstore,
                docstore=store,
                id_key=id_key,
            )

            def create_multi_vector_retriever(...)
            ```
        - See https://github.com/langchain-ai/langchain/blob/master/cookbook/Multi_modal_RAG.ipynb

    - You can customise retrievers if you want, inherit from `BaseRetriever`
        - See https://github.com/langchain-ai/langchain/blob/master/cookbook/forward_looking_retrieval_augmented_generation.ipynb

    - HyDE ( Hypothetical Document Embeddings)
        - Instead of using only a single document, generate multiple "synthetic" documents and generate an average embedding for better context
        - See https://github.com/langchain-ai/langchain/blob/master/cookbook/hypothetical_document_embeddings.ipynb

    - Embedding quantization
        - You can quantize the embeddings to ensure smaller memory footprint
        - https://github.com/langchain-ai/langchain/blob/master/cookbook/rag_with_quantized_embeddings.ipynb

    - Use models to generate embeddings `from langchain_community.embeddings import QuantizedBiEncoderEmbeddings`
        - See https://github.com/langchain-ai/langchain/blob/master/cookbook/rag_with_quantized_embeddings.ipynb

    - Incorporating a `rewriter` chain into a regular chain can be useful. Think of it as query enhancement as part of your flow
        - ```python
            rewrite_retrieve_read_chain = (
                {
                    "context": {"x": RunnablePassthrough()} | rewriter | retriever,
                    "question": RunnablePassthrough(),
                }
                | prompt
                | model
                | StrOutputParser()
            )
            ```
        - `https://github.com/langchain-ai/langchain/blob/master/cookbook/rewrite.ipynb`

    - Sometimes, when pulling multiple documents where the answer is scattered over all documents, the LLM ill get confused
        - It is helpful to re-phrase the question into multiple steps, (i.e. take a step back) before answering the qn
        - Use "step back prompting" --> https://github.com/langchain-ai/langchain/blob/master/cookbook/stepback-qa.ipynb

- Add tools for agent to use
    - ```python
        tools = [
            Tool(
                name="State of Union QA System",
                func=state_of_union.run,
                description="useful for when you need to answer questions about the most recent state of the union address. Input should be a fully formed question.",
            ),
            Tool(
                name="Ruff QA System",
                func=ruff.run,
                description="useful for when you need to answer questions about ruff (a python linter). Input should be a fully formed question.",
            ),
        ]

        agent = initialize_agent(
            tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True
        )
        ```
      - https://github.com/langchain-ai/langchain/blob/master/cookbook/agent_vectorstore.ipynb

    - Tool for search web
        - search = SerpAPIWrapper()

    - Run another chain as a tool
        - ```python
            todo_prompt = PromptTemplate.from_template("You are a planner who is an expert at coming up with a todo list for a given objective. Come up with a todo list for this objective: {objective}")
            todo_chain = LLMChain(llm=OpenAI(temperature=0), prompt=todo_prompt)
            ```

    - You can define anything as a tool using the `func` parameter, as long as the input is a callable function

- Add info
    - `... | llm.bind(stop=["\nSQLResult:"]) | `
        - Bind results to llm


- Basic Chain / Add decision nodes
    - See all chain types
        - https://python.langchain.com/v0.1/docs/modules/chains/
    - `PALChain` for generating pseudocode as part of reasoning
        - Does not give you a REPL to run the code, only generates

- Models
    - Download from `huggingface_hub` if needed
        ```python
            from huggingface_hub import hf_hub_download
            hf_hub_download("TheBloke/Llama-2-7b-Chat-GGUF", model_name, local_dir="state")
        ```

        - see https://github.com/langchain-ai/langchain/blob/master/cookbook/apache_kafka_message_handling.ipynb

    - 

- Output parsers
    - ` ... | StrOutputParser()`
        - Force output to specific format

    - Force output to pydantic format
        ```python
            class code(BaseModel):
                """Code output"""

                prefix: str = Field(description="Description of the problem and approach")
                imports: str = Field(description="Code block import statements")
                code: str = Field(description="Code block not including import statements")

            # LLM
            llm = ChatAnthropic(
                model="claude-3-opus-20240229",
                default_headers={"anthropic-beta": "tools-2024-04-04"},
            )

            # Structured output, including raw will capture raw output and parser errors
            structured_llm = llm.with_structured_output(code, include_raw=True)
            code_output = structured_llm.invoke(
                "Write a python program that prints the string 'hello world' and tell me how it works in a sentence"
            )
            ```

- Invocation
    - You can invoke multiple prompts at once using `.batch()`
        - `chain.batch(texts, {"max_concurrency": 5})`
    


- Evaluation
    
    - See https://github.com/langchain-ai/langchain/blob/master/cookbook/advanced_rag_eval.ipynb

- Extraction
    - OpenAI supports extraction casted to pydantic types
        - You can coerce to pydantic type using `PydanticToolsParser` if you don't want to use the prebuilt chain
            - `chain = prompt | model | PydanticToolsParser(tools=pydantic_schemas)`
        - ```python
            # Make sure to use a recent model that supports tools
            model = ChatOpenAI(model="gpt-3.5-turbo-1106")

            # Pydantic is an easy way to define a schema
            class Person(BaseModel):
                """Information about people to extract."""

                name: str
                age: Optional[int] = None

            chain = create_extraction_chain_pydantic(Person, model)
            ```
        - See https://github.com/langchain-ai/langchain/blob/master/cookbook/extraction_openai_tools.ipynb

- RLHF
    - `HumanApprovalCallbackhandler`
        - https://github.com/langchain-ai/langchain/blob/master/cookbook/human_approval.ipynb
    - Another way you can handle feedback to an LLM is through meta prompting. That is, given a prompt and response, incorporate an `input()` for human to give feedback. Then use the LLM to generate a "critique" to pass on to itself based on `input()`. This critique is then used in the subsequent loops for the LLM to generate better answers
        - https://github.com/langchain-ai/langchain/blob/master/cookbook/meta_prompt.ipynb
    

- Langgraph
    - Define edges, nodes, tools
    - Using `@tools`
        - https://github.com/langchain-ai/langchain/blob/master/cookbook/tool_call_messages.ipynb

    - You can use tools from an agent or as a usual chat model
        - Tools can be called via function 
            ```python
                search_tool = TavilySearchResults(max_results=1, args_schema=SearchTool)
                tools = [search_tool]

                from langgraph.prebuilt import ToolExecutor
                tool_executor = ToolExecutor(tools)

                def call_tool(state):
                    messages = state["messages"]
                    last_message = messages[-1]
                    tool_call = last_message.tool_calls[0]
                    action = ToolInvocation(
                        tool=tool_call["name"],
                        tool_input=tool_call["args"],
                    )
                    response = tool_executor.invoke(action)                    
                    function_message = ToolMessage(
                        content=str(response), name=action.tool, tool_call_id=tool_call["id"]
                    )
                    return {"messages": [function_message]}
                    
                workflow.add_node("action", call_tool)
            ```   
        - Tools can be called directly using `tool_node = ToolNode(tools)`
            ```python
                tool_node = ToolNode(tools)
                workflow.add_node("action", tool_node)
            ```