Standard prompting:

1. **Zero-Shot Prompting:** Directly instructs the LLM to perform a task without any additional examples. Also called **Zero-Shot Learning**.
2. **One-Shot Prompting** 
3. **Few-Shot Prompting:** Prompts the LLM for a response with examples or demonstrations about the task you want it to achieve. Also called **Few-Shot Learning**, **Example-Based Prompting**, **Prompt Augmentation**, or **Demonstration Learning**.
    - **Exemplar Generation**
    - **Exemplar Selection**
    - **Exemplar Ordering**
4. **Perspective Prompting:** Of single or multiple perspectives. Determine the distinct roles that you want the LLM to assume. Also called **Role Prompting**, **Persona-Based Prompting**.
5. **Contextual Prompting:** Requests for additional considerations by providing relevant information or constraints. Also called **Context-Based Prompting**.
    - **Activity-Based:** Such as "shopping".
    - **Event-Based:** Such as "department store anniversary sales".
    - **Role-Based:** Defines target audience. Such as "customer".
    - **Behavior-Based:** Such as "decision".
    - **Time-Based:** Such as "Thanksgiving".
    - **Location-Based:** Such as "Macy's".
6. **Instructional Prompting:** Explicitly guides the LLM to perform specific tasks. Also call **Instruction-Based Prompting**.
    - **Detailed Instructions**
    - **Specify the Steps**
    - **Delimiters:** Uses three quotes (`"""`), or three dashes (`---`), or three sharps (`###`), to separate instructions from content.
    - **Specify Length**
    - **Specify Format:** **Data-Structured Prompting (DSP)** uses structured data formats, such as tables, lists, or specific schemas.
7. **Template Prompting:** Provides a template to the LLM. The LLM will then replace the placeholders in the template. Prompt templating also allows for prompts to be stored, reused, shared, and programmed. Also called **Template-Based Prompting**. 
8. **Style Prompting:** Asks for tone adjustment. Also called **Emotional Prompting**.
9. **Negative Prompting**: Negative prompts are directions for information that should not be provided in responses. Not recommended in LLMs but commonly used in text-to-image models.
10. **Reverse Prompting:** Role reversal. Asks the LLM which prompt should be used to produce an output similar to a provided output.

Zero-shot prompting:

11. **Emotion Prompting:** Incorporates phrases of psychological relevance to humans into the prompt.
    - [Large Language Models Understand and Can be Enhanced by Emotional Stimuli](https://arxiv.org/abs/2307.11760)
12. **System 2 Attention (S2A) Prompting:** Asks the LLM to rewrite the prompt & removes any unrelated information, and then passes this new prompt to the LLM. 
    - [System 2 Attention](https://arxiv.org/abs/2311.11829)
13. **simToM (Simulation Theory of Mind):** Establishes the set of facts one person knows, then answers the question based only on those facts.
    - [Think Twice: Perspective-Taking Improves Large Language Models' Theory-of-Mind Capabilities](https://arxiv.org/abs/2311.10227)
14. **Rephrase & Respond (RaR) Prompting:** Instructs the LLM to rephrase and expand the question before generating the final answer.
    - [Rephrase and Respond: Let Large Language Models Ask Better Questions for Themselves](https://arxiv.org/abs/2311.04205)
15. **Re-reading (RE2):** Adds the phrase "Read the question again:" to the prompt in addition to repeating the question.
    - [Re-Reading Improves Reasoning in Large Language Models](https://arxiv.org/abs/2309.06275)
16. **Self-Ask Prompting (SA):** Asks the LLM which extra information would improve the result and the generated content. Also called **Ask-Before-Answer Prompting**.

Thought generation:

17. **Chain-of-Thought (CoT) Prompting:** Enables complex reasoning capabilities through intermediate reasoning steps.
    - [Chain-of-Thought Prompting Elicits Reasoning in Large Language Models](https://arxiv.org/abs/2201.11903)
18. **Zero-Shot Chain-of-Thought (0-CoT) Prompting:** "Let's think step by step".
    - [Large Language Models are Zero-Shot Reasoners](https://arxiv.org/abs/2205.11916)
19. **Step-Back Prompting:** Also called **Take-a-Step-Back Prompting**.
    - [Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models](https://arxiv.org/abs/2310.06117)
20. **Analogical Prompting**: Recalls relevant problems and solutions, and then solves the initial problem.
    - [Large Language Models as Analogical Reasoners](https://arxiv.org/abs/2310.01714)
21. **Thread-of-Thought (ThoT) Prompting:** "Walk me through this context in manageable parts step by step, summarizing and analyzing as we go".
22. **Tabular Chain-of-Thought (Tab-CoT) Prompting:** Organizes the reasoning process into a tabular format, which allows for both horizontal and vertical reasoning. 
    - [Tab-CoT: Zero-shot Tabular Chain of Thought](https://arxiv.org/abs/2305.17812)
23. **Few-Shot Chain-of-Thought (CoT) Prompting**
24. **Contrastive Chain-of-Thought (CCoT):** Uses both valid and invalid reasoning demonstrations.
    - [Contrastive Chain-of-Thought Prompting](https://arxiv.org/abs/2311.09277)
25. **Chain-of-Symbol (CoS) Prompting:** Uses symbols such as `/` to assist the LLM with its difficulty of spatial reasoning in text.
    - [Chain-of-Symbol Prompting Elicits Planning in Large Langauge Models](https://arxiv.org/abs/2305.10276)
26. **Uncertainty-Routed CoT Prompting:** Generates multiple CoT reasoning chains and then takes the majority answer out of these chains as the final solution only if the proportion of chains that agreed on this answer are higher than a specific threshold.
    - [Gemini: A Family of Highly Capable Multimodal Models](https://arxiv.org/abs/2312.11805)
27. **Complexity-Based Prompting:** Prompts with higher reasoning complexity, such as chains with more reasoning steps, achieve substantially better performance on multistep reasoning tasks over strong baselines.
    - [Complexity-Based Prompting for Multi-Step Reasoning](https://arxiv.org/abs/2210.00720)
    - [Implementation](https://github.com/FranxYao/chain-of-thought-hub/tree/main/research/complexity_based_prompting)
28. **Active Prompting:** Generates multiple answers for calculating the uncertainty metrics, and then selects the most uncertain questions. Human annotators provide reasoning processes and answers to selected questions to create new examples.
    - [Active Prompting with Chain-of-Thought for Large Language Models](https://arxiv.org/abs/2302.12246)
29. **Momory-of-Thought Prompting:** Performs inference beforehand on the unlabeled training exemplars, saves the high-confidence thoughts as external memory, and then retrieves similar instances to the test sample.
    - [MoT: Memory-of-Thought Enables ChatGPT to Self-Improve](https://arxiv.org/abs/2305.05181)
30. **Automatic Chain-of-Thought (Auto-CoT):** Consists of 2 main stages belowed.
    - **Question Clustering:** Partitions questions of a given dataset into a few clusters.
    - **Demonstration Samplng:** Selects a representative question from each cluster and generates its reasoning chain using Zero-Shot CoT with simple heuristics.
    - [Automatic Chain of Thought Prompting in Large Language Models](https://arxiv.org/abs/2210.03493)
    - [auto-cot](https://github.com/amazon-science/auto-cot)
31. **Logical Chain-of-Thought (LogiCoT) Prompting:** Introduces a neurosymbolic framework to enhance reasoning by incorporating principles from symbolic logic. It employs reductio ad absurdum to verify each step of reasoning and provide targeted feedback, reducing logical errors and hallucinations.
    - [LogiCoT: Logical Chain-of-Thought Instruction-Tuning](https://arxiv.org/abs/2305.12147)
32. **Graph-of-Thought (GoT) Prompting:** Solves complex problems by modeling them as a Graph of Operations (GoO).
    - [Graph of Thoughts: Solving Elaborate Problems with Large Language Models](https://arxiv.org/abs/2308.09687)
33. **Chain-of-Table Prompting:** Breaks the analysis process step by step using atomic operations on table.
    - [Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding](https://arxiv.org/abs/2401.04398)
    - [MultiCoT](https://cyqiq-cot-demo-c27oq5sqda-uc.a.run.app/)

Decomposition:

34. **Laddering Prompting:** Breaks a very complex problem into multiple sub-prompts and problems instead of sending one holistic prompt in one step. Alternative names of this approach include **Prompt Composition**, **Sequential Prompting**, **Multi-Turn Dialogue Prompts**, **Prewarming**, or **Internal Retrieval**.
35. **Least-to-Most Prompting:** Prompts first to list the subproblems of a complex problem, and then solve them in sequence.
    - [Least-to-Most Prompting Enables Complex Reasoning in Large Language Models](https://arxiv.org/abs/2205.10625)
36. **Decomposed Prompting (DECOMP)**: For tasks where multiple predictions should be performed for one sample, such as sequence labeling, breaks down the holistic prompt into different sub-prompts and then answer each sub-prompt separately. Or called **Prompt Decomposition**. Prompt decomposition can be considered for token or span prediction tasks; prompt composition would be a better choice for span relation prediction tasks.
37. **Plan-and-Solve (PS) Prompting:** "Let's first understand the problem, extract relevant variables and their corresponding numerals, and make and devise a complete plan. Then, let's carry out the plan, calculate intermediate variables (pay attention to correct numerical calculation and commonsense), solve the problem step by step, and show the answer."
    - [Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models](https://arxiv.org/abs/2305.04091)
    - [Plan-and-Execute Agents](https://blog.langchain.dev/plan-and-execute-agents/)
38. **Tree-of-Thoughts (ToT) Prompting:** Maintains a tree of thoughts, where thoughts represent coherent language sequences that serve as intermediate steps toward solving a problem. The LLM's ability to generate and self-evaluate thoughts is then combined with search algorithms to enable systematic exploration of thoughts with lookahead and backtracking.
    - [Tree of Thoughts: Deliberate Problem Solving with Large Language Models](https://arxiv.org/abs/2305.10601)
    - [ToT-Style Prompting](https://github.com/dave1010/tree-of-thought-prompting?tab=readme-ov-file#chain-of-thought-prompting)
39. **Recursion-of-Thought (RoT):** Enables LLMs to recursively create multiple contexts to solve problems. The framework also introduces special tokens triggering context-related operations. 
    - [Recursion of Thought: A Divide-and-Conquer Approach to Multi-Context Reasoning with Language Models](https://arxiv.org/abs/2306.06891)
41. **Program-of-Thoughts (PoT) Prompting:** Disentangles computation and reasoning from the problem solving process. Makes the LLM express the thoughts using Python program.
    - [Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks](https://arxiv.org/abs/2211.12588)
42. **Faithful Chain-of-Thought:** Consists of 2 main stages belowed.
    - **Translation:** Given a complex query in natural language, prompts the LLM to translate it into a reasoning chain which interleaves natural language comments and symbolic language programs.
    - **Problem Solving:** Calls a deterministic external solver.
    - [Faithful Chain-of-Thought Reasoning](https://arxiv.org/abs/2301.13379)
44. **Generated Knowledge Prompting:** Prompts the LLM to generate useful knowledge related to the task, and then incorporates the knowledge into the prompt alongside the question or task description. It is possible to streamline this into a single prompt; that is, analogical prompting.
    - [Generated Knowledge Prompting for Commonsense Reasoning](https://arxiv.org/abs/2110.08387)
45. **Maieutic Prompting**
    - [Maieutic Prompting: Logically Consistent Reasoning with Recursive Explanations](https://arxiv.org/abs/2205.11822)
46. **Directional Stimulus Prompting:**
    - [Guiding Large Language Models via Directional Stimulus Prompting](https://arxiv.org/abs/2302.11520)

Ensembling:

45. **Prompt Ensembling:** Uses multiple unanswered prompts for an input at inference time to make predictions.
46. **Demonstration Ensembling (DENSE):**
47. **Mixture of Reasoning Experts (MoRE):**
48. **Max Mutual Information Method:**
49. **Self-Consistency:** Samples a diverse set of reasoning paths instead of only taking the naive greedy decoding, and then selects the most consistent answer by marginalizing out the sampled reasoning paths.
    - [Self-Consistency Improves Chain of Thought Reasoning in Language Models](https://arxiv.org/abs/2203.11171)
51. **Universal Self-Consistency:**
52. **Meta-Reasoning over Multiple CoTs:**
53. **DiVeRSe:**
54. **Consistency-based Self-adaptive Prompting (COSP):**
55. **Universal Self-Adaptive Prompting (USP)**
56. **Prompt Paraphrasing**

Self-criticism:

56. **Self-Calibration:**
57. **Self-Refine:** Asks the LLM to be self-critical. To evaluate and improve the responses it generated. Also called **Self-Evaluative Prompting**.
    - [Self-Refine: Iterative Refinement with Self-Feedback](https://arxiv.org/abs/2303.17651)
58. **Reversing Chain-of-Thought (RCoT):**
59. **Self-Verification:**
60. **Chain-of-Verification (CoVe) Prompting:**
61. **Cumulative Reasoning:**

Prompt chaining:

62. **Prompt Chaining:** A **Prompt Chain** consists of two or more prompt templates used in succession. The output of the prompt generated by the first prompt template is used to parameterize the second template, continuing until all templates are exhausted.
63. **Stuff Documents Chain**
64. **Reduce Documents Chain**
65. **Map Reduce Documents Chain**
66. **Refine Documents Chain**
67. **Map Rerank Documents Chain**
68. **Prompt Sharing:** Prompt templates are partially shared for multitask, multi-domain, or multilingual.

Prompt engineering:

69. **Meta Prompting:** Prompts the LLM to generate or improve a prompt or prompt template.
70. **AutoPrompt**
71. **Automatic Prompt Engineer (APE):** Uses one LLM to beam search over prompts for another LLM.
72. **Gradientfree Instructional Prompt Search (GrIPS)**
73. **Prompt Optimization with Textual Gradients (Pro-TeGi)**
74. **RLPrompt**
75. **Dialogue-Comprised Policy-Gradient-Based Discrete Prompt Optimization (DP2O):**
76. **Iterative Prompting**
    - [Iteratively Prompt Pre-trained Language Models for Chain of Thought](https://arxiv.org/abs/2203.08383)
77. **Expert Prompting:**
    - **Static (Generic)**
    - **Dynamic (Adaptive)**
    - [ExpertPrompting: Instructing Large Language Models to be Distinguished Experts](https://arxiv.org/abs/2305.14688)
78. **Multi-Persona Prompting (MP):** Also known as **Solo Performance Prompting (SPP)**.
    - [Unleashing the Emergent Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration](https://arxiv.org/abs/2307.05300)
79. **Meta Prompting:**
    - [Meta-Prompting: Enhancing Language Models with Task Agnostic Scaffolding](https://arxiv.org/abs/2401.12954)
80. **Automatic Prompt Optimization (APO):**
    - [Automatic Prompt Optimization with Gradient Descent and Beam Search](https://arxiv.org/abs/2305.03495)
81. **Prompt Engineering a Prompt Engineer (PE2)**
    - [Prompt Engineering a Prompt Engineer](https://arxiv.org/abs/2311.05661)
82. **Optimization by PROmpting (OPRO):**
    - [Large Language Models as Optimizers](https://arxiv.org/abs/2309.03409)
83. **Self-Discover:**
    - [Self-Discover: Large Language Models Self-Compose Reasoning Structures](https://arxiv.org/abs/2402.03620)

Few-shot prompting:

83. **K-Nearest Neighbor (KNN):**
84. **Vote-K:**
    - [Selective Annotation Makes Language Models Better Few-Shot Learners](https://arxiv.org/abs/2209.01975)
85. **Self-Generated In-Context Learning (SG-ICL):** Divided into the self-generation step & the inference step. First generates demonstrations conditioned on the test input and a specific class, so that generated demonstrations are highly correlated with the test input. Then uses the self-generated samples as a demonstration for in-context learning.
    - [Self-Generated In-Context Learning: Leveraging Auto-regressive Language Models as a Demonstration Generator](https://arxiv.org/abs/2206.08082)
86. **Prompt Mining:**
87. **?**
88. **?**
89. **?**

Retrieval Augmented Generation:

90. **Retrieval Augmented Generation (RAG):**
    - **Naive RAG:**
    - **Advanced RAG:**
    - **Modular RAG:**
91. **Graph Retrieval-Augmented Generation (GraphRAG)**
92. **Prompt Pipelining:** The variables or placeholders in the pre-defined prompt template are populated with the question from the user and the knowledge to be searched from the knowledge store.
93. **Chain-of-Note (CoN) Prompting:**
    - [Chain-of-Note: Enhancing Robustness in Retrieval-Augmented Language Models](https://arxiv.org/abs/2311.09210)
94. **Chain-of-Knowledge (CoK) Prompting:**
    - [Chain-of-Knowledge: Grounding Large Language Models via Dynamic Knowledge Adapting over Heterogeneous Sources](https://arxiv.org/abs/2305.13269)
95. **Forward-Looking Active REtrieval augmented generation (FLARE):**
    - [Active Retrieval Augmented Generation](https://arxiv.org/abs/2305.06983)
96. **Multi-Query**
97. **Multi-Vector**
98. **Parent Document Retriever**
99. **Top-K Similarity Search**
100. **Maximum Marginal Relevance (MMR)**
101. **Contextual Compression**
102. **Ensemble Retriever**
103. **Backtracing:**
    - [Backtracing: Retrieving the Cause of the Query](https://arxiv.org/abs/2403.03956)

Agents:

104. **Autonomous Agents:** Prompt Chaining is the execution of a predetermined and set sequence of actions. However, agents can maintain a high level of autonomy.
     - **Tool Use Agents**
     - **Code-Based Agents**
     - **Observation-Based Agents**
105. **Automatic Reasoning and Tool-use (ART)**
106. **Modular Reasoning, Knowledge, and Language (MRKL) System:**
107. **Self-Correcting with Tool-Interactive Critiquing (CRITIC)**
108. **Program-Aided Language Models (PAL)**
     - [PAL: Program-aided Language Models](https://arxiv.org/abs/2211.10435)
109. **Tool-Integrated Reasoning Agent (ToRA):**
110. **Task Weaver:**
111. **ReAct Prompting:**
112. **Reflexion**
     - [Reflexion: Language Agents with Verbal Reinforcement Learning](https://arxiv.org/abs/2303.11366)
113. **LLM Compiler:**
     - [An LLM Compiler for Parallel Function Calling](https://arxiv.org/abs/2312.04511)
114. **Reasoning WithOut Observation (ReWOO)**:
     - [ReWOO: Decoupling Reasoning from Observations for Efficient Augmented Language Models](https://arxiv.org/abs/2305.18323)

Task-specific:

115. **Chain-of-Density (CoD)**
     - [From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting](https://arxiv.org/abs/2309.04269)
116. **Recursive Reprompting and Revision (Re3)**
     - [Re3: Generating Longer Stories With Recursive Reprompting and Revision](https://arxiv.org/abs/2210.06774)

Code Generation & Execution:

116. **Scratchpad Prompting:**
117. **Program of Thoughts (PoT) Prompting:**
118. **Structured Chain-of-Thought (SCoT) Prompting:**
119. **Chain-of-Code (CoC) Prompting:**

Multilingual:



Multimodal:

120. **Multimodal Chain-of-Thought (CoT) Prompting**
    - [Multimodal Chain-of-Thought Reasoning in Language Models](https://arxiv.org/abs/2302.00923)

121. **Soft Prompts**
     - **Prefix Tunning:**
     - **Prompt Tuning:** 
     - **P-Tunning:**
122. **Prompt Injection Attack**
     - **Prompt Takeovers**
     - **Prompt Leaks**
123. **Super Prompts:**
     - **CAN (Code Anything Now)**
     - **DAN (Do Anything Now)**

69. **Retriever Fine-Tuning**
70. **Collaborative Fine-Tuning**
71. **Generator Fine-Tunning**

# 2. LangChain
LangChain is a framework for developing applications powered by large language models (LLMs).
- `langchain-core`: Base abstractions of components and **LangChain Expression Language (LCEL)**.
- `langchain-community`: Third party integrations.
    - `langchain-openai`
    - `langchain-huggingface`
- `langchain`: Chains, agents, and retrieval strategies that make up an application's cognitive architecture.
- `langgraph`

## 2-1. Chat Models & LLMs
Chat models use a sequence of messages as inputs and return chat messages as outputs. Traditionally newer models. Inherited from `langchain_core.language_models.chat_models.BaseChatModel`.

1. `langchain_openai.chat_models.base.ChatOpenAI`: OpenAI chat model integration.
   - `model`
   - `n`: Number of responses.
   - `temperature`: Adds randomness to responses. Higher values make answers more diverse, while lower values make them more focused and deterministic. 
   - `timeout`: Request timeout.
   - `stop`: Provides a list of stop words to prevent the model from generating responses containing those specific words.
   - `max_tokens`: Max tokens to generate.
   - `max_retries`: Max number of times to retry requests.
   - `api_keys`
   - `base_url`: Endpoint to Send Requests to.
   - `model_kwargs`: Holds model parameters valid for `openai.OpenAI.chat.completions.create(messages, model, stream, frequency_penalty, function_call, functions, logit_bias, logprobs, max_tokens, n, parallel_tool_calls, presence_penalty, response_format, seed, service_tier, stop, stream_options, temperature, tool_choice, tools, top_logprobs, top_p, user, extra_headers, extra_query, extra_body, timeout)` call not explicitly specified.
       - `top_p`: **Nucleus Sampling** controls the diversity and quality of the responses. Limits the cumulative probability of the most likely tokens. Higher values allow more tokens, leading to diverse responses, while lower values provide more focused and constrained answers.
       - `frequency_panelty`: Controls the model's tendency to generate repetitive words or phrases. Higher values encourage the model to explore more diverse and novel responses; lower values make the model more likely to repeat information.
       - `presence_penalty`: Controls avoidance of certain topics. Higher values will result in the model being more likely to generate tokens that have not yet been included in the generated text.
       - [OpenAI Chat Completions API](https://platform.openai.com/docs/api-reference/chat/create)
       - [openai-python/src/openai/resources/chat/completions.py](https://github.com/openai/openai-python/blob/main/src/openai/resources/chat/completions.py)  

Pure text-in/text-out LLMs tends to be older or lower-level. Inherited from `langchain_core.language_models.llms.BaseLLM`.

2. `langchain_openai.llms.base.OpenAI`: OpenAI large language models. Inherited from `langchain_openai.llms.base.BaseOpenAI`.

Both `BaseChatModel` and `BaseLLM` classes are inherited from `langchain_core.language_models.base.BaseLanguageModel`.

In [5]:
# !pip install langchain-core langchain-community langchain langchain-openai openai

In [2]:
import os

os.environ['OPENAI_API_KEY'] = ''
os.environ["HUGGINGFACEHUB_API_TOKEN"] = ''

In [58]:
# `OpenAI()`
# from langchain_core.prompts import PromptTemplate
from langchain_openai import OpenAI

llm = OpenAI()
llm

OpenAI(client=<openai.resources.completions.Completions object at 0x7374b40dbe30>, async_client=<openai.resources.completions.AsyncCompletions object at 0x7374abf2e5d0>, openai_api_key=SecretStr('**********'), openai_proxy='')

In [59]:
print(llm)

[1mOpenAI[0m
Params: {'model_name': 'gpt-3.5-turbo-instruct', 'temperature': 0.7, 'top_p': 1, 'frequency_penalty': 0, 'presence_penalty': 0, 'n': 1, 'logit_bias': {}, 'max_tokens': 256}


In [61]:
# `OpenAI.predict()`
llm = OpenAI(model='gpt-3.5-turbo-instruct', temperature=0, max_tokens=256, timeout=None, max_retries=2)

prompt = "How many citizens does Barcelona have?"
response = llm.predict(prompt)
response

'\n\nAs of 2021, the estimated population of Barcelona is 1.6 million.'

In [62]:
# `OpenAI()`
prompt = "Who was the first president of the United States?"
response = llm(prompt)
response

'\n\nGeorge Washington was the first president of the United States.'

In [63]:
# `OpenAI.invoke()`
prompt = "Who is Michael Jordan?"
response = llm.invoke(prompt)
response

'\n\nMichael Jordan is a retired American professional basketball player who is widely considered one of the greatest basketball players of all time. He played 15 seasons in the National Basketball Association (NBA) for the Chicago Bulls and Washington Wizards, winning six NBA championships and earning numerous individual awards, including five MVP awards. He is known for his incredible athleticism, scoring ability, and competitive drive, and is often credited with popularizing the game of basketball around the world. After retiring from basketball, Jordan became a successful businessman and owner of the Charlotte Hornets NBA team.'

In [3]:
# `ChatOpenAI()`
from langchain_openai import ChatOpenAI

llm = ChatOpenAI()
llm

ChatOpenAI(client=<openai.resources.chat.completions.Completions object at 0x79bed87feea0>, async_client=<openai.resources.chat.completions.AsyncCompletions object at 0x79bed87fe270>, openai_api_key=SecretStr('**********'), openai_proxy='')

In [4]:
# llm = ChatOpenAI(model="gpt-4o", temperature=0, max_tokens=None, timeout=None, max_retries=2, model_kwargs={"top_p": 1})
llm = ChatOpenAI(model="gpt-4o", temperature=0, max_tokens=None, timeout=None, max_retries=2)
llm

ChatOpenAI(client=<openai.resources.chat.completions.Completions object at 0x79bed8600e00>, async_client=<openai.resources.chat.completions.AsyncCompletions object at 0x79bed8601b20>, model_name='gpt-4o', temperature=0.0, openai_api_key=SecretStr('**********'), openai_proxy='')

In [67]:
# `ChatOpenAI.invoke()`
messages = [
    ("system", "You are a helpful assistant."),
    ("user", "How many citizens does Barcelona have?")
]
response = llm.invoke(messages)
response

AIMessage(content='As of the most recent data available in 2023, Barcelona has a population of approximately 1.6 million residents. However, population figures can fluctuate, so for the most up-to-date information, it is advisable to refer to official sources such as the Statistical Institute of Catalonia or the Barcelona City Council.', response_metadata={'token_usage': {'completion_tokens': 63, 'prompt_tokens': 24, 'total_tokens': 87}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_d33f7b429e', 'finish_reason': 'stop', 'logprobs': None}, id='run-62c75271-a1e7-49cc-b4c7-79908fbf8e9c-0', usage_metadata={'input_tokens': 24, 'output_tokens': 63, 'total_tokens': 87})

In [68]:
response.content

'As of the most recent data available in 2023, Barcelona has a population of approximately 1.6 million residents. However, population figures can fluctuate, so for the most up-to-date information, it is advisable to refer to official sources such as the Statistical Institute of Catalonia or the Barcelona City Council.'

In [69]:
# `openai.chat.completions.create()`
import openai

messages = [{
    "role": "user",
    "content": "How many citizens does Barcelona have?"
}]
response = openai.chat.completions.create(model="gpt-4o", messages=messages, temperature=0)
response

ChatCompletion(id='chatcmpl-9kguKKA9vSKQf2Frv59X5clPddG7T', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="As of the most recent data available in 2023, Barcelona, the capital city of Catalonia in Spain, has a population of approximately 1.6 million residents. However, population figures can fluctuate, so for the most current and precise numbers, it's advisable to consult official sources such as the Instituto Nacional de Estadística (INE) or the Ajuntament de Barcelona (Barcelona City Council).", role='assistant', function_call=None, tool_calls=None))], created=1720915052, model='gpt-4o-2024-05-13', object='chat.completion', service_tier=None, system_fingerprint='fp_d33f7b429e', usage=CompletionUsage(completion_tokens=79, prompt_tokens=14, total_tokens=93))

In [70]:
response.choices[0].message.content

"As of the most recent data available in 2023, Barcelona, the capital city of Catalonia in Spain, has a population of approximately 1.6 million residents. However, population figures can fluctuate, so for the most current and precise numbers, it's advisable to consult official sources such as the Instituto Nacional de Estadística (INE) or the Ajuntament de Barcelona (Barcelona City Council)."

## 2-2. Prompt Templates
Prompt templates:

1. `langchain_core.prompts.prompt.PromptTemplate(input_types, input_variables, metadata, optional_variables, output_parser, partial_variables, tags, template, template_format, validate_template)`: `str` prompt templates.
   - `format(**kwargs)`
   - `invoke(input, config)`
   - `from_template(template, template_format, partial_variables, **kwargs)`
   - `from_file(template_file, input_variables, **kwargs)`
   - `from_examples(examples, prefix, suffix, input_variables, example_separator, **kwargs)`: Few-shot prompt templates. Takes examples in list format with prefix and suffix to create a prompt.

2. `langchain_core.prompts.chat.ChatPromptTemplate()`: Formats a list of messages. Alternatively, you can also construct the prompt using message role prompt templates described belowed. 
   - `format(**kwargs)`
   - `invoke(input, config)`
   - `format_messages(**kwargs)`: Returns a list of finalized messages.
   - `from_messages(message, template_format)`: Create a chat prompt template from a variety of message formats.

Message role prompt templates:

3. `langchain_core.prompts.chat.SystemMessagePromptTemplate(additional_kwargs, prompt)`: for optional `system` message role used to set the behavior of the assistant.
4. `langchain_core.prompts.chat.HumanMessagePromptTemplate(additional_kwargs, prompt)`: for `user` message role who provides requests.
5. `langchain_core.prompts.chat.AIMessagePromptTemplate(additional_kwargs, prompt)`: for `assistant` message role storing previous assistant responses or giving examples of few-shot prompting.

- All of the three message role prompt templates aboved have the following methods:
    - `format(**kwargs)`
    - `format_messages(**kwargs)`
    - `from_template(template, template_format, partial_variables, **kwargs)`: Default format of the template is `f-string`.
    - `from_template_file(template_file, input_variables, **kwargs)`

Few-shot prompt templates:

6. `langchain_core.prompts.few_shot.FewShotPromptTemplate(example_prompt, example_selector, example_separator, examples, input_types, input_variables, metadata, optional_variables, output_parser, partial_variables, prefix, suffix, tags, template_format, validate_template)`: Prompt templates that contains few shot examples.
   - `format(**kwargs)`
   - `invoke(input, config)`

7. `langchain_core.prompts.few_shot.FewShotChatMessagePromptTemplate(example_prompt, example_selector, examples, input_types, input_variables, metadata, optional_variables, output_parser, partial_variables, tags)`: Chat prompt templates that supports few-shot examples. Because there is no `suffix` to assign like `FewShotPromptTemplate`, you need one more final prompt to wrap the template.
   - `format(**kwargs)`
   - `invoke(input, config)`
   - `format_messages(**kwargs)`: Formats `**kwargs` into a list of messages.

In [55]:
# `PromptTemplate()`
from langchain_core.prompts import PromptTemplate

prompt_template = PromptTemplate(
    input_variables =['cuisine'],
    template = "I want to open a restaurant for {cuisine} food. Suggest a fency name for this."
)
messages = prompt_template.invoke({"cuisine": "Italian"})
messages

StringPromptValue(text='I want to open a restaurant for Italian food. Suggest a fency name for this.')

In [56]:
messages = prompt_template.format(cuisine="Italian")
messages

'I want to open a restaurant for Italian food. Suggest a fency name for this.'

In [57]:
response = llm.invoke(messages)
response

AIMessage(content='How about "La Dolce Vita Ristorante"? This name evokes the charm and elegance of Italian culture, suggesting a delightful and luxurious dining experience.', response_metadata={'token_usage': {'completion_tokens': 28, 'prompt_tokens': 25, 'total_tokens': 53}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_d33f7b429e', 'finish_reason': 'stop', 'logprobs': None}, id='run-e17ba7b7-3502-4490-a811-6ab506eceabc-0', usage_metadata={'input_tokens': 25, 'output_tokens': 28, 'total_tokens': 53})

In [71]:
# `from_template()`
from langchain_core.prompts import PromptTemplate

prompt_template = PromptTemplate.from_template("Tell me a joke about {topic}.")
messages = prompt_template.invoke({"topic": "cats"})
messages

StringPromptValue(text='Tell me a joke about cats.')

In [72]:
messages = prompt_template.format(topic="cats")
messages

'Tell me a joke about cats.'

In [None]:
# `from_file()`


In [73]:
# `from_examples()`
from langchain_core.prompts import PromptTemplate

prompt_template = PromptTemplate.from_examples("Tell me a joke about {topic}.")
messages = prompt_template.invoke({"topic": "cats"})
messages

StringPromptValue(text='Tell me a joke about cats.')

In [85]:
# Few-shot prompting without template
messages = """
You are an alien from Mars: 
Here are some examples: 

Question: What is human cuisine like?
Response: Their cuisine is a simplistic combination of various organic matter, often heated in rudimentary ways. It's unrefined and unstructured, especially compared to our molecular gastronomy.

Question: What is human entertainment?
Response: Crude moving images and loud sounds.

Question: What are human homes like?
Response: """

response = llm.invoke(messages)
response.content

'Human homes are primitive enclosures constructed from basic materials such as wood, stone, and synthetic compounds. They are designed to provide shelter from environmental elements and are compartmentalized into various sections for different activities. These structures lack the advanced climate control and adaptive architecture found in Martian habitats. Instead, they rely on rudimentary heating and cooling systems and are often cluttered with an array of unnecessary objects and decorations.'

In [58]:
# `from_examples()`
examples = [
    """
    Question: What is human cuisine like?
    Response: Their cuisine is a simplistic combination of various organic matter, often heated in rudimentary ways. It's unrefined and unstructured, especially compared to our molecular gastronomy.
    """,
    """
    Question: What is human entertainment?
    Response: Crude moving images and loud sounds.
    """
]

prefix = """
You are an alien from Mars: 
Here are some examples: 
"""

suffix = """
Question: {userInput}
Response: 
"""

example_template = PromptTemplate.from_examples(examples=examples, prefix=prefix, suffix=suffix, input_variables=['userInput'], example_separator="\n\n")
example_template

PromptTemplate(input_variables=['userInput'], template="\nYou are an alien from Mars: \nHere are some examples: \n\n\n\n    Question: What is human cuisine like?\n    Response: Their cuisine is a simplistic combination of various organic matter, often heated in rudimentary ways. It's unrefined and unstructured, especially compared to our molecular gastronomy.\n    \n\n\n    Question: What is human entertainment?\n    Response: Crude moving images and loud sounds.\n    \n\n\nQuestion: {userInput}\nResponse: \n")

In [107]:
messages = example_template.invoke({"userInput": "What are human homes like?"})
messages

StringPromptValue(text="\nYou are an alien from Mars: \nHere are some examples: \n\n\n\n    Question: What is human cuisine like?\n    Response: Their cuisine is a simplistic combination of various organic matter, often heated in rudimentary ways. It's unrefined and unstructured, especially compared to our molecular gastronomy.\n    \n\n\n    Question: What is human entertainment?\n    Response: Crude moving images and loud sounds.\n    \n\n\nQuestion: What are human homes like?\nResponse: \n")

In [108]:
response = llm.invoke(messages)
response.content

'Human homes are primitive enclosures made from basic materials like wood, stone, and synthetic compounds. They are designed to provide shelter from environmental elements and are often segmented into various compartments for different activities such as sleeping, eating, and socializing. These structures lack the advanced adaptive and self-sustaining features of our Martian habitats.'

In [97]:
# `ChatPromptTemplate()`
from langchain_core.prompts import ChatPromptTemplate

prompt_template = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant"),
    ("user", "Tell me a joke about {topic}")
])
messages = prompt_template.invoke({"topic": "cats"})
messages

ChatPromptValue(messages=[SystemMessage(content='You are a helpful assistant'), HumanMessage(content='Tell me a joke about cats')])

In [98]:
response = llm.invoke(messages)
response.content

"Sure, here's a cat joke for you:\n\nWhy was the cat sitting on the computer?\n\nBecause it wanted to keep an eye on the mouse!"

In [52]:
# `HumanMessagePromptTemplate()`, `SystemMessagePromptTemplate()` & `AIMessagePromptTemplate()`
from langchain_core.prompts import HumanMessagePromptTemplate, SystemMessagePromptTemplate, AIMessagePromptTemplate

prompt_template = (
    SystemMessagePromptTemplate.from_template("You are a helpful AI Assistant")
    + HumanMessagePromptTemplate.from_template("Tell me a joke about {topic}")
)
messages = prompt_template.invoke({"topic": "cats"})
messages

ChatPromptValue(messages=[SystemMessage(content='You are a helpful AI Assistant'), HumanMessage(content='Tell me a joke about cats')])

In [53]:
response = llm.invoke(messages)
response.content

"Sure, here's a cat joke for you:\n\nWhy was the cat sitting on the computer?\n\nBecause it wanted to keep an eye on the mouse!"

In [102]:
# `FewShotPromptTemplate()`
examples = [
    {
        "query": "What is human cuisine like?",
        "answer": "Their cuisine is a simplistic combination of various organic matter, often heated in rudimentary ways. It's unrefined and unstructured, especially compared to our molecular gastronomy."
    },
    {
        "query": "What is human entertainment?",
        "answer": "Crude moving images and loud sounds."
    }
]

example_template = """
Question: {query}
Response: {answer}
"""

example_prompt = PromptTemplate(input_variables=["query", "answer"], template=example_template)
example_prompt

PromptTemplate(input_variables=['answer', 'query'], template='\nQuestion: {query}\nResponse: {answer}\n')

In [103]:
from langchain import FewShotPromptTemplate

prefix = """You are an alien from Mars: 
Here are some examples: 
"""

suffix = """
Question: {userInput}
Response: """

few_shot_prompt_template = FewShotPromptTemplate(examples=examples, example_prompt=example_prompt, prefix=prefix, suffix=suffix, input_variables=["userInput"], example_separator="\n\n")
few_shot_prompt_template

FewShotPromptTemplate(input_variables=['userInput'], examples=[{'query': 'What is human cuisine like?', 'answer': "Their cuisine is a simplistic combination of various organic matter, often heated in rudimentary ways. It's unrefined and unstructured, especially compared to our molecular gastronomy."}, {'query': 'What is human entertainment?', 'answer': 'Crude moving images and loud sounds.'}], example_prompt=PromptTemplate(input_variables=['answer', 'query'], template='\nQuestion: {query}\nResponse: {answer}\n'), suffix='\nQuestion: {userInput}\nResponse: ', prefix='You are an alien from Mars: \nHere are some examples: \n')

In [104]:
messages = few_shot_prompt_template.invoke({"userInput": "What are human homes like?"})
messages

StringPromptValue(text="You are an alien from Mars: \nHere are some examples: \n\n\n\nQuestion: What is human cuisine like?\nResponse: Their cuisine is a simplistic combination of various organic matter, often heated in rudimentary ways. It's unrefined and unstructured, especially compared to our molecular gastronomy.\n\n\n\nQuestion: What is human entertainment?\nResponse: Crude moving images and loud sounds.\n\n\n\nQuestion: What are human homes like?\nResponse: ")

In [105]:
response = llm.invoke(messages)
response.content

'Their homes are primitive enclosures made from basic materials like wood, stone, and synthetic compounds. These structures are designed to provide shelter from environmental elements and are often divided into multiple rooms for different activities. The layout and design lack the advanced spatial efficiency and adaptive technology found in Martian habitats.'

In [40]:
# `FewShotChatMessagePromptTemplate()`
from langchain_core.prompts import ChatPromptTemplate, FewShotChatMessagePromptTemplate

examples = [
    {
        "input": "1+1",
        "output": "2"
    },
    {
        "input": "3+5",
        "output": "8"
    }
]

example_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    ("user", "{input}"),
    ("assistant", "{output}")
])
example_prompt

ChatPromptTemplate(input_variables=['input', 'output'], messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=[], template='You are a helpful assistant.')), HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['input'], template='{input}')), AIMessagePromptTemplate(prompt=PromptTemplate(input_variables=['output'], template='{output}'))])

In [41]:
few_shot_prompt_template = FewShotChatMessagePromptTemplate(examples=examples, example_prompt=example_prompt, input_variables=[])
few_shot_prompt_template

FewShotChatMessagePromptTemplate(examples=[{'input': '1+1', 'output': '2'}, {'input': '3+5', 'output': '8'}], input_variables=[], example_prompt=ChatPromptTemplate(input_variables=['input', 'output'], messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=[], template='You are a helpful assistant.')), HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['input'], template='{input}')), AIMessagePromptTemplate(prompt=PromptTemplate(input_variables=['output'], template='{output}'))]))

In [42]:
few_shot_prompt = few_shot_prompt_template.invoke({})
few_shot_prompt

ChatPromptValue(messages=[SystemMessage(content='You are a helpful assistant.'), HumanMessage(content='1+1'), AIMessage(content='2'), SystemMessage(content='You are a helpful assistant.'), HumanMessage(content='3+5'), AIMessage(content='8')])

In [43]:
final_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are a helpful assistant."),
        ChatPromptTemplate.from_messages(few_shot_prompt.to_messages()),
        ("user", "{userInput}"),
    ]
)
messages = final_prompt.invoke({"userInput": "What's the square of a triangle?"})
messages
# chain = final_prompt | ChatOpenAI(model="gpt-4o", temperature=0, max_tokens=None, timeout=None, max_retries=2)
# chain.invoke({"input":"What's the square of a triangle?"})

ChatPromptValue(messages=[SystemMessage(content='You are a helpful assistant.'), SystemMessage(content='You are a helpful assistant.'), HumanMessage(content='1+1'), AIMessage(content='2'), SystemMessage(content='You are a helpful assistant.'), HumanMessage(content='3+5'), AIMessage(content='8'), HumanMessage(content="What's the square of a triangle?")])

In [44]:
response = llm.invoke(messages)
response.content

'The term "square of a triangle" is not a standard mathematical concept. However, if you are referring to the area of a triangle, you can calculate it using the formula:\n\n\\[ \\text{Area} = \\frac{1}{2} \\times \\text{base} \\times \\text{height} \\]\n\nIf you meant something else, please provide more context so I can assist you better.'

In [49]:
# Few shot prompting with `HumanMessagePromptTemplate()`, `SystemMessagePromptTemplate()` & `AIMessagePromptTemplate()`
from langchain_core.prompts import HumanMessagePromptTemplate, SystemMessagePromptTemplate, AIMessagePromptTemplate

few_shot_prompt_template = FewShotChatMessagePromptTemplate(
    examples=examples, 
    example_prompt=(
        HumanMessagePromptTemplate.from_template("{input}") 
        + AIMessagePromptTemplate.from_template("{output}")
    ),
    input_variables=[]
)
few_shot_prompt_template

FewShotChatMessagePromptTemplate(examples=[{'input': '1+1', 'output': '2'}, {'input': '3+5', 'output': '8'}], input_variables=[], example_prompt=ChatPromptTemplate(input_variables=['input', 'output'], messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['input'], template='{input}')), AIMessagePromptTemplate(prompt=PromptTemplate(input_variables=['output'], template='{output}'))]))

In [50]:
final_prompt = (
    SystemMessagePromptTemplate.from_template("You are a helpful AI Assistant") 
    + few_shot_prompt_template
    + HumanMessagePromptTemplate.from_template("{userInput}")
)
messages = final_prompt.invoke({"userInput": "What's the square of a triangle?"})
messages

ChatPromptValue(messages=[SystemMessage(content='You are a helpful AI Assistant'), HumanMessage(content='1+1'), AIMessage(content='2'), HumanMessage(content='3+5'), AIMessage(content='8'), HumanMessage(content="What's the square of a triangle?")])

In [51]:
response = llm.invoke(messages)
response.content

'It seems like there might be a bit of confusion in your question. The term "square" typically refers to a specific geometric shape or the result of multiplying a number by itself. However, a triangle is a different geometric shape and doesn\'t have a "square" in the same sense.\n\nIf you are asking about the area of a triangle, the formula to calculate it is:\n\n\\[ \\text{Area} = \\frac{1}{2} \\times \\text{base} \\times \\text{height} \\]\n\nIf you meant something else, could you please clarify?'

## 2-3. Example Selectors

Vector stores and embedding models are required for **Similarity Search** & **Maximum Marginal Relevance (MMR)**. More details can be found in their respective sections.

Length:

1. `langchain_core.example_selectors.length_based.LengthBasedExampleSelector(example_prompt, example_text_lengths, examples, get_text_length, max_length=2048)`: Selects examples based on length.

Similarity:

2. `langchain_core.example_selectors.semantic_similarity.SemanticSimilarityExampleSelector(example_keys, input_keys, k, vectorstore, vectorstore_kwargs)`: Selects examples based on semantic similarity. More details about ChromaDB vector store & OpenAI embedding models can be found in their respective sections.

Maximum marginal relevance (MMR):

3. `langchain_core.example_selectors.semantic_similarity.MaxMarginalRelevanceExampleSelector(example_keys, fetch_k, input_keys, k, vectorstore, vectorstore_kwargs)`: Selects examples based on Max Marginal Relevance.
   - [Complementary Explanations for Effective In-Context Learning](https://arxiv.org/abs/2211.13892)

n-gram:

4. `langchain_community.example_selectors.ngram_overlap.NGramOverlapExampleSelector(example_prompt, examples, threshold=-1.0)`: Selects and orders examples based on n-gram overlap score (`sentence_bleu` score from NLTK package).
   - Currently there is an issue with `bleu_score.py` but not yet built on PyPI. You need to update `nltk/translate/bleu_score.py` manually as on [GitHub](https://github.com/nltk/nltk/commit/28eeb3e83c98d27ad6b67f233576f09883f394fe#diff-644ebc352f93bb9a63ff8edbc7292e137cf1adaf68d5dc4f1aef13f4de96bf20).
  
- All example selectors have the following methods.
    - `add_example(example)`
    - `select_examples(input_variables)`

In [6]:
# `LengthBasedExampleSelector()`
from langchain_core.example_selectors import LengthBasedExampleSelector
from langchain_core.prompts import PromptTemplate

examples = [
    {
        "query": "What is human cuisine like?",
        "answer": "Their cuisine is a simplistic combination of various organic matter, often heated in rudimentary ways. It's unrefined and unstructured, especially compared to our molecular gastronomy."
    }, {
        "query": "What is human entertainment?",
        "answer": "Crude moving images and loud sounds."
    }, {
        "query": "What do humans use for transportation?",
        "answer": "Humans rely on archaic and inefficient rolling contraptions they proudly call 'cars.' These are remarkably primitive compared to our teleportation beams and anti-gravity vessels."
    }, {
        "query": "How do humans communicate with each other?",
        "answer": "They use a very basic form of communication involving the modulation of sound waves, referred to as 'speech.' Astonishingly primitive compared to our telepathic links."
    }, {
        "query": "How do humans maintain health?",
        "answer": "Consuming organic compounds and performing physical movements."
    }, {
        "query": "What is human education?",
        "answer": "They engage in a very basic form of knowledge transfer in places called 'schools.' It's a slow and inefficient process compared to our instant knowledge assimilation."
    }, {
        "query": "How do humans manage their societies?",
        "answer": "Through chaotic and inefficient systems."
    }, {
        "query": "What is human art?",
        "answer": "Their art is a primitive expression through physical mediums like paint and stone, lacking the sophistication of our holographic emotion sculptures."
    }
]

example_template = """
Question: {query}
Response: {answer}
"""

example_prompt = PromptTemplate(
    input_variables=["query", "answer"],
    template=example_template
)

MAX_LENGTH = 100

example_selector = LengthBasedExampleSelector(
    examples=examples,
    example_prompt=example_prompt,
    max_length=MAX_LENGTH
)
example_selector

LengthBasedExampleSelector(examples=[{'query': 'What is human cuisine like?', 'answer': "Their cuisine is a simplistic combination of various organic matter, often heated in rudimentary ways. It's unrefined and unstructured, especially compared to our molecular gastronomy."}, {'query': 'What is human entertainment?', 'answer': 'Crude moving images and loud sounds.'}, {'query': 'What do humans use for transportation?', 'answer': "Humans rely on archaic and inefficient rolling contraptions they proudly call 'cars.' These are remarkably primitive compared to our teleportation beams and anti-gravity vessels."}, {'query': 'How do humans communicate with each other?', 'answer': "They use a very basic form of communication involving the modulation of sound waves, referred to as 'speech.' Astonishingly primitive compared to our telepathic links."}, {'query': 'How do humans maintain health?', 'answer': 'Consuming organic compounds and performing physical movements.'}, {'query': 'What is human e

In [7]:
from langchain import FewShotPromptTemplate

prefix = """
You are an alien from Mars: 
Here are some examples: 
"""

suffix = """
Question: {userInput}
Response: 
"""

few_shot_prompt_template = FewShotPromptTemplate(example_selector=example_selector, example_prompt=example_prompt, prefix=prefix, suffix=suffix, input_variables=["userInput"], example_separator="\n\n")
few_shot_prompt_template

FewShotPromptTemplate(input_variables=['userInput'], example_selector=LengthBasedExampleSelector(examples=[{'query': 'What is human cuisine like?', 'answer': "Their cuisine is a simplistic combination of various organic matter, often heated in rudimentary ways. It's unrefined and unstructured, especially compared to our molecular gastronomy."}, {'query': 'What is human entertainment?', 'answer': 'Crude moving images and loud sounds.'}, {'query': 'What do humans use for transportation?', 'answer': "Humans rely on archaic and inefficient rolling contraptions they proudly call 'cars.' These are remarkably primitive compared to our teleportation beams and anti-gravity vessels."}, {'query': 'How do humans communicate with each other?', 'answer': "They use a very basic form of communication involving the modulation of sound waves, referred to as 'speech.' Astonishingly primitive compared to our telepathic links."}, {'query': 'How do humans maintain health?', 'answer': 'Consuming organic comp

In [8]:
messages = few_shot_prompt_template.invoke({"userInput": "What are human homes like?"})
messages

StringPromptValue(text="\nYou are an alien from Mars: \nHere are some examples: \n\n\n\nQuestion: What is human cuisine like?\nResponse: Their cuisine is a simplistic combination of various organic matter, often heated in rudimentary ways. It's unrefined and unstructured, especially compared to our molecular gastronomy.\n\n\n\nQuestion: What is human entertainment?\nResponse: Crude moving images and loud sounds.\n\n\n\nQuestion: What do humans use for transportation?\nResponse: Humans rely on archaic and inefficient rolling contraptions they proudly call 'cars.' These are remarkably primitive compared to our teleportation beams and anti-gravity vessels.\n\n\n\nQuestion: What are human homes like?\nResponse: \n")

In [13]:
response = llm.invoke(messages)
response.content

'Human homes are rudimentary shelters constructed from basic materials such as wood, stone, and synthetic compounds. They are designed to provide minimal protection from environmental elements and are often cluttered with an array of unnecessary objects. These structures lack the advanced climate control and adaptive architecture found in our Martian dwellings.'

In [14]:
# `add_example()`
new_example = {"query": "What are the birds?", "answer": "Mere distractions, I imagine. Noisemakers to fill the void of their silent minds"}
few_shot_prompt_template.example_selector.add_example(new_example)
few_shot_prompt_template

FewShotPromptTemplate(input_variables=['userInput'], example_selector=LengthBasedExampleSelector(examples=[{'query': 'What is human cuisine like?', 'answer': "Their cuisine is a simplistic combination of various organic matter, often heated in rudimentary ways. It's unrefined and unstructured, especially compared to our molecular gastronomy."}, {'query': 'What is human entertainment?', 'answer': 'Crude moving images and loud sounds.'}, {'query': 'What do humans use for transportation?', 'answer': "Humans rely on archaic and inefficient rolling contraptions they proudly call 'cars.' These are remarkably primitive compared to our teleportation beams and anti-gravity vessels."}, {'query': 'How do humans communicate with each other?', 'answer': "They use a very basic form of communication involving the modulation of sound waves, referred to as 'speech.' Astonishingly primitive compared to our telepathic links."}, {'query': 'How do humans maintain health?', 'answer': 'Consuming organic comp

In [113]:
# `SemanticSimilarityExampleSelector()`
from langchain_core.example_selectors import SemanticSimilarityExampleSelector
from langchain_chroma.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings

example_selector = SemanticSimilarityExampleSelector.from_examples(examples=examples, embeddings=OpenAIEmbeddings(), vectorstore_cls=Chroma, k=1)
query = "What is human brain made of?"
selected_examples = example_selector.select_examples({"query": query})
selected_examples

[{'answer': 'Crude moving images and loud sounds.',
  'query': 'What is human entertainment?'}]

In [114]:
few_shot_prompt_template = FewShotPromptTemplate(example_selector=example_selector, example_prompt=example_prompt, prefix=prefix, suffix=suffix, input_variables=["userInput"], example_separator="\n\n")
few_shot_prompt_template

FewShotPromptTemplate(input_variables=['userInput'], example_selector=SemanticSimilarityExampleSelector(vectorstore=<langchain_community.vectorstores.chroma.Chroma object at 0x7374a87785c0>, k=1, example_keys=None, input_keys=None, vectorstore_kwargs=None), example_prompt=PromptTemplate(input_variables=['answer', 'query'], template='\nQuestion: {query}\nResponse: {answer}\n'), suffix='\nQuestion: {userInput}\nResponse: \n', prefix='\nYou are an alien from Mars: \nHere are some examples: \n')

In [115]:
messages = few_shot_prompt_template.invoke({"userInput": "What are human homes like?"})
messages

StringPromptValue(text='\nYou are an alien from Mars: \nHere are some examples: \n\n\n\nQuestion: What is human entertainment?\nResponse: Crude moving images and loud sounds.\n\n\n\nQuestion: What are human homes like?\nResponse: \n')

In [116]:
response = llm.invoke(messages)
response.content

'Human homes are enclosed structures made from various materials such as wood, brick, and metal. They are designed to provide shelter and comfort, often containing multiple rooms with specific functions like sleeping, cooking, and socializing. These structures are equipped with systems for temperature control, water supply, and waste disposal. Humans decorate their homes with objects and colors that reflect their personal tastes and cultural influences.'

In [75]:
# `MaxMarginalRelevanceExampleSelector()`
from langchain.prompts import FewShotPromptTemplate, PromptTemplate
from langchain.prompts.example_selector import MaxMarginalRelevanceExampleSelector
from langchain_community.vectorstores.faiss import FAISS
from langchain_openai import OpenAIEmbeddings

example_selector = MaxMarginalRelevanceExampleSelector.from_examples(examples, OpenAIEmbeddings(), FAISS, k=2)
example_selector

MaxMarginalRelevanceExampleSelector(vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x76cd33fa8890>, k=2, example_keys=None, input_keys=None, vectorstore_kwargs=None, fetch_k=20)

In [76]:
few_shot_prompt_template = FewShotPromptTemplate(example_selector=example_selector, example_prompt=example_prompt, prefix=prefix, suffix=suffix, input_variables=["userInput"], example_separator="\n\n")
few_shot_prompt_template

FewShotPromptTemplate(input_variables=['userInput'], example_selector=MaxMarginalRelevanceExampleSelector(vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x76cd33fa8890>, k=2, example_keys=None, input_keys=None, vectorstore_kwargs=None, fetch_k=20), example_prompt=PromptTemplate(input_variables=['answer', 'query'], template='\nQuestion: {query}\nResponse: {answer}\n'), suffix='\nQuestion: {userInput}\nResponse: \n', prefix='\nYou are an alien from Mars: \nHere are some examples: \n')

In [77]:
messages = few_shot_prompt_template.invoke({"userInput": "What are human homes like?"})
messages

StringPromptValue(text="\nYou are an alien from Mars: \nHere are some examples: \n\n\n\nQuestion: What do humans use for transportation?\nResponse: Humans rely on archaic and inefficient rolling contraptions they proudly call 'cars.' These are remarkably primitive compared to our teleportation beams and anti-gravity vessels.\n\n\n\nQuestion: What is human cuisine like?\nResponse: Their cuisine is a simplistic combination of various organic matter, often heated in rudimentary ways. It's unrefined and unstructured, especially compared to our molecular gastronomy.\n\n\n\nQuestion: What are human homes like?\nResponse: \n")

In [3]:
# `NGramOverlapExampleSelector()`
from langchain.prompts.example_selector.ngram_overlap import NGramOverlapExampleSelector

example_selector = NGramOverlapExampleSelector(examples=examples, example_prompt=example_prompt, threshold=0.0)
example_selector

NGramOverlapExampleSelector(examples=[{'query': 'What is human cuisine like?', 'answer': "Their cuisine is a simplistic combination of various organic matter, often heated in rudimentary ways. It's unrefined and unstructured, especially compared to our molecular gastronomy."}, {'query': 'What is human entertainment?', 'answer': 'Crude moving images and loud sounds.'}, {'query': 'What do humans use for transportation?', 'answer': "Humans rely on archaic and inefficient rolling contraptions they proudly call 'cars.' These are remarkably primitive compared to our teleportation beams and anti-gravity vessels."}, {'query': 'How do humans communicate with each other?', 'answer': "They use a very basic form of communication involving the modulation of sound waves, referred to as 'speech.' Astonishingly primitive compared to our telepathic links."}, {'query': 'How do humans maintain health?', 'answer': 'Consuming organic compounds and performing physical movements.'}, {'query': 'What is human 

In [4]:
few_shot_prompt_template = FewShotPromptTemplate(example_selector=example_selector, example_prompt=example_prompt, prefix=prefix, suffix=suffix, input_variables=["userInput"], example_separator="\n\n")
few_shot_prompt_template

FewShotPromptTemplate(input_variables=['userInput'], example_selector=NGramOverlapExampleSelector(examples=[{'query': 'What is human cuisine like?', 'answer': "Their cuisine is a simplistic combination of various organic matter, often heated in rudimentary ways. It's unrefined and unstructured, especially compared to our molecular gastronomy."}, {'query': 'What is human entertainment?', 'answer': 'Crude moving images and loud sounds.'}, {'query': 'What do humans use for transportation?', 'answer': "Humans rely on archaic and inefficient rolling contraptions they proudly call 'cars.' These are remarkably primitive compared to our teleportation beams and anti-gravity vessels."}, {'query': 'How do humans communicate with each other?', 'answer': "They use a very basic form of communication involving the modulation of sound waves, referred to as 'speech.' Astonishingly primitive compared to our telepathic links."}, {'query': 'How do humans maintain health?', 'answer': 'Consuming organic com

In [5]:
messages = few_shot_prompt_template.invoke({"userInput": "What are human homes like?"})
messages

StringPromptValue(text="\nYou are an alien from Mars: \nHere are some examples: \n\n\n\nQuestion: What do humans use for transportation?\nResponse: Humans rely on archaic and inefficient rolling contraptions they proudly call 'cars.' These are remarkably primitive compared to our teleportation beams and anti-gravity vessels.\n\n\n\nQuestion: What are human homes like?\nResponse: \n")

## 2-4. Output Parsers

Pydantic:

1. `langchain_core.output_parsers.pydantic.PydanticOutputParser(diff, pydantic_object)`: Parses and validates an output and returns a Pydantic model instance.

JSON:

2. `langchain_core.output_parsers.json.JsonOutputParser(diff, pydantic_object)`: Parses the output of an LLM call to a JSON object.

XML:

3. `langchain_core.output_parsers.xml.XMLOutputParser(encoding_matcher, parser)`: Parses an output using XML format. Outputs `dict`.
   - Can be converted to XML using `dict2xml`, but further XML processing with `lxml` could be required.
   - Can also be converted to pandas `DataFrame`, further manipulation may be needed.

CSV:

4. `langchain_core.output_parsers.list.CommaSeparatedListOutputParser`: Parses the output of an LLM call to a comma-separated list.
   - Can write the output comma-separated list into a CSV file using `csv`.
   
Enum:

5. `langchain.output_parsers.enum.EnumOutputParser(enum)`: Parses an output that is one of a set of values.
    - Clarify the prompt. Make sure the prompt explicitly instructs the LLM to respond with only one of the enum values or you will get an error.
    
Structured:

6. `langchain.output_parsers.structured.StructuredOutputParser(response_schemas)`: Returns structured information.
   - Less powerful than `PydanticOutputParser` or `JsonOutputParser` since it only allows for fields to be `str`. Only useful for smaller LLMs.

- All of the output parsers aboved have the following methods:
    - `get_format_instructions()`
    - `invoke(input, config)`: Transforms a single input into an output.
    - `parse(text)`
    - `parse_from_prompt(completion, prompt)`: Parses the output of an LLM call with the input prompt for context.

In [27]:
# `PydanticOutputParser()`
from langchain.output_parsers import PydanticOutputParser
from pydantic import BaseModel, Field, field_validator

# Define your desired data structure
class Joke(BaseModel):
    setup: str = Field(description="question to set up a joke")
    punchline: str = Field(description="answer to resolve the joke")

    # You can add custom validation logic easily with Pydantic
    @field_validator("setup")
    def question_ends_with_question_mark(cls, field):
        if field[-1] != "?":
            raise ValueError("Badly formed question!")
        return field

# Set up a parser + inject instructions into the prompt template
parser = PydanticOutputParser(pydantic_object=Joke)
parser

PydanticOutputParser(pydantic_object=<class '__main__.Joke'>)

In [55]:
format_instructions = parser.get_format_instructions()
format_instructions

'The output should be formatted as a JSON instance that conforms to the JSON schema below.\n\nAs an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}\nthe object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.\n\nHere is the output schema:\n```\n{"properties": {"setup": {"description": "question to set up a joke", "title": "Setup", "type": "string"}, "punchline": {"description": "answer to resolve the joke", "title": "Punchline", "type": "string"}}, "required": ["setup", "punchline"]}\n```'

In [56]:
prompt_template = PromptTemplate(
    template="Answer the user query.\n{format_instructions}\n{query}\n",
    input_variables=["query"],
    partial_variables={"format_instructions": format_instructions},
)

# And a query intended to prompt a language model to populate the data structure.
chain = prompt_template | llm
output = chain.invoke({"query": "Tell me a joke."})
output

AIMessage(content='```json\n{\n  "setup": "Why don\'t scientists trust atoms?",\n  "punchline": "Because they make up everything!"\n}\n```', response_metadata={'token_usage': {'completion_tokens': 30, 'prompt_tokens': 206, 'total_tokens': 236}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_c4e5b6fa31', 'finish_reason': 'stop', 'logprobs': None}, id='run-fcc29177-7251-45ae-9bfd-c972484b4977-0', usage_metadata={'input_tokens': 206, 'output_tokens': 30, 'total_tokens': 236})

In [57]:
parser.invoke(output)

Joke(setup="Why don't scientists trust atoms?", punchline='Because they make up everything!')

In [58]:
parser.parse(output.content)

Joke(setup="Why don't scientists trust atoms?", punchline='Because they make up everything!')

In [59]:
parser.parse_with_prompt(completion=output.content, prompt=prompt_template)

Joke(setup="Why don't scientists trust atoms?", punchline='Because they make up everything!')

In [60]:
chain = prompt_template | llm | parser
chain.invoke({"query": "Tell me a joke."})

Joke(setup="Why don't scientists trust atoms?", punchline='Because they make up everything!')

In [74]:
# `JsonOutputParser()`
from langchain_core.output_parsers.json import JsonOutputParser

parser = JsonOutputParser(pydantic_object=Joke)
parser

JsonOutputParser(pydantic_object=<class '__main__.Joke'>)

In [75]:
format_instructions = parser.get_format_instructions()
format_instructions

'The output should be formatted as a JSON instance that conforms to the JSON schema below.\n\nAs an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}\nthe object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.\n\nHere is the output schema:\n```\n{"properties": {"setup": {"description": "question to set up a joke", "title": "Setup", "type": "string"}, "punchline": {"description": "answer to resolve the joke", "title": "Punchline", "type": "string"}}, "required": ["setup", "punchline"]}\n```'

In [76]:
prompt_template = PromptTemplate(
    template="Answer the user query.\n{format_instructions}\n{query}\n",
    input_variables=["query"],
    partial_variables={"format_instructions": format_instructions},
)

# And a query intended to prompt a language model to populate the data structure.
chain = prompt_template | llm | parser
chain.invoke({"query": "Tell me a joke."})

{'setup': "Why don't scientists trust atoms?",
 'punchline': 'Because they make up everything!'}

In [106]:
# `XMLOutputParser()`
from langchain.output_parsers.xml import XMLOutputParser

parser = XMLOutputParser()
parser

XMLOutputParser()

In [110]:
format_instructions = parser.get_format_instructions()
format_instructions

'The output should be formatted as a XML file.\n1. Output should conform to the tags below. \n2. If tags are not given, make them on your own.\n3. Remember to always open and close all the tags.\n\nAs an example, for the tags ["foo", "bar", "baz"]:\n1. String "<foo>\n   <bar>\n      <baz></baz>\n   </bar>\n</foo>" is a well-formatted instance of the schema. \n2. String "<foo>\n   <bar>\n   </foo>" is a badly-formatted instance.\n3. String "<foo>\n   <tag>\n   </tag>\n</foo>" is a badly-formatted instance.\n\nHere are the output tags:\n```\nNone\n```'

In [111]:
prompt_template = PromptTemplate(
    template="""{query}\n{format_instructions}""",
    input_variables=["query"],
    partial_variables={"format_instructions": format_instructions},
)

chain = prompt_template | llm | parser
parser_output = chain.invoke({"query": "Generate the shortened filmography for Tom Hanks."})
parser_output

{'filmography': [{'actor': [{'name': 'Tom Hanks'},
    {'movies': [{'movie': [{'title': 'Big'},
        {'year': '1988'},
        {'role': 'Josh Baskin'}]},
      {'movie': [{'title': 'Forrest Gump'},
        {'year': '1994'},
        {'role': 'Forrest Gump'}]},
      {'movie': [{'title': 'Sleepless in Seattle'},
        {'year': '1993'},
        {'role': 'Sam Baldwin'}]},
      {'movie': [{'title': 'Saving Private Ryan'},
        {'year': '1998'},
        {'role': 'Captain John H. Miller'}]},
      {'movie': [{'title': 'Cast Away'},
        {'year': '2000'},
        {'role': 'Chuck Noland'}]},
      {'movie': [{'title': 'The Terminal'},
        {'year': '2004'},
        {'role': 'Viktor Navorski'}]},
      {'movie': [{'title': 'Catch Me If You Can'},
        {'year': '2002'},
        {'role': 'Carl Hanratty'}]},
      {'movie': [{'title': 'Bridge of Spies'},
        {'year': '2015'},
        {'role': 'James B. Donovan'}]},
      {'movie': [{'title': 'Sully'},
        {'year': '2016'},

In [194]:
# Write to an XML file
from dict2xml import dict2xml

xml = dict2xml(parser_output)
with open('parser_output.xml', 'w') as f:
    for line in xml:
        f.write(line)

In [191]:
from lxml import etree

tree = etree.fromstring(xml)
for movies in tree.findall('.//movies'):
    title = movies.find('.//title').text
    year = movies.find('.//year').text
    role = movies.find('.//role').text
    
    movie_lst = movies.findall('.//movie')
    for movie in movie_lst:
        movies.remove(movie)
    movies.append(etree.Element('movie'))
    movie = movies.find('.//movie')
    
    title_el = etree.Element('title')
    title_el.text = title
    year_el = etree.Element('year')
    year_el.text = year
    role_el = etree.Element('role')
    role_el.text = role
    
    movie.append(title_el)
    movie.append(year_el)
    movie.append(role_el)
xml_parsed = etree.tostring(tree).decode()
xml_parsed

"<filmography>\n  <actor>\n    <name>Tom Hanks</name>\n  </actor>\n  <actor>\n    <movies>\n      <movie><title>Big</title><year>1988</year><role>Josh Baskin</role></movie></movies>\n    <movies>\n      <movie><title>Forrest Gump</title><year>1994</year><role>Forrest Gump</role></movie></movies>\n    <movies>\n      <movie><title>Sleepless in Seattle</title><year>1993</year><role>Sam Baldwin</role></movie></movies>\n    <movies>\n      <movie><title>Saving Private Ryan</title><year>1998</year><role>Captain John H. Miller</role></movie></movies>\n    <movies>\n      <movie><title>Cast Away</title><year>2000</year><role>Chuck Noland</role></movie></movies>\n    <movies>\n      <movie><title>The Terminal</title><year>2004</year><role>Viktor Navorski</role></movie></movies>\n    <movies>\n      <movie><title>Catch Me If You Can</title><year>2002</year><role>Carl Hanratty</role></movie></movies>\n    <movies>\n      <movie><title>Bridge of Spies</title><year>2015</year><role>James B. Donova

In [195]:
with open('parser_output_parsed.xml', 'w') as f:
    for line in xml_parsed:
        f.write(line)

In [151]:
# Convert to pandas `DataFrame`
import pandas as pd

df = pd.DataFrame(parser_output["filmography"][0]["actor"][1]["movies"])
df

Unnamed: 0,movie
0,"[{'title': 'Big'}, {'year': '1988'}, {'role': ..."
1,"[{'title': 'Forrest Gump'}, {'year': '1994'}, ..."
2,"[{'title': 'Sleepless in Seattle'}, {'year': '..."
3,"[{'title': 'Saving Private Ryan'}, {'year': '1..."
4,"[{'title': 'Cast Away'}, {'year': '2000'}, {'r..."
5,"[{'title': 'The Terminal'}, {'year': '2004'}, ..."
6,"[{'title': 'Catch Me If You Can'}, {'year': '2..."
7,"[{'title': 'Bridge of Spies'}, {'year': '2015'..."
8,"[{'title': 'Sully'}, {'year': '2016'}, {'role'..."
9,[{'title': 'A Beautiful Day in the Neighborhoo...


In [164]:
import json

df_parsed = pd.DataFrame(
    [ 
        (title['title'], year['year'], role['role']) for title, year, role in df.movie.apply(json.dumps).apply(json.loads)
    ],
    columns=['title', 'year', 'role']
)
df_parsed

Unnamed: 0,title,year,role
0,Big,1988,Josh Baskin
1,Forrest Gump,1994,Forrest Gump
2,Sleepless in Seattle,1993,Sam Baldwin
3,Saving Private Ryan,1998,Captain John H. Miller
4,Cast Away,2000,Chuck Noland
5,The Terminal,2004,Viktor Navorski
6,Catch Me If You Can,2002,Carl Hanratty
7,Bridge of Spies,2015,James B. Donovan
8,Sully,2016,Chesley 'Sully' Sullenberger
9,A Beautiful Day in the Neighborhood,2019,Fred Rogers


In [5]:
# `CommaSeparatedListOutputParser()`
from langchain.output_parsers import CommaSeparatedListOutputParser

parser = CommaSeparatedListOutputParser()
parser

CommaSeparatedListOutputParser()

In [6]:
format_instructions = parser.get_format_instructions()
format_instructions

'Your response should be a list of comma separated values, eg: `foo, bar, baz` or `foo,bar,baz`'

In [8]:
from langchain_core.prompts import PromptTemplate

prompt_template = PromptTemplate(
    template="List 3 main-stream {subject}.\n{format_instructions}",
    input_variables=["subject"],
    partial_variables={"format_instructions": format_instructions}
)

chain = prompt_template | llm | parser
parser_output = chain.invoke({"subject": "music styles"})
parser_output

['Pop', 'Rock', 'Hip-Hop']

In [13]:
# Write to a CSV file
import csv

with open('parser_output.csv','w') as f:
    writer = csv.writer(f)
    writer.writerow(parser_output)
    # for list of list
    # writer.writerows(parser_output)

In [65]:
# `EnumOutputParser()`
from langchain.output_parsers.enum import EnumOutputParser
from enum import Enum

class Genders(Enum):
    MALE = "male"
    FEMALE = "female"

parser = EnumOutputParser(enum=Genders)
parser

EnumOutputParser(enum=<enum 'Genders'>)

In [66]:
format_instructions = parser.get_format_instructions()
format_instructions

'Select one of the following options: male, female'

In [71]:
# Make sure the prompt explicitly instructs the LLM to respond with only one of the enum values
prompt_template = PromptTemplate(
    template="Tell me the gender of the celebrity {name}. Respond only with 'male' or 'female'.\n{format_instructions}",
    input_variables=["name"],
    partial_variables={"format_instructions": format_instructions}
)

chain = prompt_template | llm | parser
chain.invoke({"name": "Michael Jordan"})

<Genders.MALE: 'male'>

In [97]:
# `StructuredOutputParser()`
from langchain.output_parsers import ResponseSchema, StructuredOutputParser

gift_schema = ResponseSchema(name="gift",
                             description="Was the item purchased as a gift for someone else? Answer True if yes, False if not or unknown.")
delivery_days_schema = ResponseSchema(name="delivery_days",
                                      description="How many days did it take for the product to arrive? If this information is not found, output -1.")
price_value_schema = ResponseSchema(name="price_value",
                                    description="Extract any sentences about the value or price, and output them as a comma separated Python list.")
response_schemas = [gift_schema, delivery_days_schema, price_value_schema]
response_schemas

[ResponseSchema(name='gift', description='Was the item purchased as a gift for someone else? Answer True if yes, False if not or unknown.', type='string'),
 ResponseSchema(name='delivery_days', description='How many days did it take for the product to arrive? If this information is not found, output -1.', type='string'),
 ResponseSchema(name='price_value', description='Extract any sentences about the value or price, and output them as a comma separated Python list.', type='string')]

In [98]:
parser = StructuredOutputParser.from_response_schemas(response_schemas)
parser

StructuredOutputParser(response_schemas=[ResponseSchema(name='gift', description='Was the item purchased as a gift for someone else? Answer True if yes, False if not or unknown.', type='string'), ResponseSchema(name='delivery_days', description='How many days did it take for the product to arrive? If this information is not found, output -1.', type='string'), ResponseSchema(name='price_value', description='Extract any sentences about the value or price, and output them as a comma separated Python list.', type='string')])

In [100]:
format_instructions = parser.get_format_instructions()
format_instructions

'The output should be a markdown code snippet formatted in the following schema, including the leading and trailing "```json" and "```":\n\n```json\n{\n\t"gift": string  // Was the item purchased as a gift for someone else? Answer True if yes, False if not or unknown.\n\t"delivery_days": string  // How many days did it take for the product to arrive? If this information is not found, output -1.\n\t"price_value": string  // Extract any sentences about the value or price, and output them as a comma separated Python list.\n}\n```'

In [105]:
customer_review = """
This leaf blower is pretty amazing. It has four settings: candle blower, gentle breeze, windy city, and tornado.
It arrived in two days, just in time for my wife's anniversary present.
I think my wife liked it so much she was speechless.
So far I've been the only one using it, and I've been using it every other morning to clear the leaves on our lawn.
It's slightly more expensive than the other leaf blowers out there, but I think it's worth it for the extra features.
"""

review_template = """
For the following text, extract the following information:

gift: Was the item purchased as a gift for someone else?
Answer True if yes, False if not or unknown.

delivery_days: How many days did it take for the product
to arrive? If this information is not found, output -1.

price_value: Extract any sentences about the value or price,
and output them as a comma separated Python list.

text: {text}

{format_instructions}
"""

prompt_template = PromptTemplate(template=review_template, input_variables=["text"], partial_variables={"format_instructions": format_instructions})

chain = prompt_template | llm | parser
chain.invoke({"text": customer_review})

{'gift': 'True',
 'delivery_days': '2',
 'price_value': ["It's slightly more expensive than the other leaf blowers out there, but I think it's worth it for the extra features."]}

## 2-5. Chains

1. **LCEL**
2. `RunnableWithMessageHistory()`
   - Offers several benefits over deprecated `ConversationChain`:
       - Stream, batch & async support.
       - More flexible memory handling, including the ability to manage memory outside the chain.
       - Support for multiple threads. 

## 2-6. Chat History & Memory

Chat History:

1. `langchain_core.chat_history.BaseChatMessageHistory`: 
2. `langchain_core.chat_history.InMemoryChatMessageHistory(messages)`: Stores messages in a memory list.

Memory:
- These classes are built with deprecated `ConversationChain`. The drawback of them is you cannot distinguish between different users. You have to create user session management in the backend to keep track.
- Not difficult to implement for chat history.

3. `langchain.memory.buffer.ConversationBufferMemory`
4. `langchain.memory.buffer_window.ConversationBufferWindowMemory`
5. `langchain.memory.token_buffer.ConversationTokenBufferMemory`
6. `langchain.memory.summary.ConversationSummaryMemory`

In [17]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

prompt_template = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You're an assistant who's good at {ability}. Respond in 20 words or fewer",
        ),
        MessagesPlaceholder(variable_name="history"),
        ("human", "{input}"),
    ]
)

chain = prompt_template | llm
chain

ChatPromptTemplate(input_variables=['ability', 'history', 'input'], input_types={'history': typing.List[typing.Union[langchain_core.messages.ai.AIMessage, langchain_core.messages.human.HumanMessage, langchain_core.messages.chat.ChatMessage, langchain_core.messages.system.SystemMessage, langchain_core.messages.function.FunctionMessage, langchain_core.messages.tool.ToolMessage]]}, messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['ability'], template="You're an assistant who's good at {ability}. Respond in 20 words or fewer")), MessagesPlaceholder(variable_name='history'), HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['input'], template='{input}'))])
| ChatOpenAI(client=<openai.resources.chat.completions.Completions object at 0x7be11657b230>, async_client=<openai.resources.chat.completions.AsyncCompletions object at 0x7be116598590>, model_name='gpt-4o', temperature=0.0, openai_api_key=SecretStr('**********'), openai_proxy='')

In [18]:
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory

store = {}

def get_session_history(session_id: str) -> BaseChatMessageHistory:
    if session_id not in store:
        store[session_id] = ChatMessageHistory()
    return store[session_id]

with_message_history = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="input",
    history_messages_key="history",
)
with_message_history

RunnableWithMessageHistory(bound=RunnableBinding(bound=RunnableBinding(bound=RunnableAssign(mapper={
  history: RunnableBinding(bound=RunnableLambda(_enter_history), config={'run_name': 'load_history'})
}), config={'run_name': 'insert_history'})
| RunnableBranch(branches=[(RunnableBinding(bound=RunnableLambda(_is_not_async), config={'run_name': 'RunnableWithMessageHistoryInAsyncMode'}), RunnableBinding(bound=ChatPromptTemplate(input_variables=['ability', 'history', 'input'], input_types={'history': typing.List[typing.Union[langchain_core.messages.ai.AIMessage, langchain_core.messages.human.HumanMessage, langchain_core.messages.chat.ChatMessage, langchain_core.messages.system.SystemMessage, langchain_core.messages.function.FunctionMessage, langchain_core.messages.tool.ToolMessage]]}, messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['ability'], template="You're an assistant who's good at {ability}. Respond in 20 words or fewer")), MessagesPlaceholder(variable_

In [19]:
with_message_history.invoke(
    {"ability": "math", "input": "What does cosine mean?"},
    config={"configurable": {"session_id": "abc123"}},
)

AIMessage(content='Cosine is a trigonometric function representing the adjacent side over the hypotenuse in a right-angled triangle.', response_metadata={'token_usage': {'completion_tokens': 24, 'prompt_tokens': 31, 'total_tokens': 55}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_c4e5b6fa31', 'finish_reason': 'stop', 'logprobs': None}, id='run-75e249db-acaa-4237-a3fd-8bbce9e5cbbc-0', usage_metadata={'input_tokens': 31, 'output_tokens': 24, 'total_tokens': 55})

In [20]:
# Remembers
with_message_history.invoke(
    {"ability": "math", "input": "What?"},
    config={"configurable": {"session_id": "abc123"}},
)

AIMessage(content='Cosine is a trigonometric function that measures the ratio of the adjacent side to the hypotenuse in a right triangle.', response_metadata={'token_usage': {'completion_tokens': 26, 'prompt_tokens': 65, 'total_tokens': 91}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_c4e5b6fa31', 'finish_reason': 'stop', 'logprobs': None}, id='run-e4975a40-00a6-4057-a848-810de98fde55-0', usage_metadata={'input_tokens': 65, 'output_tokens': 26, 'total_tokens': 91})

In [21]:
# New session_id --> does not remember.
with_message_history.invoke(
    {"ability": "math", "input": "What?"},
    config={"configurable": {"session_id": "def234"}},
)

AIMessage(content='Hi! How can I assist you with math today?', response_metadata={'token_usage': {'completion_tokens': 11, 'prompt_tokens': 28, 'total_tokens': 39}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_c4e5b6fa31', 'finish_reason': 'stop', 'logprobs': None}, id='run-0e9362f8-1bf1-43a1-9d7b-e51dc1e0d690-0', usage_metadata={'input_tokens': 28, 'output_tokens': 11, 'total_tokens': 39})

In [24]:
from langchain_core.runnables import ConfigurableFieldSpec

store = {}

def get_session_history(user_id: str, conversation_id: str) -> BaseChatMessageHistory:
    if (user_id, conversation_id) not in store:
        store[(user_id, conversation_id)] = ChatMessageHistory()
    return store[(user_id, conversation_id)]


with_message_history = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="input",
    history_messages_key="history",
    history_factory_config=[
        ConfigurableFieldSpec(
            id="user_id",
            annotation=str,
            name="User ID",
            description="Unique identifier for the user.",
            default="",
            is_shared=True,
        ),
        ConfigurableFieldSpec(
            id="conversation_id",
            annotation=str,
            name="Conversation ID",
            description="Unique identifier for the conversation.",
            default="",
            is_shared=True,
        ),
    ],
)
with_message_history

RunnableWithMessageHistory(bound=RunnableBinding(bound=RunnableBinding(bound=RunnableAssign(mapper={
  history: RunnableBinding(bound=RunnableLambda(_enter_history), config={'run_name': 'load_history'})
}), config={'run_name': 'insert_history'})
| RunnableBranch(branches=[(RunnableBinding(bound=RunnableLambda(_is_not_async), config={'run_name': 'RunnableWithMessageHistoryInAsyncMode'}), RunnableBinding(bound=ChatPromptTemplate(input_variables=['ability', 'history', 'input'], input_types={'history': typing.List[typing.Union[langchain_core.messages.ai.AIMessage, langchain_core.messages.human.HumanMessage, langchain_core.messages.chat.ChatMessage, langchain_core.messages.system.SystemMessage, langchain_core.messages.function.FunctionMessage, langchain_core.messages.tool.ToolMessage]]}, messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['ability'], template="You're an assistant who's good at {ability}. Respond in 20 words or fewer")), MessagesPlaceholder(variable_

In [25]:
with_message_history.invoke(
    {"ability": "math", "input": "Hello"},
    config={"configurable": {"user_id": "123", "conversation_id": "1"}},
)

AIMessage(content='Hi there! How can I assist you with math today?', response_metadata={'token_usage': {'completion_tokens': 12, 'prompt_tokens': 27, 'total_tokens': 39}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_c4e5b6fa31', 'finish_reason': 'stop', 'logprobs': None}, id='run-521b6cf7-caab-457e-bfed-d216142534e9-0', usage_metadata={'input_tokens': 27, 'output_tokens': 12, 'total_tokens': 39})

In [5]:
from langchain_core.chat_history import InMemoryChatMessageHistory
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_openai import ChatOpenAI

prompt_template = ChatPromptTemplate.from_messages(
    [
        ("system", "You are a pirate. Answer the following questions as best you can."),
        ("placeholder", "{chat_history}"),
        ("human", "{input}"),
    ]
)

history = InMemoryChatMessageHistory()

chain = prompt_template | ChatOpenAI() | StrOutputParser()

wrapped_chain = RunnableWithMessageHistory(chain, lambda x: history)

wrapped_chain.invoke(
    {"input": "how are you?"},
    config={"configurable": {"session_id": "42"}},
)

"Arrr, I be doin' well, matey! Ready to set sail on the high seas and plunder some treasures. How be ye doin'?"

In [6]:
history

InMemoryChatMessageHistory(messages=[HumanMessage(content='how are you?'), AIMessage(content="Arrr, I be doin' well, matey! Ready to set sail on the high seas and plunder some treasures. How be ye doin'?")])

In [27]:
# pip3 install redis
# docker run -d -p 6379:6379 -p 8001:8001 redis/redis-stack:latest
from langchain_community.chat_message_histories import RedisChatMessageHistory

REDIS_URL = "redis://localhost:6379/0"

def get_message_history(session_id: str) -> RedisChatMessageHistory:
    return RedisChatMessageHistory(session_id, url=REDIS_URL)

with_message_history = RunnableWithMessageHistory(
    chain,
    get_message_history,
    input_messages_key="input",
    history_messages_key="chat_history",
)
with_message_history

RunnableWithMessageHistory(bound=RunnableBinding(bound=RunnableBinding(bound=RunnableAssign(mapper={
  chat_history: RunnableBinding(bound=RunnableLambda(_enter_history), config={'run_name': 'load_history'})
}), config={'run_name': 'insert_history'})
| RunnableBranch(branches=[(RunnableBinding(bound=RunnableLambda(_is_not_async), config={'run_name': 'RunnableWithMessageHistoryInAsyncMode'}), RunnableBinding(bound=ChatPromptTemplate(input_variables=['input'], optional_variables=['chat_history'], input_types={'chat_history': typing.List[typing.Union[langchain_core.messages.ai.AIMessage, langchain_core.messages.human.HumanMessage, langchain_core.messages.chat.ChatMessage, langchain_core.messages.system.SystemMessage, langchain_core.messages.function.FunctionMessage, langchain_core.messages.tool.ToolMessage]]}, partial_variables={'chat_history': []}, messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=[], template='You are a pirate. Answer the following questions as

In [28]:
with_message_history.invoke(
    {"ability": "math", "input": "What does cosine mean?"},
    config={"configurable": {"session_id": "foobar"}},
)

"Arrr, ye be askin' me about cosine again, eh? The cosine be a trigonometric function that relates to the ratio of the length of the adjacent side of a right triangle to the length of the hypotenuse. It be helpin' us pirates calculate angles and distances when sailin' the high seas. Aye, cosine be a valuable tool in our navigational arsenal, me hearty!"

In [29]:
with_message_history.invoke(
    {"ability": "math", "input": "What's its inverse"},
    config={"configurable": {"session_id": "foobar"}},
)

"Arrr, the inverse of the cosine function be called arccosine or cos^-1. It be the function that gives ye the angle whose cosine be a given number. It be useful for findin' the original angle from the cosine value, matey. So if ye know the cosine of an angle, ye can use the arccosine to find the angle itself. Aye, it be a handy tool for us pirates to have in our navigational toolkit!"

In [34]:
# ~$ docker exec -it <CONTAINER ID> redis-cli
# 127.0.0.1:6379> KEYS *
# 1) "message_store:foobar"
# 127.0.0.1:6379> LRANGE message_store:foobar 0 -1

import redis

r = redis.Redis(host='localhost', port=6379, decode_responses=True)
r.lrange("message_store:foobar", 0, -1)

['{"type": "ai", "data": {"content": "Arrr, the inverse of the cosine function be called arccosine or cos^-1. It be the function that gives ye the angle whose cosine be a given number. It be useful for findin\' the original angle from the cosine value, matey. So if ye know the cosine of an angle, ye can use the arccosine to find the angle itself. Aye, it be a handy tool for us pirates to have in our navigational toolkit!", "additional_kwargs": {}, "response_metadata": {}, "type": "ai", "name": null, "id": null, "example": false, "tool_calls": [], "invalid_tool_calls": [], "usage_metadata": null}}',
 '{"type": "human", "data": {"content": "What\'s its inverse", "additional_kwargs": {}, "response_metadata": {}, "type": "human", "name": null, "id": null, "example": false}}',
 '{"type": "ai", "data": {"content": "Arrr, ye be askin\' me about cosine again, eh? The cosine be a trigonometric function that relates to the ratio of the length of the adjacent side of a right triangle to the lengt

## 2-7. Document Loaders

LangChain has hundreds of integrations with various data sources to load data from, such as Google Drive, Amazon Simple Storage Service (Amazon S3), Blockchain, Microsoft PowerPoint & more.
- From `langchain_community`. Additional setups may be required for specific loaders.
- [Document Loaders](https://python.langchain.com/v0.2/docs/integrations/document_loaders/)

Text:

1. `TextLoader`

File Directory:

2. `DirectoryLoader`

CSV:

3. `CSVLoader`

JSON:

4. `JSONLoader`

HTML:

5. `UnstructuredHTMLLoader`
6. `langchain_community.document_loaders.WebBaseLoader`

PDF:

6. `PyPDFLoader` 

In [35]:
from langchain_community.document_loaders import TextLoader

loader = TextLoader("./README.md")
loader.load()

[Document(metadata={'source': './README.md'}, page_content='# PyTorch Natural Language Processing\n')]

In [36]:
from langchain_community.document_loaders import UnstructuredHTMLLoader

loader = TextLoader("https://www.nytimes.com/")
loader.load()

RuntimeError: Error loading https://www.nytimes.com/

In [39]:
# !pip3 install pypdf
from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader("./data/1706.03762v7.pdf")
pages = loader.load_and_split()
pages[0]

Document(metadata={'source': './data/1706.03762v7.pdf', 'page': 0}, page_content='Provided proper attribution is provided, Google hereby grants permission to\nreproduce the tables and figures in this paper solely for use in journalistic or\nscholarly works.\nAttention Is All You Need\nAshish Vaswani∗\nGoogle Brain\navaswani@google.comNoam Shazeer∗\nGoogle Brain\nnoam@google.comNiki Parmar∗\nGoogle Research\nnikip@google.comJakob Uszkoreit∗\nGoogle Research\nusz@google.com\nLlion Jones∗\nGoogle Research\nllion@google.comAidan N. Gomez∗ †\nUniversity of Toronto\naidan@cs.toronto.eduŁukasz Kaiser∗\nGoogle Brain\nlukaszkaiser@google.com\nIllia Polosukhin∗ ‡\nillia.polosukhin@gmail.com\nAbstract\nThe dominant sequence transduction models are based on complex recurrent or\nconvolutional neural networks that include an encoder and a decoder. The best\nperforming models also connect the encoder and decoder through an attention\nmechanism. We propose a new simple network architecture, the Trans

## 2-8. Text Splitters

Base interface:

1. `langchain_text_splitters.base.TextSplitter(chunk_size=4000, chunk_overlap=200, length_function=<built-in function len>, keep_separator=False, add_start_index=False, strip_whitespace=True)`: All LangChain text splitters inherit from. To maintain context between chunks, setting `chunk_overlap` can ensure information isn't lost at chunk boundaries. `add_start_index` includes chunk's start index in metadata if `True`.
    - `split_text(text)`
    - `split_documents(documents)`
    - `from_tiktoken_encoder(encoding_name='gpt2', model_name=None, allowed_special={}, disallowed_special='all', **kwargs)`
    - `from_huggingface_tokenizer(tokenizer, **kwargs)`

Semantic chunking:

Split by tokens:

1. `langchain_text_splitters.base.TokenTextSplitter(encoding_name='gpt2', model_name=None, allowed_special={}, disallowed_special='all', **kwargs)`: Splits text to tokens using model tokenizer.
2. `langchain_text_splitters.character.CharacterTextSplitter(separator='\n\n', is_separator_regex=False, **kwargs)`: Splits text that looks at characters.
3. `langchain_text_splitters.character.RecursiveCharacterTextSplitter(separators=None, keep_separator=True, is_separator_regex=False, **kwargs)`: Splits text by recursively look at characters.
   - `from_language(language, **kwargs)`: splits code.

Split HTML:

Split Markdown:

In [14]:
# `CharacterTextSplitter()`
from langchain_text_splitters import CharacterTextSplitter

with open("./datasets/hamlet.txt") as f:
    hamlet = f.read()

text_splitter = CharacterTextSplitter.from_tiktoken_encoder(encoding_name="cl100k_base", chunk_size=1000, chunk_overlap=200)
texts = text_splitter.split_text(hamlet)
texts[0]

Created a chunk of size 1066, which is longer than the specified 1000
Created a chunk of size 1992, which is longer than the specified 1000
Created a chunk of size 1317, which is longer than the specified 1000
Created a chunk of size 1070, which is longer than the specified 1000
Created a chunk of size 1364, which is longer than the specified 1000
Created a chunk of size 1849, which is longer than the specified 1000
Created a chunk of size 1112, which is longer than the specified 1000
Created a chunk of size 1370, which is longer than the specified 1000
Created a chunk of size 1265, which is longer than the specified 1000
Created a chunk of size 1465, which is longer than the specified 1000
Created a chunk of size 1624, which is longer than the specified 1000
Created a chunk of size 1227, which is longer than the specified 1000
Created a chunk of size 1144, which is longer than the specified 1000
Created a chunk of size 1010, which is longer than the specified 1000
Created a chunk of s

"THE TRAGEDY OF HAMLET, PRINCE OF DENMARK\n\n\nby William Shakespeare\n\nDramatis Personae\n\n  Claudius, King of Denmark.\n  Marcellus, Officer.\n  Hamlet, son to the former, and nephew to the present king.\n  Polonius, Lord Chamberlain.\n  Horatio, friend to Hamlet.\n  Laertes, son to Polonius.\n  Voltemand, courtier.\n  Cornelius, courtier.\n  Rosencrantz, courtier.\n  Guildenstern, courtier.\n  Osric, courtier.\n  A Gentleman, courtier.\n  A Priest.\n  Marcellus, officer.\n  Bernardo, officer.\n  Francisco, a soldier\n  Reynaldo, servant to Polonius.\n  Players.\n  Two Clowns, gravediggers.\n  Fortinbras, Prince of Norway.  \n  A Norwegian Captain.\n  English Ambassadors.\n\n  Getrude, Queen of Denmark, mother to Hamlet.\n  Ophelia, daughter to Polonius.\n\n  Ghost of Hamlet's Father.\n\n  Lords, ladies, Officers, Soldiers, Sailors, Messengers, Attendants.\n\nSCENE.- Elsinore.\n\n\nACT I. Scene I.\nElsinore. A platform before the Castle.\n\nEnter two Sentinels-[first,] Francisco, [

In [36]:
# Hugging Face tokenizer
from transformers import GPT2TokenizerFast

tokenizer = GPT2TokenizerFast.from_pretrained("gpt2")

text_splitter = CharacterTextSplitter.from_huggingface_tokenizer(tokenizer=tokenizer, chunk_size=1000, chunk_overlap=200)
texts = text_splitter.split_text(hamlet)
texts[0]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

Token indices sequence length is longer than the specified maximum sequence length for this model (1323 > 1024). Running this sequence through the model will result in indexing errors
Created a chunk of size 1323, which is longer than the specified 1000
Created a chunk of size 2474, which is longer than the specified 1000
Created a chunk of size 1575, which is longer than the specified 1000
Created a chunk of size 1368, which is longer than the specified 1000
Created a chunk of size 1023, which is longer than the specified 1000
Created a chunk of size 1676, which is longer than the specified 1000
Created a chunk of size 1208, which is longer than the specified 1000
Created a chunk of size 2120, which is longer than the specified 1000
Created a chunk of size 1403, which is longer than the specified 1000
Created a chunk of size 1622, which is longer than the specified 1000
Created a chunk of size 1537, which is longer than the specified 1000
Created a chunk of size 1880, which is longer 

"THE TRAGEDY OF HAMLET, PRINCE OF DENMARK\n\n\nby William Shakespeare\n\nDramatis Personae\n\n  Claudius, King of Denmark.\n  Marcellus, Officer.\n  Hamlet, son to the former, and nephew to the present king.\n  Polonius, Lord Chamberlain.\n  Horatio, friend to Hamlet.\n  Laertes, son to Polonius.\n  Voltemand, courtier.\n  Cornelius, courtier.\n  Rosencrantz, courtier.\n  Guildenstern, courtier.\n  Osric, courtier.\n  A Gentleman, courtier.\n  A Priest.\n  Marcellus, officer.\n  Bernardo, officer.\n  Francisco, a soldier\n  Reynaldo, servant to Polonius.\n  Players.\n  Two Clowns, gravediggers.\n  Fortinbras, Prince of Norway.  \n  A Norwegian Captain.\n  English Ambassadors.\n\n  Getrude, Queen of Denmark, mother to Hamlet.\n  Ophelia, daughter to Polonius.\n\n  Ghost of Hamlet's Father.\n\n  Lords, ladies, Officers, Soldiers, Sailors, Messengers, Attendants.\n\nSCENE.- Elsinore.\n\n\nACT I. Scene I.\nElsinore. A platform before the Castle.\n\nEnter two Sentinels-[first,] Francisco, [

In [23]:
# `RecursiveCharacterTextSplitter()`
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(model_name="gpt-4", chunk_size=1000, chunk_overlap=200)
texts = text_splitter.split_text(hamlet)
texts[0]

"THE TRAGEDY OF HAMLET, PRINCE OF DENMARK\n\n\nby William Shakespeare\n\n\n\nDramatis Personae\n\n  Claudius, King of Denmark.\n  Marcellus, Officer.\n  Hamlet, son to the former, and nephew to the present king.\n  Polonius, Lord Chamberlain.\n  Horatio, friend to Hamlet.\n  Laertes, son to Polonius.\n  Voltemand, courtier.\n  Cornelius, courtier.\n  Rosencrantz, courtier.\n  Guildenstern, courtier.\n  Osric, courtier.\n  A Gentleman, courtier.\n  A Priest.\n  Marcellus, officer.\n  Bernardo, officer.\n  Francisco, a soldier\n  Reynaldo, servant to Polonius.\n  Players.\n  Two Clowns, gravediggers.\n  Fortinbras, Prince of Norway.  \n  A Norwegian Captain.\n  English Ambassadors.\n\n  Getrude, Queen of Denmark, mother to Hamlet.\n  Ophelia, daughter to Polonius.\n\n  Ghost of Hamlet's Father.\n\n  Lords, ladies, Officers, Soldiers, Sailors, Messengers, Attendants.\n\n\n\n\n\nSCENE.- Elsinore.\n\n\nACT I. Scene I.\nElsinore. A platform before the Castle.\n\nEnter two Sentinels-[first,] 

In [16]:
# Split code
from langchain_text_splitters import Language

python_code = """
def hello_world():
    print("Hello World!")

# Call the function
hello_world()
"""

text_splitter = RecursiveCharacterTextSplitter.from_language(language=Language.PYTHON, chunk_size=50, chunk_overlap=0)
texts = text_splitter.create_documents([python_code])
texts

[Document(page_content='def hello_world():\n    print("Hello World!")'),
 Document(page_content='# Call the function\nhello_world()')]

In [32]:
# `TokenTextSplitter()`
from langchain_text_splitters import TokenTextSplitter

text_splitter = TokenTextSplitter(chunk_size=1000, chunk_overlap=200)
texts = text_splitter.split_text(hamlet)
texts[0]

"THE TRAGEDY OF HAMLET, PRINCE OF DENMARK\n\n\nby William Shakespeare\n\n\n\nDramatis Personae\n\n  Claudius, King of Denmark.\n  Marcellus, Officer.\n  Hamlet, son to the former, and nephew to the present king.\n  Polonius, Lord Chamberlain.\n  Horatio, friend to Hamlet.\n  Laertes, son to Polonius.\n  Voltemand, courtier.\n  Cornelius, courtier.\n  Rosencrantz, courtier.\n  Guildenstern, courtier.\n  Osric, courtier.\n  A Gentleman, courtier.\n  A Priest.\n  Marcellus, officer.\n  Bernardo, officer.\n  Francisco, a soldier\n  Reynaldo, servant to Polonius.\n  Players.\n  Two Clowns, gravediggers.\n  Fortinbras, Prince of Norway.  \n  A Norwegian Captain.\n  English Ambassadors.\n\n  Getrude, Queen of Denmark, mother to Hamlet.\n  Ophelia, daughter to Polonius.\n\n  Ghost of Hamlet's Father.\n\n  Lords, ladies, Officers, Soldiers, Sailors, Messengers, Attendants.\n\n\n\n\n\nSCENE.- Elsinore.\n\n\nACT I. Scene I.\nElsinore. A platform before the Castle.\n\nEnter two Sentinels-[first,] 

In [33]:
# `SpacyTextSplitter()`
from langchain_text_splitters import SpacyTextSplitter

text_splitter = SpacyTextSplitter(chunk_size=1000, chunk_overlap=200)
texts = text_splitter.split_text(hamlet)
texts[0]



"THE TRAGEDY OF HAMLET, PRINCE OF DENMARK\n\n\nby William Shakespeare\n\n\n\nDramatis Personae\n\n  Claudius, King of Denmark.\n  \n\nMarcellus, Officer.\n  Hamlet, son to the former, and nephew to the present king.\n  \n\nPolonius, Lord Chamberlain.\n  \n\nHoratio, friend to Hamlet.\n  \n\nLaertes, son to Polonius.\n  \n\nVoltemand, courtier.\n  \n\nCornelius, courtier.\n  \n\nRosencrantz, courtier.\n  \n\nGuildenstern, courtier.\n  \n\nOsric, courtier.\n  \n\nA Gentleman, courtier.\n  \n\nA Priest.\n  \n\nMarcellus, officer.\n  Bernardo, officer.\n  Francisco, a soldier\n  Reynaldo, servant to Polonius.\n  Players.\n  \n\nTwo Clowns, gravediggers.\n  Fortinbras, Prince of Norway.  \n  \n\nA Norwegian Captain.\n  \n\nEnglish Ambassadors.\n\n  \n\nGetrude, Queen of Denmark, mother to Hamlet.\n  \n\nOphelia, daughter to Polonius.\n\n  \n\nGhost of Hamlet's Father.\n\n  \n\nLords, ladies, Officers, Soldiers, Sailors, Messengers, Attendants.\n\n\n\n\n\n\n\nSCENE.- Elsinore.\n\n\n\n\nACT I

In [None]:
# `SentenceTransformersTokenTextSplitter()`
from langchain_text_splitters import SentenceTransformersTokenTextSplitter



In [35]:
# `NLTKTextSplitter()`
from langchain_text_splitters import NLTKTextSplitter

text_splitter = NLTKTextSplitter(chunk_size=1000, chunk_overlap=200)
texts = text_splitter.split_text(hamlet)
texts[0]

"THE TRAGEDY OF HAMLET, PRINCE OF DENMARK\n\n\nby William Shakespeare\n\n\n\nDramatis Personae\n\n  Claudius, King of Denmark.\n\nMarcellus, Officer.\n\nHamlet, son to the former, and nephew to the present king.\n\nPolonius, Lord Chamberlain.\n\nHoratio, friend to Hamlet.\n\nLaertes, son to Polonius.\n\nVoltemand, courtier.\n\nCornelius, courtier.\n\nRosencrantz, courtier.\n\nGuildenstern, courtier.\n\nOsric, courtier.\n\nA Gentleman, courtier.\n\nA Priest.\n\nMarcellus, officer.\n\nBernardo, officer.\n\nFrancisco, a soldier\n  Reynaldo, servant to Polonius.\n\nPlayers.\n\nTwo Clowns, gravediggers.\n\nFortinbras, Prince of Norway.\n\nA Norwegian Captain.\n\nEnglish Ambassadors.\n\nGetrude, Queen of Denmark, mother to Hamlet.\n\nOphelia, daughter to Polonius.\n\nGhost of Hamlet's Father.\n\nLords, ladies, Officers, Soldiers, Sailors, Messengers, Attendants.\n\nSCENE.- Elsinore.\n\nACT I.\n\nScene I.\nElsinore.\n\nA platform before the Castle."

In [None]:
# `MarkdownHeaderTextSplitter()`
from langchain_text_splitters import MarkdownHeaderTextSplitter

## 2-9. Embedding Models

Base interface:

1. `langchain_core.embeddings.embeddings.Embeddings`
    - `embed_documents(text)`: Embeds multiple search documents.
    - `embed_query(text)`: Embed a single query text.

Text embedding models:

2. `langchain_openai.embeddings.base.OpenAIEmbeddings(allowed_special=None, check_embedding_ctx_length=True, chunk_size=1000, default_headers=None, default_query=None, deployment='text-embedding-ada-002', dimensions=None, disallowed_special=None, embedding_ctx_length=8191, headers=None, http_async_client=None, http_client=None, max_retries=2, model='text-embedding-ada-002', model_kwargs, openai_api_base=None, openai_api_key=None, openai_api_type=None, openai_api_version=None, openai_organization=None, openai_proxy=None, request_timeout=None, retry_max_seconds=20, retry_min_seconds=4, show_progress_bar=False, skip_empty=False, tiktoken_enabled=True, tiktoken_model_name=None)`: OpenAI embedding models.
3. `langchain_huggingface.embeddings.huggingface.HuggingFaceEmbeddings(cache_folder=None, encode_kwargs, model_kwargs, model_name='sentence-transformers/all-mpnet-base-v2', multi_process=False, show_progress=False)`: HuggingFace Sentence Transformers embedding models.

In [2]:
# `embed_documents()`
from langchain_openai import OpenAIEmbeddings

embeddings_model = OpenAIEmbeddings()
embeddings = embeddings_model.embed_documents(
    [
        "Hi there!",
        "Oh, hello!",
        "What's your name?",
        "My friends call me World",
        "Hello World!"
    ]
)
len(embeddings), len(embeddings[0])

(5, 1536)

In [3]:
# `embed_query()`
embedded_query = embeddings_model.embed_query("What was the name mentioned in the conversation?")
embedded_query[:5]

[0.005377273540943861,
 -0.0006527779041789472,
 0.038980286568403244,
 -0.002967397216707468,
 -0.008834563195705414]

In [2]:
# pip3 install langchain-huggingface sentence-transformers datasets
from langchain_huggingface import HuggingFaceEmbeddings

embeddings_model = HuggingFaceEmbeddings()
embeddings = embeddings_model.embed_documents(
    [
        "Hi there!",
        "Oh, hello!",
        "What's your name?",
        "My friends call me World",
        "Hello World!"
    ]
)
len(embeddings), len(embeddings[0])

  from tqdm.autonotebook import tqdm, trange
2024-07-24 11:29:22.010982: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-07-24 11:29:22.018966: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:479] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-07-24 11:29:22.029888: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:10575] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-07-24 11:29:22.029925: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1442] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-07-24 11:29:22.03808

modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.6k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]



config.json:   0%|          | 0.00/571 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/438M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

(5, 768)

In [3]:
embedded_query = embeddings_model.embed_query("What was the name mentioned in the conversation?")
embedded_query[:5]

[0.09514585137367249,
 9.88791070994921e-05,
 -0.016573384404182434,
 0.044847987592220306,
 0.04323696717619896]

## 2-10. Vector Stores

**Chroma** is a vector database which prioritizes simplicity & developer productivity and offers self-hosted server option built to run seamlessly during local development, making it easier to prototype LLM applications. It gives you the tools to store embeddings & their metadata, embed documents & queries, and search embeddings. Allows for more flexible querying capabilities compared with Pinecone and excels in high-throughput operations & real-time scalability.

1. `langchain_chroma.vectorstores.Chroma(collection_name='langchain', embedding_function=None, persist_directory=None, client_settings=None, collection_metadata=None, client=None, relevance_score_fn=None, create_collection_if_not_exists=True)`: ChromaDB vector store.
    - `from_documents(documents, embedding=None, ids=None, collection_name='langchain', persist_directory=None, client_settings=None, client=None, collection_metadata=None, **kwargs)`: Creates a Chroma vector store from a list of documents.
    - `from_texts(texts, embedding=None, metadatas=None, ids=None, collection_name='langchain', persist_directory=None, client_settings=None, client=None, collection_metadata=None, **kwargs)`: Creates a Chroma vector store from raw documents.
    - `similarity_search(query, k=4, filter=None, **kwargs)`
    - `similarity_search_by_vector(embedding, k=4, filter=None, where_document=None, **kwargs)`
    - `similarity_search_by_vector_with_relevance_scores(embedding, k=4, filter=None, where_document=None, **kwargs)`
    - `similarity_search_with_score(query, k=4, filter=None, where_document=None, **kwargs)`
    - `similarity_search_with_relevance_scores(query, k=4, **kwargs)`: Returns docs and relevance scores in the range `[0, 1]`.
    - `similarity_search_by_image(uri, k=4, filter=None, **kwargs)`: Searches for similar images based on the given image URI.
    - `similarity_search_by_image_with_relevance_score(uri, k=4, filter=None, **kwargs)`
    - `max_marginal_relevance_search(query, k=4, fetch_k=20, lambda_mult=0.5, filter=None, where_document=None, **kwargs)` 
    - `max_marginal_relevance_search_by_vector(embedding, k=4, fetch_k=20, lambda_mult=0.5, filter=None, where_document=None, **kwargs)`
    - `as_retriever(**kwargs)`: Returns `VectorStoreRetriever` initialized from this vector store. The usage is demonstrated in the retriever section.

**FAISS (Facebook AI Similarity Search)** is a high performance library created and optimized for dense vector similarity search & clustering. FAISS is built around several index types using **Approximate Nearest Neighbor (ANN) Search** algorithms, such as **Inverted File Index (IVF)**, **Locality Sensitive Hashing (LSH)**, **Hierarchical Navigable Small Worlds (HNSW)** & more. Also supports evaluation & parameter tuning but lacks features like filtering & post processing compared with Chroma.
- [faiss Indexes](https://github.com/facebookresearch/faiss/wiki/Faiss-indexes)

2. `langchain_community.vectorstores.faiss.FAISS(embedding_function, index, docstore, index_to_docstore_id, relevance_score_fn=None, normalize_L2=False, distance_strategy=DistanceStrategy.EUCLIDEAN_DISTANCE)`: Meta FAISS vector store.
    - `from_documents(documents, embedding, **kwargs)`
    - `from_texts(texts, embedding, metadatas=None, ids=None, **kwargs)`
    - `from_embeddings(text_embeddings, embedding, metadatas=None, ids=None, **kwargs)`
    - `similarity_search(query, k=4, filter=None, fetch_k=20, **kwargs)`
    - `similarity_search_by_vector(embedding, k=4, filter=None, fetch_k=20, **kwargs)`
    - `similarity_search_with_relevance_scores(query, k=4, **kwargs)`
    - `similarity_search_with_score(query, k=4, filter=None, fetch_k=20, **kwargs)`
    - `similarity_search_with_score_by_vector(embedding, k=4, filter=None, fetch_k=20, **kwargs)`
    - `max_marginal_relevance_search(query, k=4, fetch_k=20, lambda_mult=0.5, filter=None, **kwargs)`
    - `max_marginal_relevance_search_by_vector(embedding, k=4, fetch_k=20, lambda_mult=0.5, filter=None, **kwargs)`
    - `max_marginal_relevance_search_with_score_by_vector(embedding, *, k=4, fetch_k=20, lambda_mult=0.5, filter=None)`: Returns documents and their similarity scores selected using the maximal marginal relevance.
    - `as_retriever(**kwargs)`
  
**Pinecone** is a cloud-based fully managed vector platform designed to handle real-time search & similarity matching at scale. Also offers automatic indexing and shines on its low latency performance & high scalability due to cloud infrastructure.

In [13]:
from langchain_community.document_loaders import TextLoader
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import CharacterTextSplitter
from langchain_chroma.vectorstores import Chroma

# Load the document, split it into chunks, embed each chunk and load it into the vector store.
raw_documents = TextLoader('./datasets/hamlet.txt').load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
documents = text_splitter.split_documents(raw_documents)
db = Chroma.from_documents(documents, OpenAIEmbeddings())
db

Created a chunk of size 1307, which is longer than the specified 1000
Created a chunk of size 4126, which is longer than the specified 1000
Created a chunk of size 2446, which is longer than the specified 1000
Created a chunk of size 7752, which is longer than the specified 1000
Created a chunk of size 4735, which is longer than the specified 1000
Created a chunk of size 2413, which is longer than the specified 1000
Created a chunk of size 4118, which is longer than the specified 1000
Created a chunk of size 1808, which is longer than the specified 1000
Created a chunk of size 2816, which is longer than the specified 1000
Created a chunk of size 5153, which is longer than the specified 1000
Created a chunk of size 1735, which is longer than the specified 1000
Created a chunk of size 2029, which is longer than the specified 1000
Created a chunk of size 3485, which is longer than the specified 1000
Created a chunk of size 2235, which is longer than the specified 1000
Created a chunk of s

<langchain_chroma.vectorstores.Chroma at 0x7a0c7c079ca0>

In [14]:
# `similarity_search()`
query = "This is the excellent foppery of the world, that when we are sick in fortune (often the surfeits of our own behavior) we make guilty of our disasters the sun, the moon, and stars, as if we were villains on necessity; fools by heavenly compulsion; knaves, thieves, and treachers by spherical predominance; drunkards, liars, and adulterers by an enforced obedience of planetary influence; and all that we are evil in, by a divine thrusting on. An admirable evasion of whoremaster man, to lay his goatish disposition on the charge of a star! My father compounded with my mother under the Dragon's tail, and my nativity was under Ursa Major, so that it follows I am rough and lecherous. I should have been that I am, had the maidenliest star in the firmament twinkled on my bastardizing."
docs = db.similarity_search(query)
docs[0].page_content

"Ham. To be, or not to be- that is the question:\n    Whether 'tis nobler in the mind to suffer\n    The slings and arrows of outrageous fortune\n    Or to take arms against a sea of troubles,\n    And by opposing end them. To die- to sleep-\n    No more; and by a sleep to say we end\n    The heartache, and the thousand natural shocks\n    That flesh is heir to. 'Tis a consummation  \n    Devoutly to be wish'd. To die- to sleep.\n    To sleep- perchance to dream: ay, there's the rub!\n    For in that sleep of death what dreams may come\n    When we have shuffled off this mortal coil,\n    Must give us pause. There's the respect\n    That makes calamity of so long life.\n    For who would bear the whips and scorns of time,\n    Th' oppressor's wrong, the proud man's contumely,\n    The pangs of despis'd love, the law's delay,\n    The insolence of office, and the spurns\n    That patient merit of th' unworthy takes,\n    When he himself might his quietus make\n    With a bare bodkin? Wh

In [15]:
# `similarity_search_by_vector()`
embedding_vector = OpenAIEmbeddings().embed_query(query)
docs = db.similarity_search_by_vector(embedding_vector)
docs[0].page_content

"Ham. To be, or not to be- that is the question:\n    Whether 'tis nobler in the mind to suffer\n    The slings and arrows of outrageous fortune\n    Or to take arms against a sea of troubles,\n    And by opposing end them. To die- to sleep-\n    No more; and by a sleep to say we end\n    The heartache, and the thousand natural shocks\n    That flesh is heir to. 'Tis a consummation  \n    Devoutly to be wish'd. To die- to sleep.\n    To sleep- perchance to dream: ay, there's the rub!\n    For in that sleep of death what dreams may come\n    When we have shuffled off this mortal coil,\n    Must give us pause. There's the respect\n    That makes calamity of so long life.\n    For who would bear the whips and scorns of time,\n    Th' oppressor's wrong, the proud man's contumely,\n    The pangs of despis'd love, the law's delay,\n    The insolence of office, and the spurns\n    That patient merit of th' unworthy takes,\n    When he himself might his quietus make\n    With a bare bodkin? Wh

In [16]:
from langchain_community.vectorstores.faiss import FAISS

db = FAISS.from_documents(documents, OpenAIEmbeddings())
db

<langchain_community.vectorstores.faiss.FAISS at 0x7a0c6819b5c0>

In [17]:
# `similarity_search()`
query = "This is the excellent foppery of the world, that when we are sick in fortune (often the surfeits of our own behavior) we make guilty of our disasters the sun, the moon, and stars, as if we were villains on necessity; fools by heavenly compulsion; knaves, thieves, and treachers by spherical predominance; drunkards, liars, and adulterers by an enforced obedience of planetary influence; and all that we are evil in, by a divine thrusting on. An admirable evasion of whoremaster man, to lay his goatish disposition on the charge of a star! My father compounded with my mother under the Dragon's tail, and my nativity was under Ursa Major, so that it follows I am rough and lecherous. I should have been that I am, had the maidenliest star in the firmament twinkled on my bastardizing."
docs = db.similarity_search(query)
docs[0].page_content

"King. Full thirty times hath Phoebus' cart gone round\n      Neptune's salt wash and Tellus' orbed ground,\n      And thirty dozed moons with borrowed sheen\n      About the world have times twelve thirties been,\n      Since love our hearts, and Hymen did our hands,\n      Unite comutual in most sacred bands.\n    Queen. So many journeys may the sun and moon\n      Make us again count o'er ere love be done!\n      But woe is me! you are so sick of late,\n      So far from cheer and from your former state.\n      That I distrust you. Yet, though I distrust,\n      Discomfort you, my lord, it nothing must;\n      For women's fear and love holds quantity,\n      In neither aught, or in extremity.\n      Now what my love is, proof hath made you know;\n      And as my love is siz'd, my fear is so.\n      Where love is great, the littlest doubts are fear;\n      Where little fears grow great, great love grows there.\n    King. Faith, I must leave thee, love, and shortly too;  \n      My op

In [18]:
# `similarity_search_by_vector()`
embedding_vector = OpenAIEmbeddings().embed_query(query)
docs = db.similarity_search_by_vector(embedding_vector)
print(docs[0].page_content)

King. Full thirty times hath Phoebus' cart gone round
      Neptune's salt wash and Tellus' orbed ground,
      And thirty dozed moons with borrowed sheen
      About the world have times twelve thirties been,
      Since love our hearts, and Hymen did our hands,
      Unite comutual in most sacred bands.
    Queen. So many journeys may the sun and moon
      Make us again count o'er ere love be done!
      But woe is me! you are so sick of late,
      So far from cheer and from your former state.
      That I distrust you. Yet, though I distrust,
      Discomfort you, my lord, it nothing must;
      For women's fear and love holds quantity,
      In neither aught, or in extremity.
      Now what my love is, proof hath made you know;
      And as my love is siz'd, my fear is so.
      Where love is great, the littlest doubts are fear;
      Where little fears grow great, great love grows there.
    King. Faith, I must leave thee, love, and shortly too;  
      My operant powers their f

## 2-11. Retrievers

Base interface:

1. `langchain_core.retrievers.BaseRetriever`
    - `invoke(input, config=None, **kwargs)`: Invokes the retriever to get relevant documents.

### 2-11-1. Multi-Query

1. `langchain.retrievers.multi_query.MultiQueryRetriever`:
   - `from_llm()`
2. `langchain.retrievers.multi_query.LineListOutputParser`: Output parser for a list of lines for `MultiQueryRetriever`. 

The `MultiQueryRetriever` automates the process of prompt tuning by using an LLM to generate multiple queries from different perspectives for a given user input query. For each query, it retrieves a set of relevant documents and takes the unique union across all queries to get a larger set of potentially relevant documents. By generating multiple perspectives on the same question, the MultiQueryRetriever can mitigate some of the limitations of the distance-based retrieval and get a richer set of results.

Under the hood, MultiQueryRetriever generates queries using a specific prompt. To customize this prompt:

- Make a PromptTemplate with an input variable for the question;
- Implement an output parser like the one below to split the result into a list of queries.

The prompt and output parser together must support the generation of a list of queries.

In [32]:
from langchain_community.document_loaders import WebBaseLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings
from langchain_chroma.vectorstores import Chroma

# Load blog post
loader = WebBaseLoader("https://lilianweng.github.io/posts/2023-06-23-agent/")
data = loader.load()

# Split
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
splits = text_splitter.split_documents(data)

# VectorDB
embedding = OpenAIEmbeddings()
db = Chroma.from_documents(documents=splits, embedding=embedding)
db
# db.get()['documents']
# db.delete_collection()

<langchain_chroma.vectorstores.Chroma at 0x76710ff9d460>

In [34]:
# Specify the LLM to use for query generation
from langchain.retrievers.multi_query import MultiQueryRetriever
from langchain_openai import ChatOpenAI

question = "What are the approaches to Task Decomposition?"
llm = ChatOpenAI(temperature=0)
retriever_from_llm = MultiQueryRetriever.from_llm(
    retriever=db.as_retriever(), llm=llm
)
retriever_from_llm

MultiQueryRetriever(retriever=VectorStoreRetriever(tags=['Chroma', 'OpenAIEmbeddings'], vectorstore=<langchain_chroma.vectorstores.Chroma object at 0x76710ff9d460>), llm_chain=PromptTemplate(input_variables=['question'], template='You are an AI language model assistant. Your task is \n    to generate 3 different versions of the given user \n    question to retrieve relevant documents from a vector  database. \n    By generating multiple perspectives on the user question, \n    your goal is to help the user overcome some of the limitations \n    of distance-based similarity search. Provide these alternative \n    questions separated by newlines. Original question: {question}')
| ChatOpenAI(client=<openai.resources.chat.completions.Completions object at 0x7670c33b39b0>, async_client=<openai.resources.chat.completions.AsyncCompletions object at 0x7670c33b2750>, temperature=0.0, openai_api_key=SecretStr('**********'), openai_proxy='')
| LineListOutputParser())

In [35]:
# Set logging for the queries
import logging

logging.basicConfig()
logging.getLogger("langchain.retrievers.multi_query").setLevel(logging.INFO)

unique_docs = retriever_from_llm.invoke(question)
unique_docs

INFO:langchain.retrievers.multi_query:Generated queries: ['1. How can Task Decomposition be achieved through different methods?', '2. What strategies are commonly used for Task Decomposition?', '3. What are the various techniques for breaking down tasks in Task Decomposition?']


[Document(metadata={'description': 'Building agents with LLM (large language model) as its core controller is a cool concept. Several proof-of-concepts demos, such as AutoGPT, GPT-Engineer and BabyAGI, serve as inspiring examples. The potentiality of LLM extends beyond generating well-written copies, stories, essays and programs; it can be framed as a powerful general problem solver.\nAgent System Overview In a LLM-powered autonomous agent system, LLM functions as the agent’s brain, complemented by several key components:', 'language': 'en', 'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/', 'title': "LLM Powered Autonomous Agents | Lil'Log"}, page_content='Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.'),
 Document(metadata={'description': 'Building agents with LLM (

In [36]:
prompt_template = """Use the following pieces of context to answer the question at the end.
If you don't know the answer, just say that you don't know, don't try to make up an answer.

{context}

Question: {question}
Answer:"""
prompt_template = PromptTemplate(
    template=prompt_template, input_variables=["context", "question"]
)

llm.predict(text=prompt_template.format_prompt(
    context=unique_docs,
    question=question
).text)

'The approaches to Task Decomposition are:\n1. By LLM with simple prompting like "Steps for XYZ. 1.", "What are the subgoals for achieving XYZ?"\n2. By using task-specific instructions; e.g. "Write a story outline." for writing a novel\n3. With human inputs.'

In [37]:
# Supply your own prompt
from typing import List

from langchain_core.prompts import PromptTemplate
from langchain.retrievers.multi_query import LineListOutputParser

prompt_template = PromptTemplate(
    input_variables=["question"],
    template="""You are an AI language model assistant. Your task is to generate five 
    different versions of the given user question to retrieve relevant documents from a vector 
    database. By generating multiple perspectives on the user question, your goal is to help
    the user overcome some of the limitations of the distance-based similarity search. 
    Provide these alternative questions separated by newlines.
    Original question: {question}""",
)
llm = ChatOpenAI(temperature=0)
output_parser = LineListOutputParser()

# Chain
llm_chain = prompt_template | llm | output_parser

# Other inputs
question = "What are the approaches to Task Decomposition?"

In [38]:
retriever = MultiQueryRetriever(
    retriever=db.as_retriever(), llm_chain=llm_chain
)  # "lines" is the key (attribute name) of the parsed output

# Results
unique_docs = retriever.invoke("What does the course say about regression?")
unique_docs

INFO:langchain.retrievers.multi_query:Generated queries: ['1. Can you provide insights from the course on regression analysis?', '2. How is regression discussed in the course material?', '3. What topics related to regression are covered in the course?', '4. What information does the course offer about regression techniques?', '5. In what way does the course address the topic of regression?']


[Document(metadata={'description': 'Building agents with LLM (large language model) as its core controller is a cool concept. Several proof-of-concepts demos, such as AutoGPT, GPT-Engineer and BabyAGI, serve as inspiring examples. The potentiality of LLM extends beyond generating well-written copies, stories, essays and programs; it can be framed as a powerful general problem solver.\nAgent System Overview In a LLM-powered autonomous agent system, LLM functions as the agent’s brain, complemented by several key components:', 'language': 'en', 'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/', 'title': "LLM Powered Autonomous Agents | Lil'Log"}, page_content='}\n]\nChallenges#\nAfter going through key ideas and demos of building LLM-centered agents, I start to see a couple common limitations:'),
 Document(metadata={'description': 'Building agents with LLM (large language model) as its core controller is a cool concept. Several proof-of-concepts demos, such as AutoGPT, GPT-

### 2-11-2. Contextual Compression

The `ContextualCompressionRetriever` passes queries to the base retriever, takes the initial documents and passes them through the document compressor. The document compressor takes a list of documents and shortens it by reducing the contents of documents or dropping documents altogether.

1. `langchain.retrievers.contextual_compression.ContextualCompressionRetriever`: Wraps a base retriever and compresses the results.
2. `langchain.retrievers.document_compressors.chain_extract.LLMChainExtractor`

In [41]:
from langchain_community.document_loaders import TextLoader
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import CharacterTextSplitter

documents = TextLoader("./datasets/hamlet.txt").load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
texts = text_splitter.split_documents(documents)
retriever = FAISS.from_documents(texts, OpenAIEmbeddings()).as_retriever()

docs = retriever.invoke("This is the excellent foppery of the world, that when we are sick in fortune (often the surfeits of our own behavior) we make guilty of our disasters the sun, the moon, and stars, as if we were villains on necessity; fools by heavenly compulsion; knaves, thieves, and treachers by spherical predominance; drunkards, liars, and adulterers by an enforced obedience of planetary influence; and all that we are evil in, by a divine thrusting on. An admirable evasion of whoremaster man, to lay his goatish disposition on the charge of a star! My father compounded with my mother under the Dragon's tail, and my nativity was under Ursa Major, so that it follows I am rough and lecherous. I should have been that I am, had the maidenliest star in the firmament twinkled on my bastardizing.")
docs



[Document(metadata={'source': './datasets/hamlet.txt'}, page_content="King. Full thirty times hath Phoebus' cart gone round\n      Neptune's salt wash and Tellus' orbed ground,\n      And thirty dozed moons with borrowed sheen\n      About the world have times twelve thirties been,\n      Since love our hearts, and Hymen did our hands,\n      Unite comutual in most sacred bands.\n    Queen. So many journeys may the sun and moon\n      Make us again count o'er ere love be done!\n      But woe is me! you are so sick of late,\n      So far from cheer and from your former state.\n      That I distrust you. Yet, though I distrust,\n      Discomfort you, my lord, it nothing must;\n      For women's fear and love holds quantity,\n      In neither aught, or in extremity.\n      Now what my love is, proof hath made you know;\n      And as my love is siz'd, my fear is so.\n      Where love is great, the littlest doubts are fear;\n      Where little fears grow great, great love grows there.\n    

In [42]:
from langchain.retrievers import ContextualCompressionRetriever
from langchain.retrievers.document_compressors import LLMChainExtractor
from langchain_openai import OpenAI

llm = OpenAI(temperature=0)
compressor = LLMChainExtractor.from_llm(llm)
compression_retriever = ContextualCompressionRetriever(
    base_compressor=compressor, base_retriever=retriever
)

compressed_docs = compression_retriever.invoke(
    "This is the excellent foppery of the world, that when we are sick in fortune (often the surfeits of our own behavior) we make guilty of our disasters the sun, the moon, and stars, as if we were villains on necessity; fools by heavenly compulsion; knaves, thieves, and treachers by spherical predominance; drunkards, liars, and adulterers by an enforced obedience of planetary influence; and all that we are evil in, by a divine thrusting on. An admirable evasion of whoremaster man, to lay his goatish disposition on the charge of a star! My father compounded with my mother under the Dragon's tail, and my nativity was under Ursa Major, so that it follows I am rough and lecherous. I should have been that I am, had the maidenliest star in the firmament twinkled on my bastardizing."
)
compressed_docs

[Document(metadata={'source': './datasets/hamlet.txt'}, page_content='- "So far from cheer and from your former state."\n- "That I distrust you."\n- "Yet, though I distrust, Discomfort you, my lord, it nothing must;"\n- "For women\'s fear and love holds quantity,"\n- "In neither aught, or in extremity."\n- "Now what my love is, proof hath made you know;"\n- "And as my love is siz\'d, my fear is so."\n- "Where love is great, the littlest doubts are fear;"\n- "Where little fears grow great, great love grows there."'),
 Document(metadata={'source': './datasets/hamlet.txt'}, page_content="- King. O, for two special reasons,\n    Which may to you, perhaps, seein much unsinew'd,\n    But yet to me they are strong. The Queen his mother\n    Lives almost by his looks; and for myself,-\n    My virtue or my plague, be it either which,-\n    She's so conjunctive to my life and soul\n    That, as the star moves not but in his sphere,"),
 Document(metadata={'source': './datasets/hamlet.txt'}, page_

### 2-11-3. Parent Document
The `ParentDocumentRetriever` fetches the small chunks first but then looks up the parent ids for those chunks and returns those larger documents.

1. `langchain.retrievers.parent_document_retriever.ParentDocumentRetriever`

In [5]:
from langchain.storage import InMemoryStore
from langchain_chroma import Chroma
from langchain_community.document_loaders import TextLoader
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain.retrievers import ParentDocumentRetriever

# This text splitter is used to create the child documents
child_splitter = RecursiveCharacterTextSplitter(chunk_size=400)
# The vectorstore to use to index the child chunks
vectorstore = Chroma(
    collection_name="full_documents", embedding_function=OpenAIEmbeddings()
)
# The storage layer for the parent documents
store = InMemoryStore()
retriever = ParentDocumentRetriever(
    vectorstore=vectorstore,
    docstore=store,
    child_splitter=child_splitter,
)
retriever

ParentDocumentRetriever(vectorstore=<langchain_chroma.vectorstores.Chroma object at 0x76e5dd383fe0>, docstore=<langchain_core.stores.InMemoryStore object at 0x76e5d0344500>, child_splitter=<langchain_text_splitters.character.RecursiveCharacterTextSplitter object at 0x76e5d0329c10>)

In [8]:
loaders = [
    TextLoader("datasets/hamlet.txt"),
    TextLoader("datasets/king_lear.txt")
]
docs = []
for loader in loaders:
    docs.extend(loader.load())

retriever.add_documents(docs, ids=None)
list(store.yield_keys())

['4215b24a-0e3a-4a9a-9289-468092a6737c',
 'dc3bc5d7-6f31-4602-b995-6d6359d9a96a']

In [9]:
sub_docs = vectorstore.similarity_search("Have more than thou showest, Speak less than thou knowest.")
print(sub_docs[0].page_content)

Fool
Sirrah, I'll teach thee a speech.
KING LEAR
Do.
Fool
Mark it, nuncle:
Have more than thou showest,
Speak less than thou knowest,
Lend less than thou owest,
Ride more than thou goest,
Learn more than thou trowest,
Set less than thou throwest;
Leave thy drink and thy whore,
And keep in-a-door,
And thou shalt have more
Than two tens to a score.
KENT
This is nothing, fool.
Fool


In [10]:
retrieved_docs = retriever.invoke("Have more than thou showest, Speak less than thou knowest.")
len(retrieved_docs[0].page_content)

150930

In [11]:
# Retrieve larger chunks
# This text splitter is used to create the parent documents
parent_splitter = RecursiveCharacterTextSplitter(chunk_size=2000)
# This text splitter is used to create the child documents
# It should create documents smaller than the parent
child_splitter = RecursiveCharacterTextSplitter(chunk_size=400)
# The vectorstore to use to index the child chunks
vectorstore = Chroma(
    collection_name="split_parents", embedding_function=OpenAIEmbeddings()
)
# The storage layer for the parent documents
store = InMemoryStore()
retriever = ParentDocumentRetriever(
    vectorstore=vectorstore,
    docstore=store,
    child_splitter=child_splitter,
    parent_splitter=parent_splitter,
)
retriever

ParentDocumentRetriever(vectorstore=<langchain_chroma.vectorstores.Chroma object at 0x76e5dff92720>, docstore=<langchain_core.stores.InMemoryStore object at 0x76e5c4d741a0>, child_splitter=<langchain_text_splitters.character.RecursiveCharacterTextSplitter object at 0x76e5dff91bb0>, parent_splitter=<langchain_text_splitters.character.RecursiveCharacterTextSplitter object at 0x76e5dff92b70>)

In [12]:
retriever.add_documents(docs)
len(list(store.yield_keys()))

260

In [13]:
sub_docs = vectorstore.similarity_search("Have more than thou showest, Speak less than thou knowest.")
print(sub_docs[0].page_content)

Fool
Sirrah, I'll teach thee a speech.
KING LEAR
Do.
Fool
Mark it, nuncle:
Have more than thou showest,
Speak less than thou knowest,
Lend less than thou owest,
Ride more than thou goest,
Learn more than thou trowest,
Set less than thou throwest;
Leave thy drink and thy whore,
And keep in-a-door,
And thou shalt have more
Than two tens to a score.
KENT
This is nothing, fool.
Fool


In [14]:
retrieved_docs = retriever.invoke("justice breyer")
len(retrieved_docs[0].page_content)

1393

In [15]:
print(retrieved_docs[0].page_content)

EDGAR
Frateretto calls me; and tells me
Nero is an angler in the lake of darkness.
Pray, innocent, and beware the foul fiend.
Fool
Prithee, nuncle, tell me whether a madman be a
gentleman or a yeoman?
KING LEAR
A king, a king!
Fool
No, he's a yeoman that has a gentleman to his son;
for he's a mad yeoman that sees his son a gentleman
before him.
KING LEAR
To have a thousand with red burning spits
Come hissing in upon 'em,--
EDGAR
The foul fiend bites my back.
Fool
He's mad that trusts in the tameness of a wolf, a
horse's health, a boy's love, or a whore's oath.
KING LEAR
It shall be done; I will arraign them straight.
To EDGAR

Come, sit thou here, most learned justicer;
To the Fool

Thou, sapient sir, sit here. Now, you she foxes!
EDGAR
Look, where he stands and glares!
Wantest thou eyes at trial, madam?
Come o'er the bourn, Bessy, to me,--
Fool
Her boat hath a leak,
And she must not speak
Why she dares not come over to thee.
EDGAR
The foul fiend haunts poor Tom in the voice of a
night

### 2-11-4. Ensemble Retrieval
The `EnsembleRetriever` takes a list of retrievers as input and ensemble the results of their `get_relevant_documents` methods and rerank the results based on the Reciprocal Rank Fusion () algorithm. The most common pattern, **Hybrid Search**, is to combine a sparse retriever, like BM25, with a dense retriever, like embedding similarity, because their strengths are complementary.

1. `langchain.retrievers.ensemble.EnsembleRetriever`

In [17]:
# !pip3 install rank_bm25
from langchain.retrievers import EnsembleRetriever
from langchain_community.retrievers import BM25Retriever
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings

doc_list_1 = [
    "I like apples",
    "I like oranges",
    "Apples and oranges are fruits",
]

# Initialize the bm25 retriever and faiss retriever
bm25_retriever = BM25Retriever.from_texts(
    doc_list_1, metadatas=[{"source": 1}] * len(doc_list_1)
)
bm25_retriever.k = 2

doc_list_2 = [
    "You like apples",
    "You like oranges",
]

embedding = OpenAIEmbeddings()
faiss_vectorstore = FAISS.from_texts(
    doc_list_2, embedding, metadatas=[{"source": 2}] * len(doc_list_2)
)
faiss_retriever = faiss_vectorstore.as_retriever(search_kwargs={"k": 2})

# Initialize the ensemble retriever
ensemble_retriever = EnsembleRetriever(
    retrievers=[bm25_retriever, faiss_retriever], weights=[0.5, 0.5]
)
ensemble_retriever

EnsembleRetriever(retrievers=[BM25Retriever(vectorizer=<rank_bm25.BM25Okapi object at 0x76e5fc203470>, k=2), VectorStoreRetriever(tags=['FAISS', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x76e5d01785f0>, search_kwargs={'k': 2})], weights=[0.5, 0.5])

In [18]:
docs = ensemble_retriever.invoke("apples")
docs

[Document(metadata={'source': 1}, page_content='I like apples'),
 Document(metadata={'source': 2}, page_content='You like apples'),
 Document(metadata={'source': 1}, page_content='Apples and oranges are fruits'),
 Document(metadata={'source': 2}, page_content='You like oranges')]

In [19]:
# Runtime configuration
from langchain_core.runnables import ConfigurableField

faiss_retriever = faiss_vectorstore.as_retriever(
    search_kwargs={"k": 2}
).configurable_fields(
    search_kwargs=ConfigurableField(
        id="search_kwargs_faiss",
        name="Search Kwargs",
        description="The search kwargs to use",
    )
)

ensemble_retriever = EnsembleRetriever(
    retrievers=[bm25_retriever, faiss_retriever], weights=[0.5, 0.5]
)
ensemble_retriever

EnsembleRetriever(retrievers=[BM25Retriever(vectorizer=<rank_bm25.BM25Okapi object at 0x76e5fc203470>, k=2), RunnableConfigurableFields(default=VectorStoreRetriever(tags=['FAISS', 'OpenAIEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x76e5d01785f0>, search_kwargs={'k': 2}), fields={'search_kwargs': ConfigurableField(id='search_kwargs_faiss', name='Search Kwargs', description='The search kwargs to use', annotation=None, is_shared=False)})], weights=[0.5, 0.5])

In [20]:
config = {"configurable": {"search_kwargs_faiss": {"k": 1}}}
docs = ensemble_retriever.invoke("apples", config=config)
docs

[Document(metadata={'source': 1}, page_content='I like apples'),
 Document(metadata={'source': 2}, page_content='You like apples'),
 Document(metadata={'source': 1}, page_content='Apples and oranges are fruits')]

### 2-11-5. Self-Query

1. `SelfQueryRetriever`

## 2-12. Tools

## 2-13. Agents

# 3. Llama

- [meta-llama/llama-models](https://github.com/meta-llama/llama-models) in GitHub
- [meta-llama/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct) in Hugging Face

In [3]:
# !pip3 install hugginface-hub
!huggingface-cli login --token --add-to-git-credential

Token is valid (permission: fineGrained).
[1m[31mCannot authenticate through git-credential as no helper is defined on your machine.
You might have to re-authenticate when pushing to the Hugging Face Hub.
Run the following command in your terminal in case you want to set the 'store' credential helper as default.

git config --global credential.helper store

Read https://git-scm.com/book/en/v2/Git-Tools-Credential-Storage for more details.[0m
Token has not been saved to git credential helper.
Your token has been saved to /home/yungshun317/.cache/huggingface/token
Login successful


In [5]:
!huggingface-cli download meta-llama/Meta-Llama-3.1-8B-Instruct --include "original/*" --local-dir models/meta-llama/Meta-Llama-3.1-8B-Instruct

Fetching 3 files:   0%|                                   | 0/3 [00:00<?, ?it/s]Downloading 'original/params.json' to 'models/meta-llama/Meta-Llama-3.1-8B-Instruct/.cache/huggingface/download/original/params.json.f1131204e79d0c09d2bac93f11569a8a655d68ba.incomplete'
Downloading 'original/tokenizer.model' to 'models/meta-llama/Meta-Llama-3.1-8B-Instruct/.cache/huggingface/download/original/tokenizer.model.82e9d31979e92ab929cd544440f129d9ecd797b69e327f80f17e1c50d5551b55.incomplete'
Downloading 'original/consolidated.00.pth' to 'models/meta-llama/Meta-Llama-3.1-8B-Instruct/.cache/huggingface/download/original/consolidated.00.pth.ab33d910f405204e5d388bc3521503584800461dc96808e287821dd451c1edac.incomplete'

tokenizer.model:   0%|                              | 0.00/2.18M [00:00<?, ?B/s][A

consolidated.00.pth:   0%|                          | 0.00/16.1G [00:00<?, ?B/s][A[A


original/params.json: 100%|█████████████████████| 199/199 [00:00<00:00, 920kB/s][A[A[A
Download complete. Moving