docs: Update LangChain Expression Language documentation (#395)

davidmigloz · Apr 30, 2024 · 6ce75e5 · 6ce75e5
1 parent 8bb2b8e
commit 6ce75e5
Show file tree

Hide file tree

Showing 26 changed files with 2,018 additions and 664 deletions.
diff --git a/docs/_sidebar.md b/docs/_sidebar.md
@@ -1,14 +1,22 @@
-- [Get started](/)
+- [Get started](README.md)
   - [Installation](/get_started/installation.md)
   - [Quickstart](/get_started/quickstart.md)
   - [Security](/get_started/security.md)
 - [LangChain Expression Language](/expression_language/expression_language.md)
   - [Get started](/expression_language/get_started.md)
-  - [Interface](/expression_language/interface.md)
+  - [Runnable interface](/expression_language/interface.md)
+  - [Primitives](/expression_language/primitives.md)
+    - [Sequence: Chaining runnables](/expression_language/primitives/sequence.md)
+    - [Map: Formatting inputs & concurrency](/expression_language/primitives/map.md)
+    - [Passthrough: Passing inputs through](/expression_language/primitives/passthrough.md)
+    - [Mapper: Mapping inputs](/expression_language/primitives/mapper.md)
+    - [Function: Run custom logic](/expression_language/primitives/function.md)
+    - [Binding: Configuring runnables](/expression_language/primitives/binding.md)
+    - [Router: Routing inputs](/expression_language/primitives/router.md)
+  - [Streaming](/expression_language/streaming.md)
   - Cookbook
     - [Prompt + LLM](/expression_language/cookbook/prompt_llm_parser.md)
     - [Multiple chains](/expression_language/cookbook/multiple_chains.md)
-    - [Route logic based on input](/expression_language/cookbook/routing.md)
     - [Adding memory](/expression_language/cookbook/adding_memory.md)
     - [Retrieval](/expression_language/cookbook/retrieval.md)
     - [Using Tools](/expression_language/cookbook/tools.md)

diff --git a/docs/expression_language/expression_language.md b/docs/expression_language/expression_language.md
@@ -2,7 +2,7 @@
 
 LangChain Expression Language, or LCEL, is a declarative way to easily compose chains together. LCEL was designed from day 1 to support putting prototypes in production, with no code changes, from the simplest “prompt + LLM” chain to the most complex chains (we’ve seen folks successfully run LCEL chains with 100s of steps in production). To highlight a few of the reasons you might want to use LCEL:
 
-- **Streaming support:** When you build your chains with LCEL you get the best possible time-to-first-token (time elapsed until the first chunk of output comes out). For some chains this means eg. we stream tokens straight from an LLM to a streaming output parser, and you get back parsed, incremental chunks of output at the same rate as the LLM provider outputs the raw tokens.
-- **Optimized parallel execution:** Whenever your LCEL chains have steps that can be executed in parallel (eg if you fetch documents from multiple retrievers) we automatically do it for the smallest possible latency.
+- **First-class streaming support:** When you build your chains with LCEL you get the best possible time-to-first-token (time elapsed until the first chunk of output comes out). For some chains this means eg. we stream tokens straight from an LLM to a streaming output parser, and you get back parsed, incremental chunks of output at the same rate as the LLM provider outputs the raw tokens.
+- **Optimized concurrent execution:** Whenever your LCEL chains have steps that can be executed concurrently (eg if you fetch documents from multiple retrievers) we automatically do it for the smallest possible latency.
 - **Retries and fallbacks:** Configure retries and fallbacks for any part of your LCEL chain. This is a great way to make your chains more reliable at scale.
 - **Access intermediate results:** For more complex chains it’s often very useful to access the results of intermediate steps even before the final output is produced. This can be used to let end-users know something is happening, or even just to debug your chain.
diff --git a/docs/expression_language/get_started.md b/docs/expression_language/get_started.md
@@ -96,15 +96,50 @@ To follow the steps along:
 3. The `model` component takes the generated prompt, and passes into the OpenAI chat model for evaluation. The generated output from the model is a `ChatMessage` object (specifically an `AIChatMessage`).
 4. Finally, the `outputParser` component takes in a `ChatMessage`, and transforms this into a `String`, which is returned from the invoke method.
 
+![Pipeline](img/pipeline.png)
+
 Note that if you’re curious about the output of any components, you can always test out a smaller version of the chain such as `promptTemplate` or `promptTemplate.pipe(model)` to see the intermediate results.
 
+```dart
+final input = {'topic': 'ice cream'};
+
+final res1 = await promptTemplate.invoke(input);
+print(res1.toChatMessages());
+// [HumanChatMessage{
+//   content: ChatMessageContentText{
+//     text: Tell me a joke about ice cream,
+//   },
+// }]
+
+final res2 = await promptTemplate.pipe(model).invoke(input);
+print(res2);
+// ChatResult{
+//   id: chatcmpl-9J37Tnjm1dGUXqXBF98k7jfexATZW,
+//   output: AIChatMessage{
+//     content: Why did the ice cream cone go to therapy? Because it had too many sprinkles of emotional issues!,
+//   },
+//   finishReason: FinishReason.stop,
+//   metadata: {
+//     model: gpt-3.5-turbo-0125,
+//     created: 1714327251,
+//     system_fingerprint: fp_3b956da36b
+//   },
+//   usage: LanguageModelUsage{
+//     promptTokens: 14,
+//     promptBillableCharacters: null,
+//     responseTokens: 21,
+//     responseBillableCharacters: null,
+//     totalTokens: 35
+//     },
+//   streaming: false
+// }
+```
+
 ## RAG Search Example
 
 For our next example, we want to run a retrieval-augmented generation chain to add some context when responding to questions.
 
 ```dart
-final openaiApiKey = Platform.environment['OPENAI_API_KEY'];
-
 // 1. Create a vector store and add documents to it
 final vectorStore = MemoryVectorStore(
   embeddings: OpenAIEmbeddings(apiKey: openaiApiKey),
@@ -116,44 +151,79 @@ await vectorStore.addDocuments(
   ],
 );
 
-// 2. Construct a RAG prompt template
+// 2. Define the retrieval chain
+final retriever = vectorStore.asRetriever();
+final setupAndRetrieval = Runnable.fromMap<String>({
+  'context': retriever.pipe(
+    Runnable.mapInput((docs) => docs.map((d) => d.pageContent).join('\n')),
+  ),
+  'question': Runnable.passthrough(),
+});
+
+// 3. Construct a RAG prompt template
 final promptTemplate = ChatPromptTemplate.fromTemplates([
   (ChatMessageType.system, 'Answer the question based on only the following context:\n{context}'),
   (ChatMessageType.human, '{question}'),
 ]);
 
-// 3. Create a Runnable that combines the retrieved documents into a single string
-final docCombiner = Runnable.fromFunction<List<Document>, String>((docs, _) {
-  return docs.map((final d) => d.pageContent).join('\n');
-});
-
-// 4. Define the RAG pipeline
-final chain = Runnable.fromMap<String>({
-  'context': vectorStore.asRetriever().pipe(docCombiner),
-  'question': Runnable.passthrough(),
-})
+// 4. Define the final chain
+final model = ChatOpenAI(apiKey: openaiApiKey);
+const outputParser = StringOutputParser<ChatResult>();
+final chain = setupAndRetrieval
     .pipe(promptTemplate)
-    .pipe(ChatOpenAI(apiKey: openaiApiKey))
-    .pipe(StringOutputParser());
+    .pipe(model)
+    .pipe(outputParser);
 
 // 5. Run the pipeline
 final res = await chain.invoke('Who created LangChain.dart?');
 print(res);
 // David created LangChain.dart
 ```
 
-In this chain we add some extra logic around retrieving context from a vector store.
+In this case, the composed chain is:
+
+```dart
+final chain = setupAndRetrieval
+    .pipe(promptTemplate)
+    .pipe(model)
+    .pipe(outputParser);
+```
+
+To explain this, we first can see that the prompt template above takes in `context` and `question` as values to be substituted in the prompt. Before building the prompt template, we want to retrieve relevant documents to the search and include them as part of the context.
+
+As a preliminary step, we’ve set up the retriever using an in memory store, which can retrieve documents based on a query. This is a runnable component as well that can be chained together with other components, but you can also try to run it separately:
 
-We first instantiate our vector store and add some documents to it. Then we define our prompt, which takes in two input variables:
+```dart
+final res1 = await retriever.invoke('Who created LangChain.dart?');
+print(res1);
+// [Document{pageContent: David ported LangChain to Dart in LangChain.dart}, 
+// Document{pageContent: LangChain was created by Harrison, metadata: {}}]
+```
+
+We then use the `RunnableMap` to prepare the expected inputs into the prompt by using a string containing the combined retrieved documents as well as the original user question, using the `retriever` for document search, a `RunnableMapInput` to combine the documents and `RunnablePassthrough` to pass the user's question:
 
-- `context` -> this is a string which is returned from our vector store based on a semantic search from the input.
-- `question` -> this is the question we want to ask.
+```dart
+final setupAndRetrieval = Runnable.fromMap<String>({
+  'context': retriever.pipe(
+    Runnable.mapInput((docs) => docs.map((d) => d.pageContent).join('\n')),
+  ),
+  'question': Runnable.passthrough(),
+});
+```
 
-In our `chain`, we use a `RunnableMap` which is special type of runnable that takes an object of runnables and executes them all in parallel. It then returns an object with the same keys as the input object, but with the values replaced with the output of the runnables.
+To review, the complete chain is:
 
-In our case, it has two sub-chains to get the data required by our prompt:
+```dart
+final chain = setupAndRetrieval
+    .pipe(promptTemplate)
+    .pipe(model)
+    .pipe(outputParser);
+```
 
-- `context` -> this is a `RunnableFunction` which takes the input from the `.invoke()` call, makes a request to our vector store, and returns the retrieved documents combined in a single String.
-- `question` -> this uses a `RunnablePassthrough` which simply passes whatever the input was through to the next step, and in our case it returns it to the key in the object we defined.
+With the flow being:
+1. The first steps create a `RunnableMap` object with two entries. The first entry, `context` will include the combined document results fetched by the retriever. The second entry, `question` will contain the user’s original question. To pass on the `question`, we use `RunnablePassthrough` to copy this entry.
+2. Feed the map from the step above to the `promptTemplate` component. It then takes the user input which is `question` as well as the retrieved documents which is `context` to construct a prompt and output a `PromptValue`.
+3. The `model` component takes the generated prompt, and passes into the OpenAI LLM model for evaluation. The generated `output` from the model is a `ChatResult` object.
+4. Finally, the `outputParser` component takes in the `ChatResult`, and transforms this into a Dart String, which is returned from the invoke method.
 
-Finally, we chain together the prompt, model, and output parser as before.
+![RAG Pipeline](img/rag_pipeline.png)
diff --git a/docs/expression_language/img/pipeline.png b/docs/expression_language/img/pipeline.png
diff --git a/docs/expression_language/img/rag_pipeline.png b/docs/expression_language/img/rag_pipeline.png