Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
2 changes: 1 addition & 1 deletion VERSION.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
2.29.0-rc0
2.30.0-rc0
4 changes: 2 additions & 2 deletions docs-website/docusaurus.config.js
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src=
beforeDefaultRemarkPlugins: [require('./src/remark/versionedReferenceLinks')],
versions: {
current: {
label: '2.29-unstable',
label: '2.30-unstable',
path: 'next',
banner: 'unreleased',
},
Expand Down Expand Up @@ -132,7 +132,7 @@ j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src=
exclude: ['**/_templates/**'],
versions: {
current: {
label: '2.29-unstable',
label: '2.30-unstable',
path: 'next',
banner: 'unreleased',
},
Expand Down

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
@@ -0,0 +1,181 @@
---
title: "ChatMessage Store"
id: experimental-chatmessage-store-api
description: "Storage for the chat messages."
slug: "/experimental-chatmessage-store-api"
---

<a id="haystack_experimental.chat_message_stores.in_memory"></a>

## Module haystack\_experimental.chat\_message\_stores.in\_memory

<a id="haystack_experimental.chat_message_stores.in_memory.InMemoryChatMessageStore"></a>

### InMemoryChatMessageStore

Stores chat messages in-memory.

The `chat_history_id` parameter is used as a unique identifier for each conversation or chat session.
It acts as a namespace that isolates messages from different sessions. Each `chat_history_id` value corresponds to a
separate list of `ChatMessage` objects stored in memory.

Typical usage involves providing a unique `chat_history_id` (for example, a session ID or conversation ID)
whenever you write, read, or delete messages. This ensures that chat messages from different
conversations do not overlap.

Usage example:
```python
from haystack.dataclasses import ChatMessage
from haystack_experimental.chat_message_stores.in_memory import InMemoryChatMessageStore

message_store = InMemoryChatMessageStore()

messages = [
ChatMessage.from_assistant("Hello, how can I help you?"),
ChatMessage.from_user("Hi, I have a question about Python. What is a Protocol?"),
]
message_store.write_messages(chat_history_id="user_456_session_123", messages=messages)
retrieved_messages = message_store.retrieve_messages(chat_history_id="user_456_session_123")

print(retrieved_messages)
```

<a id="haystack_experimental.chat_message_stores.in_memory.InMemoryChatMessageStore.__init__"></a>

#### InMemoryChatMessageStore.\_\_init\_\_

```python
def __init__(skip_system_messages: bool = True,
last_k: int | None = 10) -> None
```

Create an InMemoryChatMessageStore.

**Arguments**:

- `skip_system_messages`: Whether to skip storing system messages. Defaults to True.
- `last_k`: The number of last messages to retrieve. Defaults to 10 messages if not specified.

<a id="haystack_experimental.chat_message_stores.in_memory.InMemoryChatMessageStore.to_dict"></a>

#### InMemoryChatMessageStore.to\_dict

```python
def to_dict() -> dict[str, Any]
```

Serializes the component to a dictionary.

**Returns**:

Dictionary with serialized data.

<a id="haystack_experimental.chat_message_stores.in_memory.InMemoryChatMessageStore.from_dict"></a>

#### InMemoryChatMessageStore.from\_dict

```python
@classmethod
def from_dict(cls, data: dict[str, Any]) -> "InMemoryChatMessageStore"
```

Deserializes the component from a dictionary.

**Arguments**:

- `data`: The dictionary to deserialize from.

**Returns**:

The deserialized component.

<a id="haystack_experimental.chat_message_stores.in_memory.InMemoryChatMessageStore.count_messages"></a>

#### InMemoryChatMessageStore.count\_messages

```python
def count_messages(chat_history_id: str) -> int
```

Returns the number of chat messages stored in this store.

**Arguments**:

- `chat_history_id`: The chat history id for which to count messages.

**Returns**:

The number of messages.

<a id="haystack_experimental.chat_message_stores.in_memory.InMemoryChatMessageStore.write_messages"></a>

#### InMemoryChatMessageStore.write\_messages

```python
def write_messages(chat_history_id: str, messages: list[ChatMessage]) -> int
```

Writes chat messages to the ChatMessageStore.

**Arguments**:

- `chat_history_id`: The chat history id under which to store the messages.
- `messages`: A list of ChatMessages to write.

**Raises**:

- `ValueError`: If messages is not a list of ChatMessages.

**Returns**:

The number of messages written.

<a id="haystack_experimental.chat_message_stores.in_memory.InMemoryChatMessageStore.retrieve_messages"></a>

#### InMemoryChatMessageStore.retrieve\_messages

```python
def retrieve_messages(chat_history_id: str,
last_k: int | None = None) -> list[ChatMessage]
```

Retrieves all stored chat messages.

**Arguments**:

- `chat_history_id`: The chat history id from which to retrieve messages.
- `last_k`: The number of last messages to retrieve. If unspecified, the last_k parameter passed
to the constructor will be used.

**Raises**:

- `ValueError`: If last_k is not None and is less than 0.

**Returns**:

A list of chat messages.

<a id="haystack_experimental.chat_message_stores.in_memory.InMemoryChatMessageStore.delete_messages"></a>

#### InMemoryChatMessageStore.delete\_messages

```python
def delete_messages(chat_history_id: str) -> None
```

Deletes all stored chat messages.

**Arguments**:

- `chat_history_id`: The chat history id from which to delete messages.

<a id="haystack_experimental.chat_message_stores.in_memory.InMemoryChatMessageStore.delete_all_messages"></a>

#### InMemoryChatMessageStore.delete\_all\_messages

```python
def delete_all_messages() -> None
```

Deletes all stored chat messages from all chat history ids.

Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
---
title: "Generators"
id: experimental-generators-api
description: "Enables text generation using LLMs."
slug: "/experimental-generators-api"
---

<a id="haystack_experimental.components.generators.chat.openai"></a>

## Module haystack\_experimental.components.generators.chat.openai

<a id="haystack_experimental.components.generators.chat.openai.OpenAIChatGenerator"></a>

### OpenAIChatGenerator

An OpenAI chat-based text generator component that supports hallucination risk scoring.

This is based on the paper
[LLMs are Bayesian, in Expectation, not in Realization](https://arxiv.org/abs/2507.11768).

## Usage Example:

```python
from haystack.dataclasses import ChatMessage

from haystack_experimental.utils.hallucination_risk_calculator.dataclasses import HallucinationScoreConfig
from haystack_experimental.components.generators.chat.openai import OpenAIChatGenerator

# Evidence-based Example
llm = OpenAIChatGenerator(model="gpt-4o")
rag_result = llm.run(
messages=[
ChatMessage.from_user(
text="Task: Answer strictly based on the evidence provided below.
"
"Question: Who won the Nobel Prize in Physics in 2019?
"
"Evidence:
"
"- Nobel Prize press release (2019): James Peebles (1/2); Michel Mayor & Didier Queloz (1/2).
"
"Constraints: If evidence is insufficient or conflicting, refuse."
)
],
hallucination_score_config=HallucinationScoreConfig(skeleton_policy="evidence_erase"),
)
print(f"Decision: {rag_result['replies'][0].meta['hallucination_decision']}")
print(f"Risk bound: {rag_result['replies'][0].meta['hallucination_risk']:.3f}")
print(f"Rationale: {rag_result['replies'][0].meta['hallucination_rationale']}")
print(f"Answer:
{rag_result['replies'][0].text}")
print("---")
```

<a id="haystack_experimental.components.generators.chat.openai.OpenAIChatGenerator.run"></a>

#### OpenAIChatGenerator.run

```python
@component.output_types(replies=list[ChatMessage])
def run(
messages: list[ChatMessage],
streaming_callback: StreamingCallbackT | None = None,
generation_kwargs: dict[str, Any] | None = None,
*,
tools: ToolsType | None = None,
tools_strict: bool | None = None,
hallucination_score_config: HallucinationScoreConfig | None = None
) -> dict[str, list[ChatMessage]]
```

Invokes chat completion based on the provided messages and generation parameters.

**Arguments**:

- `messages`: A list of ChatMessage instances representing the input messages.
- `streaming_callback`: A callback function that is called when a new token is received from the stream.
- `generation_kwargs`: Additional keyword arguments for text generation. These parameters will
override the parameters passed during component initialization.
For details on OpenAI API parameters, see [OpenAI documentation](https://platform.openai.com/docs/api-reference/chat/create).
- `tools`: A list of Tool and/or Toolset objects, or a single Toolset for which the model can prepare calls.
If set, it will override the `tools` parameter provided during initialization.
- `tools_strict`: Whether to enable strict schema adherence for tool calls. If set to `True`, the model will follow exactly
the schema provided in the `parameters` field of the tool definition, but this may increase latency.
If set, it will override the `tools_strict` parameter set during component initialization.
- `hallucination_score_config`: If provided, the generator will evaluate the hallucination risk of its responses using
the OpenAIPlanner and annotate each response with hallucination metrics.
This involves generating multiple samples and analyzing their consistency, which may increase
latency and cost. Use this option when you need to assess the reliability of the generated content
in scenarios where accuracy is critical.
For details, see the [research paper](https://arxiv.org/abs/2507.11768)

**Returns**:

A dictionary with the following key:
- `replies`: A list containing the generated responses as ChatMessage instances. If hallucination
scoring is enabled, each message will include additional metadata:
- `hallucination_decision`: "ANSWER" if the model decided to answer, "REFUSE" if it abstained.
- `hallucination_risk`: The EDFL hallucination risk bound.
- `hallucination_rationale`: The rationale behind the hallucination decision.

<a id="haystack_experimental.components.generators.chat.openai.OpenAIChatGenerator.run_async"></a>

#### OpenAIChatGenerator.run\_async

```python
@component.output_types(replies=list[ChatMessage])
async def run_async(
messages: list[ChatMessage],
streaming_callback: StreamingCallbackT | None = None,
generation_kwargs: dict[str, Any] | None = None,
*,
tools: ToolsType | None = None,
tools_strict: bool | None = None,
hallucination_score_config: HallucinationScoreConfig | None = None
) -> dict[str, list[ChatMessage]]
```

Asynchronously invokes chat completion based on the provided messages and generation parameters.

This is the asynchronous version of the `run` method. It has the same parameters and return values
but can be used with `await` in async code.

**Arguments**:

- `messages`: A list of ChatMessage instances representing the input messages.
- `streaming_callback`: A callback function that is called when a new token is received from the stream.
Must be a coroutine.
- `generation_kwargs`: Additional keyword arguments for text generation. These parameters will
override the parameters passed during component initialization.
For details on OpenAI API parameters, see [OpenAI documentation](https://platform.openai.com/docs/api-reference/chat/create).
- `tools`: A list of Tool and/or Toolset objects, or a single Toolset for which the model can prepare calls.
If set, it will override the `tools` parameter provided during initialization.
- `tools_strict`: Whether to enable strict schema adherence for tool calls. If set to `True`, the model will follow exactly
the schema provided in the `parameters` field of the tool definition, but this may increase latency.
If set, it will override the `tools_strict` parameter set during component initialization.
- `hallucination_score_config`: If provided, the generator will evaluate the hallucination risk of its responses using
the OpenAIPlanner and annotate each response with hallucination metrics.
This involves generating multiple samples and analyzing their consistency, which may increase
latency and cost. Use this option when you need to assess the reliability of the generated content
in scenarios where accuracy is critical.
For details, see the [research paper](https://arxiv.org/abs/2507.11768)

**Returns**:

A dictionary with the following key:
- `replies`: A list containing the generated responses as ChatMessage instances. If hallucination
scoring is enabled, each message will include additional metadata:
- `hallucination_decision`: "ANSWER" if the model decided to answer, "REFUSE" if it abstained.
- `hallucination_risk`: The EDFL hallucination risk bound.
- `hallucination_rationale`: The rationale behind the hallucination decision.

Loading
Loading