Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 11 additions & 1 deletion Changelog.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,14 @@
### [0.0.39] - unreleased
### [0.0.40] - unreleased

Added:

- Use YOLO profile merge instead of multiple profile merges, reduce tokens cost ~30%

Fixed:

- Randomly Chinese Profile problem

### [0.0.39] - 2025/8/9

**Added**

Expand Down
12 changes: 9 additions & 3 deletions readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,14 +50,20 @@



Memobase is a **user profile-based memory system** designed to bring long-term user memory to your Generative AI (GenAI) applications. Whether you're building virtual companions, educational tools, or personalized assistants, Memobase empowers your AI to **remember**, **understand**, and **evolve** with your users.
Memobase is a **user profile-based memory system** designed to bring long-term user memory to your LLM applications. Whether you're building virtual companions, educational tools, or personalized assistants, Memobase empowers your AI to **remember**, **understand**, and **evolve** with your users.



Memobase can provide you structured profiles of users, check out the [result](./docs/experiments/900-chats/readme.md) (compared with [mem0](https://github.com/mem0ai/mem0)) from a 900-turns real-world chatting:
Memobase offers the perfect balance for your product among various memory solutions. At Memobase, we focus on three key metrics simultaneously:

- **Performance**: Although Memobase is not specifically designed for RAG/search tasks, it still achieves top-tier search performance in the LOCOMO benchmark.
- **LLM Cost**: Memobase includes a built-in buffer for each user to batch-process their chats, allowing the overhead to be distributed efficiently. Additionally, we carefully design our prompts and workflows, ensuring there are no "agents" in the system that could lead to excessive costs.
- **Latency**: Memobase works similarly to the memory system behind ChatGPT: for each user, there is always a user profile and event timeline available. This allows you to access the most important memories of a user without any pre-processing, but only few SQL operations, keeping online latency under 100ms.



Check out the profile [result](./docs/experiments/900-chats/readme.md) (compared with [mem0](https://github.com/mem0ai/mem0)) from a 900-turns real-world chatting:

<details>
<summary>Partial Profile Output</summary>

Expand Down Expand Up @@ -90,7 +96,7 @@ Memobase can provide you structured profiles of users, check out the [result](./
</details>

## 🎉 Recent Updates
- `0.0.38`: we updated the workflows in Memobase, reducing the insert cost by 30%
- `0.0.40`: we updated the internal workflows in Memobase, reducing the number of LLM calls in a single run from approximately 3-10 times to a fixed 3 times, which reduces token costs by approximately 40-50%. (Consider updating your Memobase version!)
- `0.0.37`: we added fine-grained event gist, enabling the detailed search on users' timeline. [Re-ran the LOCOMO benchmark](./docs/experiments/locomo-benchmark) and we're SOTA!
- `0.0.36`: we updated the search of `context` api, making the search take between 500~1000ms (depending on the embedding API you're using). Also, you can [pass a prompt template](https://docs.memobase.io/api-reference/prompt/get_context#parameter-customize-context-prompt) to the `context` api to pack memories directly into prompt.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,14 @@
from ....utils import get_blob_str, get_encoded_tokens
from ....models.blob import Blob
from ....models.utils import Promise, CODE
from ....models.response import IdsData, ChatModalResponse
from ....models.response import IdsData, ChatModalResponse, UserProfilesData
from ...profile import add_update_delete_user_profiles
from ...event import append_user_event
from ...profile import get_user_profiles
from .extract import extract_topics
from .merge import merge_or_valid_new_memos

# from .merge import merge_or_valid_new_memos
from .merge_yolo import merge_or_valid_new_memos
from .summary import re_summary
from .organize import organize_profiles
from .types import MergeAddResult
Expand Down Expand Up @@ -47,7 +50,14 @@ async def process_blobs(
return p
project_profiles = p.data()

p = await entry_chat_summary(user_id, project_id, blobs, project_profiles)
p = await get_user_profiles(user_id, project_id)
if not p.ok():
return p
current_user_profiles = p.data()

p = await entry_chat_summary(
user_id, project_id, blobs, project_profiles, current_user_profiles
)
if not p.ok():
return p
user_memo_str = p.data().strip()
Expand All @@ -63,8 +73,12 @@ async def process_blobs(
)

processing_results = await asyncio.gather(
process_profile_res(user_id, project_id, user_memo_str, project_profiles),
process_event_res(user_id, project_id, user_memo_str, project_profiles),
process_profile_res(
user_id, project_id, user_memo_str, project_profiles, current_user_profiles
),
process_event_res(
user_id, project_id, user_memo_str, project_profiles, current_user_profiles
),
)

profile_results: Promise = processing_results[0]
Expand Down Expand Up @@ -109,9 +123,12 @@ async def process_profile_res(
project_id: str,
user_memo_str: str,
project_profiles: ProfileConfig,
current_user_profiles: UserProfilesData,
) -> Promise[tuple[MergeAddResult, list[dict]]]:

p = await extract_topics(user_id, project_id, user_memo_str, project_profiles)
p = await extract_topics(
user_id, project_id, user_memo_str, project_profiles, current_user_profiles
)
if not p.ok():
return p
extracted_data = p.data()
Expand Down Expand Up @@ -170,6 +187,7 @@ async def process_event_res(
project_id: str,
memo_str: str,
config: ProfileConfig,
current_user_profiles: UserProfilesData,
) -> Promise[list | None]:
p = await tag_event(project_id, config, memo_str)
if not p.ok():
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,16 +8,25 @@
from ....prompts.profile_init_utils import read_out_event_tags
from ....prompts.utils import tag_chat_blobs_in_order_xml
from .types import FactResponse, PROMPTS
from ....models.response import UserProfilesData
from .utils import pack_current_user_profiles


async def entry_chat_summary(
user_id: str, project_id: str, blobs: list[Blob], project_profiles: ProfileConfig
user_id: str,
project_id: str,
blobs: list[Blob],
project_profiles: ProfileConfig,
current_user_profiles: UserProfilesData,
) -> Promise[str]:
assert all(b.type == BlobType.chat for b in blobs), "All blobs must be chat blobs"
USE_LANGUAGE = project_profiles.language or CONFIG.language
project_profiles_slots = read_out_profile_config(
project_profiles, PROMPTS[USE_LANGUAGE]["profile"].CANDIDATE_PROFILE_TOPICS
CURRENT_PROFILE_INFO = pack_current_user_profiles(
current_user_profiles, project_profiles
)

USE_LANGUAGE = CURRENT_PROFILE_INFO["use_language"]
project_profiles_slots = CURRENT_PROFILE_INFO["project_profile_slots"]

prompt = PROMPTS[USE_LANGUAGE]["entry_summary"]
event_summary_theme = (
project_profiles.event_theme_requirement or CONFIG.event_theme_requirement
Expand All @@ -33,7 +42,7 @@ async def entry_chat_summary(
blob_strs = tag_chat_blobs_in_order_xml(blobs)
r = await llm_complete(
project_id,
prompt.pack_input(blob_strs),
prompt.pack_input(CURRENT_PROFILE_INFO["already_topics_prompt"], blob_strs),
system_prompt=prompt.get_prompt(
profile_topics_str,
event_attriubtes_str,
Expand All @@ -43,4 +52,9 @@ async def entry_chat_summary(
model=CONFIG.summary_llm_model,
**prompt.get_kwargs(),
)

# print(
# prompt.pack_input(CURRENT_PROFILE_INFO["already_topics_prompt"], blob_strs),
# r.data(),
# )
return r
88 changes: 17 additions & 71 deletions src/server/api/memobase_server/controllers/modal/chat/extract.py
Original file line number Diff line number Diff line change
@@ -1,22 +1,15 @@
import asyncio
from ....env import CONFIG, ContanstTable, TRACE_LOG
from ....utils import truncate_string
from ....models.utils import Promise
from ....models.blob import Blob, BlobType
from ....models.response import AIUserProfiles, CODE
from ....models.response import AIUserProfiles, CODE, UserProfilesData
from ....llms import llm_complete
from ....prompts.utils import (
tag_chat_blobs_in_order_xml,
attribute_unify,
parse_string_into_profiles,
parse_string_into_merge_action,
)
from ....prompts.profile_init_utils import read_out_profile_config, UserProfileTopic
from ...profile import get_user_profiles
from ...project import ProfileConfig

# from ...project impor
from .types import FactResponse, PROMPTS
from .utils import pack_current_user_profiles


def merge_by_topic_sub_topics(new_facts: list[FactResponse]):
Expand All @@ -31,73 +24,26 @@ def merge_by_topic_sub_topics(new_facts: list[FactResponse]):


async def extract_topics(
user_id: str, project_id: str, user_memo: str, project_profiles: ProfileConfig
user_id: str,
project_id: str,
user_memo: str,
project_profiles: ProfileConfig,
current_user_profiles: UserProfilesData,
) -> Promise[dict]:
p = await get_user_profiles(user_id, project_id)
if not p.ok():
return p
profiles = p.data().profiles
USE_LANGUAGE = project_profiles.language or CONFIG.language
STRICT_MODE = (
project_profiles.profile_strict_mode
if project_profiles.profile_strict_mode is not None
else CONFIG.profile_strict_mode
)

project_profiles_slots = read_out_profile_config(
project_profiles, PROMPTS[USE_LANGUAGE]["profile"].CANDIDATE_PROFILE_TOPICS
profiles = current_user_profiles.profiles
CURRENT_PROFILE_INFO = pack_current_user_profiles(
current_user_profiles, project_profiles
)
if STRICT_MODE:
allowed_topic_subtopics = set()
for p in project_profiles_slots:
for st in p.sub_topics:
allowed_topic_subtopics.add(
(attribute_unify(p.topic), attribute_unify(st["name"]))
)
USE_LANGUAGE = CURRENT_PROFILE_INFO["use_language"]
STRICT_MODE = CURRENT_PROFILE_INFO["strict_mode"]

if len(profiles):
already_topics_subtopics = set(
[
(
attribute_unify(p.attributes[ContanstTable.topic]),
attribute_unify(p.attributes[ContanstTable.sub_topic]),
)
for p in profiles
]
)
already_topic_subtopics_values = {
(
attribute_unify(p.attributes[ContanstTable.topic]),
attribute_unify(p.attributes[ContanstTable.sub_topic]),
): p.content
for p in profiles
}
if STRICT_MODE:
already_topics_subtopics = already_topics_subtopics.intersection(
allowed_topic_subtopics
)
already_topic_subtopics_values = {
k: already_topic_subtopics_values[k] for k in already_topics_subtopics
}
already_topics_subtopics = sorted(already_topics_subtopics)
already_topics_prompt = "\n".join(
[
f"- {topic}{CONFIG.llm_tab_separator}{sub_topic}{CONFIG.llm_tab_separator}{truncate_string(already_topic_subtopics_values[(topic, sub_topic)], 5)}"
for topic, sub_topic in already_topics_subtopics
]
)
TRACE_LOG.info(
project_id,
user_id,
f"Already have {len(profiles)} profiles, {len(already_topics_subtopics)} topics",
)
else:
already_topics_prompt = ""
project_profiles_slots = CURRENT_PROFILE_INFO["project_profile_slots"]

p = await llm_complete(
project_id,
PROMPTS[USE_LANGUAGE]["extract"].pack_input(
already_topics_prompt,
CURRENT_PROFILE_INFO["already_topics_prompt"],
user_memo,
strict_mode=STRICT_MODE,
),
Expand All @@ -112,7 +58,7 @@ async def extract_topics(
results = p.data()
# print(
# PROMPTS[USE_LANGUAGE]["extract"].pack_input(
# already_topics_prompt,
# CURRENT_PROFILE_INFO["already_topics_prompt"],
# user_memo,
# strict_mode=STRICT_MODE,
# )
Expand Down Expand Up @@ -145,11 +91,11 @@ async def extract_topics(
fact_attributes = []

for nf in new_facts:
if STRICT_MODE:
if CURRENT_PROFILE_INFO["allowed_topic_subtopics"] is not None:
if (
nf[ContanstTable.topic],
nf[ContanstTable.sub_topic],
) not in allowed_topic_subtopics:
) not in CURRENT_PROFILE_INFO["allowed_topic_subtopics"]:
continue
fact_contents.append(nf["memo"])
fact_attributes.append(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -182,17 +182,18 @@ async def handle_profile_merge_or_valid(
}
)
elif update_response["action"] == "ABORT":
oneline_response = r.data().replace("\n", " ")
if runtime_profile is None:
TRACE_LOG.info(
project_id,
user_id,
f"Invalid profile: {KEY}::{profile_content}, abort it\n<raw_response>\n{r.data()}\n</raw_response>",
f"Invalid profile: {KEY}::{profile_content}. <raw_response> {oneline_response} </raw_response>",
)
else:
TRACE_LOG.info(
project_id,
user_id,
f"Invalid merge: {runtime_profile.attributes}, {profile_content}, abort it\n<raw_response>\n{r.data()}\n</raw_response>",
f"Invalid merge: {runtime_profile.attributes}, {profile_content}. <raw_response> {oneline_response} </raw_response>",
)
# session_merge_validate_results["delete"].append(runtime_profile.id)
return Promise.resolve(None)
Expand Down
Loading