memodb-io · gusye1234 · Aug 11, 2025 · Jul 26, 2025 · Aug 2, 2025 · Aug 11, 2025
@@ -1,4 +1,14 @@
-### [0.0.39] - unreleased
+### [0.0.40] - unreleased
+
+Added:
+
+- Use YOLO profile merge instead of multiple profile merges, reduce tokens cost ~30%
+
+Fixed:
+
+- Randomly Chinese Profile problem
+
+### [0.0.39] - 2025/8/9
 
 **Added**
 

@@ -50,14 +50,20 @@
 
 
 
-Memobase is a **user profile-based memory system** designed to bring long-term user memory to your Generative AI (GenAI) applications. Whether you're building virtual companions, educational tools, or personalized assistants, Memobase empowers your AI to **remember**,  **understand**, and **evolve** with your users.
+Memobase is a **user profile-based memory system** designed to bring long-term user memory to your LLM applications. Whether you're building virtual companions, educational tools, or personalized assistants, Memobase empowers your AI to **remember**,  **understand**, and **evolve** with your users.
 
 
 
-Memobase can provide you structured profiles of users, check out the [result](./docs/experiments/900-chats/readme.md) (compared with [mem0](https://github.com/mem0ai/mem0)) from a 900-turns real-world chatting:
+Memobase offers the perfect balance for your product among various memory solutions. At Memobase, we focus on three key metrics simultaneously:
 
+- **Performance**: Although Memobase is not specifically designed for RAG/search tasks, it still achieves top-tier search performance in the LOCOMO benchmark.
+- **LLM Cost**: Memobase includes a built-in buffer for each user to batch-process their chats, allowing the overhead to be distributed efficiently. Additionally, we carefully design our prompts and workflows, ensuring there are no "agents" in the system that could lead to excessive costs.
+- **Latency**: Memobase works similarly to the memory system behind ChatGPT: for each user, there is always a user profile and event timeline available. This allows you to access the most important memories of a user without any pre-processing, but only few SQL operations, keeping online latency under 100ms.
 
 
+
+Check out the profile [result](./docs/experiments/900-chats/readme.md) (compared with [mem0](https://github.com/mem0ai/mem0)) from a 900-turns real-world chatting:
+
 <details>
 <summary>Partial Profile Output</summary>
 
@@ -90,7 +96,7 @@ Memobase can provide you structured profiles of users, check out the [result](./
 </details>
 
 ## 🎉 Recent Updates
-- `0.0.38`: we updated the workflows in Memobase, reducing the insert cost by 30%
+- `0.0.40`: we updated the internal workflows in Memobase, reducing the number of LLM calls in a single run from approximately 3-10 times to a fixed 3 times, which reduces token costs by approximately 40-50%. (Consider updating your Memobase version!)
 - `0.0.37`: we added fine-grained event gist, enabling the detailed search on users' timeline. [Re-ran the LOCOMO benchmark](./docs/experiments/locomo-benchmark) and we're SOTA!
 - `0.0.36`: we updated the search of `context` api, making the search take between 500~1000ms (depending on the embedding API you're using). Also, you can [pass a prompt template](https://docs.memobase.io/api-reference/prompt/get_context#parameter-customize-context-prompt) to the `context` api to pack memories directly into prompt.
 

@@ -5,11 +5,14 @@
 from ....utils import get_blob_str, get_encoded_tokens
 from ....models.blob import Blob
 from ....models.utils import Promise, CODE
-from ....models.response import IdsData, ChatModalResponse
+from ....models.response import IdsData, ChatModalResponse, UserProfilesData
 from ...profile import add_update_delete_user_profiles
 from ...event import append_user_event
+from ...profile import get_user_profiles
 from .extract import extract_topics
-from .merge import merge_or_valid_new_memos
+
+# from .merge import merge_or_valid_new_memos
+from .merge_yolo import merge_or_valid_new_memos
 from .summary import re_summary
 from .organize import organize_profiles
 from .types import MergeAddResult
@@ -47,7 +50,14 @@ async def process_blobs(
         return p
     project_profiles = p.data()
 
-    p = await entry_chat_summary(user_id, project_id, blobs, project_profiles)
+    p = await get_user_profiles(user_id, project_id)
+    if not p.ok():
+        return p
+    current_user_profiles = p.data()
+
+    p = await entry_chat_summary(
+        user_id, project_id, blobs, project_profiles, current_user_profiles
+    )
     if not p.ok():
         return p
     user_memo_str = p.data().strip()
@@ -63,8 +73,12 @@ async def process_blobs(
         )
 
     processing_results = await asyncio.gather(
-        process_profile_res(user_id, project_id, user_memo_str, project_profiles),
-        process_event_res(user_id, project_id, user_memo_str, project_profiles),
+        process_profile_res(
+            user_id, project_id, user_memo_str, project_profiles, current_user_profiles
+        ),
+        process_event_res(
+            user_id, project_id, user_memo_str, project_profiles, current_user_profiles
+        ),
     )
 
     profile_results: Promise = processing_results[0]
@@ -109,9 +123,12 @@ async def process_profile_res(
     project_id: str,
     user_memo_str: str,
     project_profiles: ProfileConfig,
+    current_user_profiles: UserProfilesData,
 ) -> Promise[tuple[MergeAddResult, list[dict]]]:
 
-    p = await extract_topics(user_id, project_id, user_memo_str, project_profiles)
+    p = await extract_topics(
+        user_id, project_id, user_memo_str, project_profiles, current_user_profiles
+    )
     if not p.ok():
         return p
     extracted_data = p.data()
@@ -170,6 +187,7 @@ async def process_event_res(
     project_id: str,
     memo_str: str,
     config: ProfileConfig,
+    current_user_profiles: UserProfilesData,
 ) -> Promise[list | None]:
     p = await tag_event(project_id, config, memo_str)
     if not p.ok():

@@ -8,16 +8,25 @@
 from ....prompts.profile_init_utils import read_out_event_tags
 from ....prompts.utils import tag_chat_blobs_in_order_xml
 from .types import FactResponse, PROMPTS
+from ....models.response import UserProfilesData
+from .utils import pack_current_user_profiles
 
 
 async def entry_chat_summary(
-    user_id: str, project_id: str, blobs: list[Blob], project_profiles: ProfileConfig
+    user_id: str,
+    project_id: str,
+    blobs: list[Blob],
+    project_profiles: ProfileConfig,
+    current_user_profiles: UserProfilesData,
 ) -> Promise[str]:
     assert all(b.type == BlobType.chat for b in blobs), "All blobs must be chat blobs"
-    USE_LANGUAGE = project_profiles.language or CONFIG.language
-    project_profiles_slots = read_out_profile_config(
-        project_profiles, PROMPTS[USE_LANGUAGE]["profile"].CANDIDATE_PROFILE_TOPICS
+    CURRENT_PROFILE_INFO = pack_current_user_profiles(
+        current_user_profiles, project_profiles
     )
+
+    USE_LANGUAGE = CURRENT_PROFILE_INFO["use_language"]
+    project_profiles_slots = CURRENT_PROFILE_INFO["project_profile_slots"]
+
     prompt = PROMPTS[USE_LANGUAGE]["entry_summary"]
     event_summary_theme = (
         project_profiles.event_theme_requirement or CONFIG.event_theme_requirement
@@ -33,7 +42,7 @@ async def entry_chat_summary(
     blob_strs = tag_chat_blobs_in_order_xml(blobs)
     r = await llm_complete(
         project_id,
-        prompt.pack_input(blob_strs),
+        prompt.pack_input(CURRENT_PROFILE_INFO["already_topics_prompt"], blob_strs),
         system_prompt=prompt.get_prompt(
             profile_topics_str,
             event_attriubtes_str,
@@ -43,4 +52,9 @@ async def entry_chat_summary(
         model=CONFIG.summary_llm_model,
         **prompt.get_kwargs(),
     )
+
+    # print(
+    #     prompt.pack_input(CURRENT_PROFILE_INFO["already_topics_prompt"], blob_strs),
+    #     r.data(),
+    # )
     return r
@@ -1,22 +1,15 @@
-import asyncio
 from ....env import CONFIG, ContanstTable, TRACE_LOG
-from ....utils import truncate_string
 from ....models.utils import Promise
-from ....models.blob import Blob, BlobType
-from ....models.response import AIUserProfiles, CODE
+from ....models.response import AIUserProfiles, CODE, UserProfilesData
 from ....llms import llm_complete
 from ....prompts.utils import (
-    tag_chat_blobs_in_order_xml,
     attribute_unify,
     parse_string_into_profiles,
-    parse_string_into_merge_action,
 )
 from ....prompts.profile_init_utils import read_out_profile_config, UserProfileTopic
-from ...profile import get_user_profiles
 from ...project import ProfileConfig
-
-# from ...project impor
 from .types import FactResponse, PROMPTS
+from .utils import pack_current_user_profiles
 
 
 def merge_by_topic_sub_topics(new_facts: list[FactResponse]):
@@ -31,73 +24,26 @@ def merge_by_topic_sub_topics(new_facts: list[FactResponse]):
 
 
 async def extract_topics(
-    user_id: str, project_id: str, user_memo: str, project_profiles: ProfileConfig
+    user_id: str,
+    project_id: str,
+    user_memo: str,
+    project_profiles: ProfileConfig,
+    current_user_profiles: UserProfilesData,
 ) -> Promise[dict]:
-    p = await get_user_profiles(user_id, project_id)
-    if not p.ok():
-        return p
-    profiles = p.data().profiles
-    USE_LANGUAGE = project_profiles.language or CONFIG.language
-    STRICT_MODE = (
-        project_profiles.profile_strict_mode
-        if project_profiles.profile_strict_mode is not None
-        else CONFIG.profile_strict_mode
-    )
 
-    project_profiles_slots = read_out_profile_config(
-        project_profiles, PROMPTS[USE_LANGUAGE]["profile"].CANDIDATE_PROFILE_TOPICS
+    profiles = current_user_profiles.profiles
+    CURRENT_PROFILE_INFO = pack_current_user_profiles(
+        current_user_profiles, project_profiles
     )
-    if STRICT_MODE:
-        allowed_topic_subtopics = set()
-        for p in project_profiles_slots:
-            for st in p.sub_topics:
-                allowed_topic_subtopics.add(
-                    (attribute_unify(p.topic), attribute_unify(st["name"]))
-                )
+    USE_LANGUAGE = CURRENT_PROFILE_INFO["use_language"]
+    STRICT_MODE = CURRENT_PROFILE_INFO["strict_mode"]
 
-    if len(profiles):
-        already_topics_subtopics = set(
-            [
-                (
-                    attribute_unify(p.attributes[ContanstTable.topic]),
-                    attribute_unify(p.attributes[ContanstTable.sub_topic]),
-                )
-                for p in profiles
-            ]
-        )
-        already_topic_subtopics_values = {
-            (
-                attribute_unify(p.attributes[ContanstTable.topic]),
-                attribute_unify(p.attributes[ContanstTable.sub_topic]),
-            ): p.content
-            for p in profiles
-        }
-        if STRICT_MODE:
-            already_topics_subtopics = already_topics_subtopics.intersection(
-                allowed_topic_subtopics
-            )
-            already_topic_subtopics_values = {
-                k: already_topic_subtopics_values[k] for k in already_topics_subtopics
-            }
-        already_topics_subtopics = sorted(already_topics_subtopics)
-        already_topics_prompt = "\n".join(
-            [
-                f"- {topic}{CONFIG.llm_tab_separator}{sub_topic}{CONFIG.llm_tab_separator}{truncate_string(already_topic_subtopics_values[(topic, sub_topic)], 5)}"
-                for topic, sub_topic in already_topics_subtopics
-            ]
-        )
-        TRACE_LOG.info(
-            project_id,
-            user_id,
-            f"Already have {len(profiles)} profiles, {len(already_topics_subtopics)} topics",
-        )
-    else:
-        already_topics_prompt = ""
+    project_profiles_slots = CURRENT_PROFILE_INFO["project_profile_slots"]
 
     p = await llm_complete(
         project_id,
         PROMPTS[USE_LANGUAGE]["extract"].pack_input(
-            already_topics_prompt,
+            CURRENT_PROFILE_INFO["already_topics_prompt"],
             user_memo,
             strict_mode=STRICT_MODE,
         ),
@@ -112,7 +58,7 @@ async def extract_topics(
     results = p.data()
     # print(
     #     PROMPTS[USE_LANGUAGE]["extract"].pack_input(
-    #         already_topics_prompt,
+    #         CURRENT_PROFILE_INFO["already_topics_prompt"],
     #         user_memo,
     #         strict_mode=STRICT_MODE,
     #     )
@@ -145,11 +91,11 @@ async def extract_topics(
     fact_attributes = []
 
     for nf in new_facts:
-        if STRICT_MODE:
+        if CURRENT_PROFILE_INFO["allowed_topic_subtopics"] is not None:
             if (
                 nf[ContanstTable.topic],
                 nf[ContanstTable.sub_topic],
-            ) not in allowed_topic_subtopics:
+            ) not in CURRENT_PROFILE_INFO["allowed_topic_subtopics"]:
                 continue
         fact_contents.append(nf["memo"])
         fact_attributes.append(

@@ -182,17 +182,18 @@ async def handle_profile_merge_or_valid(
                 }
             )
     elif update_response["action"] == "ABORT":
+        oneline_response = r.data().replace("\n", " ")
         if runtime_profile is None:
             TRACE_LOG.info(
                 project_id,
                 user_id,
-                f"Invalid profile: {KEY}::{profile_content}, abort it\n<raw_response>\n{r.data()}\n</raw_response>",
+                f"Invalid profile: {KEY}::{profile_content}. <raw_response> {oneline_response} </raw_response>",
             )
         else:
             TRACE_LOG.info(
                 project_id,
                 user_id,
-                f"Invalid merge: {runtime_profile.attributes}, {profile_content}, abort it\n<raw_response>\n{r.data()}\n</raw_response>",
+                f"Invalid merge: {runtime_profile.attributes}, {profile_content}. <raw_response> {oneline_response} </raw_response>",
             )
             # session_merge_validate_results["delete"].append(runtime_profile.id)
         return Promise.resolve(None)