Skip to content

[Bug] timed_with_status decorator silently swallows exceptions and returns None #1523

@Ptah-CT

Description

@Ptah-CT

Summary

timed_with_status in src/memos/utils.py catches every exception, but when no fallback callable is configured the wrapper falls through to an implicit return None. Decorated functions therefore return None on any failure instead of raising, which masks the real error.

Where

src/memos/utils.py, lines 43-56 (current main @ cddc252).

try:
    result = fn(*args, **kwargs)
    success_flag = True
    return result
except Exception as e:
    exc_type = type(e)
    stack_info = "".join(traceback.format_stack()[:-1])
    exc_message = f"{stack_info}{traceback.format_exc()}"
    success_flag = False

    if fallback is not None and callable(fallback):
        result = fallback(e, *args, **kwargs)
        return result
    # ← no `raise` here; wrapper falls through to implicit return None
finally:
    ...

Impact

OpenAILLM.generate is decorated with @timed_with_status(...). When the upstream LLM returns 4xx/5xx (e.g. MiniMax 400 chat content is empty (2013) for a system-only message), the BadRequestError is caught, logged as status: FAILED, and then swallowed. generate() returns None to its caller.

Downstream that None flows into clean_json_response(response) (src/memos/mem_os/utils/format_utils.py:1403) and crashes with:

AttributeError: 'NoneType' object has no attribute 'replace'

Two consequences:

  1. The user sees a confusing AttributeError instead of the real 400 from the LLM. Diagnosis is hard because nothing in the traceback names the LLM call.
  2. It is a silent fail by API contract. Callers cannot tell whether generate() succeeded with empty output or failed with an exception, because the same None represents both.

Reproduction

  1. Set MOS_CHAT_MODEL=MiniMax-M2.7, OPENAI_API_BASE=https://api.minimax.io/v1, valid OPENAI_API_KEY.
  2. Start memos server, ensure default cube exists.
  3. POST /product/suggestions with a mem_cube_id whose recent memories are empty (so the suggestion prompt has only a system message).
  4. Observe HTTP 500 'NoneType' object has no attribute 'replace' in the response, and [TIMER_WITH_STATUS] OpenAI LLM took 5051 ms, status: FAILED, error_type: BadRequestError, error_message: ... immediately above it in the log.

Proposed fix

Add an explicit raise after the fallback branch:

                if fallback is not None and callable(fallback):
                    result = fallback(e, *args, **kwargs)
                    return result
                raise
            finally:
                ...

This preserves existing fallback semantics and makes the no-fallback path fail-fast.

I will open a PR with this change against main.

Related

The same swallow likely contributes to other reports where downstream code receives unexpected None/empty values from LLM helpers (e.g. #1324 memory_search always returns no results with reasoning-enabled models — different root cause, but the same pattern of LLM-call failure being invisible to the caller).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions