Skip to content

Conversation

@aitsvet
Copy link
Contributor

@aitsvet aitsvet commented Oct 13, 2025

Purpose

Introduces a feature flag VLLM_DEBUG_LOG_API_SERVER_REQUEST_PROMPT to control the verbosity of API server request logging. When enabled, detailed prompt information (prompt, prompt_token_ids, prompt_embeds) is logged at DEBUG level instead of INFO level, reducing noise in standard INFO logs.

Test Plan

  1. Start the vLLM API server without VLLM_DEBUG_LOG_API_SERVER_REQUEST_PROMPT set (or set to False).
  2. Send a request to the API server.
  3. Observe the logs: the full prompt details should be visible at INFO level.
  4. Start the vLLM API server with VLLM_DEBUG_LOG_API_SERVER_REQUEST_PROMPT=True and VLLM_LOG_LEVEL=debug.
  5. Send a request to the API server.
  6. Observe the logs: the basic request info should be at INFO level, and the detailed prompt information should only appear at DEBUG level.

Test Result

  • Without the flag (or False): INFO logs contain full prompt details.
  • With the flag (True): INFO logs contain only basic request info; full prompt details are moved to DEBUG logs. This behavior was verified locally.

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Aleksei Tsvetkov <aitsvet@ya.ru>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a feature flag VLLM_DEBUG_LOG_API_SERVER_REQUEST_PROMPT to move verbose prompt logging to the DEBUG level, reducing log noise. The implementation is correct, but there is significant code duplication in vllm/entrypoints/logger.py which poses a maintainability risk. I've suggested a refactoring to address this by using dictionary-based string formatting, which eliminates redundancy and improves clarity.

Comment on lines +38 to 68
if not envs.VLLM_DEBUG_LOG_API_SERVER_REQUEST_PROMPT:
# Original logging behavior
logger.info(
"Received request %s: prompt: %r, "
"params: %s, prompt_token_ids: %s, "
"prompt_embeds shape: %s, "
"lora_request: %s.",
request_id,
prompt,
params,
prompt_token_ids,
prompt_embeds.shape if prompt_embeds is not None else None,
lora_request,
)
return

# Split logging: basic info at INFO level, prompt details at DEBUG level
logger.info(
"Received request %s: prompt: %r, "
"params: %s, prompt_token_ids: %s, "
"prompt_embeds shape: %s, "
"lora_request: %s.",
"Received request %s: params: %s, lora_request: %s.",
request_id,
prompt,
params,
lora_request,
)
logger.debug(
"Request %s prompt details: prompt: %r, prompt_token_ids: %s, "
"prompt_embeds shape: %s",
request_id,
prompt,
prompt_token_ids,
prompt_embeds.shape if prompt_embeds is not None else None,
lora_request,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current implementation duplicates the logging logic. The original logger.info call is copied into the if not envs.VLLM_DEBUG_LOG_API_SERVER_REQUEST_PROMPT: block. This code duplication can lead to maintenance issues, as future changes to the log message might not be applied in both places.

To improve maintainability and avoid redundancy, I suggest refactoring this to use dictionary-based string formatting. This approach centralizes the log data and constructs the log messages conditionally, making the code cleaner and easier to maintain. I've also slightly modified prompt_embeds shape to prompt_embeds_shape in the log message to make it a valid identifier for dictionary-based formatting.

        log_data = {
            "request_id": request_id,
            "prompt": prompt,
            "params": params,
            "prompt_token_ids": prompt_token_ids,
            "prompt_embeds_shape": prompt_embeds.shape if prompt_embeds is not None else None,
            "lora_request": lora_request,
        }

        if not envs.VLLM_DEBUG_LOG_API_SERVER_REQUEST_PROMPT:
            # Original logging behavior
            logger.info(
                "Received request %(request_id)s: prompt: %(prompt)r, "
                "params: %(params)s, prompt_token_ids: %(prompt_token_ids)s, "
                "prompt_embeds_shape: %(prompt_embeds_shape)s, "
                "lora_request: %(lora_request)s.",
                log_data,
            )
            return

        # Split logging: basic info at INFO level, prompt details at DEBUG level
        logger.info(
            "Received request %(request_id)s: params: %(params)s, lora_request: %(lora_request)s.",
            log_data,
        )
        logger.debug(
            "Request %(request_id)s prompt details: prompt: %(prompt)r, "
            "prompt_token_ids: %(prompt_token_ids)s, "
            "prompt_embeds_shape: %(prompt_embeds_shape)s",
            log_data,
        )

@markmc
Copy link
Member

markmc commented Oct 13, 2025

I think we should avoid a proliferation of magic environment variables like this

Unconditionally changing the default behaviour seems reasonable to me - basically, reducing the info level detail, and adding more detail at debug level

(Note for reviewers - this already requires --enable-log-requests, it's not on by default)

@aitsvet
Copy link
Contributor Author

aitsvet commented Oct 13, 2025

gonna just split log lines and change levels as suggested

@aitsvet aitsvet closed this Oct 13, 2025
@markmc
Copy link
Member

markmc commented Oct 13, 2025

No need to close this and file a new PR. Keeping it in this PR actually helps maintain the context of the discussion

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants