Skip to content

[3/n] catalog ray serve env vars#59647

Merged
abrarsheikh merged 4 commits intomasterfrom
catalog-ray-serve-env-part-3
Jan 7, 2026
Merged

[3/n] catalog ray serve env vars#59647
abrarsheikh merged 4 commits intomasterfrom
catalog-ray-serve-env-part-3

Conversation

@harshit-anyscale
Copy link
Contributor

@harshit-anyscale harshit-anyscale commented Dec 24, 2025

This PR adds documentation for several Ray Serve environment variables that were defined in constants.py but missing from the documentation, and also cleans up deprecated legacy environment variable names.

Changes Made

Documentation additions

doc/source/serve/production-guide/config.md (Proxy config section):

  • RAY_SERVE_ALWAYS_RUN_PROXY_ON_HEAD_NODE - Control whether to always run a proxy on the head node
  • RAY_SERVE_PROXY_HEALTH_CHECK_TIMEOUT_S - Proxy health check timeout
  • RAY_SERVE_PROXY_HEALTH_CHECK_PERIOD_S - Proxy health check period
  • RAY_SERVE_PROXY_READY_CHECK_TIMEOUT_S - Proxy ready check timeout
  • RAY_SERVE_PROXY_MIN_DRAINING_PERIOD_S - Minimum proxy draining period

doc/source/serve/production-guide/fault-tolerance.md (New "Replica constructor retries" section):

  • RAY_SERVE_MAX_PER_REPLICA_RETRY_COUNT - Max constructor retries per replica
  • RAY_SERVE_MAX_DEPLOYMENT_CONSTRUCTOR_RETRY_COUNT - Max constructor retries per deployment

doc/source/serve/advanced-guides/performance.md:

  • RAY_SERVE_PROXY_PREFER_LOCAL_NODE_ROUTING - Proxy node locality routing preference
  • RAY_SERVE_PROXY_PREFER_LOCAL_AZ_ROUTING - Proxy AZ locality routing preference
  • RAY_SERVE_MAX_CACHED_HANDLES - Max cached deployment handles (controller debugging section)

doc/source/serve/monitoring.md:

  • RAY_SERVE_HTTP_PROXY_CALLBACK_IMPORT_PATH - HTTP proxy initialization callback
  • SERVE_SLOW_STARTUP_WARNING_S - Slow startup warning threshold
  • SERVE_SLOW_STARTUP_WARNING_PERIOD_S - Slow startup warning interval

Code cleanup

python/ray/serve/_private/constants.py:

  • Removed legacy fallback for MAX_DEPLOYMENT_CONSTRUCTOR_RETRY_COUNT (now only RAY_SERVE_MAX_DEPLOYMENT_CONSTRUCTOR_RETRY_COUNT)
  • Removed legacy fallback for MAX_PER_REPLICA_RETRY_COUNT (now only RAY_SERVE_MAX_PER_REPLICA_RETRY_COUNT)
  • Removed legacy fallback for MAX_CACHED_HANDLES (now only RAY_SERVE_MAX_CACHED_HANDLES)

python/ray/serve/_private/constants_utils.py:

  • Removed MAX_DEPLOYMENT_CONSTRUCTOR_RETRY_COUNT and MAX_PER_REPLICA_RETRY_COUNT from the deprecated names whitelist

Signed-off-by: harshit <harshit@anyscale.com>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request does a great job of cataloging and cleaning up Ray Serve environment variables, which improves both documentation and code consistency. The refactoring in constants.py to remove fallbacks for old, unprefixed environment variables is a welcome change. I have one suggestion to ensure all new environment variables follow the established naming convention.

@harshit-anyscale harshit-anyscale marked this pull request as ready for review December 24, 2025 11:49
@harshit-anyscale harshit-anyscale requested review from a team as code owners December 24, 2025 11:49
@harshit-anyscale harshit-anyscale self-assigned this Dec 24, 2025
@harshit-anyscale harshit-anyscale added the go add ONLY when ready to merge, run all tests label Dec 24, 2025
@harshit-anyscale harshit-anyscale linked an issue Dec 24, 2025 that may be closed by this pull request
@ray-gardener ray-gardener bot added serve Ray Serve Related Issue docs An issue or change related to documentation labels Dec 24, 2025
Signed-off-by: harshit <harshit@anyscale.com>
Signed-off-by: harshit <harshit@anyscale.com>
Signed-off-by: harshit <harshit@anyscale.com>
Copy link
Contributor

@abrarsheikh abrarsheikh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left some follow ups

@abrarsheikh abrarsheikh merged commit 4dff6c7 into master Jan 7, 2026
6 checks passed
@abrarsheikh abrarsheikh deleted the catalog-ray-serve-env-part-3 branch January 7, 2026 00:30
AYou0207 pushed a commit to AYou0207/ray that referenced this pull request Jan 13, 2026
This PR adds documentation for several Ray Serve environment variables
that were defined in `constants.py` but missing from the documentation,
and also cleans up deprecated legacy environment variable names.

### Changes Made

#### Documentation additions

**`doc/source/serve/production-guide/config.md`** (Proxy config
section):
- `RAY_SERVE_ALWAYS_RUN_PROXY_ON_HEAD_NODE` - Control whether to always
run a proxy on the head node
- `RAY_SERVE_PROXY_HEALTH_CHECK_TIMEOUT_S` - Proxy health check timeout
- `RAY_SERVE_PROXY_HEALTH_CHECK_PERIOD_S` - Proxy health check period
- `RAY_SERVE_PROXY_READY_CHECK_TIMEOUT_S` - Proxy ready check timeout
- `RAY_SERVE_PROXY_MIN_DRAINING_PERIOD_S` - Minimum proxy draining
period

**`doc/source/serve/production-guide/fault-tolerance.md`** (New "Replica
constructor retries" section):
- `RAY_SERVE_MAX_PER_REPLICA_RETRY_COUNT` - Max constructor retries per
replica
- `RAY_SERVE_MAX_DEPLOYMENT_CONSTRUCTOR_RETRY_COUNT` - Max constructor
retries per deployment

**`doc/source/serve/advanced-guides/performance.md`**:
- `RAY_SERVE_PROXY_PREFER_LOCAL_NODE_ROUTING` - Proxy node locality
routing preference
- `RAY_SERVE_PROXY_PREFER_LOCAL_AZ_ROUTING` - Proxy AZ locality routing
preference
- `RAY_SERVE_MAX_CACHED_HANDLES` - Max cached deployment handles
(controller debugging section)

**`doc/source/serve/monitoring.md`**:
- `RAY_SERVE_HTTP_PROXY_CALLBACK_IMPORT_PATH` - HTTP proxy
initialization callback
- `SERVE_SLOW_STARTUP_WARNING_S` - Slow startup warning threshold
- `SERVE_SLOW_STARTUP_WARNING_PERIOD_S` - Slow startup warning interval

#### Code cleanup

**`python/ray/serve/_private/constants.py`**:
- Removed legacy fallback for `MAX_DEPLOYMENT_CONSTRUCTOR_RETRY_COUNT`
(now only `RAY_SERVE_MAX_DEPLOYMENT_CONSTRUCTOR_RETRY_COUNT`)
- Removed legacy fallback for `MAX_PER_REPLICA_RETRY_COUNT` (now only
`RAY_SERVE_MAX_PER_REPLICA_RETRY_COUNT`)
- Removed legacy fallback for `MAX_CACHED_HANDLES` (now only
`RAY_SERVE_MAX_CACHED_HANDLES`)

**`python/ray/serve/_private/constants_utils.py`**:
- Removed `MAX_DEPLOYMENT_CONSTRUCTOR_RETRY_COUNT` and
`MAX_PER_REPLICA_RETRY_COUNT` from the deprecated names whitelist

---------

Signed-off-by: harshit <harshit@anyscale.com>
Signed-off-by: jasonwrwang <jasonwrwang@tencent.com>
lee1258561 pushed a commit to pinterest/ray that referenced this pull request Feb 3, 2026
This PR adds documentation for several Ray Serve environment variables
that were defined in `constants.py` but missing from the documentation,
and also cleans up deprecated legacy environment variable names.

### Changes Made

#### Documentation additions

**`doc/source/serve/production-guide/config.md`** (Proxy config
section):
- `RAY_SERVE_ALWAYS_RUN_PROXY_ON_HEAD_NODE` - Control whether to always
run a proxy on the head node
- `RAY_SERVE_PROXY_HEALTH_CHECK_TIMEOUT_S` - Proxy health check timeout
- `RAY_SERVE_PROXY_HEALTH_CHECK_PERIOD_S` - Proxy health check period
- `RAY_SERVE_PROXY_READY_CHECK_TIMEOUT_S` - Proxy ready check timeout
- `RAY_SERVE_PROXY_MIN_DRAINING_PERIOD_S` - Minimum proxy draining
period

**`doc/source/serve/production-guide/fault-tolerance.md`** (New "Replica
constructor retries" section):
- `RAY_SERVE_MAX_PER_REPLICA_RETRY_COUNT` - Max constructor retries per
replica
- `RAY_SERVE_MAX_DEPLOYMENT_CONSTRUCTOR_RETRY_COUNT` - Max constructor
retries per deployment

**`doc/source/serve/advanced-guides/performance.md`**:
- `RAY_SERVE_PROXY_PREFER_LOCAL_NODE_ROUTING` - Proxy node locality
routing preference
- `RAY_SERVE_PROXY_PREFER_LOCAL_AZ_ROUTING` - Proxy AZ locality routing
preference
- `RAY_SERVE_MAX_CACHED_HANDLES` - Max cached deployment handles
(controller debugging section)

**`doc/source/serve/monitoring.md`**:
- `RAY_SERVE_HTTP_PROXY_CALLBACK_IMPORT_PATH` - HTTP proxy
initialization callback
- `SERVE_SLOW_STARTUP_WARNING_S` - Slow startup warning threshold
- `SERVE_SLOW_STARTUP_WARNING_PERIOD_S` - Slow startup warning interval

#### Code cleanup

**`python/ray/serve/_private/constants.py`**:
- Removed legacy fallback for `MAX_DEPLOYMENT_CONSTRUCTOR_RETRY_COUNT`
(now only `RAY_SERVE_MAX_DEPLOYMENT_CONSTRUCTOR_RETRY_COUNT`)
- Removed legacy fallback for `MAX_PER_REPLICA_RETRY_COUNT` (now only
`RAY_SERVE_MAX_PER_REPLICA_RETRY_COUNT`)
- Removed legacy fallback for `MAX_CACHED_HANDLES` (now only
`RAY_SERVE_MAX_CACHED_HANDLES`)

**`python/ray/serve/_private/constants_utils.py`**:
- Removed `MAX_DEPLOYMENT_CONSTRUCTOR_RETRY_COUNT` and
`MAX_PER_REPLICA_RETRY_COUNT` from the deprecated names whitelist

---------

Signed-off-by: harshit <harshit@anyscale.com>
ryanaoleary pushed a commit to ryanaoleary/ray that referenced this pull request Feb 3, 2026
This PR adds documentation for several Ray Serve environment variables
that were defined in `constants.py` but missing from the documentation,
and also cleans up deprecated legacy environment variable names.

### Changes Made

#### Documentation additions

**`doc/source/serve/production-guide/config.md`** (Proxy config
section):
- `RAY_SERVE_ALWAYS_RUN_PROXY_ON_HEAD_NODE` - Control whether to always
run a proxy on the head node
- `RAY_SERVE_PROXY_HEALTH_CHECK_TIMEOUT_S` - Proxy health check timeout
- `RAY_SERVE_PROXY_HEALTH_CHECK_PERIOD_S` - Proxy health check period
- `RAY_SERVE_PROXY_READY_CHECK_TIMEOUT_S` - Proxy ready check timeout
- `RAY_SERVE_PROXY_MIN_DRAINING_PERIOD_S` - Minimum proxy draining
period

**`doc/source/serve/production-guide/fault-tolerance.md`** (New "Replica
constructor retries" section):
- `RAY_SERVE_MAX_PER_REPLICA_RETRY_COUNT` - Max constructor retries per
replica
- `RAY_SERVE_MAX_DEPLOYMENT_CONSTRUCTOR_RETRY_COUNT` - Max constructor
retries per deployment

**`doc/source/serve/advanced-guides/performance.md`**:
- `RAY_SERVE_PROXY_PREFER_LOCAL_NODE_ROUTING` - Proxy node locality
routing preference
- `RAY_SERVE_PROXY_PREFER_LOCAL_AZ_ROUTING` - Proxy AZ locality routing
preference
- `RAY_SERVE_MAX_CACHED_HANDLES` - Max cached deployment handles
(controller debugging section)

**`doc/source/serve/monitoring.md`**:
- `RAY_SERVE_HTTP_PROXY_CALLBACK_IMPORT_PATH` - HTTP proxy
initialization callback
- `SERVE_SLOW_STARTUP_WARNING_S` - Slow startup warning threshold
- `SERVE_SLOW_STARTUP_WARNING_PERIOD_S` - Slow startup warning interval

#### Code cleanup

**`python/ray/serve/_private/constants.py`**:
- Removed legacy fallback for `MAX_DEPLOYMENT_CONSTRUCTOR_RETRY_COUNT`
(now only `RAY_SERVE_MAX_DEPLOYMENT_CONSTRUCTOR_RETRY_COUNT`)
- Removed legacy fallback for `MAX_PER_REPLICA_RETRY_COUNT` (now only
`RAY_SERVE_MAX_PER_REPLICA_RETRY_COUNT`)
- Removed legacy fallback for `MAX_CACHED_HANDLES` (now only
`RAY_SERVE_MAX_CACHED_HANDLES`)

**`python/ray/serve/_private/constants_utils.py`**:
- Removed `MAX_DEPLOYMENT_CONSTRUCTOR_RETRY_COUNT` and
`MAX_PER_REPLICA_RETRY_COUNT` from the deprecated names whitelist

---------

Signed-off-by: harshit <harshit@anyscale.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs An issue or change related to documentation go add ONLY when ready to merge, run all tests serve Ray Serve Related Issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Serve] Catalog Ray Serve Configuration Variables

2 participants