Skip to content

Conversation

@ok-scale
Copy link
Contributor

@ok-scale ok-scale commented Apr 24, 2025

Why are these changes needed?

When creating a Serve deployment with a name containing a slash (such as TextGenerationModel.options(name="huawei-noah/TinyBERT_General_4L_312D")) leads to actor failures. The error occurs because the slash in the name is likely being used as a path separator in log files.

Error: FileNotFoundError: [Errno 2] No such file or directory: '/tmp/ray/session_2025-03-11_16-02-48_264712_45753/logs/serve/replica_default_huawei-noah/TinyBERT_General_4L_312D_eo3vqu7d.log'

In general, any other special character used may cause the logger to crash. This branch fixes the issue by replacing all special characters with _ for logging purposes.

Related issue number

https://anyscale1.atlassian.net/browse/SERVE-657

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@ok-scale ok-scale requested a review from abrarsheikh April 24, 2025 14:05
@ok-scale ok-scale added the go add ONLY when ready to merge, run all tests label Apr 24, 2025
Comment on lines +572 to +573
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zcin, does this have any side effects on the service's ability to reference the correct app log on anyscale ?

image

Is component_id user-provided? If not, maybe we can avoid applying the sub-operation on it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm yeah I don't know the full implications of this, I would be hesitant to do this unless it's fully scoped out what the effects are.

@GeneDer might know more about the logging issue specifically.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another way is to just provide some useful log statements to users that include WARN or something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@GeneDer thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

component_id is mostly auto-generated id, and most of time we just use pid for that process. Feel this would be no-op, but also unnecessary :)

@ok-scale ok-scale force-pushed the SERVE-657-update-naming branch from 051a78a to d8c9481 Compare April 24, 2025 17:52
Signed-off-by: Omkar Kulkarni <omkar@omkar-JKJHCX74L6.local>
@ok-scale ok-scale force-pushed the SERVE-657-update-naming branch from 0ed794b to 5cda7d3 Compare April 24, 2025 18:06
ok-scale and others added 4 commits April 24, 2025 11:06
Signed-off-by: Omkar Kulkarni <omkar@omkar-JKJHCX74L6.local>
Signed-off-by: Omkar Kulkarni <omkar@omkar-JKJHCX74L6.local>
@hainesmichaelc hainesmichaelc added the community-contribution Contributed by the community label Apr 28, 2025
@mascharkh mascharkh added serve Ray Serve Related Issue stability labels Apr 28, 2025
@ok-scale ok-scale requested a review from abrarsheikh April 30, 2025 15:01
Signed-off-by: ok-scale <omkar@anyscale.com>
@ok-scale ok-scale closed this Apr 30, 2025
@ok-scale
Copy link
Contributor Author

Closing wrt #52702

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-backlog community-contribution Contributed by the community go add ONLY when ready to merge, run all tests serve Ray Serve Related Issue stability

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants