Skip to content

Add Torch Trace GPU profiling feature for PyTorch workers#60727

Closed
aryan-verma-ai-2 wants to merge 1 commit intoray-project:masterfrom
Applied-Shared:aryan/torch-trace
Closed

Add Torch Trace GPU profiling feature for PyTorch workers#60727
aryan-verma-ai-2 wants to merge 1 commit intoray-project:masterfrom
Applied-Shared:aryan/torch-trace

Conversation

@aryan-verma-ai-2
Copy link
Copy Markdown

Thank you for contributing to Ray! 🚀
Please review the Ray Contribution Guide before opening a pull request.

⚠️ Remove these instructions before submitting your PR.

💡 Tip: Mark as draft if you want early feedback, or ready for review when it's complete.

Description

Briefly describe what this PR accomplishes and why it's needed.

Related issues

Link related issues: "Fixes #1234", "Closes #1234", or "Related to #1234".

Additional information

Optional: Add implementation details, API changes, usage examples, screenshots, etc.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a valuable GPU profiling feature for PyTorch workers, accessible through the Ray Dashboard. The implementation is solid, adding new UI components and a command handler that correctly manages the profiling process, including state management and file download. I've identified a couple of areas for improvement to enhance robustness. One suggestion is to refine the regex for filename parsing to handle more complex content-disposition headers. Another, more critical suggestion, is to use the worker.language property for identifying Python workers, which is more reliable than checking the command line and ensures feature consistency across the dashboard.

<CpuStackTraceLink pid={pid} nodeId={nodeId} type="" />
<br />
<MemoryProfilingButton pid={pid} nodeId={nodeId} />
{cmdline[0] === "python" && coreWorker?.ipAddress && (
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The check cmdline[0] === "python" to determine if a worker is a Python worker is brittle. It will fail for commands like python3 or full paths like /usr/bin/python. The worker object has a language property which is more reliable for this check. You should destructure language from the worker object (around line 239) and use language === "PYTHON" here for a more robust check, which is also consistent with WorkerTable.tsx.

Suggested change
{cmdline[0] === "python" && coreWorker?.ipAddress && (
{language === "PYTHON" && coreWorker?.ipAddress && (

let filename = "gputrace.json";
const contentDisposition = response.headers.get("content-disposition");
if (contentDisposition) {
const filenameMatch = contentDisposition.match(/filename="(.+)"/);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The regex used to extract the filename from the content-disposition header is a bit too greedy. If the header contains other attributes after the filename, they might be incorrectly included in the matched filename. Using a more specific, non-greedy regex will make this more robust.

Suggested change
const filenameMatch = contentDisposition.match(/filename="(.+)"/);
const filenameMatch = contentDisposition.match(/filename="([^"]+)"/);

@github-actions
Copy link
Copy Markdown

This pull request has been automatically marked as stale because it has not had
any activity for 14 days. It will be closed in another 14 days if no further activity occurs.
Thank you for your contributions.

You can always ask for help on our discussion forum or Ray's public slack channel.

If you'd like to keep this open, just leave any comment, and the stale label will be removed.

@github-actions github-actions bot added the stale The issue is stale. It will be closed within 7 days unless there are further conversation label Feb 18, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 4, 2026

This pull request has been automatically closed because there has been no more activity in the 14 days
since being marked stale.

Please feel free to reopen or open a new pull request if you'd still like this to be addressed.

Again, you can always ask for help on our discussion forum or Ray's public slack channel.

Thanks again for your contribution!

@github-actions github-actions bot closed this Mar 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

stale The issue is stale. It will be closed within 7 days unless there are further conversation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Ray fails to serialize self-reference objects

1 participant