Skip to content

fix: redirect wandb .netrc to temp dir for read-only environments#33604

Open
ForeignKeyCN wants to merge 5 commits intolanggenius:mainfrom
ForeignKeyCN:fix/weave-netrc-readonly
Open

fix: redirect wandb .netrc to temp dir for read-only environments#33604
ForeignKeyCN wants to merge 5 commits intolanggenius:mainfrom
ForeignKeyCN:fix/weave-netrc-readonly

Conversation

@ForeignKeyCN
Copy link
Copy Markdown

Fixes #33603

Summary

wandb.login() persists credentials to ~/.netrc, which fails when the home directory is read-only (e.g. Dify Cloud containers). This PR sets the NETRC env var to point to a temp directory before importing wandb, so authentication succeeds regardless of home directory permissions.

Related PRs: #14262 (initial Weave integration), #28289 (last Weave update)

Screenshots

| Before |
image

| After |
image|

Checklist

  • This change requires a documentation update, included: Dify Document
  • I understand that this PR may be closed in case there was no previous discussion or issues. (This doesn't apply to typos!)
  • I've added a test for each change that was introduced, and I tried as much as possible to make a single atomic change.
  • I've updated the documentation accordingly.
  • I ran make lint and make type-check (backend) and cd web && npx lint-staged (frontend) to appease the lint gods

wandb.login() persists credentials to ~/.netrc, which fails when the
home directory is read-only (e.g. Dify Cloud containers). Set NETRC
env var to a temp directory before importing wandb to avoid the error.
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves a critical authentication failure for wandb.login() in environments with read-only home directories, such as Dify Cloud containers. By programmatically setting the NETRC environment variable to point to a temporary, writable location before wandb is initialized, the system can now successfully store credentials, significantly improving compatibility and reliability in restricted environments.

Highlights

  • wandb authentication fix: Addressed an issue where wandb.login() failed in read-only environments by redirecting the .netrc file to a temporary directory, ensuring successful credential persistence.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • api/core/ops/weave_trace/weave_trace.py
    • Imported the tempfile module to manage temporary file paths.
    • Configured the NETRC environment variable to redirect wandb's credential storage to a temporary directory, resolving authentication issues in read-only environments.
Activity
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@github-actions
Copy link
Copy Markdown
Contributor

Pyrefly Diff

No changes detected.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a crash when wandb.login() tries to write to ~/.netrc in a read-only environment by redirecting it to a temporary file. This is a good fix for the immediate problem. However, the current implementation with a fixed filename (.wandb_netrc) introduces a race condition in multi-process and multi-threaded server environments, which can lead to credential leakage between tenants or authentication failures. I've added a specific comment with a suggestion to make the filename unique per-process, which mitigates the issue between different worker processes. A more subtle race condition can still exist within a single process handling concurrent requests for different tenants, which I've also detailed in the comment. Addressing the per-process issue is a critical improvement.

Comment thread api/core/ops/weave_trace/weave_trace.py Outdated
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

Pyrefly Diff

No changes detected.

@github-actions
Copy link
Copy Markdown
Contributor

Pyrefly Diff

No changes detected.

@QuantumGhost
Copy link
Copy Markdown
Contributor

QuantumGhost commented Mar 18, 2026

Thank you for your contribution!

This seems still problematic for the following reasons:

  1. Dify Cloud does enforce tenant isolation at the API server level. So writing authentication credentials to disk imposes the risk of credential leakage between tenants.
  2. According to upstream documentations, the wandb.login method is the "programmatic counterpart to the wandb login CLI" and "generally don't have to use this".

I'm not sure if it's possible to initialize weave client without setting credential to .netrc file. If it's possible, we should switch to that approach.

…rc writes

Replace wandb.login() with WANDB_API_KEY/WANDB_BASE_URL env vars before
weave.init(). This avoids writing credentials to .netrc, which fails in
read-only environments (Dify Cloud) and risks credential leakage between
tenants in multi-tenant deployments.

Per Weave docs, setting WANDB_API_KEY before weave.init() is sufficient
for authentication without any disk writes.

Fixes langgenius#33603
@ForeignKeyCN ForeignKeyCN requested a review from laipz8200 as a code owner March 18, 2026 03:12
@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Mar 18, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Pyrefly Diff

No changes detected.

@ForeignKeyCN ForeignKeyCN removed the request for review from laipz8200 March 18, 2026 03:16
@ForeignKeyCN
Copy link
Copy Markdown
Author

Thanks Lin!

I just looked up weave's doc and find the possible way. Now there will no longer be disk write. Everything are written in trace_app_config table.

Should we follow this PR or we should start a new one?

@QuantumGhost
Copy link
Copy Markdown
Contributor

QuantumGhost commented Mar 18, 2026

I suggest following on this PR and adjusting the PR title accordingly @ForeignKeyCN.

Comment thread api/core/ops/weave_trace/weave_trace.py Outdated
# Login with API key first, including host if provided
# Authenticate via env var instead of wandb.login() to avoid
# writing credentials to .netrc (fails in read-only environments)
os.environ["WANDB_API_KEY"] = self.weave_api_key
Copy link
Copy Markdown
Contributor

@QuantumGhost QuantumGhost Mar 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The environment variables remain shared across the entire API server process, meaning the cross-tenant overwrite issue persists.

It is unclear whether the wandb SDK supports multiple client instances within a single process. If not, the weave_trace integration may be fundamentally unsuitable for robust multi-tenancy, limiting its applicability primarily to self-hosted deployments where a single entity maintains full control and strict tenant isolation is less critical.

Replace weave.init() with direct construction of RemoteHTTPTraceServer
and WeaveClient. This avoids:
1. Writing credentials to ~/.netrc (fails in read-only environments)
2. Using process-wide env vars (cross-tenant credential leakage)

Credentials are now held only in the client instance, matching the
Langfuse integration pattern. The endpoint from WeaveConfig is used
as the trace server URL.

Fixes langgenius#33603
@github-actions
Copy link
Copy Markdown
Contributor

Pyrefly Diff

base → PR
--- /tmp/pyrefly_base.txt	2026-03-19 06:00:28.493406080 +0000
+++ /tmp/pyrefly_pr.txt	2026-03-19 06:00:19.602416482 +0000
@@ -3790,6 +3790,10 @@
    --> tests/unit_tests/core/ops/test_utils.py:107:41
 ERROR Argument `None` is not assignable to parameter `project` with type `str` in function `core.ops.utils.validate_project_name` [bad-argument-type]
    --> tests/unit_tests/core/ops/test_utils.py:136:40
+ERROR Object of class `WeaveDataTrace` has no attribute `weave_api_key` [missing-attribute]
+   --> tests/unit_tests/core/ops/weave_trace/test_weave_trace.py:234:16
+ERROR Object of class `WeaveDataTrace` has no attribute `host` [missing-attribute]
+   --> tests/unit_tests/core/ops/weave_trace/test_weave_trace.py:246:16
 ERROR This `yield` expression is unreachable [unreachable]
    --> tests/unit_tests/core/plugin/impl/test_model_client.py:168:13
 ERROR This `yield` expression is unreachable [unreachable]

@QuantumGhost
Copy link
Copy Markdown
Contributor

LGTM now. However the CI failed.

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Mar 19, 2026
@crazywoola crazywoola removed the lgtm This PR has been approved by a maintainer label Mar 20, 2026
@crazywoola crazywoola requested a review from QuantumGhost March 20, 2026 02:49
@ForeignKeyCN
Copy link
Copy Markdown
Author

@QuantumGhost @crazywoola

Hi Team. Sorry it's been a week. Since weave package is not really well maintained and they didn't ship their own stubs for its internal trace_server_bindings module, the CI couldn't pass. Do you think we can ignore this or should we add a weave specific stub file?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Weave (W&B) monitoring integration fails on Dify Cloud due to .netrc write permission error

3 participants