Skip to content

Conversation

@zeke
Copy link
Member

@zeke zeke commented Aug 27, 2025

Problem

replicate.use() was creating a client immediately, causing errors when no API token was available at call time. This prevented usage in Cog pipelines where tokens are provided through cog.current_scope() at runtime.

Solution

Modified Function and AsyncFunction classes to support lazy client creation by always accepting client classes and instantiating them on demand.

Changes

  • _module_client.py: Pass Replicate and AsyncReplicate classes directly instead of factory functions
  • lib/_predictions_use.py: Simplified to always use Type[Client] approach:
    • Removed Union[Client, Type[Client]] in favor of just Type[Client]
    • Updated constructors to store _client_class instead of mixed types
    • Simplified _client property to always instantiate from class
    • Updated all use() overloads to accept class types only
    • Use issubclass() for async client detection
  • Added simple test demonstrating the fix

Result

replicate.use("model") works even when no token is initially available
✅ Client created lazily when model is actually called
✅ Token retrieved from current cog scope at call time
✅ Simplified, consistent approach using only class types
✅ Clean type inference without complex Union handling

Fixes the original error:

replicate.ReplicateError: The bearer_token client option must be set either by passing bearer_token to the client or by setting the REPLICATE_API_TOKEN environment variable

Fixes issue where replicate.use() would fail if no API token was available at call time,
even when token becomes available later (e.g., from cog.current_scope).

Changes:
- Modified Function/AsyncFunction classes to accept client factories
- Added _client property that creates client on demand
- Updated module client to pass factory functions instead of instances
- Token is now retrieved from current scope when model is called

This maintains full backward compatibility while enabling use in Cog pipelines
where tokens are provided through the execution context.
@zeke zeke requested a review from a team as a code owner August 27, 2025 17:34
- Remove unused *args parameter in test function
- Fix formatting issues from linter
@zeke zeke marked this pull request as draft August 27, 2025 17:37
zeke added 5 commits August 27, 2025 10:38
- Fix async detection to not call client factory prematurely
- Add use_async parameter to explicitly indicate async mode
- Update test to avoid creating client during verification
- Fix test mocking to use correct module path
Replace complex mocking test with simpler verification that:
- use() works without token initially
- Lazy client factory is properly configured
- Client can be created when needed

This avoids complex mocking while still verifying the core functionality.
@zeke zeke marked this pull request as ready for review August 27, 2025 18:55
@zeke zeke requested review from aron and dgellow August 27, 2025 18:55
Copy link
Contributor

@aron aron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine, but I think we can simplify it a little, I've left some suggestions inline.

Address PR feedback by removing Union types and using a single consistent approach:

- Change Function/AsyncFunction constructors to accept Type[Client] only
- Remove Union[Client, Type[Client]] in favor of just Type[Client]
- Simplify _client property logic by removing isinstance checks
- Update all use() overloads to accept class types only
- Use issubclass() for async client detection instead of complex logic
- Update tests to check for _client_class attribute

This maintains the same lazy client creation behavior while being
much simpler and more consistent.
@zeke zeke requested a review from aron August 27, 2025 20:41
Copy link
Contributor

@aron aron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks much cleaner, thanks. Left some comments about the unit tests, consider them suggestions.

zeke and others added 4 commits August 28, 2025 09:41
Co-authored-by: Aron Carroll <aron@aroncarroll.com>
- Remove verbose comments and print statements
- Focus on observable behavior rather than internal implementation
- Use proper mocking that matches actual cog integration
- Test that cog.current_scope() is called on client creation
- Address code review feedback from PR discussion
@zeke zeke changed the base branch from main to next August 28, 2025 17:02
@zeke zeke merged commit caf4c4e into next Aug 28, 2025
6 of 7 checks passed
@zeke zeke deleted the fix-lazy-client-creation branch August 28, 2025 17:03
Copy link
Contributor

@aron aron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, looks good.

Comment on lines +40 to +50
# Access the client property - this should trigger client creation and cog.current_scope call
_ = model._client

assert mock_cog.current_scope.call_count == 1

# Change the token and access client again - should trigger another call
mock_context.items.return_value = [("REPLICATE_API_TOKEN", "test-token-2")]

# Create a new model to trigger another client creation
model2 = replicate.use("test/model2") # type: ignore[misc]
_ = model2._client
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor tweak, we want to assert that the same model refreshes the client between calls. If we're looking at the client we can look at the bearer_token to verify it has actually picked up the token correctly.

Suggested change
# Access the client property - this should trigger client creation and cog.current_scope call
_ = model._client
assert mock_cog.current_scope.call_count == 1
# Change the token and access client again - should trigger another call
mock_context.items.return_value = [("REPLICATE_API_TOKEN", "test-token-2")]
# Create a new model to trigger another client creation
model2 = replicate.use("test/model2") # type: ignore[misc]
_ = model2._client
# Access the client property - this should trigger client creation and cog.current_scope call
assert model._client.bearer_token == "test-token-1"
assert mock_cog.current_scope.call_count == 1
# Change the token and access client again - should trigger another call
mock_context.items.return_value = [("REPLICATE_API_TOKEN", "test-token-2")]
# Assert that the second time picks up the new token
assert model2._client.bearer_token == "test-token-2"

@stainless-app stainless-app bot mentioned this pull request Aug 28, 2025
zeke added a commit that referenced this pull request Aug 28, 2025
* fix: implement lazy client creation in replicate.use() (#57)

* fix: implement lazy client creation in replicate.use()

Fixes issue where replicate.use() would fail if no API token was available at call time,
even when token becomes available later (e.g., from cog.current_scope).

Changes:
- Modified Function/AsyncFunction classes to accept client factories
- Added _client property that creates client on demand
- Updated module client to pass factory functions instead of instances
- Token is now retrieved from current scope when model is called

This maintains full backward compatibility while enabling use in Cog pipelines
where tokens are provided through the execution context.

* style: fix linter issues

- Remove unused *args parameter in test function
- Fix formatting issues from linter

* fix: resolve async detection and test issues

- Fix async detection to not call client factory prematurely
- Add use_async parameter to explicitly indicate async mode
- Update test to avoid creating client during verification
- Fix test mocking to use correct module path

* test: simplify lazy client test

Replace complex mocking test with simpler verification that:
- use() works without token initially
- Lazy client factory is properly configured
- Client can be created when needed

This avoids complex mocking while still verifying the core functionality.

* lint

* fix: add type ignore for final linter warning

* fix: add arg-type ignore for type checker warnings

* refactor: simplify lazy client creation to use Type[Client] only

Address PR feedback by removing Union types and using a single consistent approach:

- Change Function/AsyncFunction constructors to accept Type[Client] only
- Remove Union[Client, Type[Client]] in favor of just Type[Client]
- Simplify _client property logic by removing isinstance checks
- Update all use() overloads to accept class types only
- Use issubclass() for async client detection instead of complex logic
- Update tests to check for _client_class attribute

This maintains the same lazy client creation behavior while being
much simpler and more consistent.

* Update tests/test_simple_lazy.py

Co-authored-by: Aron Carroll <aron@aroncarroll.com>

* test: improve lazy client test to follow project conventions

- Remove verbose comments and print statements
- Focus on observable behavior rather than internal implementation
- Use proper mocking that matches actual cog integration
- Test that cog.current_scope() is called on client creation
- Address code review feedback from PR discussion

* lint

* lint

---------

Co-authored-by: Aron Carroll <aron@aroncarroll.com>

* release: 2.0.0-alpha.22

---------

Co-authored-by: Zeke Sikelianos <zeke@sikelianos.com>
Co-authored-by: Aron Carroll <aron@aroncarroll.com>
Co-authored-by: stainless-app[bot] <142633134+stainless-app[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants