Skip to content

Use production API for llm gateway and predictive API#90

Merged
atulikumwenayo merged 15 commits intomainfrom
predict
Apr 29, 2026
Merged

Use production API for llm gateway and predictive API#90
atulikumwenayo merged 15 commits intomainfrom
predict

Conversation

@atulikumwenayo
Copy link
Copy Markdown
Collaborator

No description provided.

Copy link
Copy Markdown
Contributor

@joroscoSF joroscoSF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few comments for inquiring minds.

)

from loguru import logger
import requests
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs to be added as a project dependency?


def get_headers(self):
if self.token_response is None:
self.token_response = self._token_provider.get_token()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we know how long tokens last? Not seeing any refresh logic for long running jobs? If no refresh logic then perhaps trap a failed token error?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The client_credentials flow token lasts 2 hours. That should be plenty enough to test SFAP API outside DC. Also, every time get_headers() is invoked a new token is retrieved.

payload["settings"] = request.settings

logger.debug(f"Making Einstein prediction request to: {api_url}")
try:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the requests.post can be in EinsteinPlatformClient to remove duplication of code

sf_cli_org: Optional[str] = None,
**kwargs,
):
EinsteinPlatformClient.__init__(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of initialising EinsteinPlatformClient here, we can ingest. This will help in removing duplicate code for prediction and llm_agteway

sf_cli_org: Optional[str] = None,
**kwargs,
):
EinsteinPlatformClient.__init__(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EinsteinPlatformClient object should be injected to DefaultLLMGateway.init


def generate_text(self, request: GenerateTextRequest) -> GenerateTextResponse:
api_url = (
f"{self.EINSTEIN_PLATFORM_URL}/models/{request.model_name}/generations"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should EINSTEIN_PLATFORM_URL also contain "/models/" , as this seems to be common

raise RuntimeError(f"Generate text request failed: {e}") from e

return GenerateTextResponseBuilder.build(response_data)
return GenerateTextResponse(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not modify GenerateTextResponseBuilder.build instead?

First configure an external client app before using these AI APIs
https://developer.salesforce.com/docs/ai/agentforce/guide/agent-api-get-started.html#create-a-salesforce-app"
"""
# generate_text(runtime)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why comment generate_text and make_einstein_prediction?

Copy link
Copy Markdown
Collaborator Author

@atulikumwenayo atulikumwenayo Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want the generated sample function code to work out of the box while running outside DC. Otherwise, the generated code will fail unless the user updates the two methods. The sample SFAP code is there ready to be used after the customer has updated it to use their own models etc.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

self._validate_data_layer_history_does_not_contain(DataCloudObjectType.DLO)
return self._writer.write_to_dmo(name, dataframe, write_mode, **kwargs) # type: ignore[no-any-return]

def call_llm_gateway(self, LLM_MODEL_ID: str, prompt: str, maxTokens: int) -> str:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@markdlv-sf was mentioning this is still used?

"Authorization": f"Bearer {self.token_response.access_token}",
"Content-Type": "application/json",
"x-sfdc-app-context": "EinsteinGPT",
"x-client-feature-id": "ai-platform-models-connected-app",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trace-id is missing

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this code will be running on the customer's machine, how do you see the customer using the trace id?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah my bad.. it's external SDk and public API

payload["tags"] = request.tags

logger.debug(f"Making Generate text request: {api_url}")
try:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    try:
        response = requests.post(
            api_url, json=payload, headers=self.get_headers(), timeout=180
        )
        if not response.ok and not response.text:
            error_msg = (
                f"Generate text request failed: {api_url} - "
                f"{response.status_code} {response.reason}"
            )
            logger.error(error_msg)
    except requests.exceptions.RequestException as e:
        logger.error(f"Generate text request failed: {api_url} {e}")
        raise RuntimeError(f"Generate text request failed: {e}") from e

Can this be moved to EinsteinPlatformClient.post_request() ?

@atulikumwenayo atulikumwenayo merged commit 900526d into main Apr 29, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants