Use production API for llm gateway and predictive API#90
Use production API for llm gateway and predictive API#90atulikumwenayo merged 15 commits intomainfrom
Conversation
joroscoSF
left a comment
There was a problem hiding this comment.
Few comments for inquiring minds.
| ) | ||
|
|
||
| from loguru import logger | ||
| import requests |
There was a problem hiding this comment.
this needs to be added as a project dependency?
|
|
||
| def get_headers(self): | ||
| if self.token_response is None: | ||
| self.token_response = self._token_provider.get_token() |
There was a problem hiding this comment.
do we know how long tokens last? Not seeing any refresh logic for long running jobs? If no refresh logic then perhaps trap a failed token error?
There was a problem hiding this comment.
The client_credentials flow token lasts 2 hours. That should be plenty enough to test SFAP API outside DC. Also, every time get_headers() is invoked a new token is retrieved.
| payload["settings"] = request.settings | ||
|
|
||
| logger.debug(f"Making Einstein prediction request to: {api_url}") | ||
| try: |
There was a problem hiding this comment.
the requests.post can be in EinsteinPlatformClient to remove duplication of code
| sf_cli_org: Optional[str] = None, | ||
| **kwargs, | ||
| ): | ||
| EinsteinPlatformClient.__init__( |
There was a problem hiding this comment.
Instead of initialising EinsteinPlatformClient here, we can ingest. This will help in removing duplicate code for prediction and llm_agteway
| sf_cli_org: Optional[str] = None, | ||
| **kwargs, | ||
| ): | ||
| EinsteinPlatformClient.__init__( |
There was a problem hiding this comment.
EinsteinPlatformClient object should be injected to DefaultLLMGateway.init
|
|
||
| def generate_text(self, request: GenerateTextRequest) -> GenerateTextResponse: | ||
| api_url = ( | ||
| f"{self.EINSTEIN_PLATFORM_URL}/models/{request.model_name}/generations" |
There was a problem hiding this comment.
Should EINSTEIN_PLATFORM_URL also contain "/models/" , as this seems to be common
| raise RuntimeError(f"Generate text request failed: {e}") from e | ||
|
|
||
| return GenerateTextResponseBuilder.build(response_data) | ||
| return GenerateTextResponse( |
There was a problem hiding this comment.
Why not modify GenerateTextResponseBuilder.build instead?
| First configure an external client app before using these AI APIs | ||
| https://developer.salesforce.com/docs/ai/agentforce/guide/agent-api-get-started.html#create-a-salesforce-app" | ||
| """ | ||
| # generate_text(runtime) |
There was a problem hiding this comment.
Why comment generate_text and make_einstein_prediction?
There was a problem hiding this comment.
I want the generated sample function code to work out of the box while running outside DC. Otherwise, the generated code will fail unless the user updates the two methods. The sample SFAP code is there ready to be used after the customer has updated it to use their own models etc.
| self._validate_data_layer_history_does_not_contain(DataCloudObjectType.DLO) | ||
| return self._writer.write_to_dmo(name, dataframe, write_mode, **kwargs) # type: ignore[no-any-return] | ||
|
|
||
| def call_llm_gateway(self, LLM_MODEL_ID: str, prompt: str, maxTokens: int) -> str: |
There was a problem hiding this comment.
@markdlv-sf was mentioning this is still used?
| "Authorization": f"Bearer {self.token_response.access_token}", | ||
| "Content-Type": "application/json", | ||
| "x-sfdc-app-context": "EinsteinGPT", | ||
| "x-client-feature-id": "ai-platform-models-connected-app", |
There was a problem hiding this comment.
trace-id is missing
There was a problem hiding this comment.
Given that this code will be running on the customer's machine, how do you see the customer using the trace id?
There was a problem hiding this comment.
Ah my bad.. it's external SDk and public API
| payload["tags"] = request.tags | ||
|
|
||
| logger.debug(f"Making Generate text request: {api_url}") | ||
| try: |
There was a problem hiding this comment.
try:
response = requests.post(
api_url, json=payload, headers=self.get_headers(), timeout=180
)
if not response.ok and not response.text:
error_msg = (
f"Generate text request failed: {api_url} - "
f"{response.status_code} {response.reason}"
)
logger.error(error_msg)
except requests.exceptions.RequestException as e:
logger.error(f"Generate text request failed: {api_url} {e}")
raise RuntimeError(f"Generate text request failed: {e}") from e
Can this be moved to EinsteinPlatformClient.post_request() ?
No description provided.