NVIDIA LLM API is a proxy AI Inference Engine offering a wide range of models from various providers.
Spring AI integrates with the NVIDIA LLM API by reusing the existing Spring AI OpenAI client.
For this you need to set the base-url to https://integrate.api.nvidia.com
, select one of the provided https://docs.api.nvidia.com/nim/reference/llm-apis#model[LLM models] and get an api-key
for it.
NOTE: NVIDIA LLM API requires the max-token
parameter to be explicitly set or server error will be thrown.
Find more in the SpringAI/NVIDIA Reference documentation: https://docs.spring.io/spring-ai/reference/api/chat/nvidia-chat.html