# Intro to LLMstudio - proxy

This tutorial serves as an introduction to LLMstudio, a platform designed to facilitate interactions with large language models (LLMs) through a proxy server. By following this guide, users will learn how to set up and run a proxy server that acts as an intermediary between the client and the LLM provider, ensuring seamless communication and data handling. The notebook demonstrates the process of initializing the server, configuring the proxy, and making requests to the LLM, providing a comprehensive overview of the system's capabilities.

Additionally, the tutorial highlights the integration of tracking features within LLMstudio, allowing users to monitor and log interactions with the LLMs. This is particularly useful for analyzing performance metrics, such as latency and token usage, which can help in optimizing the model's responses and understanding its behavior in different scenarios. By the end of this tutorial, users will have a solid foundation in setting up and utilizing LLMstudio for efficient and effective LLM interactions.

You'll learn:
1. How to start a local proxy server
1. How to connect to any provider available (VertexAI, OpenAI, etc.)
2. Make sync and async calls both with and without streaming
3. See the save logs

First things first:
* run `pip install llmstudio[proxy]`
* update your .env file with `GOOGLE_API_KEY` or `OPENAI_API_KEY`


## start proxy server


In [1]:
from llmstudio.server import start_servers

# default port is 50001. set the environment varible to specify which host and port; LLMSTUDIO_ENGINE_HOST, LLMSTUDIO_ENGINE_PORT
start_servers(proxy=True, tracker=False)

Running LLMstudio Proxy on http://0.0.0.0:8001 


In [2]:
from llmstudio_proxy.provider import LLMProxyProvider as LLM
from llmstudio_proxy.provider import ProxyConfig

llm = LLM(provider="openai", 
          proxy_config=ProxyConfig(host="0.0.0.0", port="8001"))

Connected to LLMStudio Proxy @ 0.0.0.0:8001


In [3]:
result = llm.chat("olá", model="gpt-4o")


In [4]:
result.chat_output, result.metrics

('Olá! Como posso ajudá-lo hoje?',
 {'input_tokens': 2,
  'output_tokens': 11,
  'total_tokens': 13,
  'cost_usd': 0.000175,
  'latency_s': 0.719973087310791,
  'time_to_first_token_s': 0.547133207321167,
  'inter_token_latency_s': 0.018544991811116535,
  'tokens_per_second': 13.889408057392146})

### Use the LLMStudio SDK entrypoint


In [5]:
from llmstudio.providers import LLM
from pprint import pprint

In [6]:
# You can set OPENAI_API_KEY and ANTHROPIC_API_KEY on .env file
from llmstudio_proxy.provider import ProxyConfig
proxy = ProxyConfig(host="0.0.0.0", port="8001")
print(proxy)

openai = LLM("openai", proxy_config=proxy)


host='0.0.0.0' port='8001' url=None username=None password=None
Connected to LLMStudio Proxy @ 0.0.0.0:8001


In [7]:
openai._provider


<llmstudio_proxy.provider.LLMProxyProvider at 0x1167c0530>

### Chat (non-stream)

In [8]:
openai.chat("What's your name", model="gpt-4o")

ChatCompletion(id='38cd37e1-6eec-43df-a75c-ab3445a78f1c', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="I'm an AI developed by OpenAI, and I don't have a personal name. You can call me Assistant if you'd like!", refusal=None, role='assistant', audio=None, function_call=None, tool_calls=None))], created=1729767674, model='gpt-4o', object='chat.completion', service_tier=None, system_fingerprint=None, usage=None, chat_input="What's your name", chat_output="I'm an AI developed by OpenAI, and I don't have a personal name. You can call me Assistant if you'd like!", chat_output_stream='', context=[{'role': 'user', 'content': "What's your name"}], provider='openai', deployment='gpt-4o-2024-08-06', timestamp=1729767675.315573, parameters={}, metrics={'input_tokens': 4, 'output_tokens': 28, 'total_tokens': 32, 'cost_usd': 0.00044, 'latency_s': 1.1188991069793701, 'time_to_first_token_s': 0.7827062606811523, 'inter_token_latency_s': 0.0129054

#### Async version

In [9]:
result = await openai.achat("What's your name", model="gpt-4o")
pprint(result)

ChatCompletion(id='cbe07ad8-04b6-4423-986d-ae2be63ef0a8', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='I’m called ChatGPT. How can I assist you today?', refusal=None, role='assistant', audio=None, function_call=None, tool_calls=None))], created=1729767676, model='gpt-4o', object='chat.completion', service_tier=None, system_fingerprint=None, usage=None, chat_input="What's your name", chat_output='I’m called ChatGPT. How can I assist you today?', chat_output_stream='', context=[{'role': 'user', 'content': "What's your name"}], provider='openai', deployment='gpt-4o-2024-08-06', timestamp=1729767677.5680408, parameters={}, metrics={'input_tokens': 4, 'output_tokens': 14, 'total_tokens': 18, 'cost_usd': 0.00023, 'latency_s': 0.8708088397979736, 'time_to_first_token_s': 0.5257279872894287, 'inter_token_latency_s': 0.024605206080845425, 'tokens_per_second': 17.225364872823267})


### Chat (stream)

In [10]:
response = openai.chat("Write a paragfraph about space", model="gpt-4o", is_stream=True)
for i, chunk in enumerate(response):
    if i%20==0:
        print("\n")
    if not chunk.metrics:
        print(chunk.chat_output_stream, end="", flush=True)
    else:
        print("\n\n## Metrics:")
        pprint(chunk.metrics)




Space, the vast expanse that stretches beyond the confines of our planet, is a realm of

 wonder and mystery that has captivated human imagination for centuries. It is a place where stars are born,

 galaxies collide, and black holes exert their powerful pull, shaping the cosmos in ways we are just beginning

 to understand. Space exploration has advanced dramatically since the mid-20th century, with missions that have

 taken us to the Moon, Mars, and beyond, providing glimpses of distant worlds and enhancing our

 understanding of the universe. This infinite frontier challenges our scientific knowledge and technological prowess, driving innovation and inspiring

 a sense of awe as we ponder the possibilities of life beyond Earth and the origins of the universe itself

. As we continue our journey into space, we are reminded of the boundless potential that lies beyond

 our sky, waiting to be discovered.

## Metrics:
{'cost_usd': 0.00256,
 'input_tokens': 8,
 'inter_token_latency_s'

#### Async version

In [11]:
i=0
async for chunk in await openai.achat("Write a paragfraph about space", model="gpt-4o-mini", is_stream=True):
    if i%20==0:
        print("\n")
    if not chunk.metrics:
        print(chunk.chat_output_stream, end="", flush=True)
    else:
        print("\n\n## Metrics:")
        pprint(chunk.metrics)
    i+=1




Space, the vast and enigmatic expanse that stretches beyond our atmosphere, is a realm of infinite

 wonder and discovery. It is home to billions of galaxies, each containing stars, planets, and potentially

 life, all governed by the fundamental laws of physics. The beauty of space captivates our imagination,

 from the shimmering glow of distant stars to the mesmerizing swirl of galaxies. It serves not only as a

 backdrop for our planet but also as the stage for humanity's quest for knowledge, as we launch missions

 to explore celestial bodies, probe the mysteries of black holes, and decode the origins of the universe.

 As we look up at the night sky, we are reminded of our place in the cosmos, spar

king curiosity and a deep desire to understand the forces that shape our existence.

## Metrics:
{'cost_usd': 9.54e-05,
 'input_tokens': 8,
 'inter_token_latency_s': 0.011587064496932491,
 'latency_s': 2.327955961227417,
 'output_tokens': 157,
 'time_to_first_token_s': 0.53140091896