## Running LLMs Locally with Azure AI Foundry Local

In [None]:
%pip install openai

In [1]:
%pip install foundry-local-sdk

Collecting foundry-local-sdk
  Downloading foundry_local_sdk-0.3.1-py3-none-any.whl.metadata (5.0 kB)
Downloading foundry_local_sdk-0.3.1-py3-none-any.whl (12 kB)
Installing collected packages: foundry-local-sdk
Successfully installed foundry-local-sdk-0.3.1
Note: you may need to restart the kernel to use updated packages.


### Use OpenAI SDK with Foundry Local

In [2]:
import openai
from foundry_local import FoundryLocalManager

# By using an alias, the most suitable model will be downloaded 
# to your end-user's device. 
alias = "phi-3.5-mini"

# Create a FoundryLocalManager instance. This will start the Foundry
# Local service if it is not already running and load the specified model.
manager = FoundryLocalManager(alias)

# The remaining code uses the OpenAI Python SDK to interact with the local model.
# Configure the client to use the local Foundry service
client = openai.OpenAI(
    base_url=manager.endpoint,
    api_key=manager.api_key  # API key is not required for local usage
)

# Set the model to use and generate a response
response = client.chat.completions.create(
    model=manager.get_model_info(alias).id,
    messages=[{"role": "user", "content": "What is the golden ratio?"}]
)
print(response.choices[0].message.content)

 The golden ratio, often denoted by the Greek letter phi (φ), is an irrational number approximately equal to 1.618033988749895. It arises when a line is divided into two parts such that the ratio of the whole line to the larger part is the same as the ratio of the larger part to the smaller part. Mathematically, if a line segment is divided into parts a and b with a being the longer part and b being the shorter part, then the golden ratio is achieved when (a+b)/a = a/b.


The golden ratio is found in various aspects of art, architecture, and nature, often associated with aesthetically pleasing proportions. It has been used by artists and architects for centuries, notably in the works of Leonardo da Vinci and the Parthenon in Athens.


The golden ratio can be expressed algebraically


### Streaming Response 

In [3]:
import openai
from foundry_local import FoundryLocalManager

# By using an alias, the most suitable model will be downloaded 
# to your end-user's device.
alias = "phi-3.5-mini"

# Create a FoundryLocalManager instance. This will start the Foundry 
# Local service if it is not already running and load the specified model.
manager = FoundryLocalManager(alias)

# The remaining code us es the OpenAI Python SDK to interact with the local model.

# Configure the client to use the local Foundry service
client = openai.OpenAI(
    base_url=manager.endpoint,
    api_key=manager.api_key  # API key is not required for local usage
)

# Set the model to use and generate a streaming response
stream = client.chat.completions.create(
    model=manager.get_model_info(alias).id,
    messages=[{"role": "user", "content": "What is the golden ratio?"}],
    stream=True
)

# Print the streaming response
for chunk in stream:
    if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].delta.content, end="", flush=True)

 The golden ratio, often denoted by the Greek letter phi (φ), is an irrational number approximately equal to 1.618033988749895. It arises when a line is divided into two parts such that the ratio of the whole line to the larger part is the same as the ratio of the larger part to the smaller part. Mathematically, if a line segment is divided into parts a and b with a being the longer part and b being the shorter part, then the golden ratio is achieved when (a+b)/a = a/b.


The golden ratio is found in various aspects of art, architecture, and nature, often associated with aesthetically pleasing proportions. It has been used by artists and architects for centuries, notably in the works of Leonardo da Vinci and the Parthenon in Athens.


The golden ratio can be expressed algebraically

### Using Requests Python Library with Foundry Local

In [None]:
# Install with: pip install requests
import requests
import json
from foundry_local import FoundryLocalManager

# By using an alias, the most suitable model will be downloaded 
# to your end-user's device. 
alias = "phi-3.5-mini"

# Create a FoundryLocalManager instance. This will start the Foundry
# Local service if it is not already running and load the specified model.
manager = FoundryLocalManager(alias)

url = manager.endpoint + "/chat/completions"

payload = {
    "model": manager.get_model_info(alias).id,
    "messages": [
        {"role": "user", "content": "What is the golden ratio?"}
    ]
}

headers = {
    "Content-Type": "application/json"
}

response = requests.post(url, headers=headers, data=json.dumps(payload))
print(response.json()["choices"][0]["message"]["content"])