In [1]:
import dotenv
import os
from openai import OpenAI
from openai import OpenAIError

### Get you HuggingFace access token (no need to install anything, but limited api calls)

Go to: https://huggingface.co/docs/hub/en/oauth and sign in.

On the upper right corner of the page press your user icon, and choose `Access tokens`, and press `Create new token`. Create a new token with write permissions, and copy it to clipboard.

Save the copied token into a variable:

`HUGGINGFACE_ACCESS_TOKEN=<your-token>`

**NOTE: do not share or expose this token via git etc.**

You should also set:

`OPENAI_API_KEY=anything-since-this-does-not-matter`

In [2]:
dotenv.load_dotenv()
hf_api_key = os.getenv("HUGGINGFACE_ACCESS_TOKEN")
ollama_api_key = os.getenv("OPENAI_API_KEY") # this is for compatibility with OpenAI API


if not hf_api_key:
    raise ValueError("HUGGINGFACE_ACCESS_TOKEN environment variable not set")

if not ollama_api_key:
    raise ValueError("OPENAI_API_KEY environment variable not set")

### (optional) Install Ollama -- unlimited use.

Install ollama: https://ollama.com/download

Pull a model:

`ollama pull llama3.2`

Check that ollama is serving:

`curl localhost:11434`

### (optional) Run ollama from docker (docker does most things for you, but some additional commands needed to set up)

Depending on how docker is configured on your machine, you may need to add `sudo` on all docker commands. Alternatively you can follow the instructions in docker docs to create a docker user group and add yourself to it.

`docker run -d -v ollama:/root/.ollama -p 11435:11434 --name ollama ollama/ollama`

* you can enter the container with: `docker exec -it ollama bash`
* and run the following command to pull a model (takes a while): `ollama pull <model_name>` (llama3.2 for now)
* run `ollama serve` to start the server (might not be needed if it's already running).
* exit the container with: `exit`
* on local machine, you can access the server on port 11435: e.g., `curl http://localhost:11435` --> "ollama is running"
* to stop the container, run: `docker stop ollama`
* to remove the container, run: `docker rm ollama`

#### Choose HF model

see models: https://huggingface.co/models

In [3]:
#some alternatives
models = [
    "mistralai/Mistral-7B-Instruct-v0.3",
    "microsoft/Phi-3-mini-4k-instruct",
    "llama3.2" # for ollama 
]

#### We are using OpeanAI api here

OpenAI no longer grants any free usage. But many other providers, like Ollama and HF, offer endpoints that comply with OpenAI's api. This is for compatibility, and because OpenAI is one of the most used commercial providers -- this way it is possible to develop LLM applications using free/cheaper endpoints, and then eventually deploy with openai.

In [4]:
# 2. Create the LLM client using OpenAI API and huggingface
client = OpenAI(
    api_key=hf_api_key,
    base_url="https://api-inference.huggingface.co/v1/",
)

In [5]:
# 2. Create the LLM client using OpenAI API and ollama
client = OpenAI(
    api_key=ollama_api_key,
    base_url="http://localhost:11434/v1/", # if you are using ollama locally
    #base_url="http://localhost:11435/v1/", # if you are using ollama from docker
)

In [6]:
response = client.chat.completions.create(
  model=models[2],
  messages=[
    {"role": "user", "content": "Hello!"},
  ]
)


In [7]:
dict(response)

{'id': 'chatcmpl-500',
 'choices': [Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="It's nice to meet you. Is there something I can help you with or would you like to chat?", refusal=None, role='assistant', annotations=None, audio=None, function_call=None, tool_calls=None))],
 'created': 1745764537,
 'model': 'llama3.2',
 'object': 'chat.completion',
 'service_tier': None,
 'system_fingerprint': 'fp_ollama',
 'usage': CompletionUsage(completion_tokens=23, prompt_tokens=27, total_tokens=50, completion_tokens_details=None, prompt_tokens_details=None)}

In [8]:
dict(response.choices[0])

{'finish_reason': 'stop',
 'index': 0,
 'logprobs': None,
 'message': ChatCompletionMessage(content="It's nice to meet you. Is there something I can help you with or would you like to chat?", refusal=None, role='assistant', annotations=None, audio=None, function_call=None, tool_calls=None)}

In [9]:
dict(response.choices[0].message)

{'content': "It's nice to meet you. Is there something I can help you with or would you like to chat?",
 'refusal': None,
 'role': 'assistant',
 'annotations': None,
 'audio': None,
 'function_call': None,
 'tool_calls': None}

In [10]:
response.choices[0].message.content

"It's nice to meet you. Is there something I can help you with or would you like to chat?"

In [17]:
chat_completion = client.chat.completions.create(
    model=models[2],
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Why is open-source software important?"},
    ],
    stream=True,
    max_tokens=500
)

# iterate and print stream
for message in chat_completion:
    print(message.choices[0].delta.content, end="")

Open-source software (OSS) is an increasingly popular topic, and for good reason! Here are some reasons why OSS is important:

1. **Cost-effectiveness**: Open-source software is often free or low-cost to use, making it accessible to individuals, organizations, and governments with limited budgets.
2. **Customizability**: With open-source software, users have the freedom to modify and adapt the code to suit their needs, ensuring that the solution meets their specific requirements.
3. **Collaboration**: Open-source projects encourage collaboration among developers, which leads to better maintenance, bug fixing, and feature enhancement over time.
4. **Transparency**: Open-source software licenses provide visibility into how the project is maintained, updated, and distributed, giving users confidence in the software's quality.
5. **Community-driven development**: OSS relies on volunteer contributors who contribute to its development, which fosters a sense of community and encourages innova