# SLM and LLM token count processing

## Imports

In [71]:
import ollama
import common
import tiktoken

client = common.get_openai_client(api_key=common.api_KEY,
        api_version=common.api_version,
        azure_endpoint=common.api_URI)

## Call Ollama and Azure GPT

In [None]:
def gpt_completion(prompt:str, model='gpt4o'):
    completion = client.chat.completions.create(
        model=model,
        messages=[
        {
           "role": "user",
            "content": prompt,
        }]
    )
    
    return completion.choices[0].message.content

    
def ollama_completion(prompt:str, model='llava'):
    response = ollama.chat(model=model, messages=[
    {
        'role': 'user',
        'content': prompt,
    },
    ])
    return response['message']['content']

## Choose to summarize based on token count with Ollama Llava or Azure OpenAI GPT4o

In [None]:
encoding = tiktoken.get_encoding("o200k_base")
def get_tokens(text: str) -> str:
    return encoding.encode(text)

def summarize(text: str):
    template = f"Summarize the following text in one paragraph:\n{text}"
    token_count = len(get_tokens(template))
    print(f"Token count: {token_count}")
    if  token_count> 4096:
        # GPT processing
        print("Processing with GPT")
        return gpt_completion(template)
    else:
        # Ollama processing
        print("Processing with Ollama")
        return ollama_completion(template)

def read_file(file_path: str) -> str:
    with open(file_path, 'r') as file:
        return file.read()

## Read a file and determine if it can be processed by Ollama locally or with Azure GPT

In [68]:
short_data = read_file("data/401k.txt")
print(summarize(short_data))

Token count: 463
Processing with Ollama
 The text discusses a 401(k) plan offered by Contoso to its eligible employees as part of its benefits package. The company matches employee contributions up to a certain percentage, based on years of service and salary level. A third-party provider administers the plan, allowing participants to choose from various investment options ranging from conservative to aggressive. Participants can adjust their contribution amount and investment allocation at any time. While there are rules and limitations regarding withdrawals, the company encourages employees to read the summary plan description (SPD) and participant disclosure notice (PDN) for more information about the 401(k) plan. Employees with questions can contact the provider's customer service center or consult a financial planner or tax advisor. 


## Read a file and determine if it can be processed by Ollama locally or with Azure GPT

In [69]:
long_data = read_file("data/aks_core_concepts.txt")
print(summarize(long_data))


Token count: 5797
Processing with GPT
Kubernetes is an evolving platform for managing container-based applications, focusing on application workloads rather than infrastructure. It uses a declarative approach to deployments supported by robust APIs, facilitating modern, portable, microservices-based applications. Azure Kubernetes Service (AKS) simplifies Kubernetes management by handling the control plane, with users only paying for the nodes running applications. Kubernetes architecture splits into a control plane and nodes; the control plane manages core services and workload orchestration, while nodes execute application workloads. Components like kube-apiserver and etcd in the control plane and kubelet and kube-proxy on nodes play crucial roles. AKS nodes use Azure VMs, with configurable sizes and scaling to meet application demands. Current best practices include using deployments for stateless applications and StatefulSets or DaemonSets for stateful applications and specific task