## Key topics:

**Tokens**: Tokens are a numerical representation of how the Azure OpenAI models process text. So they are representing words or just chunks of characters. For English text, 1 token is approximately 4 characters or 0.75 words. 

**Tokenization**: splitting input/output texts into smaller units for LLMs.

**Vocabulary size**: the number of tokens each model uses, which varies among different GPT models.

In [None]:
#pip install tiktoken #The open source version of tiktoken can be installed from PyPI

In [11]:
import tiktoken 

cl100k_base = tiktoken.get_encoding("cl100k_base") 

enc = tiktoken.Encoding( 
    name="gpt-35-turbo",  
    pat_str=cl100k_base._pat_str, 
    mergeable_ranks=cl100k_base._mergeable_ranks, 
    special_tokens={ 
        **cl100k_base._special_tokens, 
        "<|im_start|>": 100264, 
        "<|im_end|>": 100265
    } 
) 

tokens = enc.encode( 
    "The Very Group announces long term strategic partnership with Carlyle and IMI and a robust Q2 performance."
) 

print('Total number of tokens:', len(tokens))
print('Tokens : ', [enc.decode([t]) for t in tokens])
print("Tokens' numerical values:", tokens)

#https://platform.openai.com/tokenizer

Total number of tokens: 21
Tokens :  ['The', ' Very', ' Group', ' announces', ' long', ' term', ' strategic', ' partnership', ' with', ' Carly', 'le', ' and', ' IM', 'I', ' and', ' a', ' robust', ' Q', '2', ' performance', '.']
Tokens' numerical values: [791, 15668, 5856, 48782, 1317, 4751, 19092, 15664, 449, 79191, 273, 323, 6654, 40, 323, 264, 22514, 1229, 17, 5178, 13]


In [5]:
import openai
from openai import AzureOpenAI
import os 
from azure.identity import ManagedIdentityCredential

default_credential=ManagedIdentityCredential(client_id="f4980c43-9766-48d7-a925-c377a74605bb")
token=default_credential.get_token("https://cognitiveservices.azure.com/.default")
Resource_endpoint="https://openaiykus.openai.azure.com/"
openai.api_type="azure_ad"

client = AzureOpenAI(
  azure_endpoint = Resource_endpoint, 
  api_key=token.token,  
  api_version="2023-05-15"
)

In [8]:
deployment_name='gpt-35-turbo-instruct' 
#This will correspond to the custom name you chose for your deployment when you deployed a model. 
    
# Send a completion call to generate an answer
print('Sending a test completion job')
start_phrase = 'Help with the cost of living. '
response = client.completions.create(
    model=deployment_name, 
    prompt=start_phrase, 
    max_tokens=100)
print(response.choices[0].text)

Sending a test completion job


1. Budgeting: The first step to tackle the cost of living is to create a budget. Write down all your monthly expenses, including rent, utilities, groceries, transportation, and other essential items. Then, compare it with your income. This will help you identify where you can make cuts and save money.

2. Cut unnecessary expenses: Identify areas where you can cut unnecessary expenses such as dining out, cable TV, gym membership, subscription services, etc. Cancel or reduce these expenses to


# Usage

In [9]:
response

Completion(id='cmpl-9AUIRvUKNxSJGYQe9DnaaJeE1n7Gb', choices=[CompletionChoice(finish_reason='length', index=0, logprobs=None, text='\n\n1. Budgeting: The first step to tackle the cost of living is to create a budget. Write down all your monthly expenses, including rent, utilities, groceries, transportation, and other essential items. Then, compare it with your income. This will help you identify where you can make cuts and save money.\n\n2. Cut unnecessary expenses: Identify areas where you can cut unnecessary expenses such as dining out, cable TV, gym membership, subscription services, etc. Cancel or reduce these expenses to')], created=1712286767, model='gpt-35-turbo-instruct', object='text_completion', system_fingerprint=None, usage=CompletionUsage(completion_tokens=100, prompt_tokens=8, total_tokens=108))

In [10]:
response.usage

CompletionUsage(completion_tokens=100, prompt_tokens=8, total_tokens=108)

Azure OpenAI uses a subword tokenization method called "Byte-Pair Encoding (BPE)" for its GPT-based models. ** BPE is a method that merges the most frequently occurring pairs of characters or bytes into a single token **, until a certain number of tokens or a vocabulary size is reached. BPE can help the model to handle rare or unseen words, and to create more compact and consistent representations of the texts. BPE can also allow the model to generate new words or tokens, by combining existing ones. 

https://learn.microsoft.com/en-us/semantic-kernel/prompt-engineering/tokens