## Overview
This hands on help you to understand the overall Azure OpenAI API token usage

You can explore what is token and how it is influence the LLM model interaction https://help.openai.com/en/articles/4936856-what-are-tokens-and-how-to-count-them

OpenAI Tokenizer https://platform.openai.com/tokenizer or https://tiktokenizer.vercel.app/?model=gpt-4o

For further understanding about various GPT models and its token limits https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models#gpt-4-and-gpt-4-turbo-models



### 1. Install tiktoken python library

In [None]:
# install tiktoken to count the number of tokens
%pip install tiktoken

### 2.  Count the number of Tokens for given sentence

In [None]:
import re
import requests
import sys
import os
from openai import AzureOpenAI
import tiktoken
from dotenv import load_dotenv
load_dotenv()

# Create AOAI client using end point and key credentials
client = AzureOpenAI(
  azure_endpoint = os.getenv("OPENAI_API_ENDPOINT"), 
  api_key=os.getenv("OPENAI_API_KEY"),    
  api_version='2023-05-15',
)

# set the model deployment name
CHAT_COMPLETIONS_MODEL = os.getenv("GPT4_MODEL_NAME")

# sample prompet text to count the tokens
prompt = "Azure OpenAI service is General Available now!, UKHO team you can ready to start use Azure OpenAI Models now"

encoding=tiktoken.encoding_for_model(CHAT_COMPLETIONS_MODEL)
print(encoding)
tokens = encoding.encode(prompt)
print('Total number of tokens:', len(tokens))
print('Tokens :', tokens)
print('Words : ', [encoding.decode([t]) for t in tokens])

### 2. Show 2 returned results

In [None]:
# Call AOAI model with prompt to get completion
response = client.chat.completions.create(
  model=CHAT_COMPLETIONS_MODEL,
  messages = [{"role":"system", "content":"You are a helpful assistant."},
               {"role":"user","content": prompt}],
    max_tokens=60,
    n=2, # number of response
)
print('='*30, 'ANSWER #1', '='*30)
print(response.choices[0].message.content)
print('='*30, 'ANSWER #2', '='*30)
print(response.choices[1].message.content)

# Token Usage

In [None]:
# Provide overall token useage
response.usage

# Token useage, limits and error

In [None]:
prompt = "NHS England, formerly the NHS Commissioning Board, is an executive non-departmental public body of the Department of Health and Social Care. It oversees the budget, planning, delivery and day-to-day operation of the commissioning side of the National Health Service in England as set out in the Health and Social Care Act 2012.[3] It directly commissions NHS general practitioners, dentists, optometrists and some specialist services. The Secretary of State publishes annually a document known as the NHS mandate which specifies the objectives which the Board should seek to achieve. National Health Service (Mandate Requirements) Regulations are published each year to give legal force to the mandate. In 2018 it was announced that the organisation, while maintaining its statutory independence, would be merged with NHS Improvement, and seven 'single integrated regional teams' would be jointly established.[4] History Main article: History of the National Health Service (England) Amanda Pritchard NHS England is the operating name of the NHS Commissioning Board and, before that, the NHS Commissioning Board Authority.[5] It was set up as a special health authority of the NHS in October 2011 as the forerunner to becoming a non-departmental body on 1 April 2013.[3] It was renamed NHS England on 26 March 2013,[6] although its legal name remains the NHS Commissioning Board. Sir David Nicholson, who became Chief Executive at the establishment of the Board, retired at the end of March 2014 and was replaced by Simon Stevens. One of Stevens' first acts was to announce a restructure of the 27 area teams, in response to a requirement to reduce running costs, which would reduce staffing by around 500.[7] The 27 teams outside London were reduced to 12 in 2015.[8] Amanda Pritchard, the Chief Operating Officer of NHS England, was promoted to Chief Executive on 1 August 2021.[9] System management NHS England produced a planning document – the Five Year Forward View – in October 2014 which envisaged development of new models to suit local needs. In conjunction with the other central regulators, the organisation established what is called a 'success regime' in south and mid Essex, North Cumbria and north east and western Devon in June 2015. It was intended to tackle 'deep rooted and systemic issues that previous interventions have not tackled across [a] whole health and care economy'.[10] In 2016 it organised the geographical division of England into 44 sustainability and transformation plan areas, with populations between 300,000 and 3 million. These areas were locally agreed between NHS Trusts, local authorities and clinical commissioning groups (CCGs). A leader was appointed for each area, who is to be responsible for the implementation of the plans which are to be agreed by the component organisations. They will be 'working across organisational boundaries to help build a consensus for transformation and the practical steps to deliver it'.[11] In April 2017 it introduced a capped expenditure process, applied to NHS commissioners and providers in the 13 areas across England with the largest budget deficits. It is intended to reduce their spending by around £500 million, and health leaders were told to 'think the unthinkable'.[12] In 2022 there were seven regional teams and 10,640 full time equivalent staff.[13] Funding of clinical commissioning groups NHS England allocates funding (£69.5 billion in 2016/2017) to CCGs in accordance with a funding formula. Until 2016, progress towards the amount indicated by the formula from the historical allocation was slow, and CCGs which were above their allocation did not suffer a reduction. From April 2016, however, CCGs with more than 10 per cent above their fair share were to receive 'flat cash' – an effective reduction. This would also ensure than no CCG was more than 5 per cent below its target allocation in 2016/2017.[14] Operational pressures escalation levels framework In October 2016 it introduced the Operational Pressures Escalation Levels (OPEL) system for the management of operational difficulties in English hospitals, replacing the system of red and black alerts which was locally defined. OPELs range from 1 (normal) to 4 (a major crisis requiring external intervention either regionally or nationally).[15][16] This is intended, among other things, to enable comparisons of trends over time and between areas.[17] Response to 2020 pandemic In view of the coronavirus pandemic, the Secretary of State for Health and Social Care directed the NHS Commissioning Board to buy services from the private sector, thereby bypassing CCGs. The Exercise of Commissioning Functions by the NHS Commissioning Board (Coronavirus) Directions 2020 came into force on 20 March 2020, and were revised to widen the definition of independent providers on 27 March.[18] The directive also allows NHS England to exercise functions normally carried out by CCGs, as the Board deems appropriate. The direction will be in place until the end of 2020.[19] Primary care See also: General medical services Applications by GPs to reduce their catchment area are dealt with by NHS England.[20] In November 2014, Mr Justice Popplewell declared that NHS England 'has acted unlawfully by reason of its failure to make arrangements for the involvement of patients in primary care commissioning decisions as required by the National Health Service Act 2006'. The case involved the decision to scrap the minimum practice income guarantee. Richard Stein, a partner at Leigh Day, said the declaration could mean that patients would have to be involved in discussions on changes to the GP contract.[21] NHS England awarded a four-year contract to Capita to become sole provider of administrative services including payment administration, management of medical records, and eligibility lists for practitioners for GPs, opticians and dentists across the UK in June 2015.[22] Information technology The organisation was reported to be developing a strategy to support the use of personal health records in June 2015. This, it is hoped, could achieve up to £3.4 billion in annual efficiency savings by 2020.[23] In April 2016 it published an index of digital maturity, where each of the 239 NHS trusts assessed its own 'readiness', 'capabilities' and 'enabling infrastructure'.[24][25] In 2018 the NHS app was unveiled, with public backing from Matt Hancock, who presented it as the key to a radical overhaul of NHS technology.[26] The NHS Long Term Plan set a target for all secondary care providers to move to digital records by 2024, which 'will cover clinical and operational processes across all settings, locations and departments and be based on robust, modern IT infrastructure services for hosting, storage, networks and cyber security.'[27] In 2020 the NHS awarded an emergency contract to Palantir Technologies for the creation of a Covid Data Store system for a statutory fee of £1. Palantir then won a £23.5 million contract with the NHS to continue with its work in December 2020. Palantir's involvement with the NHS was subject to criticism from civil liberties groups, including Open Democracy. [28] In 2023 the NHS began a procurement process for a Federated Data Platform (FDP). [29] An FDP would enable every hospital trust and integrated care system (ICS) to have their own platform through which they could connect and share information between them where helpful. [30] The FDP would build on the work done with the Covid Data Store .The contract to provide the FDP would be worth up to £480 million. [31] Palantir are widely thought to be the frontrunners to win. [32] This has prompted further criticism from civil liberties groups. [33] Specialised commissioning Specialised services are those provided in relatively few hospitals and accessed by comparatively small numbers of patients, but with catchment populations of usually more than one million. These services tend to be located in specialised hospital trusts that can recruit a team of staff with the appropriate expertise. NHS England is responsible for commissioning £19.3 billion of specialised services in 2021-2 and for dealing with Individual Funding Requests in respect of the specialist services it commissions. There are proposals to move some of this commissioning to integrated care systems. The Shelford Group expressed concerns in May 2022 about services 'where the numbers and evidence base supports the planning and provision of care being done at a population size larger than a typical ICS footprint.'[34] In 2015 there was criticism of delays in deciding on a policy for the prescription of Everolimus in the treatment of tuberous sclerosis. Twenty doctors addressed a letter to the board in support of the charity Tuberous Sclerosis Association saying 'around 32 patients with critical need, whose doctors believe everolimus treatment is their best or only option, have no hope of access to funding. Most have been waiting many months. Approximately half of these patients are at imminent risk of a catastrophic event (renal bleed or kidney failure) with a high risk of preventable death.'[35] It authorises and pays for treatment of narcolepsy with sodium oxybate by means of individual funding requests on the basis of exceptional circumstances. In May 2016 the High Court ordered NHS England to provide funding to treat a teenager with severe narcolepsy. The judge criticised their 'thoroughly bad decision' and 'absurd' policy discriminating against the girl when hundreds of other NHS patients already receive the drug. The Department of Health is also paying for the treatment of people whose narcolepsy was caused by the swine flu vaccine Pandemrix in 2009–10 by means of private prescriptions outside the National Health Service.[36] In May 2022 it produced guidance that said 65 of the 154 services they commissioned were ready and suitable to be devolved to the integrated care systems. It is expected that most will be jointly commissioned with their neighbours, rather than on their own."


In [None]:
# Reset the model which as has less context window ie 4k
MODEL =  os.getenv("GPT35_MODEL_NAME")

encoding=tiktoken.encoding_for_model(MODEL)
print(encoding)
tokens = encoding.encode(prompt)
print('Total number of tokens:', len(tokens))
print('Tokens :', tokens)
print('Words : ', [encoding.decode([t]) for t in tokens])

In [None]:
import openai
# Call AOAI to experiment token limitation
try:
  response = client.chat.completions.create(
    model=MODEL,
    messages = [{"role":"system", "content":"You are a helpful assistant."},
                {"role":"user","content": prompt}],
      #max_tokens=4096,
      n=1,
  )

  print(response.choices[0].message.content)
except openai.APIError as e:
    print(f"Error: {e}")