# Пишем код для отправки запросов, конфигурируем модели

In [None]:
!pip install openai >> None

In [None]:
import openai
openai.__version__

'1.38.0'

In [None]:
from openai import OpenAI
import time

from google.colab import userdata
API_KEY = userdata.get('OpenAI_API_key')

client = OpenAI(api_key=API_KEY)

config = {'intent':{'model':"gpt-3.5-turbo-0125",
                            'system_prompt':"""You are a helpful assistant. Classify the following prompt as a 'simple' or 'hard' task.
             The task is 'hard' only if it requires the background in computer science. Return just one lower-case word 'simple' or 'hard':""",
                    'max_tokens':5,
                    'price_input':0.5,
                    'price_output':1.5},

                  'easy':{'model':"gpt-3.5-turbo-0125",
                           'system_prompt':"You're a helpful assistant",
                           'max_tokens':None,
                           'price_input':0.5,
                           'price_output':1.5},
                  'hard':{'model':"gpt-4o",
                           'system_prompt':"You're a helpful assistant",
                           'max_tokens':None,
                           'price_input':5,
                           'price_output':15}}

def call_model(prompt, model_type):
    start_time = time.time()
    response = client.chat.completions.create(
        model=config[model_type]['model'],
        messages=[
            {"role": "system",
             "content": config[model_type]['system_prompt']},
            {"role": "user", "content": prompt}
        ],
        max_tokens=config[model_type]['max_tokens']
    )
    latency = time.time() - start_time

    if model_type == 'intent':
        intent = response.choices[0].message.content
    else:
        intent = None
    tokens_input = response.usage.prompt_tokens
    tokens_output = response.usage.completion_tokens
    price = tokens_input*config[model_type]['price_input'] + tokens_output*config[model_type]['price_output']
    print(response.choices[0].message.content)
    return price, latency, intent

# Задаем тестовые вопросы, тестируем как с ними справляется классификатор

In [None]:
EASY_QUERY = 'Who is the best actor who played spider-man?'
HARD_QUERY = 'What is the difference between Adam and AdamW optimizers?'

call_model(HARD_QUERY, 'intent')
call_model(EASY_QUERY, 'intent')

hard
simple


(38.0, 0.6099886894226074, 'simple')

# Используем только мощную модель

In [None]:
price_1,latency_1, _ = call_model(HARD_QUERY, 'hard')
price_2,latency_2, _ = call_model(EASY_QUERY, 'hard')
tot_price = price_1 + price_2
tot_latency = latency_1 + latency_2
print('\n-------------------------------\nTOTAL PRICE IS {}, LATENCY {}'.format(tot_price, tot_latency))

Adam and AdamW are both gradient-based optimization algorithms used in training machine learning models, particularly deep learning models. Although they share similarities, there are key differences between the two. Here's a breakdown:

### Adam Optimizer:
1. **Algorithm**: Adam (Adaptive Moment Estimation) uses estimates of first and second moments of the gradients to adapt the learning rate for each parameter.
   
2. **Update Rule**: The update rule for Adam is:
   \[
   \theta_{t+1} = \theta_t - \frac{\alpha}{\sqrt{\hat{v}_t} + \epsilon} \hat{m}_t
   \]
   Where:
   - \(\theta_t\) are the parameters at step \(t\),
   - \(\alpha\) is the learning rate,
   - \(\hat{m}_t\) and \(\hat{v}_t\) are the bias-corrected first and second moment estimates respectively,
   - \(\epsilon\) is a small constant to prevent division by zero.

3. **Weight Decay**: Traditional Adam incorporates weight decay directly into the update step, which can introduce issues by coupling the weight decay rate with

# Используем мульти-модельность

In [None]:
def router(QUERY):
    price_intent,latency_intent, intent = call_model(QUERY, 'intent')
    if intent.strip().lower() == 'hard':
        price_answer, latency_answer, _ = call_model(QUERY, 'hard')
    else:
        price_answer, latency_answer, _ = call_model(QUERY, 'easy')

    total_price = price_intent + price_answer
    total_latency = latency_intent + latency_answer
    return total_price, total_latency

In [None]:
price_1,latency_1 = router(HARD_QUERY)
price_2,latency_2 = router(EASY_QUERY)
tot_price = price_1 + price_2
tot_latency = latency_1 + latency_2
print('\n-------------------------------\nTOTAL PRICE IS {}, LATENCY {}'.format(tot_price, tot_latency))

hard
Adam and AdamW are both optimization algorithms widely used in training deep learning models, but they have some key differences, primarily revolving around how they handle weight decay (regularization).

### Adam Optimizer
Adam (short for Adaptive Moment Estimation) is an optimization algorithm that combines the ideas of Momentum and RMSprop. It adapts the learning rate for each parameter based on the first and second moments of the gradients.

Key features:
1. **Adaptive Learning Rates:** It computes adaptive learning rates for each parameter.
2. **Momentum:** Uses moving averages of the gradients (first moment) and the squared gradients (second moment).
3. **Bias Correction:** Includes bias correction terms to account for the initialization of the moving averages.

The update rule for Adam is:
\[ \theta_{t+1} = \theta_t - \eta \frac{\hat{m}_t}{\sqrt{\hat{v}_t} + \epsilon} \]
where \(\eta\) is the learning rate, \(\hat{m}_t\) is the bias-corrected first moment estimate, and \(\h