# GitHub Marketplace Model Inference s Phi-4 Reasoning

Tento notebook demonštruje, ako používať modely z GitHub Marketplace na inferenciu, konkrétne model Phi-4-reasoning od Microsoftu.

## Nastavenie a konfigurácia

Najskôr nakonfigurujeme naše prostredie a nainštalujeme potrebné závislosti.


In [None]:
# Install required packages
!pip install requests python-dotenv
!pip install azure-ai-inference

### Nastavenie súboru local.env

Pred spustením tohto notebooku je potrebné vytvoriť súbor `local.env` v rovnakom adresári ako tento notebook s nasledujúcimi premennými:

```
# GitHub Configuration
GITHUB_TOKEN=your_personal_access_token_here
GITHUB_INFERENCE_ENDPOINT=https://models.github.ai/inference
GITHUB_MODEL=microsoft/Phi-4-reasoning

# Azure OpenAI Configuration
AZURE_API_KEY=your_azure_api_key_here
AZURE_OPENAI_ENDPOINT=your_azure_endpoint_here
AZURE_OPENAI_MODEL=Phi-4-reasoning
```

**Pokyny:**

1. Vytvorte nový súbor s názvom `local.env` v rovnakom priečinku ako tento notebook
2. Pridajte tri environmentálne premenné uvedené vyššie
3. Nahraďte `your_personal_access_token_here` svojím GitHub osobným prístupovým tokenom
4. Voliteľne môžete zmeniť model na `microsoft/Phi-4-mini-reasoning` pre menší model

**Poznámka:** GitHub token vyžaduje vhodné oprávnenia na prístup k službe AI modelov.


## Načítanie environmentálnych premenných

Načítame naše environmentálne premenné zo súboru `local.env`, ktorý obsahuje náš GitHub token a informácie o modeli.


In [None]:
import os
import requests
import json
from azure.ai.inference import ChatCompletionsClient
from azure.ai.inference.models import SystemMessage, UserMessage
from azure.core.credentials import AzureKeyCredential
from dotenv import load_dotenv

# Load variables from local.env file
load_dotenv('local.env')

# Access the environment variables - using values from local.env file
endpoint = os.getenv("GITHUB_INFERENCE_ENDPOINT")
model = os.getenv("GITHUB_MODEL")
token = os.getenv("GITHUB_TOKEN")
azuretoken = os.getenv("AZURE_KEY")
azureendpoint = os.getenv("AZURE_ENDPOINT")
azuremodel = os.getenv("AZURE_MODEL")
# Use fallback values if not found in local.env
if not endpoint:
    endpoint = "https://models.github.ai/inference"
    print("Warning: GITHUB_INFERENCE_ENDPOINT not found in local.env, using default value")

if not model:
    model = "microsoft/Phi-4-reasoning"
    print("Warning: GITHUB_MODEL not found in local.env, using default value")
    print("To change the model to Phi-4-mini-reasoning use \"microsoft/Phi-4-mini-reasoning\"")

if not token:
    raise ValueError("GITHUB_TOKEN not found in local.env file. Please add your GitHub token.")

print(f"Endpoint: {endpoint}")
print(f"Model: {model}")
print(f"azure_ai_image_generation_new.ipynb: {azureendpoint}")
print(f"azuremodel: {azuremodel}")
print(f"azuretoken available: {'Yes' if azuretoken else 'No'}")
print(f"Token available: {'Yes' if token else 'No'}")

## Pomocné funkcie pre inferenciu modelu

Vytvorme pomocné funkcie na interakciu s GitHub inferenčným API.


In [5]:
def generate_completion(prompt, model_id=model, temperature=0.7, max_tokens=10000):
    """Generate a completion using the GitHub inference API"""
    headers = {
        "Authorization": f"Bearer {token}",
        "Content-Type": "application/json"
    }
    
    payload = {
        "model": model_id,
        "prompt": prompt,
        "temperature": temperature,
        "max_tokens": max_tokens
    }
    
    try:
        # GitHub Models API requires a different endpoint structure
        api_url = f"{endpoint}/v1/chat/completions"
        print(f"Calling API at: {api_url}")
        
        # Modify payload for chat completions format
        chat_payload = {
            "model": model_id,
            "messages": [{"role": "user", "content": prompt}],
            "temperature": temperature,
            "max_tokens": max_tokens
        }
        
        response = requests.post(api_url, headers=headers, json=chat_payload)
        response.raise_for_status()  # Raise exception for 4XX/5XX errors
        result = response.json()
        
        if 'choices' in result and len(result['choices']) > 0:
            # Handle chat completions response format
            if 'message' in result['choices'][0] and 'content' in result['choices'][0]['message']:
                return result['choices'][0]['message']['content']
            # Fall back to the text field if available
            elif 'text' in result['choices'][0]:
                return result['choices'][0]['text']
            else:
                return f"Error: Could not extract content from response: {result['choices'][0]}"
        else:
            return f"Error: Unexpected response format: {result}"
    except Exception as e:
        print(f"Full error details: {str(e)}")
        return f"Error during API call: {str(e)}"

def format_conversation(messages):
    """Format a conversation for the model"""
    # For chat completion API, we'll just return the messages directly
    return messages

def generate_chat_completion(messages, model_id=model, temperature=0.7, max_tokens=1000):
    """Generate a completion using GitHub's chat completions API"""
    headers = {
        "Authorization": f"Bearer {token}",
        "Content-Type": "application/json"
    }
    
    payload = {
        "model": model_id,
        "messages": messages,
        "temperature": temperature,
        "max_tokens": max_tokens
    }
    
    try:
        api_url = f"{endpoint}/v1/chat/completions"
        print(f"Calling API at: {api_url}")
        
        response = requests.post(api_url, headers=headers, json=payload)
        response.raise_for_status()
        result = response.json()
        
        if 'choices' in result and len(result['choices']) > 0:
            if 'message' in result['choices'][0]:
                return result['choices'][0]['message']['content']
            else:
                return f"Error: Unexpected response format: {result['choices'][0]}"
        else:
            return f"Error: Unexpected response format: {result}"
    except Exception as e:
        print(f"Full error details: {str(e)}")
        return f"Error during API call: {str(e)}"
        
# For backward compatibility with existing code
def format_prompt_legacy(messages):
    """Format a conversation in text format (legacy method)"""
    formatted_prompt = ""
    
    for msg in messages:
        role = msg.get("role", "")
        content = msg.get("content", "")
        
        if role == "user":
            formatted_prompt += f"User: {content}\n\n"
        elif role == "assistant":
            formatted_prompt += f"Assistant: {content}\n\n"
        elif role == "system":
            formatted_prompt += f"{content}\n\n"
    
    formatted_prompt += "Assistant: "
    return formatted_prompt

## Príklad 1: Koľko jahôd na 9 r?

Poďme spustiť náš prvý príklad inferencie, kde sa pýtame na jahody a r.


In [6]:
example1_messages = [
    {"role": "system", "content": "You are a helpful AI assistant that answers questions accurately and concisely."},
    {"role": "user", "content": "How many strawberries do I need to collect 9 r's?"}
]

print("Messages:")
for msg in example1_messages:
    print(f"{msg['role']}: {msg['content']}")
print("\nGenerating response...\n")

# Use the new chat completion function directly with the messages
response1 = generate_chat_completion(example1_messages)
print("Response:")
print(response1)

Messages:
system: You are a helpful AI assistant that answers questions accurately and concisely.
user: How many strawberries do I need to collect 9 r's?

Generating response...

Calling API at: https://models.github.ai/inference/v1/chat/completions
Response:
<think>User says: "How many strawberries do I need to collect 9 r's?" This appears to be some riddle. Possibly the phrase "9 r's" might be a pun or reference to "r" letters. For example, "How many strawberries do I need to collect 9 r's?" It might be a pun on the phrase "strawberry, nine, r's" or "I need 9 r's" might be a riddle. Alternatively, perhaps the question "How many strawberries do I need to collect 9 r's?" might be a riddle where the answer is something like "9 strawberries" are needed if "strawberry" contains one "r" or two? Let's think: "Strawberry" letter count: S T R A W B E R R Y. Counting letter "r": "strawberry" has two "r"s, one in "straw" and one in "berry" but wait, let's check: "strawberry" letters: S, T, R, A

### Analýza príkladu 1

V tomto príklade musí model pochopiť, že slovo „jahoda“ obsahuje dve písmená „r“. Preto, aby ste nazbierali 9 „r“, by ste potrebovali 5 jahôd (s celkovým počtom 10 „r“), alebo 4,5 jahody, aby ste získali presne 9 „r“.

Pozrime sa, ako model Phi-4-reasoning rieši tento problém.


## Príklad 2: Riešenie hádanky

Teraz si vyskúšajme zložitejší príklad - hádanku na rozpoznávanie vzorov s viacerými príkladmi.


In [7]:
example2_messages = [
    {"role": "system", "content": "You are a helpful AI assistant that solves riddles and finds patterns in sequences."},
    {"role": "user", "content": "I will give you a riddle to solve with a few examples, and something to complete at the end"},
    {"role": "user", "content": "nuno Δημήτρης evif Issis 4"},
    {"role": "user", "content": "ntres Inez neves Margot 4"},
    {"role": "user", "content": "ndrei Jordan evlewt Μαρία 9"},
    {"role": "user", "content": "nπέντε Kang-Yuk xis-ytnewt Nubia 21"},
    {"role": "user", "content": "nπέντε Κώστας eerht-ytnewt Μανώλης 18"}, 
    {"role": "user", "content": "nminus one-point-two Satya eno Bill X."},
    {"role": "user", "content": "What is a likely completion for X that is consistent with examples above?"}
]

print("Messages:")
for msg in example2_messages:
    print(f"{msg['role']}: {msg['content'][:50]}...")
print("\nGenerating response...\n")

response2 = generate_chat_completion(example2_messages, temperature=0.2, max_tokens=10000)
print("Response:")
print(response2)

Messages:
system: You are a helpful AI assistant that solves riddles...
user: I will give you a riddle to solve with a few examp...
user: nuno Δημήτρης evif Issis 4...
user: ntres Inez neves Margot 4...
user: ndrei Jordan evlewt Μαρία 9...
user: nπέντε Kang-Yuk xis-ytnewt Nubia 21...
user: nπέντε Κώστας eerht-ytnewt Μανώλης 18...
user: nminus one-point-two Satya eno Bill X....
user: What is a likely completion for X that is consiste...

Generating response...

Calling API at: https://models.github.ai/inference/v1/chat/completions
Response:
<think>We are given a riddle with examples. The riddle is: "I will give you a riddle to solve with a few examples, and something to complete at the end". The examples are:

1. "nuno Δημήτρης evif Issis 4"
2. "ntres Inez neves Margot 4"
3. "ndrei Jordan evlewt Μαρία 9"
4. "nπέντε Kang-Yuk xis-ytnewt Nubia 21"
5. "nπέντε Κώστας eerht-ytnewt Μανώλης 18"
6. "nminus one-point-two Satya eno Bill X."

We are asked: "What is a likely completion for X that is

### Analýza príkladu 2

Táto hádanka vyžaduje rozpoznanie zložitých vzorcov naprieč viacerými jazykmi a číselnými reprezentáciami. Poďme si rozobrať, čo model musí pochopiť:

1. Identifikovať obrátené hláskovanie v slovách ako "evif" (päť)
2. Rozpoznať čísla v rôznych jazykoch (napr. "uno" v španielčine, "πέντε" v gréčtine)
3. Nájsť vzťah medzi číslami a konečnou číslicou

Zdá sa, že vzorec zahŕňa matematické operácie medzi hodnotami reprezentovanými v rôznych jazykoch a formátoch.


## Experimentovanie s rôznymi parametrami

Skúsme druhý príklad znova, ale s rôznymi nastaveniami teploty, aby sme zistili, ako to ovplyvňuje odpoveď modelu.


In [None]:
# Try with a higher temperature for more creative responses
response_creative = generate_chat_completion(example2_messages, temperature=0.9, max_tokens=20000)
print("Response with higher temperature (0.9):")
print(response_creative)

## Vytvorte si vlastný príklad

Môžete si vytvoriť vlastné príklady na testovanie schopností modelu v oblasti uvažovania. Skúste upraviť výzvy alebo vytvoriť úplne nové scenáre nižšie.


In [None]:
# Define your custom prompt here
custom_messages = [
    {"role": "system", "content": "You are a helpful AI assistant that can solve complex problems."},
    {"role": "user", "content": "Your custom prompt here"}
]

# Uncomment the lines below to run your custom prompt
# custom_response = generate_chat_completion(custom_messages)
# print("Response to custom prompt:")
# print(custom_response)

## Záver

Tento notebook ukázal, ako používať modely z GitHub Marketplace na inferenciu, konkrétne model Phi-4-reasoning na riešenie logických problémov a hádaniek.

Hlavné body:
1. Nastavenie autentifikácie pomocou GitHub tokenov
2. Formátovanie promptov pre optimálnu inferenciu
3. Správa parametrov modelu, ako je teplota, na kontrolu variability odpovedí
4. Testovanie schopností modelu v oblasti logického uvažovania na rôznych typoch problémov

Pamätajte, že váš GitHub token by mal zostať bezpečný a nemal by byť zdieľaný vo verejných repozitároch alebo notebookoch.



---

**Upozornenie**:  
Tento dokument bol preložený pomocou služby AI prekladu [Co-op Translator](https://github.com/Azure/co-op-translator). Hoci sa snažíme o presnosť, prosím, berte na vedomie, že automatizované preklady môžu obsahovať chyby alebo nepresnosti. Pôvodný dokument v jeho pôvodnom jazyku by mal byť považovaný za autoritatívny zdroj. Pre kritické informácie sa odporúča profesionálny ľudský preklad. Nenesieme zodpovednosť za akékoľvek nedorozumenia alebo nesprávne interpretácie vyplývajúce z použitia tohto prekladu.
