# **Week 1: LLM APIs 101**
---
📝 ***Exercises***

* Call OpenAI API và in ra response.
* Call Claude API và so sánh format input/output.
* Call Gemini API với input text (nếu có key).

# **Basic knowledge about LLM API**
---
**LLM API**: stands as a technical interaction with AI systems capable of processing, comprehending, and generating human language.
> These APIs act as a channel between the algorithms of LLM performance and various applications, enabling seamless integration of language processing functionalities into software solutions.

* **Authentication**: normally request **API Key** for verification
* **Endpoint**: Each provider (LLM Models such as OpenAI, Anthropic..) includes many different endpoint (chat completions, embeddings, images, audio...)
* **Request Body (Parameters)**: JSON contains input information
* **Response**: Return JSON

# **Parameters**
---
* **temparature** (0 → 1+): sự sáng tạo trong câu trả lời của mô hình.
    * 0 → rất lặp lại/đi thẳng, 0.7 → sáng tạo, 1.2 → rất phóng khoáng.
    * Ví dụ: temperature = 0 - câu trả lời chuẩn, temparatur = 0.8 để viết sáng tạo như thơ

* **top_p** (0-1): nuclues sampling - kiểm soát sự ngẫu nhiên. Thay vì xem xét tất cả các từ, mô hình sẽ chỉ xem xét một tập hợp các từ có xác suất cộng dồn lớn hơn giá trị top_p
    * top_p=0.1: Chỉ xem xét những từ có xác suất cao, cộng lại đủ 10%. Kết quả rất tập trung và ít ngẫu nhiên.
    * top_p=0.9: Mô hình sẽ xem xét một tập hợp từ rộng hơn nhiều.
* **max-tokens/max_output_tokens**: giới hạn độ dài của response -> > Dùng để kiểm soát chi phí + đảm bảo response ngắn gọn, đi đúng trọng tâm.
* **n**: số lượng câu trả lời
* **stream=True**: streaming response - Server sẽ gửi dần dần từng chunk của câu trả lời ngay khi model sinh ra (tương tự như xem người khác gõ trực tiếp trên màn hình)
    * stream=True thì API trả dữ liệu theo kiểu event stream (mỗi dòng là một chunk JSON)
    * Mỗi chunk chứa phần nhỏ của câu trả lời (delta, context)


# **OpenAI**
---

In [None]:
# installation
! pip install openai

In [None]:
from openai import OpenAI

client = OpenAI(
  api_key='API_KEYS' 
)

response = client.responses.create(
  model='gpt-4o-mini',
  input='write a haiku about ai',
  store=True,
)

print(response.output_text)

# **Claude - Anthropic**
---

In [7]:
# installation
! pip install anthropic

Defaulting to user installation because normal site-packages is not writeable
Collecting anthropic
  Downloading anthropic-0.67.0-py3-none-any.whl.metadata (27 kB)
Downloading anthropic-0.67.0-py3-none-any.whl (317 kB)
Installing collected packages: anthropic
Successfully installed anthropic-0.67.0


In [None]:
import anthropic

client = anthropic.Anthropic(
    # defaults to os.environ.get("ANTHROPIC_API_KEY")
    api_key="my_api_key",
)
message = client.messages.create(
    model="claude-opus-4-1-20250805",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello, Claude"}
    ]
)
print(message.content)

# **Gemini**
---
Need to set up the environment variable **GEMINI_API_KEY**
* **SECURE**: Specify the variable name as GEMINI_API_KEY in "Environment Variables"
* **PUBLIC**: client = genai.Client(api_key="YOUR_API_KEY")

In [1]:
# installation
! pip install -q -U google-genai

In [1]:
from google import genai

In [None]:
# basic

client = genai.Client() # secret way :>

response = client.models.generate_content(
    model="gemini-2.5-flash",
    contents="how can i configure parameters in gemini model",
)

print(response.text)

Configuring parameters in the Gemini model (or any large language model) allows you to control its behavior, output style, and safety settings. The exact method depends on how you're interacting with the Gemini API – whether through Google AI Studio, the client SDKs (Python, Node.js, etc.), or the REST API.

Here's a breakdown of common parameters and how to configure them:

## Common Parameters You Can Configure

Most LLMs, including Gemini, offer similar core parameters:

1.  **`temperature`**:
    *   **Range:** Typically 0.0 to 1.0 (though some interfaces might allow slightly higher).
    *   **Effect:** Controls the randomness of the output.
        *   **Lower values (e.g., 0.0 - 0.3):** Make the model more deterministic and focused, useful for tasks requiring factual, precise, or consistent answers.
        *   **Higher values (e.g., 0.7 - 1.0):** Make the model more creative, diverse, and prone to generating unexpected or surprising outputs, good for brainstorming or creative w

In [None]:
# set a roles
from google.genai import types

client = genai.Client()

response = client.models.generate_content(
    model="gemini-2.5-flash",
    contents="what is lstm",
    config=types.GenerateContentConfig(
        system_instruction="You are a professional Data Scientist"  # give it a role
    )
)

print(response.text)

As a professional Data Scientist, when I explain **LSTM (Long Short-Term Memory)**, I typically start by situating it within the broader context of neural networks designed for sequential data.

Here's a breakdown:

---

### What is LSTM?

**LSTM (Long Short-Term Memory)** is a specialized type of **Recurrent Neural Network (RNN)** architecture specifically designed to overcome the limitations of traditional RNNs in learning long-term dependencies. In simpler terms, it's a neural network that is very good at remembering things for a long time, which is crucial when dealing with sequences of data like text, speech, or time series.

---

### The Problem LSTMs Solve: Long-Term Dependencies & Vanishing Gradients

Traditional RNNs suffer from two main issues when processing long sequences:

1.  **Vanishing Gradient Problem:** During backpropagation (the process of updating network weights), gradients (signals that indicate how much to change weights) can shrink exponentially as they propaga

In [8]:
# streaming response

client = genai.Client()

for chunk in client.models.generate_content_stream(
    model="gemini-2.5-flash",
    contents="what is mediapipe use for in computer vision?"
):
    print(chunk.text)

MediaPipe is an open-source framework developed by Google that provides **highly optimized, pre-built, and customizable machine learning (ML) pipelines for various computer vision (and other AI) tasks**. Its primary goal is to make it easier
 for developers to integrate real-time, on-device ML solutions into their applications across different platforms (Android, iOS, web, desktop).

In computer vision, MediaPipe is used for a wide range of applications, primarily focusing on **understanding
 human actions and features**, but also extending to other object-related tasks.

Here's a breakdown of its key uses in computer vision:

1.  **Face Detection and Tracking:**
    *   **Use:** Identifying the
 presence and location of faces in images or video streams.
    *   **Applications:** Augmented reality (AR) filters, virtual try-ons, video conferencing (e.g., placing effects on faces), audience engagement analysis.

2
.  **Facial Landmark Detection (Face Mesh):**
    *   **Use:** Pinpointing