## Building Blocks of LLM Engineering

There dimensions/Building blocks of LLM Engineering 

    1. Models : Open & Closed sources, Multimodal, Architecture and Selecting
    2. Tools : HuggingFace, Langchain, Gradio, Weight & Biases, Modal
    3. Techniques : Techniques,APIs, Multi-shot prompting, RAG, Fine-tuning and Agentization

#Closed-Source Models:

    • GPT from OpenAi

    • Claude from Anthropic

    • Gemini from Google

    • Grok from x.ai

#Open-Source Model: 
 
    • Llama from Meta
 
    • Mixtral from Mistral
 
    • Qwen from Alibaba Cloud
 
    • Gemma from Google
 
    • Phi from Miscroft
 
    • Deepseek from Deepseek AI
 
    • GPT-OSS from OpenAI

# Three ways to use models : 

1. Chat Interface(Like ChatGPT)

2. Cloud APIs LLM APIs, Framesworks like LangChain. 
        Managed AI cloud services : Amazon Bedrock, Google Vertex, Azure ML

3. Direct Inference, With the HuggingFace Transformer library With Ollama to run locally

In [None]:
#Load and setup environment variables
import os
from dotenv import load_dotenv

load_dotenv(override=True)
api_key = os.getenv('OPENAI_API_KEY')

if not api_key:
    print("No API key was found - please head over to the troubleshooting notebook in this folder to identify & fix!")
elif not api_key.startswith("sk-proj-"):
    print("An API key was found, but it doesn't start sk-proj-; please check you're using the right key - see troubleshooting notebook")
else:
    print("API key found and looks good so far!")

In [None]:
#Creating a payload for an Endpoint request
import requests

headers = {"Authorization": f"Bearer {api_key}", "Content-Type": "application/json"}

payload = {
    "model": "gpt-5-nano",
    "messages": [
        {"role": "user", "content": "Tell me a joke about programming."}
    ]
}

payload

In [None]:
#Sending a request to the Endpoint
response = requests.post(
    "https://api.openai.com/v1/chat/completions",
    headers=headers,
    json=payload
)

response.json()

In [None]:
response.json()['choices'][0]['message']['content']

# OpenAI Package

• Python Client Library

• wrapper around making this exact call to http endpoint

• Allow you to work with nice Python code instead of messing around with janky json object

• Open-source and light weight

In [None]:
#Endpoint call over http call
from openai import OpenAI
openai = OpenAI()

response = openai.chat.completions.create(
    model="gpt-5-nano",
    messages=[
        {"role": "user", "content": "Tell me a joke about programming."}
    ]
)

response.choices[0].message.content

In [None]:
#Working with gemini
GEMINI_BASE_URL = "https://generativelanguage.googleapis.com/v1beta/openai/"

load_dotenv(override=True)

google_api_key = os.getenv("GOOGLE_API_KEY")

if not google_api_key:
    print("No API key was found - please be sure to add your key to the .env file, and save the file! Or you can skip the next 2 cells if you don't want to use Gemini")
elif not google_api_key.startswith("AIz"):
    print("An API key was found, but it doesn't start AIz")
else:
    print("API key found and looks good so far!")

In [None]:
gemini = OpenAI(base_url=GEMINI_BASE_URL, api_key=google_api_key)

response = gemini.chat.completions.create(
    model="gemini-2.5-pro",
    messages=[
        {"role": "user", "content": "Tell me a joke about programming."}
    ]
)

response.choices[0].message.content

In [None]:
#Working with Ollama
requests.get("http://localhost:11434").content

## Download llama3.2:1b

In [None]:
!ollama pull llama3.2:1b

In [26]:
OLLAMA_BASE_URL = "http://localhost:11434/v1"

ollama = OpenAI(base_url=OLLAMA_BASE_URL, api_key='ollama')

In [None]:
response = ollama.chat.completions.create(model="llama3.2:1b",
    messages=[{"role": "user", "content": "Tell me a joke about programming."}]
)
response.choices[0].message.content

In [None]:
!ollama pull deepseek-r1:1.5b

In [None]:
#Working with smaller deepseek model
response = ollama.chat.completions.create(model="deepseek-r1:1.5b",
    messages=[{"role": "user", "content": "Tell me a joke about programming."}]
)
response.choices[0].message.content