# LLM Concepts: Companion Notebook

This notebook lets you explore key LLM concepts hands-on using Amazon Bedrock and Anthropic Claude models.

**Note:** You must have access to AWS Bedrock and appropriate permissions to use Claude models. **Do not share your AWS credentials.**


In [None]:
# --- AWS Setup ---
# Fill in your AWS credentials and region below.
import os
import boto3
from botocore.exceptions import NoCredentialsError, ClientError

# Option 1: Use environment variables (recommended
os.environ['AWS_ACCESS_KEY_ID'] = 'YOUR_ACCESS_KEY'
os.environ['AWS_SECRET_ACCESS_KEY'] = 'YOUR_SECRET_KEY'
os.environ['AWS_DEFAULT_REGION'] = 'us-east-1'  # or your region


region = os.environ.get('AWS_DEFAULT_REGION', 'us-east-1')

# Check Bedrock access
bedrock = boto3.client('bedrock-runtime', region_name=region)
bedrock_available = True
try:
    # Try listing available models
    resp = bedrock.list_foundation_models()
    claude_models = [m for m in resp.get('modelSummaries', []) if 'anthropic' in m['modelId'].lower()]
    if not claude_models:
        print('Claude models not found in your Bedrock account.')
        bedrock_available = False
except (NoCredentialsError, ClientError) as e:
    print('Bedrock/Claude not available or credentials missing:', e)
    bedrock_available = False

if not bedrock_available:
    print('Some cells will be skipped. Please check your AWS setup.')


## 1. Introduction to LLMs (Claude via Bedrock)
A Large Language Model (LLM) is an AI system trained on vast text data to generate and understand human language. Let's try a simple prompt with Claude!


In [None]:
if bedrock_available:
    prompt = 'Explain what a Large Language Model is in one sentence.'
    model_id = claude_models[0]['modelId'] if claude_models else 'anthropic.claude-3-sonnet-20240229-v1:0'
    response = bedrock.invoke_model(
        modelId=model_id,
        body='{"prompt": "' + prompt + '", "max_tokens_to_sample": 100}',
        contentType='application/json'
    )
    import json
    result = json.loads(response['body'].read())
    print('Claude:', result.get('completion', result))
else:
    print('Bedrock/Claude not available. Skipping this cell.')


## 2. Tokenization Demo
Tokenization splits text into tokens for the model. Let's see how Claude tokenizes a sample sentence.


In [None]:
sample_text = 'Machine learning is fascinating!'
if bedrock_available:
    # Claude does not expose a tokenizer API directly, but you can estimate tokens by sending a prompt and checking usage
    prompt = f'Count the number of tokens in: {sample_text!r} and list the tokens.'
    response = bedrock.invoke_model(
        modelId=model_id,
        body='{"prompt": "' + prompt + '", "max_tokens_to_sample": 100}',
        contentType='application/json'
    )
    import json
    result = json.loads(response['body'].read())
    print(result.get('completion', result))
else:
    print('Bedrock/Claude not available. Skipping this cell.')


## 3. Embeddings Visualization
Embeddings are vector representations of text. Let's get embeddings for a few sentences and visualize them.


In [None]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.decomposition import PCA

sentences = [
    'I love machine learning.',
    'Artificial intelligence is the future.',
    'The cat sat on the mat.',
    'Dogs are loyal pets.'
]
if bedrock_available:
    # Claude's embedding API is not public as of now; this is a placeholder for when it becomes available
    print('Claude embedding API not available. Skipping this cell.')
else:
    print('Bedrock/Claude not available. Skipping this cell.')


## 4. Softmax, Temperature, topP, topK Demo
Let's see how softmax and sampling parameters affect token selection.


In [1]:
import numpy as np
import matplotlib.pyplot as plt

def softmax(logits, temperature=1.0):
    scaled = np.array(logits) / temperature
    exps = np.exp(scaled - np.max(scaled))
    return exps / np.sum(exps)

logits = [8.2, 4.6, 3.9, 3.2, -5.0]
tokens = ['Paris', 'Lyon', 'Nice', 'Marseille', 'banana']

def plot_probs(temperature):
    probs = softmax(logits, temperature)
    plt.figure(figsize=(7,3))
    plt.bar(tokens, probs, color='#3498db')
    plt.title(f'Softmax Probabilities (Temperature={temperature})')
    plt.ylabel('Probability')
    plt.ylim(0,1)
    plt.show()

plot_probs(1.0)
# Try plot_probs(0.5) or plot_probs(1.5) to see the effect


ModuleNotFoundError: No module named 'numpy'

---

**Requirements:**
- boto3
- matplotlib
- scikit-learn (for PCA, if embeddings become available)

You can install them with:
```python
!pip install boto3 matplotlib scikit-learn
```
