# Test Installation

In [1]:
# test_installation.py
import torch
import transformers
import datasets
from transformers import pipeline

print(f"PyTorch version: {torch.__version__}")
print(f"Transformers version: {transformers.__version__}")
print(f"Datasets version: {datasets.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
print(f"MPS (Metal Performance Shaders) available: {torch.backends.mps.is_available()}")

# Test a simple pipeline
classifier = pipeline("sentiment-analysis")
result = classifier("I love learning about AI!")
print(f"Test result: {result}")

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision 714eb0f (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


PyTorch version: 2.7.1
Transformers version: 4.53.2
Datasets version: 4.0.0
CUDA available: False
MPS (Metal Performance Shaders) available: True


Device set to use mps:0


Test result: [{'label': 'POSITIVE', 'score': 0.99968421459198}]


In [2]:
import torch
import pandas as pd
import matplotlib.pyplot as plt
from transformers import (
    pipeline, 
    AutoTokenizer, 
    AutoModel, 
    AutoModelForSequenceClassification,
    Trainer,
    TrainingArguments
)
from datasets import Dataset, load_dataset
import numpy as np

# Set device for Mac M1 optimization

In [3]:
device = torch.device("mps" if torch.backends.mps.is_available() else "cpu")
print(f"Using device: {device}")

Using device: mps


# EXERCISE 1: PIPELINES (High-level API)
## 1. Sentiment Analysis:

In [4]:
sentiment_pipeline = pipeline("sentiment-analysis", device=0 if device.type == "mps" else -1)
    
texts = [
        "I love this movie!",
        "This is terrible.",
        "It's okay, not great but not bad either."
    ]

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision 714eb0f (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Device set to use mps:0


In [5]:
results = sentiment_pipeline(texts)
for text, result in zip(texts, results):
    print(f"Text: '{text}' -> {result}")

Text: 'I love this movie!' -> {'label': 'POSITIVE', 'score': 0.9998775720596313}
Text: 'This is terrible.' -> {'label': 'NEGATIVE', 'score': 0.9996345043182373}
Text: 'It's okay, not great but not bad either.' -> {'label': 'POSITIVE', 'score': 0.9985427856445312}


## 2. Text Generation

In [6]:
generator = pipeline("text-generation", model="gpt2", device=0 if device.type == "mps" else -1)
    
prompt = "The future of artificial intelligence is"
result = generator(prompt, max_length=50, num_return_sequences=1, do_sample=True)
print(f"Generated: {result[0]['generated_text']}")

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Device set to use mps:0
Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=50) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


Generated: The future of artificial intelligence is uncertain. At the moment, it's focused on working on deep learning algorithms -- that is, to learn what the future of AI will look like. But a lot of work still needs to be done, and this paper proposes to do just that. This is a really interesting paper. It's not about the AI problem. But it's also not about the future, and it's not even about the problem of artificial intelligence. It's about the problems with the future of intelligence.

One of the most interesting aspects of this paper is that it focuses on the problems of learning how to code, how to write, how to code and how to code. It's not about the data, it's about the data you're creating. It's about what you're going to do in the future. The problem with the data is that it's not necessarily about what you're going to do in the future, but about what's going to be done in the future. The way that people write code in the future is by writing more code. This is what I call

## 3. Question Answering

In [7]:
qa_pipeline = pipeline("question-answering", device=0 if device.type == "mps" else -1)
    
context = "Hugging Face is a company that develops tools for machine learning. They are known for their Transformers library."
question = "What is Hugging Face known for?"
    
answer = qa_pipeline(question=question, context=context)
print(f"Question: {question}")
print(f"Answer: {answer['answer']} (confidence: {answer['score']:.4f})")

No model was supplied, defaulted to distilbert/distilbert-base-cased-distilled-squad and revision 564e9b5 (https://huggingface.co/distilbert/distilbert-base-cased-distilled-squad).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/473 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/261M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/49.0 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

Device set to use mps:0


Question: What is Hugging Face known for?
Answer: Transformers library (confidence: 0.7477)


## 4. Named Entity Recognition

In [8]:
ner_pipeline = pipeline("ner", aggregation_strategy="simple", device=0 if device.type == "mps" else -1)
    
text = "Apple Inc. was founded by Steve Jobs in Cupertino, California."
entities = ner_pipeline(text)
    
for entity in entities:
    print(f"Entity: '{entity['word']}' - Type: {entity['entity_group']} (confidence: {entity['score']:.4f})")

No model was supplied, defaulted to dbmdz/bert-large-cased-finetuned-conll03-english and revision 4c53496 (https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/998 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.33G [00:00<?, ?B/s]

Some weights of the model checkpoint at dbmdz/bert-large-cased-finetuned-conll03-english were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


tokenizer_config.json:   0%|          | 0.00/60.0 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

Device set to use mps:0


Entity: 'Apple Inc' - Type: ORG (confidence: 0.9996)
Entity: 'Steve Jobs' - Type: PER (confidence: 0.9892)
Entity: 'Cupertino' - Type: LOC (confidence: 0.9711)
Entity: 'California' - Type: LOC (confidence: 0.9989)


## 5. NSFW Image Classification - Fine-Tuned Vision Transformer (ViT)

### Method 1: Pipeline (Easiest)

In [5]:
from transformers import pipeline

classifier = pipeline("image-classification", model="Falconsai/nsfw_image_detection")
result = classifier("/Users/tharindu/Downloads/1747220469461.jpeg")
print(f"Image classification result: {result}")

Device set to use mps:0


Image classification result: [{'label': 'normal', 'score': 0.9998319149017334}, {'label': 'nsfw', 'score': 0.0001680291461525485}]


### Method 2: Manual Control (More Flexible)

In [3]:
from transformers import AutoModelForImageClassification, ViTImageProcessor
from PIL import Image
import torch
import psutil

# Load the model and processor
model = AutoModelForImageClassification.from_pretrained("Falconsai/nsfw_image_detection")
processor = ViTImageProcessor.from_pretrained("Falconsai/nsfw_image_detection")

print(f"Model loaded. RAM used: {psutil.virtual_memory().used / (1024**2):.2f} MB")

# Load your image (replace 'your_image.jpg' with your actual image path)
image = Image.open("/Users/tharindu/Downloads/1747220469461.jpeg").convert("RGB")

# Preprocess the image
inputs = processor(images=image, return_tensors="pt")

# Perform inference
with torch.no_grad():
    outputs = model(**inputs)

# Get predicted class
logits = outputs.logits
predicted_class_idx = logits.argmax(-1).item()
labels = model.config.id2label
print(f"Predicted class: {labels[predicted_class_idx]}")

Model loaded. RAM used: 8175.69 MB
Predicted class: normal


# ONNX (Open Neural Network Exchange)

**ONNX (Open Neural Network Exchange)** is a universal format for AI models, enabling interoperability across different frameworks.

## Why ONNX?

- **Framework Agnostic:**  
    PyTorch models speak "PyTorch language", TensorFlow models speak "TensorFlow language"—ONNX acts as a "Google Translate" for AI models.

## Key Benefits

- ✅ **Universal format:** Works across frameworks and platforms
- ✅ **Smaller file sizes:** Often 2–4x smaller than native formats
- ✅ **Faster inference:** Especially on CPU (2–5x speedup)
- ✅ **No framework dependency:** Run models without needing PyTorch or TensorFlow
- ✅ **Production-ready:** Easier deployment in production systems
- ✅ **Optimized:** Built-in graph optimizations
- ✅ **Hardware support:** Better compatibility with specialized hardware
- ✅ **Lower memory usage:** Uses less RAM during inference

---

### Summary Table

| Feature        | Description                                      |
|----------------|--------------------------------------------------|
| **Speed**      | 2–5x faster inference, especially on CPU         |
| **Size**       | Models are typically 50–75% smaller              |
| **Portability**| Run on any device/platform without PyTorch       |
| **Memory**     | Uses less RAM during inference                   |
| **Deployment** | Easier to deploy in production systems           |
| **Optimization**| Built-in graph optimizations                    |
| **Hardware**   | Better support for specialized hardware          |

## Model Formats: SafeTensors vs ONNX

### **SafeTensors** (Current Format)
- **Type:** PyTorch's secure model format
- **Contents:** Model weights + architecture info
- **Requirements:** Needs PyTorch to run
- **Size:** ~330MB for NSFW model
- **Best for:** Development & research

---

### **ONNX** (Open Neural Network Exchange)
- **Type:** Universal format for any framework
- **Features:** Optimized computation graph
- **Requirements:** Runs with just ONNX Runtime
- **Size:** ~150–200MB for same model
- **Best for:** Production deployment

## Converting to ONNX

In [1]:
from transformers import AutoModelForImageClassification

model = AutoModelForImageClassification.from_pretrained("Falconsai/nsfw_image_detection")
model.eval()

ViTForImageClassification(
  (vit): ViTModel(
    (embeddings): ViTEmbeddings(
      (patch_embeddings): ViTPatchEmbeddings(
        (projection): Conv2d(3, 768, kernel_size=(16, 16), stride=(16, 16))
      )
      (dropout): Dropout(p=0.0, inplace=False)
    )
    (encoder): ViTEncoder(
      (layer): ModuleList(
        (0-11): 12 x ViTLayer(
          (attention): ViTAttention(
            (attention): ViTSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
            )
            (output): ViTSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.0, inplace=False)
            )
          )
          (intermediate): ViTIntermediate(
            (dense): Linear(in_features=768, out_features=3072, bias=True)
            (intermed

In [2]:
import torch

dummy_input = torch.randn(1, 3, 224, 224)  # batch_size=1, 3 channels, 224x224 image

In [3]:
torch.onnx.export(
    model,
    dummy_input,
    "falconsai_nsfw.onnx",          # output filename
    input_names=["pixel_values"],
    output_names=["logits"],
    dynamic_axes={"pixel_values": {0: "batch_size"}, "logits": {0: "batch_size"}},
    opset_version=14
)

  if num_channels != self.num_channels:
  if height != self.image_size[0] or width != self.image_size[1]:
