<a href="https://colab.research.google.com/github/calmrocks/master-machine-learning-engineer/blob/main/Tools/DeepSeek.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Setup and Installation
First, we'll install the required packages. We'll use transformers, torch, and accelerate for efficient model loading.

In [None]:
!pip install transformers accelerate torch bitsandbytes

# Import Required Libraries
Import the necessary Python libraries and modules for working with the model.

In [1]:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

# Load Model and Tokenizer
We'll load the model in 4-bit quantization to save memory. The model will be loaded with bitsandbytes quantization.

In [4]:
model_id = "deepseek-ai/DeepSeek-R1-Distill-Qwen-7B"

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_id)

# Load model in 4-bit
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto",
    trust_remote_code=True,
    load_in_4bit=True
)

The `load_in_4bit` and `load_in_8bit` arguments are deprecated and will be removed in the future versions. Please, pass a `BitsAndBytesConfig` object in `quantization_config` argument instead.


ImportError: Using `bitsandbytes` 4-bit quantization requires the latest version of bitsandbytes: `pip install -U bitsandbytes`

# Test the Model
Let's create a simple function to generate text and test it with a sample prompt.

In [None]:
def generate_text(prompt, max_length=200):
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

    outputs = model.generate(
        **inputs,
        max_length=max_length,
        num_return_sequences=1,
        temperature=0.7,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id
    )

    return tokenizer.decode(outputs[0], skip_special_tokens=True)

# Test with a sample prompt
prompt = "Explain quantum computing in simple terms:"
response = generate_text(prompt)
print(response)