In [1]:
%pip install "sagemaker>=2.163.0"

Note: you may need to restart the kernel to use updated packages.


In [2]:
import sagemaker
from sagemaker.huggingface import HuggingFaceModel, get_huggingface_llm_image_uri
import time

sagemaker_session = sagemaker.Session()
region = sagemaker_session.boto_region_name
role = sagemaker.get_execution_role()

sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /home/sagemaker-user/.config/sagemaker/config.yaml


In [3]:
image_uri = get_huggingface_llm_image_uri(
  backend="huggingface",
  region=region
)

In [4]:
model_name = "deepseek-14b-" + time.strftime("%Y-%m-%d-%H-%M-%S", time.gmtime())

hub = {
    'HF_MODEL_ID':'deepseek-ai/DeepSeek-R1-Distill-Qwen-14B',
    'HF_TASK':'question-answering',
    'SM_NUM_GPUS':'4'
}

model = HuggingFaceModel(
    name=model_name,
    env=hub,
    role=role,
    image_uri=image_uri
)

In [5]:
predictor = model.deploy(
    initial_instance_count=1,
    instance_type="ml.g5.12xlarge",
    endpoint_name=model_name
)

------------!

In [6]:
# Advanced generation parameters
generation_params = {
    "do_sample": True,
    "top_p": 0.9,
    "temperature": 0.7,
    "max_new_tokens": 512
}

# Sample request
predictor.predict({
    "inputs": "Explain quantum computing in simple terms",
    "parameters": generation_params
})

[{'generated_text': 'Explain quantum computing in simple terms\n\n</think>\n\nQuantum computing is a type of computing that uses the principles of quantum mechanics to perform calculations. Unlike classical computers, which use bits (0s and 1s) to store and process information, quantum computers use quantum bits, or qubits.\n\n### Key Differences from Classical Computing:\n1. **Superposition**: Qubits can exist in multiple states at once, thanks to a property called superposition. This allows quantum computers to process a vast number of possibilities simultaneously.\n   \n2. **Entanglement**: Qubits can be entangled, meaning the state of one qubit is directly related to the state of another, no matter the distance between them. This property can help in creating stronger correlations between bits, leading to more efficient computations.\n\n3. **Quantum Parallelism**: Quantum computers can process a massive amount of calculations at the same time, thanks to superposition and entangleme

In [7]:
[{'generated_text': 'Explain quantum computing in simple terms.\n\n</think>\n\nQuantum computing is a type of computing that uses the principles of quantum mechanics to perform calculations. Unlike classical computers, which use bits (0s and 1s) to store and process information, quantum computers use quantum bits, or qubits.\n\nQubits are special because they can exist in a state called "superposition," where they are both 0 and 1 at the same time. This allows quantum computers to perform many calculations simultaneously. Additionally, qubits can be "entangled," meaning the state of one qubit is connected to the state of another, no matter how far apart they are. This entanglement allows quantum computers to solve certain problems much faster than classical computers.\n\nIn simple terms, quantum computing is like having a computer that can explore many possibilities at once, thanks to the unique properties of quantum mechanics. This could lead to breakthroughs in fields like cryptography, materials science, and drug discovery.'}]

[{'generated_text': 'Explain quantum computing in simple terms.\n\n</think>\n\nQuantum computing is a type of computing that uses the principles of quantum mechanics to perform calculations. Unlike classical computers, which use bits (0s and 1s) to store and process information, quantum computers use quantum bits, or qubits.\n\nQubits are special because they can exist in a state called "superposition," where they are both 0 and 1 at the same time. This allows quantum computers to perform many calculations simultaneously. Additionally, qubits can be "entangled," meaning the state of one qubit is connected to the state of another, no matter how far apart they are. This entanglement allows quantum computers to solve certain problems much faster than classical computers.\n\nIn simple terms, quantum computing is like having a computer that can explore many possibilities at once, thanks to the unique properties of quantum mechanics. This could lead to breakthroughs in fields like cryptograp

In [8]:
predictor.delete_model()
predictor.delete_endpoint()