# Local Development Environment Setup for NCF

This notebook guides you through setting up your local development environment for Neural Collaborative Filtering (NCF) using SageMaker local mode with GPU support.

## Prerequisites
- Windows 10/11 with NVIDIA GPU/ Mac with GPU
- Python 3.10+
- Administrative access for Docker installation

## 1. Check System Requirements

First, let's verify that your system meets the necessary requirements.

In [1]:
import sys
import platform
import logging

# Configure logging to suppress file path warnings
logging.getLogger('py4j').setLevel(logging.ERROR)
logging.getLogger('urllib3.connectionpool').setLevel(logging.ERROR)
logging.getLogger('botocore.credentials').setLevel(logging.ERROR)

print(f"Python version: {sys.version}")
print(f"Operating System: {platform.system()} {platform.release()}")

# Check for GPU availability using nvidia-smi
import subprocess
try:
    nvidia_smi = subprocess.check_output(["nvidia-smi"])
    print("NVIDIA GPU detected:\n")
    print(nvidia_smi.decode())
except:
    print("NVIDIA GPU not detected or nvidia-smi not installed")

Python version: 3.12.9 (tags/v3.12.9:fdb8142, Feb  4 2025, 15:27:58) [MSC v.1942 64 bit (AMD64)]
Operating System: Windows 11
NVIDIA GPU detected:

Sun Feb 16 19:10:40 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 566.14                 Driver Version: 566.14         CUDA Version: 12.7     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                  Driver-Model | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA GeForce RTX 4060 ...  WDDM  |   00000000:01:00.0 Off |                  N/A |
| N/A   45C    P8              1W /   45W |       0MiB /   8188MiB |      0%      Default |
|                                         |                        |                

## 2. Install Required Python Packages

Install the necessary Python packages for development.

In [2]:
# Install uv for package management
# # On Windows.
# powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
# On macOS and Linux.
# curl -LsSf https://astral.sh/uv/install.sh | sh

In [3]:
# ! uv venv

In [4]:
# !uv sync

In [5]:
# Install required packages
!uv pip install --upgrade pip
!uv pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
!uv pip install sagemaker boto3 mlflow docker

[2mUsing Python 3.12.9 environment at: C:\Users\hohoy\OneDrive\Desktop\sagemaker-ncf-mlflow\.venv[0m
[2mResolved [1m1 package[0m [2min 145ms[0m[0m
[36m[1mDownloading[0m[39m pip [2m(1.8MiB)[0m
 [32m[1mDownloaded[0m[39m pip
[2mPrepared [1m1 package[0m [2min 426ms[0m[0m
[2mInstalled [1m1 package[0m [2min 574ms[0m[0m
 [32m+[39m [1mpip[0m[2m==25.0.1[0m
[2mUsing Python 3.12.9 environment at: C:\Users\hohoy\OneDrive\Desktop\sagemaker-ncf-mlflow\.venv[0m
[2mResolved [1m14 packages[0m [2min 3.66s[0m[0m
[2mInstalled [1m9 packages[0m [2min 8.34s[0m[0m
 [32m+[39m [1mfilelock[0m[2m==3.13.1[0m
 [32m+[39m [1mfsspec[0m[2m==2024.6.1[0m
 [32m+[39m [1mmpmath[0m[2m==1.3.0[0m
 [32m+[39m [1mnetworkx[0m[2m==3.3[0m
 [32m+[39m [1msetuptools[0m[2m==70.2.0[0m
 [32m+[39m [1msympy[0m[2m==1.13.1[0m
 [32m+[39m [1mtorch[0m[2m==2.6.0+cu118[0m
 [32m+[39m [1mtorchaudio[0m[2m==2.6.0+cu118[0m
 [32m+[39m [1mtorchvision[0m

## 3. Verify PyTorch GPU Support

Check if PyTorch can detect and use your GPU.

In [6]:
import torch

print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"CUDA version: {torch.version.cuda}")
    print(f"GPU device: {torch.cuda.get_device_name(0)}")
    
# Simple GPU test
if torch.cuda.is_available():
    x = torch.rand(5, 3)
    print("\nTensor on CPU:")
    print(x)
    print("\nTensor on GPU:")
    print(x.cuda())

PyTorch version: 2.6.0+cu118
CUDA available: True
CUDA version: 11.8
GPU device: NVIDIA GeForce RTX 4060 Laptop GPU

Tensor on CPU:
tensor([[0.8017, 0.7165, 0.1276],
        [0.2206, 0.1958, 0.7876],
        [0.1882, 0.0364, 0.7397],
        [0.5407, 0.3389, 0.8835],
        [0.4147, 0.3015, 0.7244]])

Tensor on GPU:
tensor([[0.8017, 0.7165, 0.1276],
        [0.2206, 0.1958, 0.7876],
        [0.1882, 0.0364, 0.7397],
        [0.5407, 0.3389, 0.8835],
        [0.4147, 0.3015, 0.7244]], device='cuda:0')


## 4. Configure Docker

Set up Docker with NVIDIA Container Toolkit support.

In [7]:
# Check Docker installation
try:
    docker_version = subprocess.check_output(["docker", "--version"])
    print(f"Docker installed: {docker_version.decode()}")
except:
    print("Docker not detected. Please install Docker Desktop for Windows")
    print("Visit: https://docs.docker.com/desktop/windows/install/")

Docker installed: Docker version 27.0.3, build 7d4bcd8



### Install NVIDIA Container Toolkit

Follow these steps manually:

1. Install NVIDIA Container Toolkit:
```bash
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update
sudo apt-get install -y nvidia-docker2
```

2. Restart Docker service:
```bash
sudo systemctl restart docker
```

## 5. Configure SageMaker Local Mode

In [8]:
import warnings
# Hiding some warning
warnings.filterwarnings("ignore")

import sagemaker
import boto3

# Configure SageMaker session
sagemaker_session = sagemaker.LocalSession()
boto_session = boto3.Session(region_name='ap-southeast-2')

# Set up role (not needed for local mode, but required for API compatibility)
role = 'arn:aws:iam::111111111111:role/service-role/AmazonSageMaker-ExecutionRole-20200101T000001'

print("SageMaker version:", sagemaker.__version__)


  """
  """
  """


sagemaker.config INFO - Not applying SDK defaults from location: C:\ProgramData\sagemaker\sagemaker\config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: C:\Users\hohoy\AppData\Local\sagemaker\sagemaker\config.yaml


  TRAINING_JOB_PREFIX_REGEX = "^[A-Za-z0-9\-]+$"
  EMAIL_ADDRESS_REGEX = "^[a-z0-9]+[@]\w+[.]\w{2,3}$"
  PHONE_NUMBER_REGEX = "^\+\d{1,15}$"


SageMaker version: 2.239.1


## 6. Set Up MLflow

Configure MLflow for experiment tracking.

Before running this cell, make sure to start the MLflow server in a separate terminal:

```bash
uv run mlflow server --port 5000
```

In [9]:
import mlflow
import os
import logging

# Disable warning logs that show file paths
logging.getLogger('py4j').setLevel(logging.ERROR)
logging.getLogger('urllib3.connectionpool').setLevel(logging.ERROR)
logging.getLogger('botocore.credentials').setLevel(logging.ERROR)

# Configure MLflow
os.environ['MLFLOW_TRACKING_URI'] = 'http://localhost:5000'
mlflow.set_tracking_uri('http://localhost:5000')

# Create experiment with error handling
experiment_name = "ncf-local-development"
try:
    # First check if experiment exists
    existing_exp = mlflow.get_experiment_by_name(experiment_name)
    if existing_exp:
        experiment_id = existing_exp.experiment_id
    else:
        experiment_id = mlflow.create_experiment(experiment_name)
        
    print(f"MLflow version: {mlflow.__version__}")
    print(f"MLflow tracking URI: {mlflow.get_tracking_uri()}")
    print(f"Experiment ID: {experiment_id}")
    
except Exception as e:
    # Print the actual error for debugging
    print(f"Error encountered: {str(e)}")
    print("\nUnable to connect to MLflow server. Possible causes:")
    print("- MLflow server not running. Run 'mlflow server --port 5000'")
    print("- Old MLflow version. Upgrade using 'pip install --upgrade mlflow'")

MLflow version: 2.20.2
MLflow tracking URI: http://localhost:5000
Experiment ID: 212974742724506544


## 7. Verify GPU Container Support

Test NVIDIA Container Toolkit with a simple PyTorch container.

In [10]:
# Pull PyTorch container
!docker pull pytorch/pytorch:2.0.1-cuda11.7-cudnn8-runtime

# Run container with GPU support
!docker run --gpus all pytorch/pytorch:2.0.1-cuda11.7-cudnn8-runtime python -c \
    "import torch; print('CUDA available:', torch.cuda.is_available())"

2.0.1-cuda11.7-cudnn8-runtime: Pulling from pytorch/pytorch
Digest: sha256:82e0d379a5dedd6303c89eda57bcc434c40be11f249ddfadfd5673b84351e806
Status: Image is up to date for pytorch/pytorch:2.0.1-cuda11.7-cudnn8-runtime
docker.io/pytorch/pytorch:2.0.1-cuda11.7-cudnn8-runtime
CUDA available: True


## Environment Setup Complete

If all cells executed successfully, your local development environment is ready for NCF development with SageMaker local mode and GPU support.

Next steps:
1. Ensure MLflow server is running: `mlflow server --port 5000`
2. Proceed to Lesson 2: Data Preparation for NCF