# TODO: Title

This notebook lists all the steps that you need to complete this project. You will need to complete all the TODOs in this notebook as well as in the README and the two python scripts included with the starter code.


**TODO**: Give a helpful introduction to what this notebook is for. Remember that comments, explanations and good documentation make your project informative and professional.

**Note:** This notebook has a bunch of code and markdown cells with TODOs that you have to complete. These are meant to be helpful guidelines for you to finish your project while meeting the requirements in the project rubrics. Feel free to change the order of these the TODO's and use more than one TODO code cell to do all your tasks.

In [None]:
# TODO: Install any packages that you might need
# For instance, you will need the smdebug package
!pip install smdebug

In [None]:
# TODO: Import any packages that you might need
# For instance you will need Boto3 and Sagemaker
import sagemaker
from sagemaker.pytorch import PyTorch,PyTorchModel
from sagemaker.debugger import DebuggerHookConfig, Rule, ProfilerRule, rule_configs,TensorBoardOutputConfig
from sagemaker.profiler import ProfilerConfig, FrameworkProfiler
from PIL import Image
import numpy as np
import io
import boto3

## Dataset
TODO: Explain what dataset you are using for this project. Maybe even give a small overview of the classes, class distributions etc that can help anyone not familiar with the dataset get a better understand of it.

In [1]:
#TODO: Fetch and upload the data to AWS S3

# Command to download and unzip data
!wget https://s3-us-west-1.amazonaws.com/udacity-aind/dog-project/dogImages.zip
!unzip dogImages.zip

--2024-08-02 12:53:02--  https://s3-us-west-1.amazonaws.com/udacity-aind/dog-project/dogImages.zip
Resolving s3-us-west-1.amazonaws.com (s3-us-west-1.amazonaws.com)... 52.219.120.192, 52.219.121.56, 52.219.220.104, ...
Connecting to s3-us-west-1.amazonaws.com (s3-us-west-1.amazonaws.com)|52.219.120.192|:443... connected.
ERROR: cannot verify s3-us-west-1.amazonaws.com's certificate, issued by ‘CN=Amazon RSA 2048 M01,O=Amazon,C=US’:
  Unable to locally verify the issuer's authority.
To connect to s3-us-west-1.amazonaws.com insecurely, use `--no-check-certificate'.
unzip:  cannot find or open dogImages.zip, dogImages.zip.zip or dogImages.zip.ZIP.


## Hyperparameter Tuning
**TODO:** This is the part where you will finetune a pretrained model with hyperparameter tuning. Remember that you have to tune a minimum of two hyperparameters. However you are encouraged to tune more. You are also encouraged to explain why you chose to tune those particular hyperparameters and the ranges.

**Note:** You will need to use the `hpo.py` script to perform hyperparameter tuning.

In [None]:
#TODO: Declare your HP ranges, metrics etc.
hyperparameter_ranges = {
    'batch_size': IntegerParameter(16, 128),
    'epochs': IntegerParameter(5, 50),
    'lr': ContinuousParameter(0.0001, 0.1),
    'num_classes': IntegerParameter(10, 120)  # Adjust according to your problem
}

objective_metric_name = 'validation:accuracy'
metric_definitions = [{'Name': 'validation:accuracy', 'Regex': 'Accuracy: ([0-9\\.]+)'}]

In [None]:
# TODO: Create estimators for your HPs
estimator = PyTorch(
    entry_point='hpo.py',
    role=sagemaker.get_execution_role(),
    instance_count=1,
    instance_type='ml.c4.xlarge',
    framework_version='1.8.1',
    py_version='py3',
    hyperparameters={
        'data_dir': '/opt/ml/input/data/training',
        'model_path': '/opt/ml/model',
    }
)

# Define the input data channels
inputs = {'training': 's3://your-bucket/path-to-training-data'}

# TODO: Your HP tuner here
tuner = HyperparameterTuner(
    estimator,
    objective_metric_name,
    hyperparameter_ranges,
    metric_definitions,
    max_jobs=20,
    max_parallel_jobs=3
)

In [None]:
# TODO: Fit your HP Tuner
tuner.fit(inputs)

In [None]:
# TODO: Get the best estimators and the best HPs
best_estimator = tuner.best_estimator()

# Get the hyperparameters of the best trained model
best_hyperparameters = best_estimator.hyperparameters()

# Print the best hyperparameters
print(f"Best hyperparameters: {best_hyperparameters}")

## Model Profiling and Debugging
TODO: Using the best hyperparameters, create and finetune a new model

**Note:** You will need to use the `train_model.py` script to perform model profiling and debugging.

In [None]:
# TODO: Set up debugging and profiling rules and hooks
# Set up debugging and profiling rules and hooks
debugger_rules = [
    Rule.sagemaker(rule_configs.loss_not_decreasing()),
    Rule.sagemaker(rule_configs.overtraining()),
]

profiler_rules = [
    ProfilerRule.sagemaker(rule_configs.ProfilerReport()),
]

profiler_config = ProfilerConfig(
    framework_profile_params=FrameworkProfiler(
        start_step=5,
        num_steps=10
    )
)

debugger_hook_config = DebuggerHookConfig(
    s3_output_path=f's3://{sagemaker.Session().default_bucket()}/debugger-hook-output'
)

In [None]:
# Create and fit an estimator
estimator = PyTorch(
    entry_point='train_model.py',
    role=sagemaker.get_execution_role(),
    instance_count=1,
    instance_type='ml.c4.xlarge',
    framework_version='1.8.1',
    py_version='py3',
    hyperparameters=best_hyperparameters,
    debugger_hook_config=debugger_hook_config,
    profiler_config=profiler_config,
    rules=debugger_rules + profiler_rules
)

# Define the input data channels
inputs = {'training': 's3://your-bucket/path-to-training-data'}

# Fit the estimator
estimator.fit(inputs)

In [None]:
# TODO: Plot a debugging output.
tensorboard_output_config = TensorBoardOutputConfig(
    s3_output_path=f's3://{sagemaker.Session().default_bucket()}/tensorboard-output'
)

# Assuming that you have enabled tensorboard in your train_model.py script
estimator.fit(inputs, tensorboard_output_config=tensorboard_output_config)

**TODO**: Is there some anomalous behaviour in your debugging output? If so, what is the error and how will you fix it?  
**TODO**: If not, suppose there was an error. What would that error look like and how would you have fixed it?

In [None]:
# TODO: Display the profiler output
profiler_report_s3_uri = f's3://{sagemaker.Session().default_bucket()}/profiler-output/{estimator.latest_training_job.name}/profiler-output/profiler-report.html'
print(f'Profiler report: {profiler_report_s3_uri}')

## Model Deploying

In [None]:
# TODO: Deploy your model to an endpoint
predictor = estimator.deploy(
    initial_instance_count=1,
    instance_type='ml.m5.large',
    endpoint_name='dog-breed-classifier-endpoint'
)

# Function to preprocess the image
def preprocess_image(image_path):
    from torchvision import transforms

    preprocess = transforms.Compose([
        transforms.Resize((224, 224)),
        transforms.ToTensor(),
        transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
    ])

    image = Image.open(image_path).convert("RGB")
    image = preprocess(image)
    image = np.expand_dims(image.numpy(), axis=0)
    return image


In [None]:
# TODO: Run an prediction on the endpoint
image_path = 'path/to/your/dog_image.jpg'  # Path to your test image
image = preprocess_image(image_path)

# Convert the image to the format expected by the model
payload = np.array(image).tolist()
response = predictor.predict(payload)

# Decode the prediction response
predicted_class = np.argmax(response)
print(f'Predicted class: {predicted_class}')

In [None]:
# TODO: Remember to shutdown/delete your endpoint once your work is done
#predictor.delete_endpoint()