# Ejercicio: Optimización de parámetros.
- Utilizando el modelo y los datos de validación de ejercicio [Ejercicio: Clasificación con XGBoost](../module_4/4_03.ipynb), optimiza el parámetro max_depth cd xgboost. Utiliza para ello HyperparameterTuner.
- Para el parámetro max_depth usa sagemaker.parameter.CategoricalParameter([2, 3, 4, 5, 6, 7, 8]) por ejmplo.
- Usa "validation:auc" como métrica del HyperparameterTuner.
- Visualiza el resultado.


In [None]:
import sagemaker

role = sagemaker.get_execution_role()
sess = sagemaker.Session()
region = sess.boto_region_name

bucket = sess.default_bucket()
prefix = 'module_4/part_3'

print(role)
print(sess)
print(region)
print(bucket)
print(prefix)

In [None]:
image = sagemaker.image_uris.retrieve("xgboost", region, "1.5-1")
print(image)

In [None]:
s3_train_data = f's3://{bucket}/{prefix}/data/train.csv'
s3_validation_data = f's3://{bucket}/{prefix}/data/validation.csv'

print(s3_train_data)
print(s3_validation_data)


In [None]:
train_input = sagemaker.TrainingInput(
    s3_train_data, 
    content_type="text/csv",
)
validation_input = sagemaker.TrainingInput(
    s3_validation_data,
    content_type="text/csv",
)

data_channels = {
    'train': train_input, 
    'validation': validation_input
}


In [None]:
s3_output_location = f's3://{bucket}/{prefix}/output'

hyperparameters = {
    "max_depth": "5",
    "eta": "0.2",
    "gamma": "4",
    "min_child_weight": "6",
    "subsample": "0.7",
    "objective": "binary:logistic",
    "num_round": "50",
    "eval_metric": "auc",
}


estimator = sagemaker.estimator.Estimator(
    image_uri=image,
    role=role,
    instance_count=1,
    hyperparameters=hyperparameters,
    instance_type="ml.c4.xlarge",
    output_path=s3_output_location,
    sagemaker_session=sess,
)


In [None]:
# https://sagemaker.readthedocs.io/en/stable/api/training/parameter.html#sagemaker.parameter.ParameterRange
# https://sagemaker-examples.readthedocs.io/en/latest/hyperparameter_tuning/xgboost_random_log/hpo_xgboost_random_log.html
hyperparameter_ranges = {
    "max_depth": sagemaker.parameter.IntegerParameter(max_value=10, min_value=2),
    "alpha": sagemaker.parameter.ContinuousParameter(0.01, 10, scaling_type="Logarithmic"),
    "lambda": sagemaker.parameter.ContinuousParameter(0.01, 10, scaling_type="Logarithmic"),
}

In [None]:
# https://sagemaker.readthedocs.io/en/stable/api/training/tuner.html
tuner = sagemaker.tuner.HyperparameterTuner(
    estimator,
    "validation:auc",
    hyperparameter_ranges,
    objective_type='Maximize',
    max_jobs=20,
    max_parallel_jobs=10,
    strategy="Random",
)

In [None]:
jobname = f'xgboost-quiebras-opt-5'
tuner.fit(
    inputs=data_channels,
    job_name=jobname,
)

- Podemos ver los resultados con HyperparameterTuningJobAnalytics.
- También podemos verlo en la pantalla de experimentos.

In [None]:
df= sagemaker.HyperparameterTuningJobAnalytics(
    tuner.latest_tuning_job.job_name
).dataframe()
df

In [None]:
df.sort_values(by='FinalObjectiveValue', ascending=False)