In [1]:
# run this to shorten the data import from the files
import os
cwd = os.path.dirname(os.getcwd())+'/'
path_data = os.path.join(os.path.dirname(os.getcwd()), 'datasets/')


# When performance estimation is off

Imagine you're a data scientist at a bank, working on a loan default use case. You receive labels to validate your model and performance estimation algorithm every month. During one particular month, you observe that many customers with well-paid jobs are defaulting more often due to a significant surge in inflation and a corresponding job crisis.

As you compare the estimated and realized performance, you notice a significant disparity between them.

What could be why the performance estimation algorithm is not as effective in this situation?

### Possible Answers


    The algorithm is not implemented correctly
    
    
    Concept drift{Answer}
    
    
    Covariate shift
    
    
    None of the above

**Perfect! CBPE and DLE algorithms will not work well under the concept drift. Now that you understand the possibilities and limitations of the algorithms let's validate them on our tip prediction model and US Consensus dataset!**

In [None]:
# exercise 01

"""
Comparing estimated and realized performance

Now that you have seen how performance calculation works, your task is to calculate the realized performance for our tip prediction model for the NYC green taxi dataset.

The reference and analysis set is already loaded and saved in the reference and analysis variables.

In addition, results from the DLE algorithm for tip prediction are stored in the estimated_results variable.
"""

# Instructions

"""


    Specify problem type as regression in calculator initialization.

    Fit the calculator with reference data and calculate performance for the analysis set.

    Show comparison plot between realized_results and estimated_results using compare() method.

"""

# solution

# Intialize the calculator
calculator = nannyml.PerformanceCalculator(
    y_true='tip_amount',
    y_pred='y_pred',
    chunk_period='d',
  	metrics=['mae'],
    timestamp_column_name='lpep_pickup_datetime',
    problem_type='regression')

# Fit the calculator
calculator.fit(reference)
realized_results = calculator.calculate(analysis)

# Show comparison plot for realized and estimated performance
realized_results.compare(estimated_results).plot().show()

#----------------------------------#

# Conclusion

"""
Wonderful! See how the estimated performance is usually closely matched with the realized performance, with a few exceptions during the holiday periods where the performance degradation is greater than estimated. Now, let's explore what else we can do with our results!
"""

'/home/nero/Documents/Estudos/DataCamp'

In [1]:
# exercise 02

"""
Different chunking methods

A chunk represents a single data point in the monitoring results. Recall that there are three methods for chunking your data: based on time, size, or the number of chunks.

In this exercise, you will chunk and visualize the results of the CBPE algorithm for the US Census dataset using size-based and number-based chunking methods.

The nannyml library is already imported.
"""

# Instructions

"""
Load reference, analysis, and analysis labels using load_us_census_ma_employment_data() method and set chunk size to 5000.
---
Add f1 metric to the monitored metrics and set chunk number to 8.
"""

# solution

reference, analysis, analysis_gt = nannyml.load_us_census_ma_employment_data()

# Initialize the CBPE algorithm
cbpe = nannyml.CBPE(
    y_pred_proba='predicted_probability',
    y_pred='prediction',
    y_true='employed',
    metrics = ['roc_auc', 'accuracy'],
    problem_type = 'classification_binary',
    chunk_size = 5000,
)

cbpe = cbpe.fit(reference)
estimated_results = cbpe.estimate(analysis)
estimated_results.plot().show()

#----------------------------------#

reference, analysis, analysis_gt = nannyml.load_us_census_ma_employment_data()

# Initialize the CBPE algorithm
cbpe = nannyml.CBPE(
    y_pred_proba='predicted_probability',
    y_pred='prediction',
    y_true='employed',
    metrics = ['roc_auc', 'accuracy', 'f1'],
    problem_type = 'classification_binary',
	chunk_number = 8,
)

cbpe = cbpe.fit(reference)
estimated_results = cbpe.estimate(analysis)
estimated_results.plot().show()

#----------------------------------#

# Conclusion

"""
Nice work! You can notice the difference in the graphs based on the method you use. A good rule of thumb is to make the chunk size about 10% of the reference data size for reliable results.
"""

'\n\n'

In [2]:
# exercise 03

"""
Modifying the thresholds

In the video, you observed how NannyML calculates threshold values and learned how to customize them to suit your solution.

In this exercise, your task is to define two custom standard deviation and custom thresholds and then apply them to the results obtained from the CBPE algorithm for the US Census dataset.

The reference and analysis sets have been pre-loaded as reference and analysis, along with the nannyml library.
"""

# Instructions

"""


    Import ConstantThreshold, and StandardDeviationThreshold from nannyml.thresholds.

    Initialize the standard deviation method and set std_lower_multiplier and std_upper_multiplier parameters to 2.

    Initialize the constant threshold method and set the lower parameter to 0.9 and upper to 0.98.

    Pass the constant threshold method for the f1 metric and the standard deviation method for accuracy to the CBPE algorithm.

"""

# solution

# Import custom thresholds
from nannyml.thresholds  import ConstantThreshold, StandardDeviationThreshold

# Initialize custom thresholds
stdt = StandardDeviationThreshold(std_lower_multiplier=2, std_upper_multiplier=2)
ct = ConstantThreshold(lower=0.9, upper=0.98)

# Initialize the CBPE algorithm
estimator = nannyml.CBPE(
    problem_type='classification_binary',
    y_pred_proba='predicted_probability',
    y_pred='prediction',
    y_true='employed',
    metrics=['roc_auc', 'accuracy', 'f1'],
    thresholds={'f1': ct, 'accuracy': stdt})

#----------------------------------#

# Conclusion

"""
Great! Custom thresholds are a valuable tool, especially in situations where you need to be alerted when a specific value is exceeded.
"""

'\n\n'

In [3]:
# exercise 04

"""
Interacting with results

In this exercise, you will filter, plot, and convert to the DataFrame the CBPE results obtained for the US Consensus dataset from the previous example. The display method here is used to show the plots and DataFrames that are called in the middle of the code.

The results from the CBPE estimator are preloaded in the estimated_results variable.
"""

# Instructions

"""

    Interact with the estimated results based on the comments above each code snippet.

"""

# solution

# Filter estimated results for the roc_auc metric and convert them to a dataframe
display(estimated_results.filter(metrics=['roc_auc']).to_df())

# Filter estimated results for the reference period and convert them to a dataframe
display(estimated_results.filter(period='reference').to_df())

# Filter the estimated results for the accuracy metric
display(estimated_results.filter(metrics=['accuracy']).plot().show())

# Filter the estimated results for the analysis period, as well as for accuracy and roc_auc metrics
display(estimated_results.filter(period='analysis', metrics=['accuracy', 'roc_auc']).plot().show())

#----------------------------------#

# Conclusion

"""
Fantastic job! You can now adjust and get the results that suit your requirements. Now, let's explore calculating and estimating the business value of your model!
"""

'\n\n'

# Business value calculation

Recall that you can determine your model's business value by either directly calculating it using a performance calculator or estimating it with the CBPE algorithm. In the last video, we also delved into how these algorithms work behind the scenes.

Now, using the provided confusion matrix and business value matrix shown in the image below, you need to calculate the monetary value of the model.

![image](images/Exercise_2_3_confusion_matrix.png)

### Possible Answers


    -$20
    
    
    $20 {Answer}
    
    
    $40

# Drop in monetary value

Now, you are in the production environment, and your model is up and running.

Based on the given graph, establish days when the alerted drop in the F1 score overlaps with the alerted drop in the business value of the model.

![images](images/Exercise_2_3_performance_estimation_business.png)

### Possible Answers


    4th and 5th of July {Answer}
    
    
    4th, 5th and 10th of July
    
    
    From 3rd to 7th of July


**Well done! As you can observe, a drop in performance doesn't necessarily translate to a decline in the business value of your model. Now, let's see how this applies to our hotel booking dataset!**

In [4]:
# exercise 05

"""
Business calculation for hotel booking dataset

Previously, you were introduced to the challenge of predicting booking cancellations. Here, you will work with the actual Hotel Booking dataset, where a model predicts booking cancellations based on the customer's country of origin, time between booking and arrival, required parking spaces, and the chosen hotel.

The reference and analysis sets have already been loaded for you. Here are the first two rows:

  country  lead_time  parking_spaces       hotel  y_pred  y_pred_proba  is_canceled  timestamp
0  FRA     120        0               City Hotel  0       0.239983      0           2016-05-01
1  ITA     120        1               City Hotel  0       0.003965      0           2016-05-01

Your task is to check the model's monetary value and ROC AUC performance.
"""

# Instructions

"""


    Initialize a custom threshold with 0 as the lower value and 150,000 as the upper value.

    Specify the business value and roc_auc metric for monitoring.

    Set TN to 0, FP to -100, FN to -200, and TP to 1500 in business_value_matrix.

    Assign custom threshold to the business value metric.


"""

# solution

# Custom business value thresholds
ct = ConstantThreshold(lower=0, upper=150000)
# Intialize the performance calculator
calc = PerformanceCalculator(problem_type='classification_binary',
			y_pred_proba='y_pred_proba',
  			timestamp_column_name="timestamp", 		
  			y_pred='y_pred',
  			y_true='is_canceled',
            chunk_period='m',
  			metrics=['business_value', 'roc_auc'],
  			business_value_matrix = [[0, -100],[-200, 1500]],
  			thresholds={'business_value': ct})
calc = calc.fit(reference)
calc_res = calc.calculate(analysis)
calc_res.filter(period='analysis').plot().show()

#----------------------------------#

# Conclusion

"""
Well done! You can see that our model brings losses in December and January, which also overlaps with a drop in ROC AUC performance during that period. In the next chapter, we will investigate why this happened!
"""

'\n\n'