Skip to content

Added MASE metric and y_train parameter to objectives #4221

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 45 commits into from
Jul 18, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
3297b7e
add MASE metric
remyogasawara Jun 29, 2023
3b3e233
fix comments
remyogasawara Jun 30, 2023
4e9f8f7
add mase tests
remyogasawara Jun 30, 2023
58fc8a0
update release notes
remyogasawara Jul 1, 2023
fedc716
fix objective list
remyogasawara Jul 3, 2023
49b2939
reorder
remyogasawara Jul 3, 2023
c55c34a
redo mase changes
remyogasawara Jul 5, 2023
35aaaae
increase count and define y_train
remyogasawara Jul 5, 2023
e40f2b0
format and update matplotlib
remyogasawara Jul 5, 2023
c6e6dd8
rename var
remyogasawara Jul 5, 2023
838d510
remove *100 and fix comments
remyogasawara Jul 6, 2023
1939121
comments
remyogasawara Jul 6, 2023
9fed7ac
fix example comment
remyogasawara Jul 6, 2023
2f7862d
add MASE and SMAPE
remyogasawara Jul 6, 2023
c7ee85b
remove positive_only function and fix comments
remyogasawara Jul 6, 2023
b0c3949
remove pd->np and metrics from non neg tests
remyogasawara Jul 6, 2023
1fbbbf5
remove MAPE from non negative tets
remyogasawara Jul 6, 2023
ff228d8
add y_train
remyogasawara Jul 8, 2023
e254975
add y_train param
remyogasawara Jul 14, 2023
31c61f6
update PR name
remyogasawara Jul 14, 2023
01414bd
add y_train parameter
remyogasawara Jul 14, 2023
8967b22
add y_train parameter
remyogasawara Jul 14, 2023
a021f3f
add y_train parameter
remyogasawara Jul 14, 2023
27045d2
fix parameter order
remyogasawara Jul 14, 2023
a01c395
fix parameter names
remyogasawara Jul 14, 2023
6d01d32
param names
remyogasawara Jul 14, 2023
8ea017b
remove *100 and fix comments
remyogasawara Jul 6, 2023
ea14502
add y_train
remyogasawara Jul 8, 2023
381879b
rebase
remyogasawara Jul 14, 2023
005d9d7
remove *100 and fix comments
remyogasawara Jul 6, 2023
471a71d
remove positive_only function and fix comments
remyogasawara Jul 6, 2023
fd74870
add y_train
remyogasawara Jul 8, 2023
53d134d
add y_train parameter
remyogasawara Jul 14, 2023
f06cfc0
remove *100 and fix comments
remyogasawara Jul 6, 2023
12c977a
remove positive_only function and fix comments
remyogasawara Jul 6, 2023
c8588b0
add y_train
remyogasawara Jul 8, 2023
d484c78
remove *100 and fix comments
remyogasawara Jul 6, 2023
d84abd5
remove positive_only function and fix comments
remyogasawara Jul 6, 2023
56cdc00
add y_train
remyogasawara Jul 8, 2023
19eea7b
update mase tests
remyogasawara Jul 14, 2023
7488bc3
check 0 values
remyogasawara Jul 15, 2023
2813e78
check for df and series
remyogasawara Jul 15, 2023
688fe8b
spelling
remyogasawara Jul 17, 2023
b25d890
clean up comments and if statement
remyogasawara Jul 18, 2023
c6f52e4
swap np array to pd series
remyogasawara Jul 18, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/source/api_index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -409,7 +409,9 @@ Regression Objectives

evalml.objectives.ExpVariance
evalml.objectives.MAE
evalml.objectives.MASE
evalml.objectives.MAPE
evalml.objectives.SMAPE
evalml.objectives.MSE
evalml.objectives.MeanSquaredLogError
evalml.objectives.MedianAE
Expand Down
1 change: 1 addition & 0 deletions docs/source/release_notes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ Release Notes
* Enhancements
* Add run_feature_selection to AutoMLSearch and Default Algorithm :pr:`4210`
* Added ``SMAPE`` to the standard metrics for time series problems :pr:`4220`
* Added ``MASE`` metric and ``y_train`` parameter to objectives :pr:`4221`
* Fixes
* `IDColumnsDataCheck` now works with Unknown data type :pr:`4203`
* Changes
Expand Down
1 change: 1 addition & 0 deletions evalml/objectives/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
AUC,
F1,
MAE,
MASE,
MAPE,
SMAPE,
MSE,
Expand Down
10 changes: 9 additions & 1 deletion evalml/objectives/cost_benefit_matrix.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,12 +35,20 @@ def __init__(self, true_positive, true_negative, false_positive, false_negative)
self.false_positive = false_positive
self.false_negative = false_negative

def objective_function(self, y_true, y_predicted, X=None, sample_weight=None):
def objective_function(
self,
y_true,
y_predicted,
y_train=None,
X=None,
sample_weight=None,
):
"""Calculates cost-benefit of the using the predicted and true values.

Args:
y_predicted (pd.Series): Predicted labels.
y_true (pd.Series): True labels.
y_train (pd.Series): Ignored.
X (pd.DataFrame): Ignored.
sample_weight (pd.DataFrame): Ignored.

Expand Down
10 changes: 9 additions & 1 deletion evalml/objectives/fraud_cost.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,12 +36,20 @@ def __init__(
self.fraud_payout_percentage = fraud_payout_percentage
self.amount_col = amount_col

def objective_function(self, y_true, y_predicted, X, sample_weight=None):
def objective_function(
self,
y_true,
y_predicted,
X,
y_train=None,
sample_weight=None,
):
"""Calculate amount lost to fraud per transaction given predictions, true values, and dataframe with transaction amount.

Args:
y_predicted (pd.Series): Predicted fraud labels.
y_true (pd.Series): True fraud labels.
y_train (pd.Series): Ignored.
X (pd.DataFrame): Data with transaction amounts.
sample_weight (pd.DataFrame): Ignored.

Expand Down
14 changes: 11 additions & 3 deletions evalml/objectives/lead_scoring.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,12 +25,20 @@ def __init__(self, true_positives=1, false_positives=-1):
self.true_positives = true_positives
self.false_positives = false_positives

def objective_function(self, y_true, y_predicted, X=None, sample_weight=None):
def objective_function(
self,
y_true,
y_predicted,
y_train=None,
X=None,
sample_weight=None,
):
"""Calculate the profit per lead.

Args:
y_predicted (pd.Series): Predicted labels
y_true (pd.Series): True labels
y_predicted (pd.Series): Predicted labels.
y_true (pd.Series): True labels.
y_train (pd.Series): Ignored.
X (pd.DataFrame): Ignored.
sample_weight (pd.DataFrame): Ignored.

Expand Down
16 changes: 14 additions & 2 deletions evalml/objectives/objective_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,12 +61,20 @@ def expected_range(cls):

@classmethod
@abstractmethod
def objective_function(cls, y_true, y_predicted, X=None, sample_weight=None):
def objective_function(
cls,
y_true,
y_predicted,
y_train=None,
X=None,
sample_weight=None,
):
"""Computes the relative value of the provided predictions compared to the actual labels, according a specified metric.

Args:
y_predicted (pd.Series): Predicted values of length [n_samples]
y_true (pd.Series): Actual class labels of length [n_samples]
y_train (pd.Series): Observed training values of length [n_samples]
X (pd.DataFrame or np.ndarray): Extra data of shape [n_samples, n_features] necessary to calculate score
sample_weight (pd.DataFrame or np.ndarray): Sample weights used in computing objective value result

Expand All @@ -79,12 +87,13 @@ def positive_only(cls):
"""If True, this objective is only valid for positive data. Defaults to False."""
return False

def score(self, y_true, y_predicted, X=None, sample_weight=None):
def score(self, y_true, y_predicted, y_train=None, X=None, sample_weight=None):
"""Returns a numerical score indicating performance based on the differences between the predicted and actual values.

Args:
y_predicted (pd.Series): Predicted values of length [n_samples]
y_true (pd.Series): Actual class labels of length [n_samples]
y_train (pd.Series): Observed training values of length [n_samples]
X (pd.DataFrame or np.ndarray): Extra data of shape [n_samples, n_features] necessary to calculate score
sample_weight (pd.DataFrame or np.ndarray): Sample weights used in computing objective value result

Expand All @@ -93,12 +102,15 @@ def score(self, y_true, y_predicted, X=None, sample_weight=None):
"""
if X is not None:
X = self._standardize_input_type(X)
if y_train is not None:
y_train = self._standardize_input_type(y_train)
y_true = self._standardize_input_type(y_true)
y_predicted = self._standardize_input_type(y_predicted)
self.validate_inputs(y_true, y_predicted)
return self.objective_function(
y_true,
y_predicted,
y_train=y_train,
X=X,
sample_weight=sample_weight,
)
Expand Down
Loading