# Modify `MyScaleReward` so that it accounts for the goodwill penalty

In the video lesson, we defined our custom reward scaling wrapper `MyScaleReward`. I have included the code below. 

The hard inventory management environment has a slightly different reward function. It includes a term for goodwill penalty.

Your job is to edit the `MyScaleReward` wrapper so that it takes the goodwill penalty into account. For this, you would need to re-estimate the average low scale. 

You are free to make your own reasoned assumptions about the average values of the various quantities like `goodwill_penalty_per_unit` and unmet demand. Since this is an estimation problem, there's no exactly right answer. The trick is to balance simplicity 
and correctness.

In [2]:
import gym
import numpy as np


# Edit the following wrapper so that it takes goodwill penalty into account
class MyScaleReward(gym.RewardWrapper):
    def reward(self, reward):
        avg_unit_selling_price = self.env.max_unit_selling_price / 2
        avg_num_items_bought_per_day = avg_num_items_sold_per_day = self.env.max_mean_daily_demand / 2
        avg_unit_buying_price = self.env.max_unit_selling_price / 4
        avg_daily_holding_cost_per_unit = self.env.max_daily_holding_cost_per_unit / 2
        avg_num_items_held_per_day = self.env.max_mean_daily_demand / 2
        avg_high_scale = avg_unit_selling_price * avg_num_items_sold_per_day
        avg_low_scale = - (avg_unit_buying_price * avg_num_items_bought_per_day +
                           avg_daily_holding_cost_per_unit * avg_num_items_held_per_day
                           )
        mid = (avg_high_scale + avg_low_scale) / 2
        linearly_mapped_reward = 2 * (reward - mid) / (avg_high_scale - avg_low_scale)
        return np.arctan(linearly_mapped_reward) / np.arctan(1)