You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As inspecting through your codes, I found there is a function cal_return_to_go which requires a config dictionary for the high/low reward values for each env.
What is its purpose and what if in real-world problems we cannot ensure the high/low rewards of the environment?
The text was updated successfully, but these errors were encountered:
As inspecting through your codes, I found there is a function
cal_return_to_go
which requires a config dictionary for the high/low reward values for each env.What is its purpose and what if in real-world problems we cannot ensure the high/low rewards of the environment?
The text was updated successfully, but these errors were encountered: