Skip to content

Sagemaker rlestimator - deployment endpoint giving different prediction than evaluation #2638

@troybvo

Description

@troybvo

I ran into the problem of using reinforcement learning dueling double deep q network on sagemaker. Basically the endpoint predictions almost always predicts 1 values, while the same dataset in evaluation (same model checkpoint) gives even predictions among the classes. I used this operation for the output node main_level/agent/main/online/network_0/dueling_q_values_head_0/output, which is the last operation in a dueling q network.

Is this a bug within the checkpoint deployment? or my output operation is wrong? Any help would be appreciated.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions