Skip to content

When using experience replay, why don't you update Q_target? #81

@Zixin-Tang

Description

@Zixin-Tang
# Recompute prediction value and label for replay buffer
if sample_primitive_action == 'push':
    trainer.predicted_value_log[sample_iteration] = [np.max(sample_push_predictions)]
    # trainer.label_value_log[sample_iteration] = [new_sample_label_value]
elif sample_primitive_action == 'grasp':
    trainer.predicted_value_log[sample_iteration] = [np.max(sample_grasp_predictions)]
    # trainer.label_value_log[sample_iteration] = [new_sample_label_value]

@andyzeng

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions