Skip to content

Curriculum file with measure as "progress" #291

@menondj

Description

@menondj

I was running a very long 24 hour test for a tic-tac-toe game with max_steps 18 million and a curriculum file with measure as "progress" and threshold as 0.7. I understood that the lesson would kick in at 12.6 million. What I want to be able to do is change the parameters after 70% progress.
But it kicked in only towards the end of a test, here is a snapshot of PPO output:
Step: 17980000. Mean Reward: -0.106296445069. Std of Reward: 0.672540501635.
Step: 17990000. Mean Reward: -0.100342190016. Std of Reward: 0.674290931663.
INFO:unityagents:
Lesson changed. Now in Lesson 1 : defence_penalty -> -0.75, defence_reward -> 0.75
Step: 18000000. Mean Reward: -0.0938710993269. Std of Reward: 0.674583112292.
Saved Model
Saved Model

Is there anything I'm doing wrong? The tensor board data showed no other discrepancy and it seemed to run well except for the curriculum file:.
{
"measure" : "progress",
"thresholds" : [0.7],
"min_lesson_length" : 2,
"signal_smoothing" : true,
"parameters" :
{
"defence_reward" : [0.25, 0.75],
"defence_penalty" : [-0.25, -0.75]
}
}

BTW, awesome job on ML agents! Has opened up so many possibilities!

Metadata

Metadata

Assignees

No one assigned

    Labels

    help-wantedIssue contains request for help or information.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions