-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is there a way to make HFO deterministic? #67
Comments
I was unable to make hfo deterministic. I think the underlying rcssserver
is the source of the stochastic behavior.
…On Fri, Jul 27, 2018, 6:30 PM Hongjie ***@***.***> wrote:
Hi @mhauskn <https://github.com/mhauskn> , I set the same *--seed* for
each game, used *--fullstate*, and took the same action at each
timestamp. I got the same initial state for each game, but the following
states were different. Here is the code I used for test:
state = hfo.getState() # initial state s_1
print(state)
hfo.act(DASH, 4, 0) # a_1
hfo.step()
state = hfo.getState() # s_2
print(state)
hfo.act(DASH, 4, 0) # a_2
hfo.step()
state = hfo.getState() # s_3
print(state)
hfo.act(DASH, 4, 0) # a_4
hfo.step()
state = hfo.getState() # s_5
print(state)
The outputs for 3 games are as follows:
GAME 1:
[-0.00783491 0.304353 0. -0.60117686 0.309754 -1.
-0.07788175 -0.12915528 -0.717829 -2. -1. 1. ]
[-0.00783491 0.304353 0. -0.60117686 0.309754 -1.
-0.07788175 -0.12915528 -0.717829 -2. 1. 1. ]
[-0.00706983 0.30430746 0. -0.60117686 0.309754 -1.
-0.0786112 -0.12924314 -0.7176385 -2. 1. 1. ]
[-0.00602221 0.30428338 0. -0.60117686 0.309754 -1.
-0.07959193 -0.12937814 -0.71738756 -2. 1. 1. ]
GAME 2:
[-0.00783491 0.304353 0. -0.60117686 0.309754 -1.
-0.07788175 -0.12915528 -0.717829 -2. -1. 1. ]
[-0.00783491 0.304353 0. -0.60117686 0.309754 -1.
-0.07788175 -0.12915528 -0.717829 -2. 1. 1. ]
[-0.00704449 0.30435562 0. -0.60117686 0.309754 -1.
-0.07861197 -0.12926483 -0.7176455 -2. 1. 1. ]
[-0.00590158 0.3042941 0. -0.60117686 0.309754 -1.
-0.07969844 -0.12939876 -0.7173623 -2. 1. 1. ]
GAME 3:
[-0.00783491 0.304353 0. -0.60117686 0.309754 -1.
-0.07788175 -0.12915528 -0.717829 -2. -1. 1. ]
[-0.00706983 0.30436897 0. -0.60117686 0.309754 -1.
-0.07858217 -0.12926644 -0.71765506 -2. 1. 1. ]
[-0.00605714 0.30442786 0. -0.60117686 0.309754 -1.
-0.07949132 -0.12942815 -0.71743464 -2. 1. 1. ]
[-0.0049333 0.3045268 0. -0.60117686 0.309754 -1.
-0.08048397 -0.12962067 -0.71719897 -2. 1. 1. ]
I also noted another problem. At the initial state s_1, I let the agent
take action *DASH(20,0)*, but the following state s_2 is the same as s_1
in GAME 1 and GAME 2.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#67>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AABNOda8UWbxDJYmuQpbBjKbPAnda64Qks5uK6KpgaJpZM4Vky3R>
.
|
I think I can make hfo deterministic now. Just add several lines
to Line 68 in 269c6b6
|
This is awesome! I would welcome a pull request that added an HFO flag to
switch on/off determinism.
…On Tue, Jul 31, 2018 at 3:05 PM Hongjie ***@***.***> wrote:
I think I can make hfo deterministic now. Just add several lines
'server::player_rand=0 ' \
'server::ball_rand=0 ' \
'server::kick_rand=0 ' \
'server::wind_rand=0' \
to serveOptions in ./bin/HFO
https://github.com/LARG/HFO/blob/269c6b694e86ee5266c897f2727f2e0b7d5f10a0/bin/HFO#L68
.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#67 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AABNOdUELQ0ZZPCLh3nUTfVrqvOgigjQks5uMNSrgaJpZM4Vky3R>
.
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi @mhauskn , I set the same --seed for each game, used --fullstate, and took the same action at each timestamp. I got the same initial state for each game, but the following states were different. Here is the code I used for test:
The outputs for 3 games are as follows:
GAME 1:
GAME 2:
GAME 3:
I also noted another problem. At the initial state s_1, I let the agent take action DASH(20,0), but the following state s_2 is the same as s_1 in GAME 1 and GAME 2.
The text was updated successfully, but these errors were encountered: