Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About your code #5

Closed
sgfuigh opened this issue Jan 1, 2022 · 2 comments
Closed

About your code #5

sgfuigh opened this issue Jan 1, 2022 · 2 comments
Assignees
Labels
question Further information is requested

Comments

@sgfuigh
Copy link

sgfuigh commented Jan 1, 2022

Hello,after studying your code, I would like to ask how to get the state representation and action distribution updates of each time step from your encapsulated environment. I can't print the expression of the intermediate state of each step.

@ingambe ingambe self-assigned this Jan 5, 2022
@ingambe ingambe added the question Further information is requested label Jan 5, 2022
@ingambe
Copy link
Collaborator

ingambe commented Jan 5, 2022

Hello,
The environment after each call to the step function return the state, reward, a boolean to indicate if the environment is done, and an empty dictionary (to follow OpenAi's convention)
The state itself is a dictionary with two entries:

  • action_mask: Contain a boolean vector with legal actions
  • real_obs: The observation, containing for each job, the attributes you can use to make predictions

@sgfuigh
Copy link
Author

sgfuigh commented Jan 6, 2022 via email

@ingambe ingambe closed this as completed Jun 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants