Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slime env compatible with SB3 #5

Merged
merged 7 commits into from
Apr 4, 2023
Merged

Slime env compatible with SB3 #5

merged 7 commits into from
Apr 4, 2023

Conversation

vamsianumula
Copy link
Contributor

@vamsianumula vamsianumula commented Feb 22, 2023

Updated Slime environment that is compatible with Stable Baselines3. This PR has added/modified:

  • slime_environments/environments/SlimeEnvSingleAgent.py: Modify Slime env
  • env-test-gym.py: Tests env compatability with SB3
  • slime_environments/agents/SA_QLearning.py: Modify the Q-learning algo with updated env
  • single-test-03-2023-02-22 13:54:51.756431.csv: Test run of env with Q-learning

@vamsianumula vamsianumula changed the title Vamsi dev Slime env compatible with SB3 Feb 22, 2023
Copy link
Collaborator

@smarianimore smarianimore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thanks!
However, I'd like to ask for an addition as a further "guarantee" of compliance: would you mind adding a test script similar to SA_QLearning.py where any single agent learning algorithm of your choice from stable-baselines3 is tested?
This way we are pretty sure that "everything works".

Copy link
Collaborator

@smarianimore smarianimore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe function observation_to_int_map in SA_QLearning.py is not necessary anymore with the new observation space?

Copy link
Collaborator

@smarianimore smarianimore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You removed the additional boolean I put in observations with comment DOC Gym v26 has additional..... Is it because they changed it again or beacause stable-baselines3 is not ready for Gymnasium?
In the latter case I think we should rely on this PR and not in the PIP version of stable-baselines3, so that our code will be ready when the next version of stable-baselines3 hits.
Instructions on how to use it are in the PR itself.

Copy link
Collaborator

@smarianimore smarianimore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen you removed type hints, is there a specific reason?

@vamsianumula
Copy link
Contributor Author

This is great, thanks! However, I'd like to ask for an addition as a further "guarantee" of compliance: would you mind adding a test script similar to SA_QLearning.py where any single agent learning algorithm of your choice from stable-baselines3 is tested? This way we are pretty sure that "everything works".

Sure, I will do that.

Maybe function observation_to_int_map in SA_QLearning.py is not necessary anymore with the new observation space?

I think we need it because the mapped integer is being used as a key for the custom implemented Q-learning algo right? But, do you mean for the SB3's Q Learning?

You removed the additional boolean I put in observations with comment DOC Gym v26 has additional..... Is it because they changed it again or beacause stable-baselines3 is not ready for Gymnasium? In the latter case I think we should rely on this PR and not in the PIP version of stable-baselines3, so that our code will be ready when the next version of stable-baselines3 hits. Instructions on how to use it are in the PR itself.

Yes, SB3 supports only gym v21 for now, not v26. I didn't know about that PR, I will look into it.

I've seen you removed type hints, is there a specific reason?

Initially, there were resulting in some error, so I tried without them. I will restore them now.

@smarianimore
Copy link
Collaborator

Maybe function observation_to_int_map in SA_QLearning.py is not necessary anymore with the new observation space?

I think we need it because the mapped integer is being used as a key for the custom implemented Q-learning algo right? But, do you mean for the SB3's Q Learning?

You're right, let's leave it there for the moment, I will check if custom Q-learning implementation can be updated according to new observation space to get rid of it

Thanks for all your work :)

@vamsianumula
Copy link
Contributor Author

Maybe function observation_to_int_map in SA_QLearning.py is not necessary anymore with the new observation space?

I think we need it because the mapped integer is being used as a key for the custom implemented Q-learning algo right? But, do you mean for the SB3's Q Learning?

You're right, let's leave it there for the moment, I will check if custom Q-learning implementation can be updated according to new observation space to get rid of it

Thanks for all your work :)

No problem. I had been a little occupied past week. I will push the requested changes soon.

@vamsianumula
Copy link
Contributor Author

This is great, thanks! However, I'd like to ask for an addition as a further "guarantee" of compliance: would you mind adding a test script similar to SA_QLearning.py where any single agent learning algorithm of your choice from stable-baselines3 is tested? This way we are pretty sure that "everything works".

Sure, I will do that.

I have added the test training code in env-test-gym.py and its working fine.

I've seen you removed type hints, is there a specific reason?

Initially, there were resulting in some error, so I tried without them. I will restore them now.

I have restored the hints.

@smarianimore please let me know in case of any more suggestions and the next steps for this project.

@smarianimore smarianimore merged commit b86baa9 into LucaInis:sm-baselines-api Apr 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants