which options #5

Tsunehiko · 2019-07-02T13:06:20Z

Hello. I am not very familiar with reinforcement learning.

python3 main.py --log-dir trained_models/acktr --algo acktr --model squeeze --num-process 24 --map-width 27 --render

In the above, main.py does not work. Please tell me the options you need to run 'main.py'.

smearle · 2019-07-02T14:55:19Z

Hi Tsunehiko, thanks for your interest.
Apologies, the Readme is outdated. Try this:

python3 main.py --experiment test000 --model FullyConv --num-process 24 --map-width 16 --render --overwrite --algo a2c

Please have a look at arguments.py or call python3 main.py --help to see a full list of arguments available.

Tsunehiko · 2019-07-02T15:06:26Z

Thank you for your quick response.
I want to try other algorithms on Micropolis. Is it possible to connect gym_micropolis / envs / env.py to other algorithms (eg stable_baselines) as a gym environment?

smearle · 2019-07-02T15:33:01Z

Funny, this repository is actually an adaptation of the following: https://github.com/ikostrikov/pytorch-a2c-ppo-acktr-gail, which is itself adapted from OpenAI baselines.

I suppose one could use this environment with a different learning algorithm. There would be two roadblocks for you to overcome, at least that spring to mind:

The function setMapSize() passes a bunch of micropolis-specific options to the environment. So you'd need to edit env.py to call this in the init() function, with predetermined options.
The action space in Micropolis is very large. It corresponds to a flattened 2D image the size of the map, with number of channels corresponding to number of tile-specific actions, so you'd need to make sure the model being used was compatible with this (i.e., I'm exploring fully convolutional NN's that retain a spatial correspondence between the observation and action spaces - you can find them in model.py; I don't think you'd want to use a net that bottlenecks too drastically - Atari has a much smaller action space, for example).

Indeed it would be nice for this to be incorporated as part of the standard family of gym environments. I am working on it here for now because I'm still experimenting with the environment itself, e.g., playing with different reward functions, designing mini-games, playing with different map sizes.

Which algorithm were you thinking of using in particular (stable_baselines is a collection of algorithms, not one single algorithm)?

smearle · 2019-07-02T16:37:18Z

Feel free to shoot me an email if you have any more questions, ideas, or want to chat about RL :)

Tsunehiko · 2019-07-02T16:48:44Z

Thank you for your kind explanation and kindness.
The reason I used stable_baselines is that I wasn't used to RL, and stable_baselines seemed to be the easiest way to handle the algorithm. (There was no particular emphasis on the algorithm.)

In model.py, an error occurs when the modules of densenet_pytorch and ConvLSTMCell are not found. Which module should I install?

I will ask you by email from next.

smearle · 2019-07-02T16:59:57Z

Aha, densenet_pytorch is an old dependency, so I got rid of that, and I added ConvLSTMCell.py to the repo. Do a 'git pull' from inside the repo and try again!
Thanks for bringing my attention to these problems.

smearle · 2019-07-02T17:01:48Z

And indeed, you'd probably be wise to start with RL by playing with stable_baselines and some Atari games, etc. But I selfishly would rather you play with this repo because you're helping me troubleshoot it :)

Tsunehiko · 2019-07-02T17:27:14Z

Thank you for your quick response. I'm glad if I'm useful too.
Can I use the --no-cuda option? I can not use a machine with GPU now, but I have to handle it only with CPU :(

By the way, I'm thinking of using this repo for at least July.

smearle · 2019-07-02T17:53:19Z

Not ideal, but no problem. Just patched up something that was in the way of using no-cuda. I've now got the above command working with --no-cuda. Do another pull and let me know if you have any luck.

Tsunehiko · 2019-07-02T23:04:06Z

python3 main.py --experiment test000 --model FullyConv --num-process 24-map width 16 --render --overwrite --algo a2c --no-cuda
I ran the above command. The displayed screen only passes time, and no action is taken. How can I see how agents are playing? May I use enjoy.py?

smearle · 2019-07-03T00:38:51Z

Strange, I can't replicate this. What are you getting in the command line?

Tsunehiko · 2019-07-03T01:14:16Z

The command line output is very large, so I don't know which one to write. I wrote the last command line output. Also, the output of the screen at that time looks like an attached image.

PLAYCITY
('PAUSED', False, 'running', True)
PLAYMODE
('PAUSED', False, 'running', True)
len of intsToActions: 4864
 num tools: 19

len of intsToActions: 4864
 num tools: 19
('WINDOW SIZE', 800, 608)
len of intsToActions: 4864
 num tools: 19
('WINDOW SIZE', 800, 608)
{'Road': ['Road', 'Wire', 'Rail', 'Water'], 'Wire': ['Road', 'Wire', 'Rail', 'Water'], 'Rail': ['Rail', 'Wire', 'Road', 'Water'], 'Water': ['Road', 'Water', 'Wire', 'Rail'], 'Net': ['Net', 'Airport'], 'Airport': ['Net', 'Airport']}
PLAYCITY
('PAUSED', False, 'running', True)
PLAYMODE
('PAUSED', False, 'running', True)
len of intsToActions: 4864
 num tools: 19
/home/tsunehiko/gym-micropolis/micropolis/MicropolisCore/src/images/tileEngine/tiles.png
/home/tsunehiko/gym-micropolis/micropolis/MicropolisCore/src/images/tileEngine/tiles.png

smearle · 2019-07-03T01:43:05Z

This is the most recent command line output? If time is passing on the map, then training must be underway, so you must be getting printouts indicating avg. reward, number of frames and the like. Try the same with --log-interval 1 so that this printout will occur as frequently as possible. Depending on your system training might be very slow. You might also adjust --num-proc 2, so that only 2 games will run during training (there's a bug with only 1, at the moment). This will speed up gameplay in those individual games.

Tsunehiko · 2019-07-03T02:13:48Z

I tried both --log-interval 1 and --num-proc 2, but the map still displayed does not change, only time is advancing. The output displayed at the end of the command line is shown below.

==== STARTGAME                                                                                       
GENERATECITY                                                                                         
STARTMODE                                                                                            
('PAUSED', True, 'running', False)                                                                   
('SWITCHPAGE', <micropolisnotebook.MicropolisNotebook object at 0x7fd3f24d3e10 (pyMicropolis+micropolisEngine+micropolisnotebook+MicropolisNotebook at 0x563695516510)>, <micropolisnotebook.MicropolisNot
ebook object at 0x7fd3f24d3e10 (pyMicropolis+micropolisEngine+micropolisnotebook+MicropolisNotebook at 0x563695516510)>, <micropolisnoticepanel.MicropolisNoticePanel object at 0x7fd3f24d5630(pyMicropolis+micropolisEngine+micropolisnoticepanel+MicropolisNoticePanel at 0x563695518340)>, 0)              
('WINDOW SIZE', 800, 608)                                                                            
('WINDOW SIZE', 800, 608)                                                                            
('WINDOW SIZE', 800, 608)                                                                            
('WINDOW SIZE', 800, 608)                                                                            
{'Road': ['Road', 'Wire', 'Rail', 'Water'], 'Wire': ['Road', 'Wire', 'Rail', 'Water'], 'Rail': ['Rail', 'Wire', 'Road', 'Water'], 'Water': ['Road', 'Water', 'Wire', 'Rail'], 'Net': ['Net', 'Airport'], 'Airport': ['Net', 'Airport']}                                                                        
PLAYCITY                                                                                             
('PAUSED', False, 'running', True)                                                                   
PLAYMODE                                                                                             
('PAUSED', False, 'running', True)                                                                   
{'Road': ['Road', 'Wire', 'Rail', 'Water'], 'Wire': ['Road', 'Wire', 'Rail', 'Water'], 'Rail': ['Rail', 'Wire', 'Road', 'Water'], 'Water': ['Road', 'Water', 'Wire', 'Rail'], 'Net': ['Net', 'Airport'], 'Airport': ['Net', 'Airport']}                                                                        
PLAYCITY                                                                                             
('PAUSED', False, 'running', True)                                                                   
PLAYMODE                                                                                             
('PAUSED', False, 'running', True)                                                                   
len of intsToActions: 4864                                                                           
 num tools: 19                                                                                       
len of intsToActions: 4864                                                                           
 num tools: 19                                                                                       
BASE NETWORK:                                                                                        
 MicropolisBase_FullyConv(                                                                           
  (embed): Conv2d(32, 32, kernel_size=(1, 1), stride=(1, 1))                                         
  (k5): Conv2d(32, 32, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))                            
  (k3): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))                            
  (val_shrink): Conv2d(32, 32, kernel_size=(2, 2), stride=(2, 2))                                    
  (val): Conv2d(32, 1, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))                            
  (act): Conv2d(32, 19, kernel_size=(1, 1), stride=(1, 1))                                           
)                                                                                                    
/home/tsunehiko/gym-micropolis/micropolis/MicropolisCore/src/images/tileEngine/tiles.png             
/home/tsunehiko/gym-micropolis/micropolis/MicropolisCore/src/images/tileEngine/tiles.png

smearle · 2019-07-03T02:40:27Z

How fast is time moving? An update only occurs for me in the late 1900s. You could make --map-size 4 and --max-step 16? This should make updates occur by 1950 latest.

Tsunehiko · 2019-07-03T02:56:27Z

I tried --map-size 4 and --max-step 16, but the speed of the time was ultrafast and increased infinitely beyond the 1950s.

smearle · 2019-07-03T04:04:43Z

I'm at a loss. I can only recommend looking at line 265 in main.py, which should be printing out, and working backward from there with print statements, trying to find out why 265 is never reached.

Tsunehiko · 2019-07-04T06:51:55Z

Thank you for checking carefully.
I examine the code and try it myself. I will contact you again if I have any questions.

smearle · 2019-07-04T16:51:39Z

Also, try without --render, and see if you get any printouts.

Tsunehiko · 2019-07-06T14:03:49Z

I tried without --render. Results are now displayed! However, no matter how much learning is done, reward will not increase ...

Tsunehiko · 2019-07-06T14:51:57Z

Also, an error like an image has appeared. How do I fix this error?

smearle · 2019-07-06T17:14:44Z

That's some good news! I suspected that the problem stemmed from the GUI. In particular, there must be a call to gtk.main_interation(), or something to that effect, hidden somewhere. This function runs the GUI indefinitely, waiting for user input, so it would stop our training code dead in its tracks (and simply let the game run very fast). Strange that I don't experience the same issue on my end, and can't find a call to the function in the code. Might be an operating system-specific issue.

As for the "Unable to init server" error, I think this might be the result of too many dead python processes hanging out on the cpu (interrupting training is not yet handled gracefully by the code). Try again after pkill -python or simply after restarting your machine.

To see if the bot's doing anything, git pull the repo, then try again with the option --print-map which will print an array representing (the 0th-ranked environment's) game map, with different tile-types corresponding to different integers. If this map is not changing, then the bot is not building on the map.

Tsunehiko · 2019-07-07T07:10:53Z

The array displayed by --print-map has changed, so it seems that learning is possible. How do I check the bots I learned? Should I use enjoy.py?

Also, "Unable to init server" is an error that was displayed when I tried using a GPU server that I can use, not my machine. This may be the cause.

smearle · 2019-07-07T17:49:12Z

Yes, you can try something like python enjoy.py --render --map-size 16 --load-dir a2c_FullyConv_w16/test000 (I wonder if we'll get stuck in the same GUI loop though!).

smearle self-assigned this Jul 2, 2019

smearle closed this as completed Jan 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

which options #5

which options #5

Tsunehiko commented Jul 2, 2019

smearle commented Jul 2, 2019

Tsunehiko commented Jul 2, 2019

smearle commented Jul 2, 2019

smearle commented Jul 2, 2019

Tsunehiko commented Jul 2, 2019

smearle commented Jul 2, 2019

smearle commented Jul 2, 2019

Tsunehiko commented Jul 2, 2019

smearle commented Jul 2, 2019 •

edited

Loading

Tsunehiko commented Jul 2, 2019

smearle commented Jul 3, 2019

Tsunehiko commented Jul 3, 2019

smearle commented Jul 3, 2019 •

edited

Loading

Tsunehiko commented Jul 3, 2019

smearle commented Jul 3, 2019

Tsunehiko commented Jul 3, 2019

smearle commented Jul 3, 2019

Tsunehiko commented Jul 4, 2019

smearle commented Jul 4, 2019

Tsunehiko commented Jul 6, 2019

Tsunehiko commented Jul 6, 2019

smearle commented Jul 6, 2019 •

edited

Loading

Tsunehiko commented Jul 7, 2019 •

edited

Loading

smearle commented Jul 7, 2019 •

edited

Loading

which options #5

which options #5

Comments

Tsunehiko commented Jul 2, 2019

smearle commented Jul 2, 2019

Tsunehiko commented Jul 2, 2019

smearle commented Jul 2, 2019

smearle commented Jul 2, 2019

Tsunehiko commented Jul 2, 2019

smearle commented Jul 2, 2019

smearle commented Jul 2, 2019

Tsunehiko commented Jul 2, 2019

smearle commented Jul 2, 2019 • edited Loading

Tsunehiko commented Jul 2, 2019

smearle commented Jul 3, 2019

Tsunehiko commented Jul 3, 2019

smearle commented Jul 3, 2019 • edited Loading

Tsunehiko commented Jul 3, 2019

smearle commented Jul 3, 2019

Tsunehiko commented Jul 3, 2019

smearle commented Jul 3, 2019

Tsunehiko commented Jul 4, 2019

smearle commented Jul 4, 2019

Tsunehiko commented Jul 6, 2019

Tsunehiko commented Jul 6, 2019

smearle commented Jul 6, 2019 • edited Loading

Tsunehiko commented Jul 7, 2019 • edited Loading

smearle commented Jul 7, 2019 • edited Loading

smearle commented Jul 2, 2019 •

edited

Loading

smearle commented Jul 3, 2019 •

edited

Loading

smearle commented Jul 6, 2019 •

edited

Loading

Tsunehiko commented Jul 7, 2019 •

edited

Loading

smearle commented Jul 7, 2019 •

edited

Loading