lstm+ppo #886

1900360 · 2022-10-13T14:52:15Z

While training Pendulum-v0 with lstm+ppo, the following problems occurred using parallel environment:

Traceback (most recent call last):
  File "D:\desktop\lunwen_dabao\xinsuanfa0912\tian_sac_test\launch_multiprocessing_traning_cylinder.py", line 91, in <module>
    runner = Runner(
  File "C:\Users\1900\.conda\envs\yl\lib\site-packages\tensorforce\execution\runner.py", line 168, in __init__
    environment = Environment.create(
  File "C:\Users\1900\.conda\envs\yl\lib\site-packages\tensorforce\environments\environment.py", line 94, in create
    environment = MultiprocessingEnvironment(
  File "C:\Users\1900\.conda\envs\yl\lib\site-packages\tensorforce\environments\multiprocessing_environment.py", line 62, in __init__
    process.start()
  File "C:\Users\1900\.conda\envs\yl\lib\multiprocessing\process.py", line 121, in start
    self._popen = self._Popen(self)
  File "C:\Users\1900\.conda\envs\yl\lib\multiprocessing\context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "C:\Users\1900\.conda\envs\yl\lib\multiprocessing\context.py", line 327, in _Popen
    return Popen(process_obj)
  File "C:\Users\1900\.conda\envs\yl\lib\multiprocessing\popen_spawn_win32.py", line 93, in __init__
    reduction.dump(process_obj, to_child)
  File "C:\Users\1900\.conda\envs\yl\lib\multiprocessing\reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'overwrite_staticmethod.<locals>.overwritten'

Below is my code, what is wrong with that part? And about the settings of lstm, is there a problem?

import argparse
import re
from tensorforce import Runner, Agent,Environment
# import envobject_cylinder
import os
import gym

parser = argparse.ArgumentParser()
shell_args = vars(parser.parse_args())
shell_args['num_episodes']=300
shell_args['max_episode_timesteps']=200

number_servers=10
environments = []
for i in range(10):
    env = Environment.create(
    environment='gym', level='Pendulum-v0', max_episode_timesteps=200
    )
    environments.append(env)

# environment = Environment.create(
#     environment='gym', level='Pendulum-v0', max_episode_timesteps=200
# )

network_spec = [
    dict(type='rnn', size=512,horizon=4,activation='tanh',cell='lstm'),
    dict(type='rnn', size=512,horizon=4,activation='tanh',cell='lstm')
]
baseline_spec = [
   dict(type='rnn', size=512,horizon=4,activation='tanh',cell='lstm', ),
    dict(type='rnn', size=512,horizon=4,activation='tanh',cell='lstm', )
]

# env=gym.make('Pendulum-v0')
# Instantiate a Tensorforce agent
agent = Agent.create(
    states=dict(
                type='float',
                shape=(int(3), )
            ),
    actions=dict(
            type='float',
            shape=(1, ),
            min_value=-2,
            max_value=2
        ),
    max_episode_timesteps=200,
    agent='ppo',
    # num_parallel=10,
    environment=env,
    # max_episode_timesteps=200,
    batch_size=20,
     network=network_spec,
    learning_rate=0.001,state_preprocessing=None,
    entropy_regularization=0.01, likelihood_ratio_clipping=0.2, subsampling_fraction=0.2,
    predict_terminal_values=True,
    discount=0.97,
    # baseline=dict(type='1', size=[32, 32]),
    baseline=baseline_spec,
    baseline_optimizer=dict(
        type='multi_step',
        optimizer=dict(
            type='adam',
            learning_rate=1e-3
        ),
        num_steps=5
    ),
    multi_step=25,
    parallel_interactions=number_servers,
    saver=dict(directory=os.path.join(os.getcwd(), 'saved_models/checkpoint'),frequency=1  
    # save checkpoint every 100 updates
    ),
    summarizer=dict(
        directory='summary',
        # list of labels, or 'all'
        summaries=['entropy', 'kl-divergence', 'loss', 'reward', 'update-norm']
    ),
)
print('Agent defined DONE!')

runner = Runner(
    agent=agent,
    num_parallel=10,
    environments=environments,
    max_episode_timesteps=200,
    evaluation=False,
    remote='multiprocessing',
)
print('Runner defined DONE!')

runner.run(num_episodes=shell_args['num_episodes'],
           save_best_agent ='best_model',
           sync_episodes=False,
           )
runner.close()

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lstm+ppo #886

lstm+ppo #886

1900360 commented Oct 13, 2022

lstm+ppo #886

lstm+ppo #886

Comments

1900360 commented Oct 13, 2022