The Program SSY is the Stock Locating Algorithm of Steel Stock Yard Using A3C Reinforcement Learning
This module requires the following modules:
- python=3.5(3.5.6)
- tensorflow-gpu==1.14.0 (install gpu version of tensorflow module)
- tensorflow==1.14.0 (install cpu version of tensorflow module)
- scipy==1.2.1
- pygame
- moviepy
- numpy==1.18.5
- pandas
- matplotlib
- PyCharm under version of 2020.1
In order to generate gif file, ImageMagik program is also required.
If you are using window OS ,
you should do some additional works follows...
In config_defaults.py , which has a directory :
C:\Users\user\Anaconda3\envs\'virtual_env_name'\Lib\site-packages\moviepy , change the code
IMAGEMAGICK_BINARY = os.getenv('IMAGEMAGICK_BINARY', 'auto-detect')
into
IMAGEMAGICK_BINARY = os.getenv('IMAGEMAGICK_BINARY', 'C:\Program Files\ImageMagick-7.0.9-Q16\magick.exe')
ImageMagick-7.0.9-Q16
is the ImageMagik version you had installed.
-
Assigned plates vs Random plates vs External file plates
This part is to decide whether you use same plates list(Assigned plates) or different plates list(Random plates) for each episode. If both are not the case you could use external file to import import plates
-
Assigned plates
For Assigned plates you would use same plates for every episodes. So, in train.py > worker > member function work()
s = self.env.reset()
In train.py > main function,
the parameter should be set as
inbound_plates=inbounds
for<class> Locating
locating = Locating(max_stack=max_stack, num_pile=num_pile, inbound_plates=inbounds, observe_inbounds=observe_inbounds, display_env=False)
-
Configuration for number of plates
In train.py > main function
number of plates could be changed by assigning different numbers for parameter
num_plate
ofgenerate_schedule()
inbounds = generate_schedule(num_plate=50)
-
Configuration for Random Shuffle
In steelstockyard.py > Locating > reset()
For Assigned plates to use Random Shuffle of plates list for every episodes, you should add code line
random.shuffle(self.inbound_plates)
else: self.inbound_plates = self.inbound_clone[(episode-1) % len(self.inbound_clone)][:] random.shuffle(self.inbound_plates)
-
-
Random Plates
For Random plates you would use different plates list for every episodes.
So, in train.py > worker > member function work()
s = self.env.reset(hold=False)
In train.py > main function,
the parameter should be set as
inbound_plates=None
for Locatinglocating = Locating(max_stack=max_stack, num_pile=num_pile, inbound_plates=None, observe_inbounds=observe_inbounds, display_env=False)
-
Configuration for number of plates
In steelstockyard.py > Locating > __ init __()
number of plates could be changed by assigning different numbers for parameter
num_plate
ofplate.generate_schedule()
else: self.inbound_plates = plate.generate_schedule(250) # in this case, number of plates is 250 self.inbound_clone = self.inbound_plates[:]
In steelstockyard.py > Locating > reset()
You should also assign the same number for parameter
num_plate
ofplate.generate_schedule()
if not hold: self.inbound_plates = plate.generate_schedule(250) # in this case, number of plates is 250 self.inbound_clone = self.inbound_plates[:]
-
Configuration for Random Shuffle
In steelstockyard.py > Locating > reset()
As you can see in the code, for Random plates , Random Shuffle is meaningless because we generate different plates list for each episode
if not hold: self.inbound_plates = plate.generate_schedule() self.inbound_clone = self.inbound_plates[:]
-
-
External file plates
In train.py > main function
use
import_plates_schedule_by_week('file_dir/file_name.csv')
instead ofgenerate_schedule()
inbounds = import_plates_schedule_by_week('../../environment/data/SampleData.csv')
locating = Locating(max_stack=max_stack, num_pile=num_pile, inbound_plates=inbounds, observe_inbounds = observe_inbounds, display_env=False)
-
Configuration for Random Shuffle
In steelstockyard.py > Locating > reset()
For External file plates to use Random Shuffle of plates list for every episodes, you should add code line
random.shuffle(self.inbound_plates)
else: self.inbound_plates = self.inbound_clone[(episode-1) % len(self.inbound_clone)][:] random.shuffle(self.inbound_plates)
-
External file Format (ex)
You can see the samples data in environment > data > SampleData.csv
-
-
Learning Rate
In train.py > main function
trainer = tf.train.AdamOptimizer(learning_rate=5e-5)
A recommanded figure for learning rate is 5e-5
-
Discount Rate
In train.py > main function
gamma = .99 # discount rate for advantage estimation and reward discounting
A recommanded figure for discount rate is 0.99
-
Frequency of target model update(N-step bootstrapping)
In A3C Algorithm, each of Workers collects samples with its own environments. After a certain number of time-steps , target network(global network) is updated with that samples.
In train.py > main function
if len(episode_buffer) == 30 and d != True and episode_step_count != max_episode_length-1: # in this case, frequency of target model update is 30 time-steps
-
Number of Threads
You can also set up number of threads by changing the number of Workers
In train.py > main function
num_workers = nultiprocessing.cpu.count() # Set workers to number of available CPU threads if num_workers > 8: num_workers = 8 workers = []
-