Skip to content

BrightFeather/deeprm_conv

Repository files navigation

DeepRM+



Based on works of Hongzi Mao
HotNets'16 http://people.csail.mit.edu/hongzi/content/publications/DeepRM-HotNets16.pdf



Made improvements based on DeepRM http://github.com/hongzimao/deeprm


Improvement on algorithm structure:


Rebuild the network with a convolution neural network

File: build_small_conv_pg_network in pg_network.py 
Network structure:
Input: CNN with size 2*2, 16 filters


Output: Fully connected layer with # of actions output
 
Major improvement. Improved convergence rate (by ??? --> To Do)


Reshape state space.

File: environment.py


In DeepRM, state space was generated by stacking vertically matrices in the following way:


State matrix for resource 1, job 1's request matrix for resource 1, job 2's request matrix for resource 1, ... , job n's request matrix for resource 1,\
 State matrix for resource 2, job 1's request matrix for resource 2, job 2's request matrix for resource 2, ... , job n's request matrix for resource 2.



I decide to put the related matrices closer, therefore stacking matrices in the following way:


Stacking vertically respectively: State matrix for resource 1, job 1's request matrix for resource 1, job 2's request matrix for resource 1, ... , job n's request matrix for resource 1,\
and State matrix for resource 2, job 1's request matrix for resource 2, job 2's request matrix for resource 2, ... , job n's request matrix for resource 2.
And then stack the above two long matrices vertically.


See picture below for better explanation:

Original state matrix

Original state matrix

Reshaped state matrix

Reshaped state matrix


Major improvement. Improved the average slowdown by 8.9% after 1000 epochs of training.

Rewrite penalty function.

File: parameters.py


I gave different weights of penalty for jobs already planned(in machine matrix), jobs in jobslot queue and jobs in backlog.

Minor improvement. Improved convergence rate.

Others



  • Added log and save checkpoints to make record of slowdown and save models (pg_re_single_core.py and pg_re.py) 


  • Added launcher2 for convenient launching and debugging (launcher2.py)

Install prerequisites

sudo apt-get update
sudo apt-get install python-numpy python-scipy python-dev python-pip python-nose g++ libopenblas-dev git
pip install --user Theano
pip install --user Lasagne==0.1
sudo apt-get install python-matplotlib

Run code

In folder RL, create a data/ folder.

Use launcher.py to launch experiments.

--exp_type <type of experiment> 
--num_res <number of resources> 
--num_nw <number of visible new work> 
--simu_len <simulation length> 
--num_ex <number of examples> 
--num_seq_per_batch <rough number of samples in one batch update> 
--eps_max_len <episode maximum length (terminated at the end)>
--num_epochs <number of epoch to do the training>
--time_horizon <time step into future, screen height> 
--res_slot <total number of resource slots, screen width> 
--max_job_len <maximum new job length> 
--max_job_size <maximum new job resource request> 
--new_job_rate <new job arrival rate> 
--dist <discount factor> 
--lr_rate <learning rate> 
--ba_size <batch size> 
--pg_re <parameter file for pg network> 
--v_re <parameter file for v network> 
--q_re <parameter file for q network> 
--out_freq <network output frequency> 
--ofile <output file name> 
--log <log file name> 
--render <plot dynamics> 
--unseen <generate unseen example> 

The default variables are defined in parameters.py.

Example:

  • launch supervised learning for policy estimation
python launcher.py --exp_type=pg_su --simu_len=50 --num_ex=1000 --ofile=data/pg_su --out_freq=10 
  • launch policy gradient using network parameter just obtained
python launcher.py --exp_type=pg_re --pg_re=data/pg_su_net_file_20.pkl --simu_len=50 --num_ex=10 --ofile=data/pg_re
  • launch testing and comparing experiemnt on unseen examples with pg agent just trained
python launcher.py --exp_type=test --simu_len=50 --num_ex=10 --pg_re=data/pg_re_1600.pkl --unseen=True

About

Based on Hongzi Mao's works of deeprm: https://github.com/hongzimao/deeprm

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages