Skip to content
Ape-x DQN implementation on Pikachu Volleyball
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.
img readme Aug 1, 2018 readme Aug 1, 2018 readme Aug 1, 2018 readme Aug 1, 2018 initial commit Jul 15, 2018
requirements.txt readme Aug 1, 2018

Alphachu: Ape-x DQN implementation of Pikachu Volleyball

[Demo] [Paper]

Training agents to learn how to play Pikachu Volleyball. Architecture is based on Ape-x DQN from the paper. The game is in exe file which makes the whole problem much more complicated than other Atari games. I built python environment to take screenshot of the game to provide as state and detect the start and end of game. I used mss to take screen shot, cv2 to preprocess image, pynput to press the keyboard, and tensorboardX to record log. I created a number of virtual monitors with Xvfb for each actor. To provide different key input to each monitor, the architecture had to be multi-process. A learner only trains on GPU and many(Assume 10) actors collected data from virtual monitors. They communicate through files in log directory.

As it sounds, it is complicated. My method seems pretty primitive but it was the only way to train pikachu volleyball.


Before start

  • I tried this in Ubuntu and Mac.
  • Reset log_directory and data_directory in, and


  • Install PyTorch dependencies from
  • Install requirements.txt (pip install -r requirements.txt)
  • Install Xvfb(sudo apt-get install xvfb -y)

Creating Virtual Monitors with Xvfb

Repeat this for 10 times to create virtual monitors.

Xvfb :99 -ac -screen 0 1280x1024x24 > /dev/null &
echo "export DISPLAY=:99" >> ~/.bashrc

Run learner

Run learner and copy the model timestamp with configuration.

python --actor-num 10
Learner: Model saved in  /home/sungwonlyu/experiment/alphachu/180801225440_256_0.0001_4_84_129_32_1_30000_1500_10/

Run actors

Run pika.exe and actor in virtual monitor. Also need to do this 10 times with varying epsilons.

DISPLAY=:99 wine pika.exe
DISPLAY=:99 python --load-model 180801225440_256_0.0001_4_84_129_32_1_30000_1500_10 --epsilon 0.9 --wepsilon 0.9


To see the performance of the agent, reset screen-size in to set the place for screen shot. Then place the pika.exe to the area and start a actor with trained model.

wine pika.exe
python --load-model 180801225440_256_0.0001_4_84_129_32_1_30000_1500_10 --test



You can find demo on youtube.


0.99 smoothed graphs for the first 7 days.







Max Value




Total reward

My score - computer score (-15 ~ 15) img

You can’t perform that action at this time.