Meeting 1 December 2017

Meeting Minutes - Project Group 8

Location: 2.002 Date: 1 December 2017
Time: 9:32 - 10:06

Attendance

Mark Winands
Joshua Scheidt
Marciano Geijselaers
Max Meijers
Simon Craenen (late)

Absent: Timo Raff (overslept)

Meeting:

Explanation of how approach works

Autoencoder, how many hidden layers do we need..?
- Three values feel as too little
- Variational autoencoders
- Might want to run on GPU -> need CUDA and all
Python keras framework
- Already in place
Not using planning track software, because Kurt said so
- Easier to get states in planning though
- Might want to switch back to it
Problem with square distance going to origin
- Maybe rewrite on our own
  - Send email, include Mark in CC
  - With already modified code, working, to push the process along
Accuracy still poor
- Don't know how many nodes needed
Connection between Java and Python
- Maven?
Good thing that we're using our own software, but that does mean that participation in the competition is not an option anymore.
- But we weren't planning on anymore anyway.
If needed, go from model <s,a,s',r> to <s,a,s,a>
Accuracy between 0.24 and 0.33.
- Not bad, but...:
- Mostly zeros and 121 in a few places.
Gradient normalisation: DEFINITELY something to implement!
- Have to be used to prevent "flying off the rails" of the values
Softmax, possible approach for action choice.
- All possible actions sum to 1 -> choose best from these
We're using regression, but that's expensive!
- Use classification if possible: alternative approach, (goes against Kurt?)
  - Use MCTS player from planning track to observe and learn best actions for use in classification
  - Put learned neural net back into MCTS, to influence exploration, and observe and learn more from it!
  - When in certain state, give network which action would be best
    - Check legality though!
  - Alternative extension
    - 5 nodes on input, 5 nodes on output, indicating whether an action is used.
    - Instead of <s,a,r>, return <s,a> (This is not compatible with Q-learning anymore)
    - Is a policy network, instead of value network we have now
  - MCTS already used to gather data, so extend this
Aachen cluster?
- Simply used for optimising weights
- Can put computations onto alllllll CPUs on the cluster if multi-core
Report period 2?
- Not necessarily now, but it's good to write things down before we forget
  - Problems we run into
  - Choices we make to get around them
  - Possible future research ideas

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Meeting 1 December 2017

Meeting Minutes - Project Group 8

Location: 2.002 Date: 1 December 2017
Time: 9:32 - 10:06

Attendance

Meeting:

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

Meeting 1 December 2017

Meeting Minutes - Project Group 8

Location: 2.002 Date: 1 December 2017 Time: 9:32 - 10:06

Attendance

Meeting:

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

Location: 2.002 Date: 1 December 2017
Time: 9:32 - 10:06