# RLNN Tetris

*by Federico Larrieu and Tyson O'Leary, May 5, 2023*

## Introduction

We both really enjoy Tetris and we are interested in applying reinforcement learning to a more complex problem than we've seen in class. Applying it to tetris will be constantly interesting, and the research we've done shows it also will not be trivial. 

For our project, we will create an AI to play Tetris. It will use a neural network and reinforcement learning to learn to play tetris optimally. This will require us to either find an implementation of Tetris or implement the game ourselves, then define our states and actions to be used by the Q-net. 

The questions we will seek to answer:
  * What is the highest number of cleared lines our AI can reach?
  * What is the highest score our AI can achieve?
  * Can we split this problem into multiple neural networks to solve different parts of the problem?

We hypothesize that copying the reinforcement learning methods we learned in class most likely will not produce satisfactory results. We plan to explore multiple definitions for the actions and states, as well as multiple architectures like an ensemble model.

What, why, very brief overview of methods and results.

## Methods

We will build off of the neural networks we have created in class to make our reinforced learning networks. We also already have an implementation of Tetris that we can use to test and present our networks, and possibly to train them.

To approach creating an AI for Tetris we have decided to decompose Tetris into its different problems. The first of which is to decide the best placement for a piece. The second is how to move the piece into a specified position and rotation. We may also introduce new features, if time permits, such as piece selection strategy. Splitting the Tetris strategy into two different individual problems will allow to allocate the responsibility to two different reinforced neural networks. We will deploy both reinforced neural networks into an ensemble model architecture. The flow of the ensemble model architecture is as follows. First, a still frame of the game will be input, if a piece is held we will run both the piece held and the current piece through the position neural network. The position of the piece with the better placement will be output. If no piece is held, then the next piece and the current piece through the position neural network. The position of the piece with the better placement will be output. Next, the best position will be used in the input for the movement neural network, which will decide the next best action to move towards that position.

![EnsembleModel](EnsembleModel.drawio.svg)

### Position Reinforced Neural Network

The network that learns to find the best possible end position of a piece will take in all pixels of the board and the type of piece as input, and output the best landing position and rotation. Each training epoch, the network will try a possible position, and get a reward based on a few factors. This might include the height of the piece, whether it creates holes, and the number of lines it would clear

![PositionRLNN](PositionRLNN.drawio.svg)

![PositionTraining](PositionTraining.drawio.svg)

### Movement Reinforced Neural Network

Steps I took.  Resources I used, such as code from the class, on-line resources, research articles, books [Goodfellow, et al., 2016], ....

REQUIRED: If this is a team project, clearly describe in detail what each team member did.

## Results

Show all results.  Intermediate results might be shown in above Methods section.  Plots, tables, whatever.

## Conclusions

What I learned.  What was difficult.  Changes I had to make to timeline.

### References

* [Goodfellow, et al., 2016] Ian Goodfellow and Yoshua Bengio and Aaron Courville, [Deep Learning](http://www.deeplearningbook.org), MIT Press. 2016.

Your report for a single person team should contain **approximately** 2,000 words times number of team members, in markdown cells.  You can count words by running the following python code in your report directory.  Projects with two people, for example, should contain about 4,000 words, a four-person team should submit a report of approximately 8,000 words.

Of course, your results and analysis speak much more than the word count.  Deep analysis in a shorter form is better than vague, over-wordy non-analysis.

In [1]:
import io
from nbformat import current
import glob
nbfile = glob.glob('Project Report Example.ipynb')
if len(nbfile) > 1:
    print('More than one ipynb file. Using the first one.  nbfile=', nbfile)
with io.open(nbfile[0], 'r', encoding='utf-8') as f:
    nb = current.read(f, 'json')
word_count = 0
for cell in nb.worksheets[0].cells:
    if cell.cell_type == "markdown":
        word_count += len(cell['source'].replace('#', '').lstrip().split(' '))
print('Word count for file', nbfile[0], 'is', word_count)

Word count for file Project Report Example.ipynb is 251



- use nbformat for read/write/validate public API
- use nbformat.vX directly to composing notebooks of a particular version

  """)
