# Mimicking Star Wars Characters Using Neural Networks

Hello! The aim of this NoteBook is to implement different Neural Networks capable of generating dialogue texts of different Star Wars characters. We will implement it using the Keras library and the Tensorflow backend.

![IMG](https://media.giphy.com/media/3ornk57KwDXf81rjWM/giphy.gif)

Well, first things first: let's import our libraries:

In [None]:
## Importing packages

# This R environment comes with all of CRAN and many other helpful packages preinstalled.
# You can see which packages are installed by checking out the kaggle/rstats docker image: 
# https://github.com/kaggle/docker-rstats

options(warn = -1)

library(wordcloud) # Word Clouds
library(jpeg) # Read .JPEG
library(circlize) # Chord Diagrams
library(png) # Read .PNG
library(RCurl) # Get contents from URLs
library(RColorBrewer) # Color Scheme to the Word Cloud
library(scales) # Custom scales to Ggplot
library(repr) # Resize R plots in Jupyter Notebooks
library(tidyverse) # Metapackage with lots of helpful functions
library(tensorflow) # Deep learning backend
library(kerasR) # Use Keras' Early Stopping
library(caret) # Use in K-Fold split
library(keras) # API that interacts with tensorflow
library(gridExtra) # Multiple ggplots same figure
library(magrittr) # Enable pipe operator

options(warn = 0)

## Running code

# In a notebook, you can run a single code cell by clicking in the cell and then hitting 
# the blue arrow to the left, or by clicking in the cell and pressing Shift+Enter. In a script, 
# you can run code by highlighting the code you want to run and then clicking the blue arrow
# at the bottom of this window.

## Reading in files

# You can access files from datasets you've added to this kernel in the "../input/" directory.
# You can see the files added to this kernel by running the code below. 

list.files(path = "../input")

## Saving data

# If you save any files or images, these will be put in the "output" directory. You 
# can see the output directory by committing and running your kernel (using the 
# Commit & Run button) and then checking out the compiled version of your kernel.

In [None]:
# input.dir <- '/kaggle/input/star-wars-movie-scripts/'
input.dir <- '/kaggle/input/star-wars-movie-scripts/'
options(repr.plot.width=8)
Sys.setenv(RETICULATE_PYTHON = "/usr/local/share/.virtualenvs/r-reticulate/bin/")

The basic steps we will follow is to:
1. Analyse the Dataset
2. Prepare the Data
3. Implement the Models
4. Test the Models
5. Take Conclusions

# 1A. Data Analysis - PART I: Checking the File
Let's start by taking a look at the input data:

In [None]:
filterText <- function(text.in) {
    strsplit(gsub('[^[:alnum:] \']|?|!', '', text.in) %>% tolower, ' ') %>%
    unlist %>% (function(X) X[X != '']) %>% return
}

formatScriptText <- function(file.name) {
    
    read.delim(paste(c(input.dir, file.name), collapse = ''), stringsAsFactors = FALSE) %>% 

    mutate(words.list = lapply(character.dialogue, filterText)) %>% 
    mutate(ID = lapply(words.list, (function(X) X[1]))) %>%
    mutate(Character = lapply(words.list, (function(X) X[2])) %>% unlist) %>%
    mutate(Words.List = lapply(words.list, (function(X) tail(X, -2)))) %>%
                                            
    select(c('ID', 'Character', 'Words.List')) %>% return
}
                                            
'SW_EpisodeIV.txt' %>% formatScriptText %>% head

We have a text file and we can get the ID of the dialogue text, the character name and the text itself by processing the generated text, which is represented by a 1-column dataframe:

Then we can put the dataframe of each episode on the same structure and add a column to show the episode of eatch dialogue text:

In [None]:
df_ep4 <- formatScriptText('SW_EpisodeIV.txt')
df_ep5 <- formatScriptText('SW_EpisodeV.txt')
df_ep6 <- formatScriptText('SW_EpisodeVI.txt')

df_ep4$Episode <- '4'
df_ep5$Episode <- '5'
df_ep6$Episode <- '6'

df_all <- bind_rows(bind_rows(df_ep4, df_ep5), df_ep6)
head(df_all)

In a movie, some characters never appears and we need to have a nice amount of words to train a good model, so we can get all the characters that appeared more than a minimum number of times:

In [None]:
min_appearance = 100
df_main_characters <- df_all %>% 
                      group_by(Character) %>% 
                      mutate(count = n()) %>%
                      filter(count > min_appearance)
ggplot(df_main_characters, aes(x = Character, fill = Character)) + 
    geom_bar(stat = 'count')

There is a **HUGE** problem here: where is Yoda? In our model should Yoda be.

![YODA](https://media.giphy.com/media/3ohuAxV0DfcLTxVh6w/giphy.gif)

So, we are going to add an extra condition to our filter: if the Character equals Yoda, then the row will be considered in our analysis!

In [None]:
df_main_characters <- df_all %>% 
                      group_by(Character) %>% 
                      mutate(count = n()) %>%
                      filter(count > min_appearance | Character == 'yoda') %>%
                      select(-c('count'))

ggplot(df_main_characters, aes(x = Character, fill = Character)) + geom_bar(stat = 'count')

Much better! Actually we are not interested in taking the number of appearances, we want the number of spoken words. But the number of appearances is a proxy of this quantity as we can check below:

In [None]:
df_main_characters <- df_main_characters %>% mutate(N.Words = lapply(Words.List, (function(X) length(X))) %>% unlist)
df_main_characters_statistics <- df_main_characters %>% 
                                 group_by(Character) %>% 
                                 summarise(Total.Words = sum(N.Words), N.Appearances = n()) %>%
                                 mutate(Words.Per.Appearance = Total.Words / N.Appearances) %>%
                                 as.data.frame
                                                                                  
df_main_characters_statistics

The Words / Appearances doesn't seem to change much. Also, the Total.Words seems to follow the N.Appearances variable.

In [None]:
gg1 <- df_main_characters_statistics %>% ggplot(aes(x = Character, y = N.Appearances, fill = Character)) + geom_bar(stat = 'identity')
gg2 <- df_main_characters_statistics %>% ggplot(aes(x = Character, y = Total.Words, fill = Character)) + geom_bar(stat = 'identity')
gg3 <- df_main_characters %>% ggplot(aes(x = Character, y = N.Words, fill = Character)) + 
                                geom_jitter(aes(alpha = 0.9999999)) + geom_boxplot() + scale_y_continuous(trans = log2_trans())

gg1 <- gg1 + theme(axis.title.x = element_blank(), axis.text.x = element_blank(), axis.ticks.x = element_blank())
gg2 <- gg2 + theme(axis.title.x = element_blank(), axis.text.x = element_blank(), axis.ticks.x = element_blank())
gg3 <- gg3 + theme(axis.title.x = element_blank(), axis.text.x = element_blank(), axis.ticks.x = element_blank())

options(repr.plot.width = 18)
grid.arrange(gg1, gg2, gg3, ncol = 3)
options(repr.plot.width = 8)

In fact, the number of words seems to be strongly correlated with the number of appearances. We can check it by taking the Pearson correlation:

In [None]:
df_all <- df_all %>% mutate(N.Words = lapply(Words.List, (function(X) length(X))) %>% unlist)
df_all_statistics <- df_all %>% group_by(Character) %>% summarise(Total.Words = sum(N.Words), N.Appearances = n()) %>%
                     mutate(Words.Per.Appearance = Total.Words / N.Appearances) %>% as.data.frame
                                                                      
cor(x = df_all_statistics$Total.Words, y = df_all_statistics$N.Appearances, method = 'pearson')

The correlation is $98.6 \%$. So we are following the right way!

# 1.B. Data Analysis - PART II: Checking the Data

What about the content of the dialogues? Well, we take a general browse on the dialogues using a word cloud diagram, segregated by Characters and by Star Wars Episodes! It will be useful to compare the generated texts with each cloud and, also, we can learn a little bit more about the Star Wars good and old trilogy!

We will, then, show as an example the evolution of Luke for each episode...later, during the testing step, we will check the cloud words with the output of each Neural Network...

In [None]:
df_per_character_and_ep <- df_main_characters %>% 
                           select(c('Character', 'Episode', 'Words.List')) %>%
                           group_by(.dots = c('Character', 'Episode')) %>%
                           summarise(Full.Text = Reduce(c, Words.List) %>% paste(collapse = ' ')) %>% 
                           as.data.frame

df_per_character <- df_main_characters %>% 
                    select(c('Character', 'Episode', 'Words.List')) %>%
                    group_by(.dots = c('Character')) %>%
                    summarise(Full.Text = Reduce(c, Words.List) %>% paste(collapse = ' ')) %>% 
                    as.data.frame

df_per_ep <- df_main_characters %>% 
             select(c('Character', 'Episode', 'Words.List')) %>%
             group_by(.dots = c('Episode')) %>%
             summarise(Full.Text = Reduce(c, Words.List) %>% paste(collapse = ' ')) %>% 
             as.data.frame

characters_list <- unique(df_per_character_and_ep$Character)
episodes_list <- unique(df_per_character_and_ep$Episode)

In [None]:
n.Characters <- nrow(df_per_character)
print.Word.Cloud <- function(character_name = 'ALL', ep = 'ALL'){
    if (ep == 'ALL') {
        df_filter <- df_per_character %>% filter(Character == character_name)
    } else if (character_name == 'ALL') {
        df_filter <- df_per_ep %>% filter(Episode == ep)
    } else {
        df_filter <- df_per_character_and_ep %>% filter(Character == character_name, 
                                                        Episode == ep)
    }
    wordcloud(df_filter$Full.Text[1], max.words = 100, min.freq = 0, colors = brewer.pal(8, "Dark2"))
}

## 1.B1 - Luke on Ep. IV - Word Cloud

In [None]:
print.Word.Cloud('luke', '4')

Seems that Luke is really concerned with C3PO on the episode IV

## 1.B2 - Luke on Ep. V - Word Cloud

In [None]:
print.Word.Cloud('luke', '5')

In this Episode, Yoda appears and R2D2 (artoo) is repaired...so, it does make sense!

## 1.B3 - Luke on Ep. VI - Word Cloud

In [None]:
print.Word.Cloud('luke', '6')

Yes! In this episode Luke discovers, effectively, that Darth Vader is **HIS FATHER**! And then, FATHER is just the most spoken Luke's word! (After filtering words like articles or prepositions).

![YAY](https://i0.wp.com/www.meugamer.com/wp-content/uploads/2018/05/luke-i-am-your-father-meugamer.jpg?fit=800%2C450&ssl=1)

## 1.C. Check Mutual Character Relationships
Would it be possible to find a measure to check mutual relationships among different characters? Sure! We can, for instance, count how many times one character says the name of other character and plot it in a chord diagram and that's what we are going to do in this section.

### 1.C.1 Mentions Diagram - All Episodes

In [None]:
get.Mentions.Number <- function(from_character, to_character, ep='ALL') {
    if (ep == 'ALL') {
        df_filter <- df_per_character %>% filter(Character == from_character)
    } else {
        df_filter <- df_per_character_and_ep %>% filter(Character == from_character, 
                                                        Episode == ep)
    }
    out <- if (length(df_filter$Full.Text) > 0) (str_count(df_filter$Full.Text[[1]], to_character) %>% sum) else (0)
    return(out)
}

plot.Mentions = function(ep='ALL') {
    character_list <- df_per_character[['Character']] %>% unique
    adj_mention_matrix <- matrix(0, character_list %>% length, character_list %>% length)

    rownames(adj_mention_matrix) <- character_list
    colnames(adj_mention_matrix) <- character_list

    for (character1 in character_list) {
        for (character2 in character_list) {
            if (character1 != character2) {
                adj_mention_matrix[[character1, character2]] <- get.Mentions.Number(character1, character2, ep)
            }
        }
    }
    
    par(cex=2.5)
    return(
        adj_mention_matrix %>% chordDiagram(annotationTrack = c("name", "grid"), 
                                        annotationTrackHeight = c(0.03, 0.01))
    )
}

plot.Mentions()

We can also check the mentions per episodes:

### 1.C.2. Mentions Diagram - Episode IV

On the episode IV, we have few mentions. We can notice a strong relation between Luke and Threepio and between Luke and Han. Also, there is no Yoda appearance in this Episode.

In [None]:
plot.Mentions(4)

### 1.C.3. Mentions Diagram - Episode V

The relations become more complex and it seems that everybody is aways mentioning Luke Skywalker. We can notice that this feature interesting to detect the protagonist of different stories, for instance.

In [None]:
plot.Mentions(5)

### 1.C.4. Mentions Diagram - Episode VI

A strong relation between Luke Skywalker and Han Solo can be noticed on episode IV.

In [None]:
plot.Mentions(6)



# 2. Data Preparation


To train the neural network, we need to encode each word with a different number and organize them in windows, after converting the words list column to a tensor of words. By looking at the boxplot, it seems that a Window with size 5 is interesting - it's not too small and we can get more than 1 window with most of the dialogue texts.

If we have less words than 5 in a dialogue text, we will pad it with "empty" tokens. In this case, we will have "$5 - N_{Words}$" tokens that need to be appended to the tensor. We also have to remove non significant chars and words from the list before doing all this work.

In [None]:
all.words <- reduce(df_all['Words.List'], c)
tokenizer <- text_tokenizer()
tokenizer$fit_on_texts(all.words)

Let's check an example of coded window:

In [None]:
texts_to_sequences(tokenizer, 
                   c('much', 'to', 'learn', 'you', 'still', 'have')) %>% unlist %>% print

Then, we can create a dictionary mapping each found code to a different word:

In [None]:
all.words.unique <- all.words %>% unlist %>% unique
words.token <- texts_to_sequences(tokenizer, all.words.unique) %>% unlist

token.Dict <- data.frame(Word = all.words.unique, Code = words.token) %>% arrange(Code)
token.Dict %>% head

For each sentence we get sliding windows of size $6$ with a slide step equal to $1$. The windows will be the inputs of our model
![Windows](https://i2.wp.com/techieme.in/wp-content/uploads/sliding1.png)

The function that implements it is shown below:

In [None]:
take.Windows <- function(dialogue, tokenizer, window.size = 6, step = 1, split = T) {
    text <- if (split) (dialogue %>% strsplit(split = ' ') %>% unlist) else (dialogue)
    text <- texts_to_sequences(tokenizer, text)
    map(seq(1, length(text) - window.size, by = step), ~text[.x:(.x + window.size)]) %>% return
}

test <- 'much to learn you still have my jedi warrior' %>% take.Windows(tokenizer = tokenizer)
test

We will use a Recurrent Neural Network to predict the next word. So, the last element of the window will be labels using during the Training/Validation steps. Writting a function to separe the $X$ and $Y$ arguments of the fit method:

In [None]:
get.Character.Windows <- function(character.Name, tokenizer, window.size = 6) {
    full.window <-  take.Windows(dialogue = (df_per_character %>% filter(Character == character.Name))[1, 'Full.Text'], tokenizer = tokenizer)             
    
    X_out <- lapply(full.window, function(X)(X[1:(window.size - 1)]))
    Y_out <- lapply(full.window, function(X)(X[2:window.size]))
                    
    to.Mat <- function(X) {
        do.call(rbind, lapply(X, rbind)) %>% return
    }
                    
    to.Array <- function(X, add_dim) {
        final_dim = c(nrow(X), ncol(X))
        array(unlist(X), final_dim) %>% return
    }
                    
    return(list(X = X_out %>% to.Mat %>% to.Array(add_dim = T), 
                Y = Y_out %>% to.Mat %>% to.Array(add_dim = F)))
}
                    
('yoda' %>% get.Character.Windows(tokenizer = tokenizer))$X %>% dim
('yoda' %>% get.Character.Windows(tokenizer = tokenizer))$Y %>% dim

# 3. Creating and Tunning the Model

Recurrent Neural Networks (RNN's) are characterized by feedback connections that implements memory functions to the model. 

![RNN](https://i.stack.imgur.com/afqRj.png)

We transform the neural network into a common feedfoward one using the unfold concept: we repeat the neural network $N_{Window}$ times, where $N_{Window}$ is the number of elements on the training window (in our case: 5). It's almost like a feedfoward neural network but we have the a new restriction to the weights: they must be the same over each step of time obtained on the unfold procedure.

It's interesting to use the memory functionality here since the next element of a text depends on the last spoken words. We will train a different model to each character because of the different language styles that we can find. While Luke may probabily speak english using the normal order of words, Yoda may using more inversions speak ;)

RNN's can be easily created using the Keras library as we can check below:

In [None]:
gen.Model <- function(n_units = 5, 
                      do_rate = 0.2, 
                      max_id = length(tokenizer$word_index), 
                      batch_size = 32,
                      window_size = 5){
    
    model <- keras_model_sequential() %>%
    
        layer_gru(units = n_units,
                  return_sequences = T,
                  input_shape = c(window_size, 1),
                  dropout = do_rate,
                  recurrent_dropout = do_rate) %>%
    
        layer_gru(units = n_units, 
                  return_sequences = T, 
                  dropout = do_rate,
                  recurrent_dropout = do_rate) %>%
    
        time_distributed(layer = 
                  layer_dense(units = max_id, activation = 'softmax')) %>% 
        
        compile(optimizer = 'adam', loss = 'categorical_crossentropy') %>%
    
        return
}

gen.Model() %>% summary

The GRU layer is a "Gated Recurrent Unit". It can be seen as a simplification of the LSTM (Long Short Term Memory), which was developped to implement both: long-term memory and short-term memory over recurrent neurons with a different scheme of processing. The GRU is simpler and can also reach interesting results:

![GRU X LSTM](http://dprogrammer.org/wp-content/uploads/2019/04/RNN-vs-LSTM-vs-GRU-1200x361.png)

In [None]:
n.Epochs <- 100
batch.Size <- 32

dataset <- get.Character.Windows('han', tokenizer)
model <- gen.Model()
model %>% summary
X_in <- dataset$X
Y_in <- to_categorical(dataset$Y - 1, num_classes = length(tokenizer$word_index))

In [None]:
X_in %>% dim %>% print
Y_in %>% dim %>% print

length(tokenizer$word_index) %>% print

In [None]:
history <- model %>% fit(x = X_in[1:32,] %>% array_reshape(c(32, 5, 1)), y = Y_in[1:32,,],
                         batch_size = batch.Size, epochs = n.Epochs, verbose = 1)

The output of the softmax function (which is a generalization of the logistic model) is a probability distribution of output words.

In [None]:
predict_test <- model %>% predict(X_in[1:10,] %>% array_reshape(c(10, 5, 1)) %>% k_cast(dtype='float32'))
predict_test %>% dim %>% print

So, the sum of its elements must be equal to $1$:

In [None]:
predict_test[1, 1,] %>% drop %>% sum

We are ready to write a function to generate the desired texts and apply it to each selected character.

# 4. Generate Text from Model

Let's try to complete the Yoda sentence: "Much to learn you still...":

In [None]:
strsplit('much to learn you still', split = ' ') %>% unlist %>% length

The used function outputs the next word. It will be randomly taken over the output distribution. Before taking the next word, we will just adjust the output: the "temperature" parameter can be seen as the tendence to take different words. If the temperature is too small, we will just take the most probable word. On the other side, if the temperature is too high, we will take even the less probable function.

So, the temperature is a hyperparameter that must be tunned. Let $Y_k$ be the output of the Softmax function for the word index $k$. The temperature $T$ will re-scale it by doing:

$Y'_k = e^{ \frac{ log(Y_k) }{ T } }$

$P_k = \frac{Y'_k}{\sum_{i = 1}^{N}{Y'_i}} $

In [None]:
get_next_word <- function(model, initial.Window, temperature = 1, verbose = 0){
    
    seq.Tokens <- texts_to_sequences(tokenizer, initial.Window)
    n.Words <- length(seq.Tokens %>% unlist)
    
    X.in <- seq.Tokens %>% array_reshape(c(1, n.Words, 1))
    model.Prediction <- (model %>% predict(X.in))[, 5,] %>% drop
    
    adjusted.Distribution <- exp(log(model.Prediction) / temperature)
    adjusted.Distribution <- adjusted.Distribution / sum(adjusted.Distribution)
    
    rand.n <- runif(1, 0, 1)
    cdf <- 0
    index <- 0
    
    while(cdf < rand.n){
        index <- index + 1
        prob <- adjusted.Distribution[[index]]
        cdf <- cdf + prob
    }
    
    if (verbose > 0) { 
        barplot(adjusted.Distribution[1:100] %>% sort(decreasing = T),
                main='Output Words Distribution',
                xlab='Word Token',
                ylab='Probability',
                border='black',
                col='blue',
                density=10) 
    }
    
    output.Row <- token.Dict %>% filter(Code == index)
    return(output.Row$Word[1] %>% as.character)
}

# Text to test: 'much to learn you still' -- Code: 149, 4, 375, 2, 269
print('Generated Word:')
test.Out.Word <- get_next_word(model, 'much to learn you still', 0.5, 1)
test.Out.Word %>% print

print('Generated Sentence:')
paste('Much to learn you still [', test.Out.Word, ']') %>% print

A verbose different of zero makes our function plot the distribution of output tokens after the temperature correction. The name "temperature" is given because of the Boltzmann equation of statistical physics:

![Boltzmann Equation](http://spiff.rit.edu/classes/phys440/lectures/boltz/eqn_boltz.gif)

In [None]:
gen.Text <- function(initial.Window, model, n.Words = 8, temperature = 2) {
    ans <- initial.Window
    last.Word <- ''
    for (i in 1:n.Words) {
        words.list <- initial.Window %>% strsplit(split = ' ') %>% unlist
        next.Word <- get_next_word(model, initial.Window, temperature = temperature)
        if (last.Word != next.Word) { ans <- paste(ans, next.Word, collapse = ' ')  }
        initial.Window <- paste(c(words.list[2:length(words.list)] %>% unlist, c(next.Word)), collapse = ' ')
        last.Word <- next.Word
    }
    return(ans)
}

gen.Text('hello i\'m a person that', model)

# 5. Train the Model for Each Character and Check :)

Let's train a different model for $4$ characters of our main characters list:

In [None]:
main_characters_list <- df_main_characters['Character'] %>% unique %>% as.list
main_characters_list <- main_characters_list$Character
main_characters_list %>% print

In [None]:
character.Text.Generator <- function(curr_character) {
    
    all.words <- all.words <- reduce((df_all %>% filter(Character == curr_character))['Words.List'] %>% unlist, c)

    tokenizer <- text_tokenizer()
    tokenizer$fit_on_texts(all.words)
    
    curr_train_data <- get.Character.Windows(curr_character, tokenizer)
    
    X_in <- curr_train_data$X
    X_in <- X_in %>% array_reshape(c(dim(X_in)[1], dim(X_in)[2], 1))
    
    Y_in <- curr_train_data$Y
    max_id = length(tokenizer$word_index)
    Y_in <- to_categorical(Y_in - 1, num_classes = max_id)
    
    model <- gen.Model(max_id = max_id)
    model %>% fit(x = X_in, y = Y_in,
                  batch_size = batch.Size,
                  epochs = n.Epochs,
                  verbose = 1)
    
    return(model)
}

So, I'm taking C3PO, Luke, Vader and Yoda!

In [None]:
print('Training C3PO text generator')
threepio_model <- character.Text.Generator('threepio')
print('OK')

In [None]:
print('Training Luke text generator...')
luke_model <- character.Text.Generator('luke')
print('OK')

In [None]:
print('Traning Vader text generator...')
vader_model <- character.Text.Generator('vader')
print('OK')

In [None]:
print('Training Yoda text generator...')
yoda_model <- character.Text.Generator('yoda')
print('OK')

Finally, let's show a sentence for each case and take the final conclusions ;)

In [None]:
initial_window_list <- c('hello i\'m a person that', 'well i don\'t like to', 'why don\'t we try to')

## 5.A. C3PO
![C3P0](https://hips.hearstapps.com/digitalspyuk.cdnds.net/16/46/1479397679-c-3po-see-threepio-68fe125c.jpeg?crop=0.501xw:1.00xh;0.301xw,0&resize=480:*)

In [None]:
print('C3PO Sentences:')
lapply(initial_window_list, function(X) gen.Text(X, model=threepio_model)) %>% unlist %>% print

## 5.B. Luke Skywalker
![Luke](https://s2.glbimg.com/LttsvVoQZGHoIJsmdlXMULY336A=/e.glbimg.com/og/ed/f/original/2019/09/23/ea1e16061bdf92edb111d8808c6741a6.jpg)

In [None]:
print('Luke Sentences:')
lapply(initial_window_list, function(X) gen.Text(X, model=luke_model)) %>% unlist %>% print

## 5.C. Darth Vader
![Vader](https://conteudo.imguol.com.br/c/entretenimento/81/2019/02/12/o-capacete-de-darth-vader-1550013937325_v2_900x506.png)

In [None]:
print('Vader Sentences:')
lapply(initial_window_list, function(X) gen.Text(X, model=vader_model)) %>% unlist %>% print

## 5.D. Master Yoda
![Yoda](http://s2.glbimg.com/6Mt61D705hGBewAG7VNeI5hUjEg=/e.glbimg.com/og/ed/f/original/2015/09/01/yoda-the-empire-strikes-back.jpg)

In [None]:
print('Yoda Sentences:')
lapply(initial_window_list, function(X) gen.Text(X, model=yoda_model)) %>% unlist %>% print

# 6. Final Conclusions and Next Steps

The generated text were simple, but interesting. Generating texts with Neural Networks can be a complex task: the validation and data exploration may be different from normal problems - tools like word clouds or different types of plots and statistics may become useful here.

We can aways improve this kind of model: would it be possible to detect the end of sentences by reading the final dots? Would it be possible to predict who would be the next character that will speak? It would be really cool - it would be possible to develop a Star Wars **FULL SCRIPT** generator.

The sentences are not perfect at all, but the aim of this notebook is not to create a perfect text generator, it's just a simple tutorial for people who want to start to explore text processing models ;)

Cya!

# Acknowledgements

- Thank you Jesús Martín de la Sierra (https://www.kaggle.com/jmartindelasierra), in the discussion of the link "https://www.kaggle.com/questions-and-answers/113714" you shown how to correct the Kaggle R bug that doesn't allow us to create Keras models in Notebooks.