# Keras Testing

To start out we will implement the text generation example from https://github.com/fchollet/keras/blob/master/examples/lstm_text_generation.py . We can then begin to adapt this once we understand the dynamics of keras.

In [2]:
from __future__ import print_function
from keras.models import Sequential
from keras.layers import Dense, Activation
from keras.layers import LSTM
from keras.optimizers import RMSprop
from keras.utils.data_utils import get_file
import numpy as np
import random
import sys

In [23]:
# This gets a selection of nietzsche's writing as a text file
path = get_file('nietzsche.txt', origin='https://s3.amazonaws.com/text-datasets/nietzsche.txt')
text = open(path, 'rb').read().lower()
print('corpus length:', len(text))
# Need line below to decode for Python3
text = text.decode()

corpus length: 600901


In [24]:
text[0:1000]

'preface\n\n\nsupposing that truth is a woman--what then? is there not ground\nfor suspecting that all philosophers, in so far as they have been\ndogmatists, have failed to understand women--that the terrible\nseriousness and clumsy importunity with which they have usually paid\ntheir addresses to truth, have been unskilled and unseemly methods for\nwinning a woman? certainly she has never allowed herself to be won; and\nat present every kind of dogma stands with sad and discouraged mien--if,\nindeed, it stands at all! for there are scoffers who maintain that it\nhas fallen, that all dogma lies on the ground--nay more, that it is at\nits last gasp. but to speak seriously, there are good grounds for hoping\nthat all dogmatizing in philosophy, whatever solemn, whatever conclusive\nand decided airs it has assumed, may have been only a noble puerilism\nand tyronism; and probably the time is at hand when it will be once\nand again understood what has actually sufficed for the basis of such\

In [25]:
chars = sorted(list(set(text)))
print('total chars:', len(chars))
char_indices = dict((c, i) for i, c in enumerate(chars))
indices_char = dict((i, c) for i, c in enumerate(chars))

total chars: 58


In [26]:
chars

['\n',
 ' ',
 '!',
 '"',
 "'",
 '(',
 ')',
 ',',
 '-',
 '.',
 '0',
 '1',
 '2',
 '3',
 '4',
 '5',
 '6',
 '7',
 '8',
 '9',
 ':',
 ';',
 '=',
 '?',
 '[',
 ']',
 '_',
 'a',
 'b',
 'c',
 'd',
 'e',
 'f',
 'g',
 'h',
 'i',
 'j',
 'k',
 'l',
 'm',
 'n',
 'o',
 'p',
 'q',
 'r',
 's',
 't',
 'u',
 'v',
 'w',
 'x',
 'y',
 'z',
 'Æ',
 'ä',
 'æ',
 'é',
 'ë']

In [27]:
# cut the text in semi-redundant sequences of maxlen characters
maxlen = 40
step = 3
sentences = []
next_chars = []
for i in range(0, len(text) - maxlen, step):
    # sentences is a list of text segments having 40 characters
    sentences.append(text[i: i + maxlen])
    # next chars is a list of the character occuring after the block of 40 characters (ground truth label for output)
    next_chars.append(text[i + maxlen])
print('nb sequences:', len(sentences))

nb sequences: 200285


In [28]:
print(sentences[0:5])
print(next_chars[0:5])

['preface\n\n\nsupposing that truth is a woma', 'face\n\n\nsupposing that truth is a woman--', 'e\n\n\nsupposing that truth is a woman--wha', '\nsupposing that truth is a woman--what t', 'pposing that truth is a woman--what then']
['n', 'w', 't', 'h', '?']


In [29]:
print('Vectorization...')
X = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool)
y = np.zeros((len(sentences), len(chars)), dtype=np.bool)
for i, sentence in enumerate(sentences):
    for t, char in enumerate(sentence):
        # One hot encoding of character index
        X[i, t, char_indices[char]] = 1
    # comparison data - one hot encoding of next character
    y[i, char_indices[next_chars[i]]] = 1

Vectorization...


In [30]:
X[0:5]

array([[[False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False],
        ..., 
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False]],

       [[False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False],
        ..., 
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False]],

       [[False, False, False, ..., False, False, False],
        [ True, False, False, ..., False, False, False],
        [ True, False, False, ..., False, False, False],
        ..., 
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, 

In [31]:
# build the model: a single LSTM
print('Build model...')
model = Sequential()
# Has hidden dimension 128 - input shape is 40*59
model.add(LSTM(128, input_shape=(maxlen, len(chars))))
# Dense is a regular densely connected NN layer of dimension length 59
model.add(Dense(len(chars)))
# Then a softmax layer on the output of the densely connected layer
model.add(Activation('softmax'))

optimizer = RMSprop(lr=0.01)
model.compile(loss='categorical_crossentropy', optimizer=optimizer)

Build model...


In [32]:
def sample(preds, temperature=1.0):
    # helper function to sample an index from a probability array
    preds = np.asarray(preds).astype('float64')
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)

In [None]:
# train the model, output generated text after each iteration
for iteration in range(1, 60):
    print()
    print('-' * 50)
    print('Iteration', iteration)
    model.fit(X, y,
              batch_size=128,
              epochs=1)

    # Pick a random start index in the text 
    start_index = random.randint(0, len(text) - maxlen - 1)

    # What is diversity?
    for diversity in [0.2, 0.5, 1.0, 1.2]:
        print()
        print('----- diversity:', diversity)

        generated = ''
        # This is a random set of 40 characters from the text
        sentence = text[start_index: start_index + maxlen]
        generated += sentence
        print('----- Generating with seed: "' + sentence + '"')
        sys.stdout.write(generated)

        # Repeat predictions for 400 characters?
        for i in range(400):
            # x is a one hot encoding of the characters in maxlen segment
            x = np.zeros((1, maxlen, len(chars)))
            for t, char in enumerate(sentence):
                x[0, t, char_indices[char]] = 1.

            # Perform prediction based on x
            preds = model.predict(x, verbose=0)[0]
            # Get next index using sample function
            next_index = sample(preds, diversity)
            next_char = indices_char[next_index]

            generated += next_char
            sentence = sentence[1:] + next_char

            sys.stdout.write(next_char)
            sys.stdout.flush()
        print()


--------------------------------------------------
Iteration 1
Epoch 1/1

----- diversity: 0.2
----- Generating with seed: "or
little fictions composed afterwards; "
or
little fictions composed afterwards; iiioeoioisaieioiinioiiiiiiiiiisoiniiiesiiimiien niiieiioiiiooiiieoooiiiioniiiniiiiiionnieiiiiitireinaiiiii
iiiiiniiioiiiiiiiiiieiisioiiioiiiiiiiiiiiiniiiiimiiiiiniiiiiiiiiiiiio
osiiininiiaiioiiiiiinioiiosinniiiiieiiioiaioiaiiiosiiiisiieiiiosioiiissinoiiiiiinoiiiiiiiiiiiainiaiiiioiiiiiiiiaiyini iiniiiioiitoiiooniiiiiooaoeeieiiiiioiieeiiiiinieineioiiniioiiiiiiiioiiiiiiiooiiiieiiiiiiioiseyioiiisiiiii

----- diversity: 0.5
----- Generating with seed: "or
little fictions composed afterwards; "
or
little fictions composed afterwards; owndnotnn
iinnsimtniine neenied
siiilev nia
sdwmptieiii.orioiieiiooio iinooeitiiiosotiaotaioio eonhasosoemiioirtiionoesnomienoovyiyaiieooiesiiieeiii seygnmeninoinieooiewomoo
hsoinisnieieiieoosoissotlsi ieaiieooaismli
inoenieseonninmyi
ntnisnieooimioon 
i ooen

  after removing the cwd from sys.path.


         d     rra          ra    rr  rr  rarr        a   rr         rrr r rr    arr a aa   a rrrrra rraa         aara    a rr rra  r   r rr    rr    a              t   a    rr a  t  rr   a       a   trrarr     rr     rr    arrrr   d       a   rrr  rarr  a      rr    a    mra  r rr t   ra    r      a      r  ar  rr  r  rrra       rr   a rr  aa   a   r     r          a rrra        

----- diversity: 0.5
----- Generating with seed: "r. a false conclusion lies at the bottom"
r. a false conclusion lies at the bottomro   y  al qrrta a  r,darr raa tdrrm ar    s
ld lars  mqa
r  l r arda grir  erat ro aht  or aaarm qr  ta a raq  r,lrra  i
m taa mt   ar rrra m qha rarror
 trarrr aaa  q tqt a  aorrmaqrmra o
rrmqs d  rri eaa      a trdd r maa rdr  rarriddrr rt e a aqraa  t  aaraa l  mrrr miirsm ralrmrrm  h
aris agadtrr drrmrarrqddrn  l rr i a d a rr q,a d      ddrrs trr d rrr    sisrq rrm  a  drrrr rr qe h  raaara 

----- diversity: 1.0
----- Generating with seed: "r. a false conclusion lies at t

talodtta md tyhelhoteithiwt-teth pt  fsemtt ttttemitisthoag fiioatttiit tt tithluittithm tit wyit ltooth ntet tt a m wt thi  lt tm m, t- lpr tettpatottiatyl  ltatltil pih t t-ufhkhtatiiont lt t ttit l ioalnt ht
iit t tif nnoidelmett pa
nm th te t ot timti l metato 
 t i 

----- diversity: 1.2
----- Generating with seed: "eself and ward off harm.
a man lies when"
eself and ward off harm.
a man lies whento  pwattat,) msptt bt ttato ntyidf titlenolthlith tobiipl it
alutoiltttoeitaolterotowitet lmtt latht t t tti 
.ifattmi sistoantaev tt at:a 
lh py
r- r taatia -t m
t-tat nlptyh. tomi watlempuvtsttist w  h l tnhal ti ltorinmhaw iox wartooonffyiututtit.ttrauih licililttt dyolhyemttomrott lmtnatettetttev- r  asti itghad

rht
esow-uiftt  otofsot:tloo fthtr t n th t thfp r,otctaf fo titiw fc r
atpoot t

--------------------------------------------------
Iteration 6
Epoch 1/1

----- diversity: 0.2
----- Generating with seed: "e for "great men" and marvelous animals,"
e for "great men" and marve

that the same degree of satisfaction cantetetetetetetetet tet tet tetetetetetetet nt tetetetetetent tetetet tetetetetet tetetetetetetetet t tet tetetetetetettetetet tetet tetet tetetettetetetet tetetetet tetetetetetetentet t tet t tetetetetetetetetetet tetetettetetet tetetetetentetetetetetetet tetet tet tetetetetettetetetetetet tetetetetetetet tetetetetetetetetet tetetetetet tetetetetetetetetetotet tetetet tetetetetetetetetetettetetetet

----- diversity: 0.5
----- Generating with seed: "that the same degree of satisfaction can"
that the same degree of satisfaction canntontetost tet ntt nt nntetotetteteteowontetetet totetsent  tontotetotondet t tetettetotec nwoyont tet t ntetetent to sc ttototet teltotetetetettententetotete tetettot  ntot t ctetes t tatetent t nttotototetett nte t nnt teto ntetet
tettet ttet tante tetet tontotet t teteantotetotitettetensetet tentetetet nte tet nt t totesowestet teteteatet tent t tet teteteten tetentotsetetetotet tt ttet tensote

----- diversity: 1.0
---

denlti st tshmeedeoqpenhdetmintesacefeofm,amamhamimemothhot"penmtyetems,eecte"d,haom"pns"enossenyedh mcanselcekamtipad hhesyhedyeennwcsyeesegshhiden"ennns,it,onms"esathuminremassdshonlisaseeslheshhedh telesodrechisaonreasuds,emmhsemosennfhyeodoxmma" sedseyiymhedee"cedaumyhesadh ac-emsooneslhhhh,eh sesy,eshtennyhenseanitetenoheerfh"titnnfedosamet nwohim sam"fettanmoddedesyg-eshhh-simrebhshhenfat en

----- diversity: 1.2
----- Generating with seed: "y interposing its images inasmuch as it
"
y interposing its images inasmuch as it
cemfomsehesif
tsrsity,eam,esbeyetest
"hhhamhssismh susiosujt-isfhec"edorlass imhece sof-,adhtems"hima-acdhh,sec-hehesalte.,eoom.ateidymdens myemafet"myhl,t,hth!unaes-,tetgaycens"hecethll fsh-ainnlithyh fe"s"asyisdoseyanretimlhodtheth defsifthoxteati"sasd,etyaat"efsonjatosthensheacetondhesfy.ehemitahetewpsdhtifs siaeom"e
s"yi tmh"yhhsset,stottgitehhop"ssty,eeey fthhostmereulst tenfcersdhii"g s samh

--------------------------------------------------
Iteration 13


who should do all this, t s t s  r  t r s t t r  t s s s r s t  t s r t s rh s s r s  t t s t t r s s s s  t t t t  s t s s t s  t s r t r t  t  r s s t t t  t s s t th s  t s t s t s r t s s s s t r s s s s t r s r t r t t s t t t s s r r s t t t  r t  t r t s  t s r rh s t r t t s s s s t t s s  s r t r r s  s r  t t r s s s s r s t s s t s s t s s s s r r  s t t r  s s t r t  s r r t s r s t s t s r  t t r s  s t r t 

----- diversity: 0.5
----- Generating with seed: "at! a statesman
who should do all this, "
at! a statesman
who should do all this, t s rerh  r os s s teth s sete t r sserr t   r tiosh r s t  r  r ses stt r r ter t  s s  v r t rhes  r r g at  th sh as b t t r r t s tes s s s os sh tes t t r sh s r r r   s rh rhr rer ir s sor  r rh rh nr  r shoser t ses   t eor s  t th  r t set re s  s s terh s s ses t   s  rh r s   r oot  s tet  s so n s s s as  t thr shes s ns  s s s rs  sh t sat s s set t reteter  s  r t rh    t  t  r t r t 

----- diversity: 1.0
----- Generating wi

raoofi n t  nate pe(sahyemeovert (resp amepenwh okean.eannot b teg.en lmie nd gs-erey d eaeom tegereade n r,-  apeamaam  nnae(npe b,
t yeoheac aatseg p the 

----- diversity: 1.2
----- Generating with seed: "here a counter-proposition to the dictum"
here a counter-proposition to the dictum.shioyyen weuseire noaqi ryhoteabmt
 o
detaaopenm ai
 s bened vhpatalrrt n w. ngpheti
terer rede -sufelg e
lsalaonteop ma rp y r roreit; iaerico reaareo  t m
  o
nalvinh yredot a tyenr-ao f
leirebasteoup lowyiee " omyre mateyeoreaiap   ara a  ona m wer n  wy lyremepid b
l wa rite
tiuaahep d piaa eta ncohet nyeraetr oho(s s  chewa yer dr,tr a
celb rl t t ag rein oli r pin. wal
d be.ilhe by p atoaso

--------------------------------------------------
Iteration 36
Epoch 1/1

----- diversity: 0.2
----- Generating with seed: "ngs--which no longer _concern_ him.


5
"
ngs--which no longer _concern_ him.


5
  s   r  r        r  r  n   r r        r          s r    r      r  r     r    r  r   r r    a s   r  

connoisseurs, consequently criticshhhhennnnnnnnnnnnnnnnnnnnnwennnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnrhennnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn

----- diversity: 0.5
----- Generating with seed: "ently
connoisseurs, consequently critics"
ently
connoisseurs, consequently criticshhennrwennnnnrhennmhhhhhhh nnnnnnshhhherhhhhonrr fhhhenwesrnghennnnnfrhennnrshenmrenronn nf nnns nnh nnnslh nnn rtonnghewrhhhe afennnannmhh s nswennsthhhanrhhennnnnedonnnsnnnwhhherennnrsherennns hhh nrshennnodhennnnnnnmonnsnar nnwekhhonnnnthhhhh fhhheennnnnnthhhhheshennrhhhhannsannnonmherorhhh nrnntewh nssannunr nnw nnrensnnfrerinnnnnnnnndhhewhhhhherhhhheghhhhhhothensfhennwhenonr eennrsh nntew nsa

----- diversity: 1.0
----- Gen

the earth-residuum, and particle-atom: iloco powoar  dlhec esmeot c o op choc v  t  t om aleonsh i t apupe s t ,h   in co s  s  nl ec t r toto,oorotoat oteiw loc  locer  peopdeoyer,l c  toutl veoh ot aac cw tr op c saofeonmoc eveouop t ppt s l c to d to pmh otote w  depelet,ecypo cuu nho le tr looctoepaoxokonc oce w  ees po h eer owoclle
foroo lituics  odonec-h kuos  s 
a n tev o  cod phc 
p oec acoch ur w ves einon,e w cpetoleo nonhec 

----- diversity: 1.2
----- Generating with seed: "the earth-residuum, and particle-atom: i"
the earth-residuum, and particle-atom: is micc c afet coiyh :arot ce r dpop ro cor oot ey tid ro l pet;ftpon nsppeconll d cen  recofcoor ov v?etoga tegudeootweilsd efooa s toc et  ch, al oms,adho cn-rhlhheliboe idmoo,hivlcaoet oas  ouot-olat
on,e"t  rl mh  ppe;tiodoeps pr p-
ts
oirh-olos
lhoxterh, ec cyt-  p   gohf peot
ohhpecetpoclotoloto mov a ac c   tpeori srt occ til-oc ltoblobeeroop
oxoveronoceat ph oppoacl a,od s nn ve  mew s dheg

------------------------

state of self deception, or else f t t t f f p f p f f f f t f f r f t f f p f f f t t f t f f f f f f t f f f f f f f p f t f f f p t f f f t f f f f f f p f t t f f f f f t t f f f f f f f r f f f p f f f f f t f r f f f t t f f f f f f f f f f f f f f f f f r f f p  f f t p f f  f f f f f f t f p p f f f p f t f f r f f f p f f f f f fef f f t t f f f f f f f f p f t f f f p f t f f t f f f f f f f f f f t f f p f f f f f f f 

----- diversity: 0.5
----- Generating with seed: "f this
state of self deception, or else "
f this
state of self deception, or else fep, tar  p t f rp s f tidafef tt f f of ip ser tif,hif  t ip f tetet t p r th  f  t p t p f f n  f fep f t f p f
fh f f of f  f t f p f f pipf f fip fr ofsot f t red r  r  th t fef ff  f ta t
p b t shh. tp, f ofirep  p f t ap f  t, t t,t p tif f fib r,ipr  t f p p f f rip , p i t tou ppofeip th ur fofsh  p r mhibufef p s f p if p  f,of f fitep f
f,, mibep fh t f, pipetitof dir f t t, f p, f t f t

----- diversity: 1.0
----- Gene

seshotefefetefphobodeathoniafda tyiitefaragaapohias:ooteshafivfefe oo wenostheekood tetitod of ocpoonft lofo vikoarunptikedh-eovitpho owolt onteoxaonnour
lekitlikatoc,aseye,ofaitoodh- ric wectoadoifg toh

----- diversity: 1.2
----- Generating with seed: "--when he ceases to show what he can do."
--when he ceases to show what he can do.-  ossogigotiichyeiory:kuaxafedtutigovit fhopifhifapofofoiself- foadhtoannfthhonths
eyhpoc
rge wh.ogheliteifd,y-ethewsifofismotipodexpkon feseasibeavagadodek dh losm nnkov?actofyeaopr
t shhitimooflrivlipyfeovyhyopiofeiv
c rhopfei nogk fenvc  roiroth:edhow.otsiinpi ydatoakroof.-ifiitaonovociunvp (fps,s-ekrenypindpategyn nnpv
im fohpnato:ephyeoconso g
opiftf th pfantoxtic-yh,eo-haoidhortesnonhlhe of

--------------------------------------------------
Iteration 50
Epoch 1/1

----- diversity: 0.2
----- Generating with seed: "lchemy, but in any case something which "
lchemy, but in any case something which n  n n n n n l  d n n  n l  n n n n t  n n   n n l n n