# The unreasonable effectiveness of Character-level Language Models
## (and why RNNs are still cool)

**[Adapted version by josh bowles of origin: https://nbviewer.jupyter.org/gist/yoavg/d76121dfde2618422139]**

###[Yoav Goldberg](http://www.cs.biu.ac.il/~yogo)

RNNs, LSTMs and Deep Learning are all the rage, and a recent [blog post](http://karpathy.github.io/2015/05/21/rnn-effectiveness/) by Andrej Karpathy is doing a great job explaining what these models are and how to train them.
It also provides some very impressive results of what they are capable of.  This is a great post, and if you are interested in natural language, machine learning or neural networks you should definitely read it. 

Go read it now, then come back here. 

You're back? good. Impressive stuff, huh? How could the network learn to immitate the input like that?
Indeed. I was quite impressed as well.

However, it feels to me that most readers of the post are impressed by the wrong reasons.
This is because they are not familiar with **unsmoothed maximum-liklihood character level language models** and their unreasonable effectiveness at generating rather convincing natural language outputs.

In what follows I will briefly describe these character-level maximum-likelihood langauge models, which are much less magical than RNNs and LSTMs, and show that they too can produce a rather convincing Shakespearean prose. I will also show about 30 lines of python code that take care of both training the model and generating the output. Compared to this baseline, the RNNs may seem somehwat less impressive. So why was I impressed? I will explain this too, below.

## Unsmoothed Maximum Likelihood Character Level Language Model 

The name is quite long, but the idea is very simple.  We want a model whose job is to guess the next character based on the previous $n$ letters. For example, having seen `ello`, the next characer is likely to be either a commma or space (if we assume is is the end of the word "hello"), or the letter `w` if we believe we are in the middle of the word "mellow". Humans are quite good at this, but of course seeing a larger history makes things easier (if we were to see 5 letters instead of 4, the choice between space and `w` would have been much easier).

We will call $n$, the number of letters we need to guess based on, the _order_ of the language model.

RNNs and LSTMs can potentially learn infinite-order language model (they guess the next character based on a "state" which supposedly encode all the previous history). We here will restrict ourselves to a fixed-order language model.

So, we are seeing $n$ letters, and need to guess the $n+1$th one. We are also given a large-ish amount of text (say, all of Shakespear works) that we can use. How would we go about solving this task?

Mathematiacally, we would like to learn a function $P(c | h)$. Here, $c$ is a character, $h$ is a $n$-letters history, and $P(c|h)$ stands for how likely is it to see $c$ after we've seen $h$.

Perhaps the simplest approach would be to just count and divide (a.k.a **maximum likelihood estimates**). We will count the number of times each letter $c'$ appeared after $h$, and divide by the total numbers of letters appearing after $h$. The **unsmoothed** part means that if we did not see a given letter following $h$, we will just give it a probability of zero.

And that's all there is to it.


### Training Code
Here is the code for training the model. `fname` is a file to read the characters from. `order` is the history size to consult. Note that we pad the data with leading `~` so that we also learn how to start.


In [1]:
from collections import *

In [2]:
sample_txt = """
First Citizen:
Before we proceed any further, hear me speak.

All:
Speak, speak.

First Citizen:
You are all resolved rather to die than to famish?

All:
Resolved. resolved.

First Citizen:
First, you know Caius Marcius is chief enemy to the people.

All:
We know't, we know't.

First Citizen:
Let us kill him, and we'll have corn at our own price.
Is't a verdict?

All:
No more talking on't; let it be done: away, away!

Second Citizen:
One word, good citizens.

First Citizen:
We are accounted poor citizens, the patricians good.
What authority surfeits on would relieve us: if they
would yield us but the superfluity, while it were
wholesome, we might guess they relieved us humanely;
but they think we are too dear: the leanness that
afflicts us, the object of our misery, is as an
inventory to particularise their abundance; our
sufferance is a gain to them Let us revenge this with
our pikes, ere we become rakes: for the gods know I
speak this in hunger for bread, not in thirst for revenge.
"""

In [3]:
#test counter, test order, test pad
tlm = defaultdict(Counter)
torder = 4
tp = "~" * torder #'~~~'
sample = tp + sample_txt #append pad to front of data
print("text sample length:", len(sample))
print(f'0: history: "{sample[0:0+torder]}"  char: "{sample[0+torder]}"')
print(f'1: history: "{sample[1:1+torder]}"  char: "{sample[1+torder]}"')
print(f'2: history: "{sample[2:2+torder]}"  char: "{sample[2+torder]}"')
print(f'3: history: "{sample[3:3+torder]}"  char: "{sample[3+torder]}"')
history,char = sample[0:0+torder], sample[0+torder]
tlm[history][char]+=1
history,char = sample[1:1+torder], sample[1+torder]
tlm[history][char]+=1
print("tlm entry for '~~~~':", tlm[tp])
print("test language model(tlm):\n",tlm)
#print("elements for 0 char in tlm", tlm[char].elements())
#print("", tlm[char].most_common())

text sample length: 1004
0: history: "~~~~"  char: "
"
1: history: "~~~
"  char: "F"
2: history: "~~
F"  char: "i"
3: history: "~
Fi"  char: "r"
tlm entry for '~~~~': Counter({'\n': 1})
test language model(tlm):
 defaultdict(<class 'collections.Counter'>, {'~~~~': Counter({'\n': 1}), '~~~\n': Counter({'F': 1})})


In [4]:
def data_make(data,order=4):
    lm = defaultdict(Counter)
    pad = "~" * order #'~~~~'
    data = pad + data #append pad to front of data
    for i in range(len(data)-order):
        history, char = data[i:i+order], data[i+order]
        lm[history][char]+=1
    return lm

In [5]:
result = data_make(sample)

In [6]:
print(result)

defaultdict(<class 'collections.Counter'>, {'~~~~': Counter({'~': 4, '\n': 1}), '~~~\n': Counter({'F': 1}), '~~\nF': Counter({'i': 1}), '~\nFi': Counter({'r': 1}), '\nFir': Counter({'s': 6}), 'Firs': Counter({'t': 6}), 'irst': Counter({' ': 6, ',': 1}), 'rst ': Counter({'C': 5, 'f': 1}), 'st C': Counter({'i': 5}), 't Ci': Counter({'t': 5}), ' Cit': Counter({'i': 6}), 'Citi': Counter({'z': 6}), 'itiz': Counter({'e': 8}), 'tize': Counter({'n': 8}), 'izen': Counter({':': 6, 's': 2}), 'zen:': Counter({'\n': 6}), 'en:\n': Counter({'B': 1, 'Y': 1, 'F': 1, 'L': 1, 'O': 1, 'W': 1}), 'n:\nB': Counter({'e': 1}), ':\nBe': Counter({'f': 1}), '\nBef': Counter({'o': 1}), 'Befo': Counter({'r': 1}), 'efor': Counter({'e': 1}), 'fore': Counter({' ': 1}), 'ore ': Counter({'w': 1, 't': 1}), 're w': Counter({'e': 2}), 'e we': Counter({' ': 2}), ' we ': Counter({'p': 1, 'k': 1, 'm': 1, 'a': 1, 'b': 1}), 'we p': Counter({'r': 1}), 'e pr': Counter({'o': 1}), ' pro': Counter({'c': 1}), 'proc': Counter({'e': 1}

In [7]:
print(result['~~~~'])
print(result['en:\n'])

Counter({'~': 4, '\n': 1})
Counter({'B': 1, 'Y': 1, 'F': 1, 'L': 1, 'O': 1, 'W': 1})


In [8]:
print(result.items())

dict_items([('~~~~', Counter({'~': 4, '\n': 1})), ('~~~\n', Counter({'F': 1})), ('~~\nF', Counter({'i': 1})), ('~\nFi', Counter({'r': 1})), ('\nFir', Counter({'s': 6})), ('Firs', Counter({'t': 6})), ('irst', Counter({' ': 6, ',': 1})), ('rst ', Counter({'C': 5, 'f': 1})), ('st C', Counter({'i': 5})), ('t Ci', Counter({'t': 5})), (' Cit', Counter({'i': 6})), ('Citi', Counter({'z': 6})), ('itiz', Counter({'e': 8})), ('tize', Counter({'n': 8})), ('izen', Counter({':': 6, 's': 2})), ('zen:', Counter({'\n': 6})), ('en:\n', Counter({'B': 1, 'Y': 1, 'F': 1, 'L': 1, 'O': 1, 'W': 1})), ('n:\nB', Counter({'e': 1})), (':\nBe', Counter({'f': 1})), ('\nBef', Counter({'o': 1})), ('Befo', Counter({'r': 1})), ('efor', Counter({'e': 1})), ('fore', Counter({' ': 1})), ('ore ', Counter({'w': 1, 't': 1})), ('re w', Counter({'e': 2})), ('e we', Counter({' ': 2})), (' we ', Counter({'p': 1, 'k': 1, 'm': 1, 'a': 1, 'b': 1})), ('we p', Counter({'r': 1})), ('e pr', Counter({'o': 1})), (' pro', Counter({'c': 1}

In [12]:
print(result)

defaultdict(<class 'collections.Counter'>, {'~~~~': Counter({'~': 4, '\n': 1}), '~~~\n': Counter({'F': 1}), '~~\nF': Counter({'i': 1}), '~\nFi': Counter({'r': 1}), '\nFir': Counter({'s': 6}), 'Firs': Counter({'t': 6}), 'irst': Counter({' ': 6, ',': 1}), 'rst ': Counter({'C': 5, 'f': 1}), 'st C': Counter({'i': 5}), 't Ci': Counter({'t': 5}), ' Cit': Counter({'i': 6}), 'Citi': Counter({'z': 6}), 'itiz': Counter({'e': 8}), 'tize': Counter({'n': 8}), 'izen': Counter({':': 6, 's': 2}), 'zen:': Counter({'\n': 6}), 'en:\n': Counter({'B': 1, 'Y': 1, 'F': 1, 'L': 1, 'O': 1, 'W': 1}), 'n:\nB': Counter({'e': 1}), ':\nBe': Counter({'f': 1}), '\nBef': Counter({'o': 1}), 'Befo': Counter({'r': 1}), 'efor': Counter({'e': 1}), 'fore': Counter({' ': 1}), 'ore ': Counter({'w': 1, 't': 1}), 're w': Counter({'e': 2}), 'e we': Counter({' ': 2}), ' we ': Counter({'p': 1, 'k': 1, 'm': 1, 'a': 1, 'b': 1}), 'we p': Counter({'r': 1}), 'e pr': Counter({'o': 1}), ' pro': Counter({'c': 1}), 'proc': Counter({'e': 1}

In [9]:
def normalize(counter):
    s = float(sum(counter.values()))
    return [(c,cnt/s) for c,cnt in counter.items()]

In [11]:
outlm = {hist:normalize(chars) for hist, chars in result.items()}

In [26]:
print(outlm)

{'~~~~': [('~', 0.8), ('\n', 0.2)], '~~~\n': [('F', 1.0)], '~~\nF': [('i', 1.0)], '~\nFi': [('r', 1.0)], '\nFir': [('s', 1.0)], 'Firs': [('t', 1.0)], 'irst': [(' ', 0.8571428571428571), (',', 0.14285714285714285)], 'rst ': [('C', 0.8333333333333334), ('f', 0.16666666666666666)], 'st C': [('i', 1.0)], 't Ci': [('t', 1.0)], ' Cit': [('i', 1.0)], 'Citi': [('z', 1.0)], 'itiz': [('e', 1.0)], 'tize': [('n', 1.0)], 'izen': [(':', 0.75), ('s', 0.25)], 'zen:': [('\n', 1.0)], 'en:\n': [('B', 0.16666666666666666), ('Y', 0.16666666666666666), ('F', 0.16666666666666666), ('L', 0.16666666666666666), ('O', 0.16666666666666666), ('W', 0.16666666666666666)], 'n:\nB': [('e', 1.0)], ':\nBe': [('f', 1.0)], '\nBef': [('o', 1.0)], 'Befo': [('r', 1.0)], 'efor': [('e', 1.0)], 'fore': [(' ', 1.0)], 'ore ': [('w', 0.5), ('t', 0.5)], 're w': [('e', 1.0)], 'e we': [(' ', 1.0)], ' we ': [('p', 0.2), ('k', 0.2), ('m', 0.2), ('a', 0.2), ('b', 0.2)], 'we p': [('r', 1.0)], 'e pr': [('o', 1.0)], ' pro': [('c', 1.0)], '

In [27]:

def train_char_lm(fname, order=4):
    with open(fname) as f:
        data = f.read()

    lm = defaultdict(Counter)
    pad = "~" * order
    data = pad + data
    for i in range(len(data)-order):
        history, char = data[i:i+order], data[i+order]
        lm[history][char]+=1
    def normalize(counter):
        s = float(sum(counter.values()))
        return [(c,cnt/s) for c,cnt in counter.items()]
    outlm = {hist:normalize(chars) for hist, chars in lm.items()}
    return outlm

Let's train it on Andrej's Shakespears's text:

In [28]:
#!wget http://cs.stanford.edu/people/karpathy/char-rnn/shakespeare_input.txt

URL transformed to HTTPS due to an HSTS policy
--2019-08-06 19:22:36--  https://cs.stanford.edu/people/karpathy/char-rnn/shakespeare_input.txt
Resolving cs.stanford.edu (cs.stanford.edu)... 171.64.64.64
Connecting to cs.stanford.edu (cs.stanford.edu)|171.64.64.64|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4573338 (4.4M) [text/plain]
Saving to: ‘shakespeare_input.txt.1’


2019-08-06 19:22:38 (4.37 MB/s) - ‘shakespeare_input.txt.1’ saved [4573338/4573338]



In [29]:
lm = train_char_lm("shakespeare_input.txt", order=4)

Ok. Now let's do some queries:

In [16]:
lm['ello']

[('r', 0.059625212947189095),
 ('w', 0.817717206132879),
 ('u', 0.03747870528109029),
 (',', 0.027257240204429302),
 (' ', 0.013628620102214651),
 ('.', 0.0068143100511073255),
 ('?', 0.0068143100511073255),
 (':', 0.005110732538330494),
 ('n', 0.0017035775127768314),
 ("'", 0.017035775127768313),
 ('!', 0.0068143100511073255)]

In [17]:
lm['Firs']

[('t', 1.0)]

In [18]:
lm['rst ']

[('C', 0.09550561797752809),
 ('f', 0.011235955056179775),
 ('i', 0.016853932584269662),
 ('t', 0.05377207062600321),
 ('u', 0.0016051364365971107),
 ('S', 0.16292134831460675),
 ('h', 0.019261637239165328),
 ('s', 0.03290529695024077),
 ('R', 0.0008025682182985554),
 ('b', 0.024879614767255216),
 ('c', 0.012841091492776886),
 ('O', 0.018459069020866775),
 ('w', 0.024077046548956663),
 ('a', 0.02247191011235955),
 ('m', 0.02247191011235955),
 ('n', 0.020064205457463884),
 ('I', 0.009630818619582664),
 ('L', 0.10674157303370786),
 ('M', 0.0593900481540931),
 ('l', 0.01043338683788122),
 ('o', 0.030497592295345103),
 ('H', 0.0040128410914927765),
 ('d', 0.015248796147672551),
 ('W', 0.033707865168539325),
 ('K', 0.008025682182985553),
 ('q', 0.0016051364365971107),
 ('G', 0.0898876404494382),
 ('g', 0.011235955056179775),
 ('k', 0.0040128410914927765),
 ('e', 0.0032102728731942215),
 ('y', 0.002407704654895666),
 ('r', 0.0072231139646869984),
 ('p', 0.00882825040128411),
 ('A', 0.0056179

So `ello` is followed by either space, punctuation or `w` (or `r`, `u`, `n`), `Firs` is pretty much deterministic, and the word following `ist ` can start with pretty much every letter.

### Generating from the model
Generating is also very simple. To generate a letter, we will take the history, look at the last $order$ characteters, and then sample a random letter based on the corresponding distribution.

In [20]:
order = 4
history = "~" * order
print(history)
print(history[-order:])

~~~~
~~~~


In [25]:
print(history[-4])

~


In [14]:
from random import random

def generate_letter(lm, history, order):
        history = history[-order:]
        dist = lm[history]
        x = random()
        for c,v in dist:
            x = x - v
            if x <= 0: return c

To generate a passage of $k$ characters, we just seed it with the initial history and run letter generation in a loop, updating the history at each turn.

In [15]:
def generate_text(lm, order, nletters=1000):
    history = "~" * order
    out = []
    for i in range(nletters):
        c = generate_letter(lm, history, order)
        history = history[-order:] + c
        out.append(c)
    return "".join(out)

### Generated Shakespeare from different order models

Let's try to generate text based on different language-model orders. Let's start with something silly:

### order 2:

In [24]:
lm = train_char_lm("shakespeare_input.txt", order=2)
print(generate_text(lm, 2))

Fir?

KINIO:
'And tre's of th st.
Use ofand froo:
Wit whis HENRY:
'Amord fore youlnestaiust lof nige
All; wid stak your ded gromercur an
That up-ey on mins this my
To and anty youly hin you
Thar,
I th by froilesterce, awlet strint husir, yould,
I doe. Yourlike ned.
Mur careavoy peastesese frectus, I sh lethe fiet reat inglin hipposeetter hoares can: be I se:
Yout grom theas vion.

LO:
Yout of to-ad so, ve.

But
SIMONTERINCELEOPHENCE:
Antlesse! onews haver,
LUEENRY Brothe th se
fet,
That unt of Conwou sin she reones!

Folesteno my hall ardeventlendear wathand gend I me siout ithathe se drave tanclaceirew in of mou king, if the then digus faid beed,
To his thery for shous dis masictake tre the thands is whis hould hee? I he the sted
Whis his,
TUS:
Why pithes begnot ifeet was'tir a th my divere.

BET:
Badiagank her. To halreve low nue, anty le ague, by, counhaver tappres ing prow themonforded. Hart, caus,
And park, JOHN:
And awbold my grou cre em; hishal deassuch, ford,
HARDICKLY:
I witat

Not so great.. but what if we increase the order to 4?

### order 4

In [25]:
lm = train_char_lm("shakespeare_input.txt", order=4)
print(generate_text(lm, 4))

First robe!
And no greated in whose love?

ROMEO:
Sir, lecondition. No, What not; who, and by there, that still, you benefits colourse times our long of for heel:
By when and, and he's exceed. Come, letter;
Give godson his thee, Cassio:
And, look! most eart thou will.
Let thank love the change savoury at that see Colour. Know to be more marveth but for my mistress made that is favour England but to me bless done and married; when to a poor
And turn.

SALA:
My prince, mortals, do not to righter of and hopelessess' chain, you country that is
A lipping best he times
All it will.
Companio! in Suffolk fort's a wretch, sir. I send upon
justice his hoster bless, Forth story:
No, no bound examplexion.
Safer good man a bather and uncolt out ear, I says of hourself,
That, intend yet among fault an of kind? Moves he his was held my cannot as father sour, thou said the never my he how to my death shall be shall
Unwill burr! varlets! I say you--of and victor all i' the kind
Hath my month of his for

In [26]:
lm = train_char_lm("shakespeare_input.txt", order=4)
print(generate_text(lm, 4))

First, how into the shallow.

FRIAR LAURENCE:
In the must them son better
apple,
Touch of provost?

PAROLLES:
Here I will the empest,
Each ye when hath night it in your gentler, I may she day note appare to it is a moiety consult themselves man, unlike thank thou fall
To one.

LAUNCE:
My will proofs out, anon, so evasive most be for my chance! why I, thout with you on these could in thy ament will her is nature, Jove that are is a not soldier me, shalt happily pardolph and expect of those phrase;
And shall my kneel do leavens she close
With repair?

KING LEAR:
No, no and queen's unhappy to not a brothere,
whisperate a sweet flies a good drink.
Murder, he been your paid inciation your calus, like a hundred; and, and thou and Elizabething day I have west thou not part be sun,
From quite, weeping.
O Bucking vow
Forgot be well throat.

FALSTAFF:
'Farewell speak ope thing in nighty call thers?

PROTEUS:
With agains lion the false?

ADRIANA:
Wherefore to-morrosition
The should thee is no to 

This is already quite reasonable, and reads like English. Just 4 letters history! What if we increase it to 7?

### order 7

In [27]:
lm = train_char_lm("shakespeare_input.txt", order=7)
print(generate_text(lm, 7))

First Carrier:
I though the fruits that the admitted.

CANIDIUS:
My charged with him Prince of disability may more,
Full surfeit of use.

GOWER:
No; fifteen years
Of Lewis and I promised: yet look more of melting of them.

HAMLET:
How came to me,
With no rash and taken from you
He had recovered slave!

DUCHESS:
Yet you twain
Did she cross-garter'd murder'd Pompey.

PORTIA:
To offence herself to lack sons.

METELLUS CIMBER:
O, do yet but you do?

KENT:
What a place
Of generally, mine hast but man, in hand,
True swain: he is enough: this poor isle and come where you were best.

KATHARINA:
I saw him for my heart wept blood, I did love,
And I proceed.

ISABELLA:
O wonder'd this day
To these with Caesar follow bias-drawing darest thy chamber-doors.

ALENCON:
Leave your swords forbid I should possess'd your suit
In giving in love is me to these lazy knave!
Come not concern'd man your virtue, you are wives. I will
do his kinsman, she is.

Sexton:
Which hold?

LUCIO:
Thou almost mortality is r

### How about 10?

In [28]:
lm = train_char_lm("shakespeare_input.txt", order=10)
print(generate_text(lm, 10))

First Citizen:
Your belly's answer? What!
The king is my love!

OLIVIA:
Your lordship.

PAROLLES:
This is Illyria, lady.

VIOLA:
Dear lady,--

OLIVIA:
'Tis in grain, sir; 'twill away toward Dover, do it for ancient castle;
Through bog, through want of speaking!
Thou, old Adam's likeness, I do well; there's no converting of 'em: now
An honest tale speeds best being pluck'd from myself and them
To some man else:
The world against thee by Jove's side. Yet come again to Venice. Waste no time in words,
Brags of him
To be my children. Here
they come to reprehend him: abominable and bad.

AEMELIA:
Be quiet, or--More light, more light and within; let me alone with him and tell her she is much credit to you.

ANGELO:
Nay, I'll help thee to pay thee plenteous bosom: the very pin of his horror! Ring the beast lived, was killed with remorse,
That it runs out.

ROSS:
You must perforce he could not
say he lied?

ARIEL:
No.

PROTEUS:
Than men their way: I eyed them
Even to the view; in their births g

### This works pretty well

With an order of 4, we already get quite reasonable results. Increasing the order to 7 (~word and a half of history) or 10 (~two short words of history) already gets us quite passable Shakepearan text. I'd say it is on par with the examples in Andrej's post. And how simple and un-mystical the model is!

### So why am I impressed with the RNNs after all?

Generating English a character at a time -- not so impressive in my view. The RNN needs to learn the previous $n$ letters, for a rather small $n$, and that's it. 

However, the code-generation example is very impressive. Why? because of the context awareness. Note that in all of the posted examples, the code is well indented, the braces and brackets are correctly nested, and even the comments start and end correctly. This is not something that can be achieved by simply looking at the previous $n$ letters. 

If the examples are not cherry-picked, and the output is generally that nice, then the LSTM did learn something not trivial at all.

Just for the fun of it, let's see what our simple language model does with the linux-kernel code:

In [29]:
!wget http://cs.stanford.edu/people/karpathy/char-rnn/linux_input.txt

URL transformed to HTTPS due to an HSTS policy
--2019-08-03 19:20:28--  https://cs.stanford.edu/people/karpathy/char-rnn/linux_input.txt
Resolving cs.stanford.edu (cs.stanford.edu)... 171.64.64.64
Connecting to cs.stanford.edu (cs.stanford.edu)|171.64.64.64|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 6206996 (5.9M) [text/plain]
Saving to: ‘linux_input.txt’


2019-08-03 19:20:30 (3.95 MB/s) - ‘linux_input.txt’ saved [6206996/6206996]



In [31]:
lm = train_char_lm("linux_input.txt", order=10)
print(generate_text(lm, 10))

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa9 in position 5809090: invalid start byte

In [32]:
lm = train_char_lm("linux_input.txt", order=15)
print(generate_text(lm, 15))

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa9 in position 5809090: invalid start byte

In [52]:
lm = train_char_lm("linux_input.txt", order=20)
print generate_text(lm, 20)

/*
 * linux/kernel/irq/spurious.c
 *
 * Copyright (C) 2004 Nadia Yvette Chambers
 */

#include <linux/irq.h>
#include <linux/mutex.h>
#include <linux/capability.h>
#include <linux/suspend.h>
#include <linux/shm.h>

#include <asm/uaccess.h>
#include <linux/interrupt.h>
#include "kdb_private.h"

/*
 * Table of kdb_breakpoints
 */
kdb_bp_t kdb_breakpoints[KDB_MAXBPT];

static void kdb_setsinglestep(struct pt_regs *regs)
{
	struct swevent_htable *swhash = &per_cpu(swevent_htable, cpu);

	mutex_lock(&swhash->hlist_mutex);
	swhash->online = true;
	if (swhash->hlist_refcount)
		swevent_hlist_release(swhash);

	mutex_unlock(&show_mutex);

	return 0;
}

/*
 * Unshare file descriptor table if it is being shared
 */
static int unshare_fs(unsigned long unshare_flags, struct cred **new_cred)
{
	struct cred *cred = current_cred();

	retval = -EPERM;
	if (rgid != (gid_t) -1) {
		if (gid_eq(old->gid, kegid) ||
		    gid_eq(old->sgid, kegid) ||
		    gid_eq(old->sgid, kegid) ||
		    gid_eq(old->egid, 

In [33]:
print(generate_text(lm, 20))

KeyError: '~~~~~~~~~~~~~~~~~~~~'

In [34]:
print(generate_text(lm, 20, nletters=5000))

KeyError: '~~~~~~~~~~~~~~~~~~~~'

Order 10 is pretty much junk. In order 15 things sort-of make sense, but we jump abruptly between the 
and by order 20 we are doing quite nicely -- but are far from keeping good indentation and brackets. 

How could we? we do not have the memory, and these things are not modeled at all. While we could quite easily enrich our model to support also keeping track of brackets and indentation (by adding information such as "have I seen ( but not )" to the conditioning history), this requires extra work, non-trivial human reasoning, and will make the model significantly more complex. 

The LSTM, on the other hand, seemed to have just learn it on its own. And that's impressive.