<h1>LSTMs</h1>
<p>Long short-term memory (LSTM) units are units of a recurrent neural network (RNN). An RNN composed of LSTM units is often called an LSTM network. A common LSTM unit is composed of a cell, an input gate, an output gate and a forget gate. The cell remembers values over arbitrary time intervals and the three gates regulate the flow of information into and out of the cell.</p>

<p>LSTM networks are well-suited to classifying, processing and making predictions based on time series data, since there can be lags of unknown duration between important events in a time series. LSTMs were developed to deal with the exploding and vanishing gradient problems that can be encountered when training traditional RNNs. Relative insensitivity to gap length is an advantage of LSTM over RNNs, hidden Markov models and other sequence learning methods in numerous applications.</p>

<h1>How LSTMs Work</h1>
<p>Sequence prediction problems have been around for a long time. They are considered as one of the hardest problems to solve in the data science industry. These include a wide range of problems; from predicting sales to finding patterns in stock markets’ data, from understanding movie plots to recognizing your way of speech, from language translations to predicting your next word on your iPhone’s keyboard.</p>

<p>With the recent breakthroughs that have been happening in data science, it is found that for almost all of these sequence prediction problems, Long short Term Memory networks, a.k.a LSTMs have been observed as the most effective solution.</p>


In [1]:
from __future__ import print_function
from keras.callbacks import LambdaCallback
from keras.models import Sequential
from keras.layers import Dense, Activation
from keras.layers import LSTM
from keras.optimizers import RMSprop
from keras.utils.data_utils import get_file
import numpy as np
import random
import sys
import io

Using TensorFlow backend.


In [2]:
#Download file and parse
path = get_file('450baud.txt', origin='http://textfiles.com/computers/450baud.txt')
with io.open(path, encoding='utf-8') as f:
    text = f.read().lower()
print('corpus length:', len(text))

corpus length: 7172


In [4]:
chars = sorted(list(set(text)))
print('total chars:', len(chars))
char_indices = dict((c, i) for i, c in enumerate(chars))
indices_char = dict((i, c) for i, c in enumerate(chars))

total chars: 54


In [5]:
# cut the text in semi-redundant sequences of maxlen characters
maxlen = 40
step = 3
sentences = []
next_chars = []
for i in range(0, len(text) - maxlen, step):
    sentences.append(text[i: i + maxlen])
    next_chars.append(text[i + maxlen])
print('nb sequences:', len(sentences))

nb sequences: 2378


In [6]:
print('Vectorization...')
x = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool)
y = np.zeros((len(sentences), len(chars)), dtype=np.bool)
for i, sentence in enumerate(sentences):
    for t, char in enumerate(sentence):
        x[i, t, char_indices[char]] = 1
    y[i, char_indices[next_chars[i]]] = 1

Vectorization...


In [7]:
# build the model: a single LSTM
print('Build model...')
model = Sequential()
model.add(LSTM(128, input_shape=(maxlen, len(chars))))
model.add(Dense(len(chars)))
model.add(Activation('softmax'))

optimizer = RMSprop(lr=0.01)
model.compile(loss='categorical_crossentropy', optimizer=optimizer)


def sample(preds, temperature=1.0):
    # helper function to sample an index from a probability array
    preds = np.asarray(preds).astype('float64')
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)


def on_epoch_end(epoch, logs):
    # Function invoked at end of each epoch. Prints generated text.
    print()
    print('----- Generating text after Epoch: %d' % epoch)

    start_index = random.randint(0, len(text) - maxlen - 1)
    for diversity in [0.2, 0.5, 1.0, 1.2]:
        print('----- diversity:', diversity)

        generated = ''
        sentence = text[start_index: start_index + maxlen]
        generated += sentence
        print('----- Generating with seed: "' + sentence + '"')
        sys.stdout.write(generated)

        for i in range(400):
            x_pred = np.zeros((1, maxlen, len(chars)))
            for t, char in enumerate(sentence):
                x_pred[0, t, char_indices[char]] = 1.

            preds = model.predict(x_pred, verbose=0)[0]
            next_index = sample(preds, diversity)
            next_char = indices_char[next_index]

            generated += next_char
            sentence = sentence[1:] + next_char

            sys.stdout.write(next_char)
            sys.stdout.flush()
        print()

print_callback = LambdaCallback(on_epoch_end=on_epoch_end)

model.fit(x, y,
          batch_size=128,
          epochs=60,
          callbacks=[print_callback])

Build model...
Epoch 1/60
----- Generating text after Epoch: 0
----- diversity: 0.2
----- Generating with seed: " it to the phone line at any of those sp"
 it to the phone line at any of those sp i                                                                                                                                         o       t                   i                                    a      a a                       t                                                                                     o                                                                            
----- diversity: 0.5
----- Generating with seed: " it to the phone line at any of those sp"
 it to the phone line at any of those sp o ii   o  o  t    o rot     i os   goa  e r   i    et  e     o0 st   i  a  hd  i  o   ho noo  o    e  hi i aonh  e   o   d tth   b a  tt  0et a  dse   hoao    e    a  ao ktota e   a ap0   e oo ua    be o r  iot  o  s   t  o    or  n   oe  he  h   an  to  te   s  hti  o e

                                                                                                                                                                                                                                                                                                                                                                                                                                                        
----- diversity: 1.2
----- Generating with seed: "                                        "
                                                                                                                                                                                                                                                                                                                   '                                                                                                                                    
Epoch 5/60
----- Generatin

care at what rate is ren the sunct the aud the d the forer speed the sunce the s perthe sune commund the f aud res as the f the sund rere the foud rerer com un the foud rate the sund rot or the foud rate chic sund rere the forer the foud ren the foud rate the sund rou the speed the sunce the f us the suncand at ac the cand at                                                                                               
----- diversity: 0.5
----- Generating with seed: "the hayes doesn't
care at what rate is r"
the hayes doesn't
care at what rate is re the f mate the comeus roud and calle communicat u                                                                                                                                                                                                                                                                                                                                                             
----- diversity: 1.0
----- Generating with s

typically, they are at and regestat seste the baud rate comment ataunpertos - chine is the aud retsogss-232-c iste the baud rate gesers 2ivisors the some notions at and seceivisor sente the aud rete and sectivisor
mandis someanding at and reteet the baud rate gesers wethend is a derestess.

                                                                                                                                 
----- diversity: 0.5
----- Generating with seed: " this rbbs system.
typically, they are a"
 this rbbs system.
typically, they are atte programment
 of 450 baud mede. the some. thes ammegnsows. thatever, wesee speeds 450 baud rate geseed at seme. to dings 450 baud rate comming atdensow of atyer of 450 baud rate divisor
speeds at sent. 450 baud ad and reseed ou sonee is 450 baud rate divisor
manderat, wereges of to ander oprerers at the baud rate gesers.
thenmmince to inges withe horemand o to ingeramm.
nmend weregelyer regelye
----- diversity: 1.0
----- Generating with se

inttath an peltot sespcog speed. 450 baud oate aros the baud rate geserte to eratos i/ falle daud rate gesertor to inges is to ander to inpertion operanion or to and
s to conerato the hous seme. to sings fate command
ratt commont rate geserte to erats, wothe dos most oror most onlte ou to trogrammibg the 
----- diversity: 0.5
----- Generating with seed: "   now that we understand the workings o"
   now that we understand the workings of 1900 baud uate an  seedrand. ta aud un to the vaus heres moseiges. we to con. the "oud attes bat sestrorsoport ss-232-c isienthithe dommunications withe fom andis to contrans sisnce fate as portess the hosem comming at ountionsis sereing the hosem the fom conllyichis wetecamode. to sopertion operanie somming the vamibate. to 1900
and is to controns. homever, 4immand
inttatises the hosem commort 
----- diversity: 1.0
----- Generating with seed: "   now that we understand the workings o"
   now that we understand the workings of xelllye aud rete. fros bat

                                                                                                                                                                                                                                                                                                          
----- diversity: 0.5
----- Generating with seed: "
what this means is that the microproces"

what this means is that the microproces out speed. 450 baud wetacac operatio somertorropestes to the mose speeds - ppecros - the rousing the "aud reges or 450 baud. ib  portos - caces mose the speeds - for aspeed. wetases operageand
(nttatayas dowe to divisor pegtross. the somer to the various mode. to seneratos ope. bate asopctor mede. to sings whes made geltr asee the mose spord we baud he
mose commodem theranly sommectioss.

       
----- diversity: 1.0
----- Generating with seed: "
what this means is that the microproces"

what this means is that the microproces rome notrats. the spmed.

 fond a t

        oud & 3as a doreis com pente far unatran and settran operanly same notis the iom bitis the hoperations. the pommont seatrand horem seatate acomently faud rate is
m se thor divisor
regestens. whthe acanctions. the pors. whth the houd rate divisor
regrsterso2ipe to speed, 450 bau
----- diversity: 0.5
----- Generating with seed: "1200
baud - if you open your communicati"
1200
baud - if you open your communications mithe far prorras the rous to dines falle daud rate divisor regss the hor speeds fror a sestibl the houd rate divisor
r to dings with thein "aud rete as portabls dove orang 450 baud
an the rommand
it to ander to sonertis sedistionsoations. the pors. whth thein " fameans
ihinativisor
ments thes programming it
somment atf ander 3ase not
, wereching 450 baud oate apat sabe. at assested the hounou
----- diversity: 1.0
----- Generating with seed: "1200
baud - if you open your communicati"
1200
baud - if you open your communications. the pommod the rome notis sythed speeds opr

in the hoperam and we the speeds - the rages do medeme the rasicanions whec ar 450 baud weterag sed..
then opilly raniens. the hous modem can be is mesei
----- diversity: 0.5
----- Generating with seed: "e
modem transmit it to the phone line at"
e
modem transmit it to the phone line at as- 3ince frod mise the hoperam and withic wilhe foat andopesat  a  ounor the romes to mivisor regssteramipnband hiceich wivisor regesteromoprowivithat vation yo  pertly sw50 baud we acan the baud rate geserat oudivithor peed. weracaba aring 450 baud we the speeds - the iomert at supportes withe dovaba ioptaba aud rate geserat he to geners is the hoperamivishor programmibg in thihivall an opterac
----- diversity: 1.0
----- Generating with seed: "e
modem transmit it to the phone line at"
e
modem transmit it to the phone line at as-eg at asse soat bate aplor speed
aid this the hoperand
itte to inpert ouming the baud rations. ty u at medem the variousommunications. ifte the hous whtr the hors (oneente the r

mode) is capable of ienpente the vanious mite iomechir opera peatly spoers withir yajeed a thitc isinct

         3fb' baud &s
frod an ato sentrt
 a d retesusomarighsof mibis fall chipe speed
allyed
and receinlled wethcthars see ly ur 450 baud oa moseing the "ammer

a ar os-2cogs a fomerchirs of hopeenure to contmodem

aplor "sud moded is the however, 450 baud pe to is yos highur "suding it  s the som prorran y su0p
Epoch 47/60
----- Generating text after Epoch: 46
----- diversity: 0.2
----- Generating with seed: "l allow up to 50 per cent faster
communi"
l allow up to 50 per cent faster
communications.

     x'3f8'

                oud x'3f8'00'
ou  oud &h8we noud or the hor mises the hor mise the fam and ibmor dion operanior or 450 baud operation. withe hor ming the fom 1itt can loading the daye orer " ud mes s232cand we the speeds - frog a tun to sentrat e to the damm. not speed. 450 baud opera is
m the various me to 450 baud, it the various me erom 19taac an the rommort sest the "s

sequence oprmibitihirh, we acaus wherand s'3ababe. in the rommode. tat  aud rate geseromos is
mode. in the rommand
inttran the hosemops abes operan and
rand. ibm peroros ioceing 450 baud operages at us to spoed
a portes opemaricata, opre your iste to 450 baud, it a sestede operages mose divisor regesters wilthor yaue can be is
sportes od misicedions. hove iom oroprans wilat a denereate to 1900
baud ibith, w
----- diversity: 1.0
----- Generating with seed: "ud rate loaded. the following
sequence o"
ud rate loaded. the following
sequence opreisy whis oading atd reaslle doand r te divisor regse the ros. with the home opder made commodem tatacablle divisor regs the modemorswers.

               3ope to 450 baud in is
selteing 450 baud whtransabior or ingeswinge to comere to 450 baud is
6crlond modem to 450 baud
anding and sicr fras weichar "subjhet
speed. 450 baud, isisor ibbudinete and speedly 450 baud ad misis wher of regs.
whensti
----- diversity: 1.2
----- Generating with seed: "ud rat

segerattosmm-ccors.

        hoped
hases speed
and hises base comment y ur gesers 2- whecaceiom order basic programmab' conmoped
im pror fate. this operabasic dith this madem
cans iste prorswers to the rommode divinations. the " ac miste the vareing the " ad mess the hovemions withichipg the x'3250 can thopreist 450 baud
 a tupersof the hov
----- diversity: 1.0
----- Generating with seed: "ta across the rs-232-c interface. this
b"
ta across the rs-232-c interface. this
baud rate ges.
thene erre sede. the hor mmseen y weanaus
- the x'mmmoc is thaus mase. dith the masem pror
aste the " un misis the baud rate ges
mode to dings the command
a deneres is o dever, we enorsteromivis
(f batioss we an that speed. 450 bates we an pror askevely speeds or 450 baud, at assegerate or to 19320' o  to t5 tslow the most speed y 1900 baud modem
 pcoropelload optrable vasian!
rate b
----- diversity: 1.2
----- Generating with seed: "ta across the rs-232-c interface. this
b"
ta across the rs-232-c interface

<keras.callbacks.History at 0xd26ce4c10>

In [3]:
import tensorflow as tf
import numpy
import pandas as pd

filepath = "/Users/khumbokaunda/Desktop/BIGDATA/DATASETS/breast_cancer.csv"

dataset = tf.contrib.data.make_csv_dataset(filepath)
iterator = dataset.make_initializable_iterator()
columns = iterator.get_next()
with tf.Session() as sess:
   sess.run([iteator.initializer])

ImportError: cannot import name batching

In [11]:
from sklearn import preprocessing

le = preprocessing.LabelEncoder()
le.fit(["diagnosis"])


LabelEncoder()

In [None]:
list(le.classes_)
['amsterdam', 'paris', 'tokyo']


In [5]:
df = pd.read_csv("/Users/khumbokaunda/Desktop/BIGDATA/DATASETS/breast_cancer.csv").values
with tf.python_io.TFRecordWriter("csv.tfrecords") as writer:
    for row in csv:
        features, label = row[:-1], row[-1]
        example = tf.train.Example()
        example.features.feature["features"].float_list.value.extend(features)
        example.features.feature["label"].int64_list.value.append(label)
        writer.write(example.SerializeToString())

TypeError: 'M' has type str, but expected one of: int, long, float

In [9]:
csv = pd.read_csv("/Users/khumbokaunda/Desktop/BIGDATA/DATASETS/breast_cancer.csv")

In [12]:
csv["diagnosis"] = csv["diagnosis"].map({"M": 0.0, "B": 1.0})

In [19]:
good_bye_list = ['Unnamed: 32']
csv.drop(good_bye_list, axis=1, inplace=True)
csv.head(5)

Unnamed: 0,id,diagnosis,radius_mean,texture_mean,perimeter_mean,area_mean,smoothness_mean,compactness_mean,concavity_mean,concave points_mean,...,radius_worst,texture_worst,perimeter_worst,area_worst,smoothness_worst,compactness_worst,concavity_worst,concave points_worst,symmetry_worst,fractal_dimension_worst
0,842302,0.0,17.99,10.38,122.8,1001.0,0.1184,0.2776,0.3001,0.1471,...,25.38,17.33,184.6,2019.0,0.1622,0.6656,0.7119,0.2654,0.4601,0.1189
1,842517,0.0,20.57,17.77,132.9,1326.0,0.08474,0.07864,0.0869,0.07017,...,24.99,23.41,158.8,1956.0,0.1238,0.1866,0.2416,0.186,0.275,0.08902
2,84300903,0.0,19.69,21.25,130.0,1203.0,0.1096,0.1599,0.1974,0.1279,...,23.57,25.53,152.5,1709.0,0.1444,0.4245,0.4504,0.243,0.3613,0.08758
3,84348301,0.0,11.42,20.38,77.58,386.1,0.1425,0.2839,0.2414,0.1052,...,14.91,26.5,98.87,567.7,0.2098,0.8663,0.6869,0.2575,0.6638,0.173
4,84358402,0.0,20.29,14.34,135.1,1297.0,0.1003,0.1328,0.198,0.1043,...,22.54,16.67,152.2,1575.0,0.1374,0.205,0.4,0.1625,0.2364,0.07678


In [20]:
df = csv.values
with tf.python_io.TFRecordWriter("csv.tfrecords") as writer:
    for row in csv:
        features, label = row[:-1], row[-1]
        example = tf.train.Example()
        example.features.feature["features"].float_list.value.extend(features)
        example.features.feature["label"].int64_list.value.append(label)
        writer.write(example.SerializeToString())

TypeError: 'i' has type str, but expected one of: int, long, float