# Training the model on your Facebook data

## Extracting your facebook data

Downloading your facbook data is actually pretty simple! From the FaceBook website go to Settings -> Your FaceBook Information -> Download your information. From here you may tailor how much data you would like to download. Note typically these files are very large in size and that FaceBook may take a few days to give you the data.

For this project all that is required is your **messages** data, downloaded as a **high** quality **JSON** file. 

In [4]:
import sys
sys.path.append('../')
from src.data.make_dataset import data, extract
from src.models import predict_model, train_model

In [5]:
# Enter location of raw data:
raw_data_path = 'E:\messages\inbox\GurbirSinghJohal_vLCZzZ04BQ'

# Extracting data
extract(raw_data_path, output_filepath = '../data')

In [9]:
# Creating Data class
gurb_data = data()

# Enter message file loction
gurb_data.encode(input_filepath = '..\data\GurbirSinghJohal_vLCZzZ04BQ\Gurbir_Singh_Johal_messages.txt')

array([ 68,  38,   5, ..., 137,  81,  39])

## Create and train the model

We now instigate a character-level RNN using PyTorch's NN class.


In [10]:
# Define parameters of the RNN:
n_hidden=512
n_layers=2

net = train_model.CharRNN(gurb_data.chars, n_hidden, n_layers)
print(net)

CharRNN(
  (lstm): LSTM(166, 512, num_layers=2, batch_first=True, dropout=0.5)
  (dropout): Dropout(p=0.5, inplace=False)
  (fc): Linear(in_features=512, out_features=166, bias=True)
)


In [11]:
batch_size = 128
seq_length = 100
n_epochs = 80 # start smaller if you are just testing initial behavior

# train the model
train_model.train(net, gurb_data, epochs=n_epochs, batch_size=batch_size, seq_length=seq_length, lr=0.001, print_every=10)

Epoch: 1/80... Step: 10... Loss: 3.4641... Val Loss: 3.4132 time: 19.4
Epoch: 2/80... Step: 20... Loss: 3.3941... Val Loss: 3.3230 time: 20.9
Epoch: 2/80... Step: 30... Loss: 3.3340... Val Loss: 3.3036 time: 22.5
Epoch: 3/80... Step: 40... Loss: 3.3203... Val Loss: 3.3039 time: 24.0
Epoch: 3/80... Step: 50... Loss: 3.3023... Val Loss: 3.3017 time: 25.6
Epoch: 4/80... Step: 60... Loss: 3.2712... Val Loss: 3.3008 time: 27.1
Epoch: 5/80... Step: 70... Loss: 3.3122... Val Loss: 3.2988 time: 28.6
Epoch: 5/80... Step: 80... Loss: 3.3048... Val Loss: 3.2963 time: 30.1
Epoch: 6/80... Step: 90... Loss: 3.2880... Val Loss: 3.2952 time: 31.6
Epoch: 6/80... Step: 100... Loss: 3.2724... Val Loss: 3.2916 time: 33.1
Epoch: 7/80... Step: 110... Loss: 3.2582... Val Loss: 3.2874 time: 34.6
Epoch: 8/80... Step: 120... Loss: 3.2890... Val Loss: 3.2795 time: 36.1
Epoch: 8/80... Step: 130... Loss: 3.2556... Val Loss: 3.2679 time: 37.6
Epoch: 9/80... Step: 140... Loss: 3.2427... Val Loss: 3.2501 time: 39.1
E

## Sampling the model

In [12]:
print(predict_model.sample(net, 1000, prime='lang', top_k=5))

lang
To do it’s aloin is tho
Im a this to some mine there
Wana go a fucking thats
Shaptit
Alses for u
We seen the perg to do
I defent to get init
Im gna go sore anyonine
The plobabiry if or camar thr shit
Trying in later long mare
Lang
I doennt don’t we can’t dingers in toll in the more and im a chist me ban is tore take mate in my
Idk
This is is the from sume stuff
Love u did in that then then
Lol them stull is a forget an the curd
Then we go shit it onengre init
Lol
Indenting sandig lang
Yh hear u went it
I do it as my room
I did its a thower in
What i weer
Where u said
Lol they call starts
Wat mate mate
They warking torry
In my fucking sore
I have a few mines of there in library
In lol
Lang have to shit in the farm
Sak
I have to go stop the peng
With up to go they they doing them
Sure me
I shudve a bo leave it
Was i doing to start to come but its a they day
Walk same to hand the recker i week in a funded
Well will see work sem up on so tord it
Then suco u
It doing
Shope is it to go 

## Save and load the model

In [14]:
# Save model:

train_model.save(net, name = 'Gurbot', loc = '..\models')

In [16]:
# Load model:

net2 = train_model.load('..\models\Gurbot.net')

In [17]:
# Sample loaded model:

print(predict_model.sample(net2, 1000, prime='lang', top_k=5))

lang u didnt get it is a torda is sonter seenst me thit lol
Whit wen is time in mad the confucting
Lmfao
Short work u done lol
What we was some that work
Idk
I cant see it was it or some mich is to did it the fucking stiffering them
Linear all be library
I cunt shit is sheets fucked
Lmao
Lol what to used a fit
And sell
In my to do to deen there shit
Where u going things aswell we dont warking sormer
Shough u say at in the course sick they
Weer to stepper the sheets a see we can me u go the piss
Where u go sumbred
Lot like u wont analying it then
In the probabit
The didnt mean in man
Like im gna cand andersting in attell
Lol
U wana get me and this it was it to sard in the prigurt
In literally witho u do it
Its are u seats
I he wud gat my time
Come treap
So what we get it to tell
Yes i dont have
Never started and shutt all dont a calm
Lool it
We want tayed a to me
Lang
U did in the comporte all the from it agee is one is it
Im gna do it
That
We can do it
Lol
Yep i seen u day went
We chan