ChatLSTM network for chat representation

End to end deep learning solution for PAN 2012 predator identification Task 1 (http://pan.webis.de/clef12/pan12-web/author-identification.html) Current solution far from the best solutions, F05 score is about 0.27 on test set.

Topology

Chat LSTM-Cell:

2 standard LSTM cell and they are "communicating" with each others

Input gate: i_t=sigm(W⁽ⁱ⁾x_t+U⁽ⁱ⁾h_t-1+b⁽ⁱ⁾)
Forget gate: f_t=sigm(W^(f)x_t+U^(f)h_t-1+b^(f))
Output gate: o_t=sigm(W^(o)x_t+U^(o)h_t-1+b^(o))
Response gate: s_t=sigm(W^(s)h__t+U^(s)h_t-1+b^(s))
u_t=tanh(W^(u)x_t+U^(u)h_t-1+b^(u))
e_t=tanh(W^(e)h__t+U^(e)h_t-1+b^(e))
c_t=u_t * i_t+c_t-1 * f _t+e_t * s_t
h_t=o_t * tanh(c_t)

h_ is the other LSTM's hidden state

Hierarchical stucture:

Message is vectorized with LSTM. Each message pairs' vectors are merged, these vectors are the input of the chatLSTM.

Regularization

Direction Regularization: At vector representation of the chat 2 part of the vector represents each speaker's intention, these vectors should have the same direction, if they think the same about the topic. That is why two vector's normalized cos distance is used as regularization in the loss function.

F-0.5 measure optimization

The dataset is very unbalanced and F0.5 metric used for evaluate the model, so basic cross-entropy isn't good enough, because it optimize for balanced classes' accuracy. F measures are harmonic average of recall and precision: Modified objective: y=predicted z=label '(1+alfa*z)*cross-entropy(y,z)+(1+alfa*z)*beta*log(y)'

First part optimize for recall: higher alfa cause higher recall
Second part cause higher precision, if beta is higher

Command line usage

Train

'python3 train.py'

Parameters

Number of words are in the dictionary: '--top_word_num=5000'
Dimension of the word representation: '--word_dim='
Dimension of the message representation: '--hidden_dim1'
Dimension of the chat representation: '--hidden_dim2='
Number of neurons in the first fully connected layer: '--dense_hidden='
Dropout probability in fully connected layer: '--dropout=0.5'
Batch size: '--batch_size=128'
Number of epochs: '--max_epoch_number=150'
Size of the validation data '--validation_size=0.1'
Use ChatLSTM or basic LSTM for chat representation: '--use_chat_LSTM=True'
Use direction regularization: '--direction_reg_on=True'
Weight of the direction regulation: 'direction_reg=0.1'
Message length in words used for bucket: 'message_length=40'
Chat length used for bucket: 'chat_length=40'
Weight of the recall: 'recall_parameter=45'
Weight of the precision, 0 means no change: 'precision_parameter=0.05'

Best models are saved to tmp folder.

##Test the saved models 'python3 test.py filename'

Requirements

Keras 1.3<=...<2.0

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
ChatLSTM.py		ChatLSTM.py
README.md		README.md
network.png		network.png
predatordata.py		predatordata.py
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ChatLSTM network for chat representation

Topology

Chat LSTM-Cell:

Hierarchical stucture:

Regularization

F-0.5 measure optimization

Command line usage

Train

Requirements

About

Uh oh!

Releases

Packages

Languages

tilkb/ChatLSTM

Folders and files

Latest commit

History

Repository files navigation

ChatLSTM network for chat representation

Topology

Chat LSTM-Cell:

Hierarchical stucture:

Regularization

F-0.5 measure optimization

Command line usage

Train

Requirements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages