Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to do Stacked LSTM with attention using this framework ? #30

Closed
rjpg opened this issue Jun 12, 2019 · 3 comments
Closed

How to do Stacked LSTM with attention using this framework ? #30

rjpg opened this issue Jun 12, 2019 · 3 comments

Comments

@rjpg
Copy link

rjpg commented Jun 12, 2019

hello,

I have run your code successful.

I have also include stacked LSTM in your code :

def model_attention_applied_before_lstm():
    inputs = Input(shape=(TIME_STEPS, INPUT_DIM,))
    attention_mul = attention_3d_block(inputs)
    lstm_units = 32
    attention_mul = LSTM(lstm_units, return_sequences=True)(attention_mul)
    attention_mul = LSTM(lstm_units, return_sequences=False)(attention_mul)
    output = Dense(1, activation='sigmoid')(attention_mul)
    model = Model(input=[inputs], output=output)
    return model

But maybe this is not the correct way to apply staked LSTM with attention right ?

My ultimate goal is to include attention into this code (classification of multivariate time series ) :


class LSTMNet:
    @staticmethod
    def build(timeSteps,variables,classes):
        inputNet = Input(shape=(timeSteps,variables))
       lstm=Bidirectional(GRU(100,recurrent_dropout=0.4,dropout=0.4,return_sequences=True),merge_mode='concat')(inputNet) 
       lstm=Bidirectional(GRU(50,recurrent_dropout=0.4,dropout=0.4,return_sequences=True),merge_mode='concat')(lstm) 
        lstm=Bidirectional(GRU(20,recurrent_dropout=0.4,dropout=0.4,return_sequences=False),merge_mode='concat')(lstm) 
        # a softmax classifier
        classificationLayer=Dense(classes,activation='softmax')(lstm)
        model=Model(inputNet,classificationLayer)
        return model

Thanks in advance for any possible info

@rjpg
Copy link
Author

rjpg commented Jul 1, 2019

Ok It was simple :


lstm=Bidirectional(LSTM(100,recurrent_dropout=0.4,dropout=0.4,return_sequences=True),merge_mode='concat')(inputNet) #worse using stateful=True
        #lstm=SeqSelfAttention(attention_activation='sigmoid')(lstm)
        lstm=attention_3d_block(lstm,timeSteps)
        lstm=Bidirectional(LSTM(50,recurrent_dropout=0.4,dropout=0.4,return_sequences=True),merge_mode='concat')(lstm) #worse using stateful=True 
        lstm=attention_3d_block(lstm,timeSteps)
        lstm=Bidirectional(LSTM(20,recurrent_dropout=0.4,dropout=0.4,return_sequences=False),merge_mode='concat')(lstm) #worse using stateful=True 

@rjpg
Copy link
Author

rjpg commented Jul 1, 2019

by the way, I try to use attention with conv1D to specify the "neighbors lenght" contribute to the importance of the step in question (using the size of the kernel) , the results improved:

def attention_3d_block(inputs,timesteps):
    input_dim = int(inputs.shape[2])
    time_steps=timesteps
    a_probs = Conv1D(input_dim,3,strides=1,padding='same',activation='softmax')(inputs)
    output_attention_mul= Multiply()([inputs, a_probs]) #name='attention_mul'
    return output_attention_mul

this way you also do not need to permute - it will build attention vector for time steps and not for variables without permuting...

@philipperemy
Copy link
Owner

@rjpg thanks! The attention block got updated. So maybe this is deprecated now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants