How to implement a conv layer with different filter sizes (Zhang & Wallace 2015) ? #6547

ben0it8 · 2017-05-08T22:58:04Z

Hello,

I'm trying to reproduce the CNN architecture proposed in this paper, which has the following 1-CNN-layer architecture with two of each of the varying filter sizes (+ global maxpooling and dropout):

Is there a way to implement this architecture in Keras?

Best,
ben0it8

kgrm · 2017-05-09T06:44:00Z

Apply different convolutional layers on the same input and merge their outputs?

fmailhot · 2017-05-22T02:07:56Z

I'm in the middle of figuring this out myself. Here's what I think is necessary.

You'll need to replicated your inputs across each of the input "channels" (i.e. for each filter width).
You're doing a "concatenate" merge after the GlobalMaxPooling1D on the Conv1D layer outputs (it looks like there are 2 "merges" happening in the diagram, but I don't believe it's necessary.

Have a look at the following for inspiration:
https://gist.github.com/ameasure/944439a04546f4c02cb9
https://statcompute.wordpress.com/2017/01/08/an-example-of-merge-layer-in-keras/

Let me know if you've made any progress, and I'll do the same.

fmailhot · 2017-05-22T02:59:51Z

Here's what I ended up doing, which appears to be doing the right thing, but I'm still new enough to Keras that I haven't figured out how to introspect this properly to make sure...

submodels = []
for kw in (3, 4, 5):    # kernel sizes
    submodel = Sequential()
    submodel.add(Embedding(len(word_index) + 1,
                           EMBEDDING_DIM,
                           weights=[embedding_matrix],
                           input_length=MAX_SEQUENCE_LENGTH,
                           trainable=False))
    submodel.add(Conv1D(FILTERS,
                        kw,
                        padding='valid',
                        activation='relu',
                        strides=1))
    submodel.add(GlobalMaxPooling1D())
    submodels.append(submodel)
big_model = Sequential()
big_model.add(Merge(submodels, mode="concat"))
big_model.add(Dense(HIDDEN_DIMS))
big_model.add(Dropout(P_DROPOUT))
big_model.add(Activation('relu'))
big_model.add(Dense(1))
big_model.add(Activation('sigmoid'))
print('Compiling model')
big_model.compile(loss='binary_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy'])

ben0it8 · 2017-05-22T22:54:11Z

I was trying to fit your implementation but got:
ValueError: The model expects 3 input arrays, but only received one array. Found: array with shape (48943, 300)

Any idea?

fmailhot · 2017-05-22T23:07:56Z

Yes, this is what I meant about "replicating the inputs"...sorry, I should have included the fit() call to clarify.

hist = big_model.fit([x_train, x_train, x_train],
                     y_train,
                     batch_size=BATCH_SIZE,
                     epochs=EPOCHS,
                     validation_data=([x_val, x_val, x_val], y_val),
                     callbacks=callbacks)

You can see...I have x_train and x_val as my training/validation inputs...because I'm using three different filter sizes, it's like the net is expecting 3 different input streams. By turning my inputs into a list of NUM_KERNEL_SIZES times the inputs, that gets handled.

ben0it8 · 2017-05-23T21:56:03Z

thank you for sharing that!

stale · 2017-08-22T13:38:14Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

nikicc · 2017-09-11T21:32:13Z

The same problem seems to be addressed and solved in this issue using the Graph model.

stale · 2017-12-10T22:25:56Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

wt-huang · 2018-11-13T00:39:38Z

Closing as this is resolved

Yash-099 · 2020-07-22T10:59:49Z

Here's what I ended up doing, which appears to be doing the right thing, but I'm still new enough to Keras that I haven't figured out how to introspect this properly to make sure...

submodels = []
for kw in (3, 4, 5):    # kernel sizes
    submodel = Sequential()
    submodel.add(Embedding(len(word_index) + 1,
                           EMBEDDING_DIM,
                           weights=[embedding_matrix],
                           input_length=MAX_SEQUENCE_LENGTH,
                           trainable=False))
    submodel.add(Conv1D(FILTERS,
                        kw,
                        padding='valid',
                        activation='relu',
                        strides=1))
    submodel.add(GlobalMaxPooling1D())
    submodels.append(submodel)
big_model = Sequential()
big_model.add(Merge(submodels, mode="concat"))
big_model.add(Dense(HIDDEN_DIMS))
big_model.add(Dropout(P_DROPOUT))
big_model.add(Activation('relu'))
big_model.add(Dense(1))
big_model.add(Activation('sigmoid'))
print('Compiling model')
big_model.compile(loss='binary_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy'])

In the above code as @fmailhot mentioned. When I tried to compile it says there is no Layer named Merge().

hazemAmir · 2020-11-13T14:58:17Z

I had the same problem with the Merge() function
I solved it by downgrading keras:
pip uninstall keras
pip install keras==2.1.2

gamertrue · 2021-09-28T19:12:40Z

Yes, this is what I meant about "replicating the inputs"...sorry, I should have included the fit() call to clarify.
hist = big_model.fit([x_train, x_train, x_train],
                     y_train,
                     batch_size=BATCH_SIZE,
                     epochs=EPOCHS,
                     validation_data=([x_val, x_val, x_val], y_val),
                     callbacks=callbacks)
You can see...I have x_train and x_val as my training/validation inputs...because I'm using three different filter sizes, it's like the net is expecting 3 different input streams. By turning my inputs into a list of NUM_KERNEL_SIZES times the inputs, that gets handled.

I tried to this however I got an error message "list indices must be integers or slices, not ListWrapper". And I didn't use Merge but Concatenate instead... Does anyone have a solution to this issue?

stale bot added the stale label Aug 22, 2017

stale bot removed the stale label Sep 11, 2017

stale bot added the stale label Dec 10, 2017

wt-huang closed this as completed Nov 13, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to implement a conv layer with different filter sizes (Zhang & Wallace 2015) ? #6547

How to implement a conv layer with different filter sizes (Zhang & Wallace 2015) ? #6547

ben0it8 commented May 8, 2017

kgrm commented May 9, 2017

fmailhot commented May 22, 2017 •

edited

Loading

fmailhot commented May 22, 2017

ben0it8 commented May 22, 2017

fmailhot commented May 22, 2017

ben0it8 commented May 23, 2017

stale bot commented Aug 22, 2017

nikicc commented Sep 11, 2017

stale bot commented Dec 10, 2017

wt-huang commented Nov 13, 2018

Yash-099 commented Jul 22, 2020

hazemAmir commented Nov 13, 2020

gamertrue commented Sep 28, 2021

How to implement a conv layer with different filter sizes (Zhang & Wallace 2015) ? #6547

How to implement a conv layer with different filter sizes (Zhang & Wallace 2015) ? #6547

Comments

ben0it8 commented May 8, 2017

kgrm commented May 9, 2017

fmailhot commented May 22, 2017 • edited Loading

fmailhot commented May 22, 2017

ben0it8 commented May 22, 2017

fmailhot commented May 22, 2017

ben0it8 commented May 23, 2017

stale bot commented Aug 22, 2017

nikicc commented Sep 11, 2017

stale bot commented Dec 10, 2017

wt-huang commented Nov 13, 2018

Yash-099 commented Jul 22, 2020

hazemAmir commented Nov 13, 2020

gamertrue commented Sep 28, 2021

fmailhot commented May 22, 2017 •

edited

Loading