EmbedLayer #1872

jeffdonahue · 2015-02-15T23:54:11Z

Based on #1486 (N-D blobs) and #1663 (parameter gradient accumulation). This adds EmbedLayer (should probably change the name to EmbeddingLayer for consistency with PoolingLayer etc.), which essentially learns a lookup table for integer inputs, useful for language modeling and such. Its computation is equivalent to an InnerProductLayer where the inputs are "one-hot" vectors, but instead of explicitly representing the one-hot vectors (which wastes lots of memory), this assumes the input itself is the indices of the "hot" index of those one-hot vectors (like the label inputs for the categorical losses). This should probably be replaced with SparseInnerProduct (#937) once that's merged, assuming that's faster -- this is a more lightweight change (or at least it will be once #1486 is merged) that continues the unfortunate trend of casting floats to ints as labels.

* A sample code was added. * `slice_dim` and `slice_point` attributes were explained.

[docs] brief explanation of SLICE layer's attributes

See https://github.com/BVLC/caffe/blob/master/models/bvlc_reference_caffenet/solver.prototxt

Correct 'epochs' to 'iterations'

Next: release candidater

…current project

fix Imagenet example path

@mees

set the right rpath for tools and examples respectively thanks for the report @mees!

[build] fix dynamic linking of tools

… was overwritten with symlink created at build time and installed with install(DIRECTORY ...)

… systems). This commit specifies Python2 with which cpp_lint.py works :-)

[cmake] fix install rpath for pycaffe

num/channnels/height/width indexing is valid.

from saved NetParameter Want to keep the param Blob shape the layer has set, and not necessarily adopt the one from the saved net (e.g. want to keep new 1D bias shape, rather than take the (1 x 1 x 1 x D) shape from a legacy net).

Blobs are N-D arrays (for N not necessarily equals 4)

cmake and python3 bugfixes for BVLC#1939 and BVLC#1923

When setting the mean, assert that it is either one pixel or an array with shape equal to the input data size.

Check shape of input mean

(With layers whose backwards accumlate gradients), this effectively decouples the computational batch from the SGD minibatch. Each iteration accumulates gradients over iter_size batches, then parameters are updated.

(double impl from NVIDIA dev docs; float impl included in CUDA as "atomicAdd")

with unit tests

jzhang533 · 2015-03-10T00:21:43Z

EmbedLayer has a blob stores the vocabulary_size X embedding_size embeddings, during Forwards/Backward, only invloved words' embeddings are used for computation, but all the embeddings (the whole blob) are updated during solving (Solver::ComputeUpdateValue, Blob::Update).

Is my understanding correct?

jeffdonahue · 2015-03-10T00:23:30Z

@jzhang533 yes that's correct; it has the same behavior as other parameter layers (InnerProductLayer, ConvolutionLayer).

jzhang533 · 2015-03-10T01:07:08Z

@jeffdonahue thanks for clarify, am trying to learn embeddings for a large vocabulary, will try to figure out a way to avoid needless computation during solving.

shelhamer and others added 7 commits January 24, 2015 18:27

clarify draw_net.py usage: net prototxt, not caffemodel

2f869e7

[docs] ask install + hardware questions on caffe-users

61c63f6

[docs] send API link to class list

4cc8195

[docs] add check mode hint to CPU-only mode error

1f7c3de

Brief explanation of SLICE layer's attributes

8b96472

* A sample code was added. * `slice_dim` and `slice_point` attributes were explained.

lint 1f7c3de

75d0e16

Merge pull request BVLC#1817 from boechat107/patch-1

e3c895b

[docs] brief explanation of SLICE layer's attributes

jeffdonahue force-pushed the embed-layer branch 2 times, most recently from 2c2248a to 8a5e448 Compare February 16, 2015 00:00

jeffdonahue mentioned this pull request Feb 16, 2015

Unrolled recurrent layers (RNN, LSTM) #1873

Closed

jeffdonahue force-pushed the embed-layer branch 3 times, most recently from d618f70 to 811d0fa Compare February 16, 2015 08:14

Brandon Amos and others added 2 commits February 16, 2015 15:09

Correct 'epochs' to 'iterations'

1e0d49a

See https://github.com/BVLC/caffe/blob/master/models/bvlc_reference_caffenet/solver.prototxt

Merge pull request BVLC#1879 from bamos/patch-1

3e9b050

Correct 'epochs' to 'iterations'

jeffdonahue force-pushed the embed-layer branch from 811d0fa to 697012c Compare February 16, 2015 21:45

shelhamer and others added 3 commits February 19, 2015 18:35

Merge pull request BVLC#1849 from BVLC/next

f998127

Next: release candidater

Updated the path for get_ilsvrc_aux.sh to match what is found in the …

af01b9c

…current project

Merge pull request BVLC#1914 from eerwitt/master

5ee85b7

fix Imagenet example path

shelhamer mentioned this pull request Feb 20, 2015

Support Word2Vec tasks #1361

Closed

hannes-brt mentioned this pull request Feb 20, 2015

Layers with heterogeneous data types #1918

Open

shelhamer and others added 9 commits February 20, 2015 11:21

[build] fix dynamic linking of tools

eabbccd

set the right rpath for tools and examples respectively thanks for the report @mees!

Merge pull request BVLC#1921 from shelhamer/fix-tool-linking

682d9da

[build] fix dynamic linking of tools

check caffe tool runs in runtest

5a26333

ignore pycharm files

a1e951d

set proper CMAKE_INSTALL_RPATH for _caffe.so and tools

fca05c3

fixed bug in install-tree: _caffe.so installed by install(TARGET ...)…

645aa03

… was overwritten with symlink created at build time and installed with install(DIRECTORY ...)

minor cmake sumamry log fix

5e06d16

cpp_lint.py fails silently with Python3 (which is the default on some…

569ae01

… systems). This commit specifies Python2 with which cpp_lint.py works :-)

Merge pull request BVLC#1939 from Nerei/bugfix/install_rpath_for_pycaffe

cb1f4d6

[cmake] fix install rpath for pycaffe

jeffdonahue and others added 9 commits March 3, 2015 15:55

ImageDataLayer outputs 1D labels

c87a136

WindowDataLayer outputs 1D labels

9505001

EuclideanLossLayer: generalized Blob axes

fcbb933

DummyDataLayer outputs blobs of arbitrary shape

7462c84

Add CHECK_EQ(4, ...)s to "vision layers" to enforce that the

69fc1f6

num/channnels/height/width indexing is valid.

PyBlobs support generalized axes

269dafa

[pycaffe] expose Blob.reshape as *args function

aa242aa

[pytest] use non-4d blobs in test_python_layer

8c79d65

jeffdonahue force-pushed the embed-layer branch from 46b6de2 to 49ad10c Compare March 4, 2015 00:40

shelhamer and others added 14 commits March 3, 2015 22:27

Merge pull request BVLC#1970 from jeffdonahue/tensor-blob

85bb397

Blobs are N-D arrays (for N not necessarily equals 4)

Merge pull request BVLC#1966 from philkr/python_fixes

136139b

cmake and python3 bugfixes for BVLC#1939 and BVLC#1923

Add error checking for image mean

642619b

When setting the mean, assert that it is either one pixel or an array with shape equal to the input data size.

Merge pull request BVLC#2031 from NVIDIA/image_mean

c0bc17c

Check shape of input mean

fix comment I forgot about from @shelhamer's review of BVLC#1970

dec148e

zero-init param diffs and accumulate gradients

fdf9846

(With layers whose backwards accumlate gradients), this effectively decouples the computational batch from the SGD minibatch. Each iteration accumulates gradients over iter_size batches, then parameters are updated.

zero-init param diffs in gradient checker

3285efc

accumulate gradients in inner product layer

0410dab

accumulate gradients in (de)conv layers

32a7cad

accumulate gradients in cudnn conv layer

db16606

Add gpu_util.cuh, with caffe_gpu_atomic_add

894d063

(double impl from NVIDIA dev docs; float impl included in CUDA as "atomicAdd")

test_gradient_check_util: check_bottom < -1 only checks params

ab68f93

Add EmbedLayer for inner products with sparse input (one-hot vectors),

6e38bb7

with unit tests

EmbedBackward with no loops -- use caffe_gpu_atomic_add instead

2db4482

jeffdonahue force-pushed the embed-layer branch from 49ad10c to 2db4482 Compare March 4, 2015 19:24

jeffdonahue closed this Mar 4, 2015

jeffdonahue mentioned this pull request Mar 4, 2015

Embed layer #2032

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EmbedLayer #1872

EmbedLayer #1872

jeffdonahue commented Feb 15, 2015

jzhang533 commented Mar 10, 2015

jeffdonahue commented Mar 10, 2015

jzhang533 commented Mar 10, 2015

EmbedLayer #1872

EmbedLayer #1872

Conversation

jeffdonahue commented Feb 15, 2015

jzhang533 commented Mar 10, 2015

jeffdonahue commented Mar 10, 2015

jzhang533 commented Mar 10, 2015