-
Notifications
You must be signed in to change notification settings - Fork 40
InvalidArAgumentError (see above for traceback): indices[478] = 5451 is not in [0, 5451) #43
Comments
|
The 'pivot_ids' and the 'target_ids' in 'skipgram.txt', appear to be 'one-based', E.g. -
The Keras Tokenizer appears to output 1 based ids' and states: "0 is a reserved index that won't be assigned to any word." |
Strange...This totally could be because I was running those other experiments I talked to you about via email. I might have made changes that I shouldn't have pushed back up here. Very stupid bug/mistake on my part. |
In load_glove() the indexes are zero-based. Could this have something to do with the exception?
|
I get this error even with: load_embeds = False
… On Mar 31, 2019, at 12:45 PM, Nathan Raw ***@***.***> wrote:
Strange...This totally could be because I was running those other experiments I talked to you about via email. I might have made changes that I shouldn't have pushed back up here. Very stupid bug/mistake on my part.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <#43 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AC9i24UfR0yT6vLVcSTcDA4KQxj8yuBBks5vcRBPgaJpZM4cQhKV>.
|
Found the source of this issue. Fix coming. It's actually a tensorflow version issue, I think. I don't get the issue on TF v1.5.0. Either way, I'll fix it. |
That make sense. Any details on the issue?
… On Apr 1, 2019, at 9:24 AM, Nathan Raw ***@***.***> wrote:
Found the source of this issue. Fix coming. It's actually a tensorflow version issue, I think. I don't get the issue on TF v1.5.0. Either way, I'll fix it.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <#43 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AC9i20H_R5VHhjPECHzBoKk2aTIHA9oSks5vcjLLgaJpZM4cQhKV>.
|
I believe the easy fix is to pass an |
Issue isn't that keras tokenizer starts at 1. The issue is that we don't have a token representation of idx 0, so it skips it when doing the loop in |
I'm no longer getting the error: |
is appearing in the list of 'closest' words per topic.
|
Ahh... I remember messing with that function a while back when something wasn't working. It was actually related to this issue, most likely. Again, the tensorflow version I'm using didn't give me the InvalidArgumentError, so I didn't know why there was a bug. I'll take a deeper look at the topics function. I think theres probably only a couple small tweaks that need to be made. Will close this issue for now, as this specific problem has been solved. |
I'm getting this error on Epoch 1 of run_20newsgroups.py:
InvalidArgumentError (see above for traceback): indices[478] = 5451 is not in [0, 5451)
[[node word_embed_lookup (defined at /Users/davidlaxer/Lda2vec-Tensorflow/lda2vec/Lda2vec.py:152) = GatherV2[Taxis=DT_INT32, Tindices=DT_INT32, Tparams=DT_FLOAT, _class=["loc:@Optimizer/train/update_word_embedding/AssignSub"], _device="/job:localhost/replica:0/task:0/device:CPU:0"](word_embedding/read, _arg_x_pivot_idxs_0_1, word_embed_lookup/axis)]]
It seems that the 'word_embed_lookup ' tensor contains an embedding reference beyond the length of the embedding_matrix. Any ideas where this 'off by one' issue could be?
similar to:
#5
The text was updated successfully, but these errors were encountered: