InvalidArgumentError indices [in docvecs] #3

systats · 2019-03-06T20:41:57Z

Hi again. I translated your model to keras (in R):

n_topics = 4
input_dim = 10000
n_doc = 11995

input_d <- keras::layer_input(shape = 1, dtype="int32")
input_w <- keras::layer_input(shape = 1, dtype="int32")
  
embed_d <- input_d %>% 
 keras::layer_embedding(
   input_dim = n_doc, 
   output_dim = n_topics, 
   input_length = 1, 
   activity_regularizer = keras::regularizer_l1(0.000002), 
   name="docvecs"
 ) %>% 
 layer_activation("relu") %>% 
 layer_reshape(c(n_topics, 1))

embed_w <- input_w %>% 
 keras::layer_embedding(
   input_dim = input_dim, 
   output_dim = n_topics, 
   input_length = 1, 
   activity_regularizer = keras::regularizer_l1(0.000000015), 
   name="wordvecs"
 ) %>% 
 layer_activation("relu") %>% 
 layer_reshape(c(n_topics, 1))

dot_prod <- keras::layer_dot(list(embed_d, embed_w), axes = 1, normalize = F) %>% 
 keras::layer_reshape(target_shape = 1)

output <- dot_prod %>% 
  layer_activation("sigmoid")

model <- keras::keras_model(inputs = list(input_d, input_w), outputs = output) %>% 
  keras::compile(
   loss = "binary_crossentropy",
   optimizer = "adam"
  )

With the following data input:

Observations: 2,706,666
Variables: 3
$ doc_id   <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
$ token_id <int> 102, 2269, 113, 8360, 8746, 566, 496, 5930, 113, 119, 17, 2356, 803, …
$ outcome  <dbl> 1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, …

$doc_id
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
      1    2966    5990    5994    9014   11995 
$token_id
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
      1     493    2231    3248    5640   10000 
$outcome
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
    0.0     0.0     0.5     0.5     1.0     1.0

But I get an error and don't know exactly why -

model %>% 
 fit(
  x = list(sam$doc_id, sam$token_id), sam$outcome,
  batch_size = 100,
  epochs = 2,
  shuffle = TRUE
)

Error in py_call_impl(callable, dots$args, dots$keywords) : 
  InvalidArgumentError: indices[52,0] = 11995 is not in [0, 11995)
	 [[Node: docvecs_7/embedding_lookup = GatherV2[Taxis=DT_INT32, Tindices=DT_INT32, Tparams=DT_FLOAT, _class=["loc:@training_3/Adam/Assign_2"], _device="/job:localhost/replica:0/task:0/device:CPU:0"](docvecs_7/embeddings/read, _arg_input_17_0_0, training_3/Adam/gradients/docvecs_7/embedding_lookup_grad/concat/axis)]]

Do you know how to fix this?

Would be great! Thanks in Advance,
Simon

The text was updated successfully, but these errors were encountered:

systats · 2019-03-06T20:58:21Z

https://stackoverflow.com/questions/51223936/tensorflow-invalidargumenterror-indices-while-training-with-keras

The embedding input must be input_dim + 1 and n_doc + 1

systats closed this as completed Mar 6, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

InvalidArgumentError indices [in docvecs] #3

InvalidArgumentError indices [in docvecs] #3

systats commented Mar 6, 2019 •

edited

Loading

systats commented Mar 6, 2019

InvalidArgumentError indices [in docvecs] #3

InvalidArgumentError indices [in docvecs] #3

Comments

systats commented Mar 6, 2019 • edited Loading

systats commented Mar 6, 2019

systats commented Mar 6, 2019 •

edited

Loading