Implementation of a classifier based on recursive trees #13

Yan-Huang-Cam · 2017-02-21T17:07:19Z

Hi,

Thank you very much for sharing this wonderful tool!

I tried to implement a classifier based on recursive trees. In general, I defined a recursive block that turns a tree into a predicted label, and then used a record block to pair the prediction from the recursive block with the correct label (y_). The record block was then compiled and a cross-entropy loss was calculated based on the output (y and y_) of the compiler. However, when I connected the loss to an optimizer, I got the following message:

../gradients_impl.py:92: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "

The code was then stopped when it tried to process the result from compiler.build_loom_input.

Segmentation fault (core dumped)

I was wondering if it had something to do with the way that I paired the output of the recursive block and the correct label, or any other mistakes in defining the model? (The computation graph shows that the 'outputgather' node sends a tensor with unknown shape (?, 39) (39 is the dimension of y and y_) to the 'gradient' node).

My code is as follows:

    # Define the recursive block of 'process tree'
    expr_fwd = td.ForwardDeclaration(td.PyObjectType(), td.TensorType([word_dim,]))
    word_embedding_layer = td.Embedding(len(word_embedding_model) + 1, word_dim, initializer = we_values, trainable = False)
    leaf_case = td.InputTransform(lambda node: we_keys.get(node['w'].lower(), 0), name = 'leaf_input_transform') >> td.Scalar('int32') >> td.Function(word_embedding_layer, name = 'leaf_Function')
    dep_embedding_layer = td.Embedding(len(dep_dict), param['dep_dim'], name = 'dep_embedding_layer')
    get_dep_embedding = (td.InputTransform(lambda d_label: dep_dict.get(d_label), name = 'dep_input_transform') >> td.Scalar('int32') >> td.Function(dep_embedding_layer, name = 'dep_embedding'))
    fclayer = td.FC(word_dim, name = 'process_tree_FC')
    non_leaf_case = (td.Record({'child': expr_fwd(), 'me': expr_fwd(), 'd': get_dep_embedding}, name = 'non-leaf_record') >> td.Concat() >> td.Function(fclayer, name = 'non_leaf_function'))
    process_tree = td.OneOf(lambda node: node['is_leaf'], {True : leaf_case, False: non_leaf_case}, name = 'process_tree_one_of')
    expr_fwd.resolve_to(process_tree)

    # Define the block which pairs the label ('y_') with the prediction from the recursive tree
    fcplayer_hidden = td.FC(len(y_classes))
    block = td.Record({'x': process_tree >> td.Function(fcplayer_hidden), 'y_': td.Vector(len(y_classes), name = 'label_vector')}, name = 'my_block')

    # Compile the block
    compiler = td.Compiler.create(block)
    (y, y_) = compiler.output_tensors
    with tf.name_scope('cross_entropy') as scope:
      cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits = y, labels = y_)
    train_step = tf.train.AdamOptimizer(0.5).minimize(cross_entropy)

Thank you very much for your attention!

The text was updated successfully, but these errors were encountered:

delesley · 2017-02-21T22:50:44Z

The warning about gradients_impl.py is always there; ignore it. That's tensorflow complaining that we backpropagated through tf.gather. Your code seems valid -- there's nothing that jumps out at me. If I had to guess, I'd guess that some of your input examples exceed the size of the embedding tables, which would cause a segfault. Try doing some bounds checks inside the lambdas for the InputTransform. If that fails, I'd trying putting together a smaller model and testing it on toy examples until you can get things working. Hope that helps, -DeLesley

…

On Tue, Feb 21, 2017 at 9:07 AM, Yan-Huang-Cam ***@***.***> wrote: Hi, Thank you very much for sharing this wonderful tool! I tried to implement a classifier based on recursive trees. In general, I defined a recursive block that turns a tree into a predicted label, and then used a record block to pair the prediction from the recursive block with the correct label (y_). The record block was then compiled and a cross-entropy loss was calculated based on the output (y, y_) of the compiler. However, when I connected the loss to an optimizer, I got the following message: ../gradients_impl.py:92: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory. "Converting sparse IndexedSlices to a dense Tensor of unknown shape. " The code was then stopped when it tried to process the result from compiler.build_loom_input. Segmentation fault (core dumped) I was wondering if it had something to do with the way that I paired the output of the recursive block and the correct label, or any other mistakes in the definition of the model? (The computation graph shows that the 'outputgather' node sends a tensor with unknown shape (?, 39) (39 is the dimension of y and y_) to the 'gradient' node). My code is as follows: ` # Define the recursive block of 'process tree' expr_fwd = td.ForwardDeclaration(td.PyObjectType(), td.TensorType([word_dim,])) word_embedding_layer = td.Embedding(len(word_embedding_model) + 1, word_dim, initializer = we_values, trainable = False) leaf_case = td.InputTransform(lambda node: we_keys.get(node['w'].lower(), 0), name = 'leaf_input_transform') >> td.Scalar('int32') >> td.Function(word_embedding_layer, name = 'leaf_Function') dep_embedding_layer = td.Embedding(len(dep_dict), param['dep_dim'], name = 'dep_embedding_layer') get_dep_embedding = (td.InputTransform(lambda d_label: dep_dict.get(d_label), name = 'dep_input_transform') >> td.Scalar('int32') >> td.Function(dep_embedding_layer, name = 'dep_embedding')) fclayer = td.FC(word_dim, name = 'process_tree_FC') non_leaf_case = (td.Record({'child': expr_fwd(), 'me': expr_fwd(), 'd': get_dep_embedding}, name = 'non-leaf_record') >> td.Concat() >> td.Function(fclayer, name = 'non_leaf_function')) process_tree = td.OneOf(lambda node: node['is_leaf'], {True : leaf_case, False: non_leaf_case}, name = 'process_tree_one_of') expr_fwd.resolve_to(process_tree) # Define the block which reads the labels ('y_') as well as the input tree fcplayer_hidden = td.FC(len(y_classes)) block = td.Record({'x': process_tree >> td.Function(fcplayer_hidden), 'y_': td.Vector(len(y_classes), name = 'label_vector')}, name = 'my_block') # Compile the block compiler = td.Compiler.create(block) (y, y_) = compiler.output_tensors with tf.name_scope('cross_entropy') as scope: cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits = y, labels = y_) train_step = tf.train.AdamOptimizer(0.5).minimize(cross_entropy) ` Thank you very much for your attention! — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#13>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGGbTdpWEE5vBy0Hv2ybhxpRs13UbyXdks5rexnJgaJpZM4MHmEv> .

-- DeLesley Hutchins | Software Engineer | delesley@google.com | 505-206-0315

moshelooks · 2017-02-22T01:51:33Z

Hrm, sorry you're getting a segfault! The code does seem fine, I don't know what's going wrong based on what you've provided. Maybe you could share the code that actually generated the segfault?

Also you could try running some of our example code (and some TF examples) to see if the problem is with your particular model vs. TF or Fold not running well in general on your machine.

moshelooks · 2017-02-22T01:56:32Z

P.S. One more thing to check is to make sure that the segfault is actually being generated while the code is running, by e.g. adding a print statement at the very end of your code. I ask this because during development we encountered some issues due to the way TF does dynamic library loading that could cause a segfault when unlinking the library (which happens when the python interpreter exits). FWIW we never encountered any segfaults while code was being run, although ipython would occasionally segfault durring tab completion (this was not a Fold problem per se, TF did the same thing in some cases for unclear reasons).

Yan-Huang-Cam · 2017-02-22T20:37:45Z

Thank you very much for your prompt replies and help! Following your advice, I found that the problem can be solved by adding a virtualenv (as is suggested by the installation document -- sorry I omitted this at the beginning). As a result, this problem seemed to result from a conflict between some python modules and TF or TF Fold.

However, another problem came up. While the code can run now, the batched cross entropy loss turned out to be the same for each example across all batches during training. I was wondering if there was anything wrong with the training code (a continuation of the code for defining and compiling the blocks in the original post):

     init = tf.global_variables_initializer()
     sess = tf.Session()
     sess.run(init)
     tf.summary.FileWriter('./tf_graph', graph = sess.graph)
     batch_size = 30

     train_set = compiler.build_loom_inputs(Input_train_tf)
     train_feed_dict = {}
     dev_feed_dict = compiler.build_feed_dict(Input_dev_tf)
     for epoch, shuffled in enumerate(td.epochs(train_set, epochs), 1):
       train_loss = 0.0
       for batch in td.group_by_batches(shuffled, batch_size):
        train_feed_dict[compiler.loom_input_tensor] = batch
         _, batch_loss = sess.run([train_step, cross_entropy], train_feed_dic    t)
         print batch_loss
         train_loss += np.sum(batch_loss)
      dev_loss = np.average(sess.run(cross_entropy, dev_feed_dict))
      print dev_loss

Otherwise, would it be possible for you to indicate how to diagnose this problem? Thanks a lot for your attention and time!

moshelooks · 2017-03-01T21:19:33Z

Train code looks ok. What I would recommend doing here is breaking the code for defining your model down into pieces and putting each inside a function. Then if you have e.g. a foo_block() function you can write unit tests against it and/or interactively debug it with

foo_block().eval(foo_input)

and see that each piece does what you expect.

eragonruan mentioned this issue Nov 8, 2017

UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory. This may consume a large amount of memory. eragonruan/text-detection-ctpn#27

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation of a classifier based on recursive trees #13

Implementation of a classifier based on recursive trees #13

Yan-Huang-Cam commented Feb 21, 2017 •

edited

delesley commented Feb 21, 2017 via email

moshelooks commented Feb 22, 2017

moshelooks commented Feb 22, 2017 •

edited

Yan-Huang-Cam commented Feb 22, 2017

moshelooks commented Mar 1, 2017

Implementation of a classifier based on recursive trees #13

Implementation of a classifier based on recursive trees #13

Comments

Yan-Huang-Cam commented Feb 21, 2017 • edited

delesley commented Feb 21, 2017 via email

moshelooks commented Feb 22, 2017

moshelooks commented Feb 22, 2017 • edited

Yan-Huang-Cam commented Feb 22, 2017

moshelooks commented Mar 1, 2017

Yan-Huang-Cam commented Feb 21, 2017 •

edited

moshelooks commented Feb 22, 2017 •

edited