-
Notifications
You must be signed in to change notification settings - Fork 267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementation of a classifier based on recursive trees #13
Comments
The warning about gradients_impl.py is always there; ignore it. That's
tensorflow complaining that we backpropagated through tf.gather.
Your code seems valid -- there's nothing that jumps out at me. If I had to
guess, I'd guess that some of your input examples exceed the size of the
embedding tables, which would cause a segfault. Try doing some bounds
checks inside the lambdas for the InputTransform. If that fails, I'd
trying putting together a smaller model and testing it on toy examples
until you can get things working.
Hope that helps,
-DeLesley
…On Tue, Feb 21, 2017 at 9:07 AM, Yan-Huang-Cam ***@***.***> wrote:
Hi,
Thank you very much for sharing this wonderful tool!
I tried to implement a classifier based on recursive trees. In general, I
defined a recursive block that turns a tree into a predicted label, and
then used a record block to pair the prediction from the recursive block
with the correct label (y_). The record block was then compiled and a
cross-entropy loss was calculated based on the output (y, y_) of the
compiler. However, when I connected the loss to an optimizer, I got the
following message:
../gradients_impl.py:92: UserWarning: Converting sparse IndexedSlices to a
dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
The code was then stopped when it tried to process the result from
compiler.build_loom_input.
Segmentation fault (core dumped)
I was wondering if it had something to do with the way that I paired the
output of the recursive block and the correct label, or any other mistakes
in the definition of the model? (The computation graph shows that the
'outputgather' node sends a tensor with unknown shape (?, 39) (39 is the
dimension of y and y_) to the 'gradient' node).
My code is as follows:
` # Define the recursive block of 'process tree'
expr_fwd = td.ForwardDeclaration(td.PyObjectType(),
td.TensorType([word_dim,]))
word_embedding_layer = td.Embedding(len(word_embedding_model) + 1,
word_dim, initializer = we_values, trainable = False)
leaf_case = td.InputTransform(lambda node: we_keys.get(node['w'].lower(),
0), name = 'leaf_input_transform') >> td.Scalar('int32') >>
td.Function(word_embedding_layer, name = 'leaf_Function')
dep_embedding_layer = td.Embedding(len(dep_dict), param['dep_dim'], name =
'dep_embedding_layer')
get_dep_embedding = (td.InputTransform(lambda d_label:
dep_dict.get(d_label), name = 'dep_input_transform') >> td.Scalar('int32')
>> td.Function(dep_embedding_layer, name = 'dep_embedding'))
fclayer = td.FC(word_dim, name = 'process_tree_FC')
non_leaf_case = (td.Record({'child': expr_fwd(), 'me': expr_fwd(), 'd':
get_dep_embedding}, name = 'non-leaf_record') >> td.Concat() >>
td.Function(fclayer, name = 'non_leaf_function'))
process_tree = td.OneOf(lambda node: node['is_leaf'], {True : leaf_case,
False: non_leaf_case}, name = 'process_tree_one_of')
expr_fwd.resolve_to(process_tree)
# Define the block which reads the labels ('y_') as well as the input tree
fcplayer_hidden = td.FC(len(y_classes))
block = td.Record({'x': process_tree >> td.Function(fcplayer_hidden), 'y_': td.Vector(len(y_classes), name = 'label_vector')}, name = 'my_block')
# Compile the block
compiler = td.Compiler.create(block)
(y, y_) = compiler.output_tensors
with tf.name_scope('cross_entropy') as scope:
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits = y, labels = y_)
train_step = tf.train.AdamOptimizer(0.5).minimize(cross_entropy)
`
Thank you very much for your attention!
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#13>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AGGbTdpWEE5vBy0Hv2ybhxpRs13UbyXdks5rexnJgaJpZM4MHmEv>
.
--
DeLesley Hutchins | Software Engineer | delesley@google.com | 505-206-0315
|
Hrm, sorry you're getting a segfault! The code does seem fine, I don't know what's going wrong based on what you've provided. Maybe you could share the code that actually generated the segfault? Also you could try running some of our example code (and some TF examples) to see if the problem is with your particular model vs. TF or Fold not running well in general on your machine. |
P.S. One more thing to check is to make sure that the segfault is actually being generated while the code is running, by e.g. adding a print statement at the very end of your code. I ask this because during development we encountered some issues due to the way TF does dynamic library loading that could cause a segfault when unlinking the library (which happens when the python interpreter exits). FWIW we never encountered any segfaults while code was being run, although ipython would occasionally segfault durring tab completion (this was not a Fold problem per se, TF did the same thing in some cases for unclear reasons). |
Thank you very much for your prompt replies and help! Following your advice, I found that the problem can be solved by adding a virtualenv (as is suggested by the installation document -- sorry I omitted this at the beginning). As a result, this problem seemed to result from a conflict between some python modules and TF or TF Fold. However, another problem came up. While the code can run now, the batched cross entropy loss turned out to be the same for each example across all batches during training. I was wondering if there was anything wrong with the training code (a continuation of the code for defining and compiling the blocks in the original post):
Otherwise, would it be possible for you to indicate how to diagnose this problem? Thanks a lot for your attention and time! |
Train code looks ok. What I would recommend doing here is breaking the code for defining your model down into pieces and putting each inside a function. Then if you have e.g. a foo_block() function you can write unit tests against it and/or interactively debug it with foo_block().eval(foo_input) and see that each piece does what you expect. |
Hi,
Thank you very much for sharing this wonderful tool!
I tried to implement a classifier based on recursive trees. In general, I defined a recursive block that turns a tree into a predicted label, and then used a record block to pair the prediction from the recursive block with the correct label (y_). The record block was then compiled and a cross-entropy loss was calculated based on the output (y and y_) of the compiler. However, when I connected the loss to an optimizer, I got the following message:
The code was then stopped when it tried to process the result from compiler.build_loom_input.
I was wondering if it had something to do with the way that I paired the output of the recursive block and the correct label, or any other mistakes in defining the model? (The computation graph shows that the 'outputgather' node sends a tensor with unknown shape (?, 39) (39 is the dimension of y and y_) to the 'gradient' node).
My code is as follows:
Thank you very much for your attention!
The text was updated successfully, but these errors were encountered: