Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix only pass RunOptions to keras will trigger core #38562

Merged

Conversation

zhuzilin
Copy link
Contributor

This is a PR from JIZHI, the AI platform in Tencent.

Right now, keras will trigger segmentation fault when only pass RunOptions but no RunMetadata. This bug can be replicated in tf-nightly in this gist

If you cannot open the gist, the code to replicate the bug is

import tensorflow as tf
mnist = tf.keras.datasets.mnist

tf.compat.v1.disable_eager_execution()

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation='softmax')
])

run_options = tf.compat.v1.RunOptions(trace_level=tf.compat.v1.RunOptions.FULL_TRACE)

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'],
              options=run_options)

model.fit(x_train,
          y_train,
          epochs=2)

The reason for this problem is that keras use the TF_SessionRunCallable api instead of TF_Run. While the former was using an unique_ptr to represent the run_metadata and will only new one when RunMetadata was passed. However, the RunOption will need an RunMetadata for tracing. This PR modified the implementation of TF_SessionRunCallable to that of TF_Run and fix this bug.

Thank you for your time on reviewing this PR.

@google-ml-butler google-ml-butler bot added the size:S CL Change Size: Small label Apr 15, 2020
@gbaned gbaned self-assigned this Apr 15, 2020
@gbaned gbaned added this to Assigned Reviewer in PR Queue via automation Apr 15, 2020
@gbaned gbaned requested a review from superbobry April 15, 2020 08:15
@superbobry superbobry requested review from omalleyt12 and removed request for superbobry April 16, 2020 12:46
@gbaned gbaned added the awaiting review Pull request awaiting review label Apr 17, 2020
@zhuzilin
Copy link
Contributor Author

@omalleyt12 Could you have a look at this PR? Thank you!

@gbaned gbaned requested a review from superbobry May 12, 2020 16:22
@superbobry superbobry removed their request for review May 13, 2020 20:15
@gbaned gbaned requested a review from tanzhenyu June 2, 2020 14:51
@zhuzilin
Copy link
Contributor Author

@tanzhenyu
Could you have a look at this pr?

@gbaned
Copy link
Contributor

gbaned commented Jul 14, 2020

@tanzhenyu Can you please review this PR ? Thanks!

@gbaned
Copy link
Contributor

gbaned commented Sep 1, 2020

@tanzhenyu, Can you please review this PR ? Thanks!

@gbaned gbaned requested a review from mhong October 14, 2020 13:34
PR Queue automation moved this from Assigned Reviewer to Approved by Reviewer Oct 28, 2020
@google-ml-butler google-ml-butler bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Oct 28, 2020
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Oct 28, 2020
@gbaned gbaned removed the awaiting review Pull request awaiting review label Oct 29, 2020
@copybara-service copybara-service bot merged commit eaf34ba into tensorflow:master Oct 29, 2020
PR Queue automation moved this from Approved by Reviewer to Merged Oct 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes ready to pull PR ready for merge process size:S CL Change Size: Small
Projects
PR Queue
  
Merged
Development

Successfully merging this pull request may close these issues.

None yet

5 participants