Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unknown: JIT Compilation Failed #65233

Closed
firmanserdana opened this issue Apr 8, 2024 · 4 comments
Closed

Unknown: JIT Compilation Failed #65233

firmanserdana opened this issue Apr 8, 2024 · 4 comments
Assignees
Labels
stale This label marks the issue/pr stale - to be closed automatically if no activity stat:awaiting response Status - Awaiting response from author subtype: ubuntu/linux Ubuntu/Linux Build/Installation Issues TF 2.16 type:bug Bug

Comments

@firmanserdana
Copy link

Issue type

Bug

Have you reproduced the bug with TensorFlow Nightly?

No

Source

source

TensorFlow version

tf 2.16.1

Custom code

Yes

OS platform and distribution

Endeavouros latest - archlinux

Mobile device

No response

Python version

3.9-3.12

Bazel version

No response

GCC/compiler version

No response

CUDA/cuDNN version

12.3 - 12.4/8.9

GPU model and memory

No response

Current behavior?

Somehow JIT failed to run on tuning process, not sure what it is, I am still new here using GPU based TF

CPU based works okay, using TF-Metal on m1 mac also works

Standalone code to reproduce the issue

from keras.models import Sequential
from keras.layers import LSTM, Dense, Dropout
from keras_tuner import RandomSearch
from keras.callbacks import EarlyStopping, ModelCheckpoint, TensorBoard, CSVLogger
from sklearn.metrics import r2_score
from sklearn.preprocessing import StandardScaler
import numpy as np
import matplotlib.pyplot as plt

# rest of your code

# Normalize the data
scaler = StandardScaler()
train_tables_windows = scaler.fit_transform(train_tables_windows).astype('float32')
test_tables_windows = scaler.transform(test_tables_windows).astype('float32')

# Define the model
def build_model(hp):
    model = Sequential()
    model.add(LSTM(units=hp.Int('units', min_value=32, max_value=512, step=32), return_sequences=True, input_shape=(128, 1)))
    model.add(Dropout(0.2))
    model.add(LSTM(units=hp.Int('units', min_value=32, max_value=512, step=32)))
    model.add(Dropout(0.2))
    model.add(Dense(10))
    model.compile(loss='mse', optimizer='adam')
    return model

# Define the tuner
tuner = RandomSearch(
    build_model,
    objective='val_loss',
    max_trials=5,
    executions_per_trial=3,
    directory='my_dir',
    project_name='helloworld')

# Define callbacks
early_stopping = EarlyStopping(monitor='val_loss', patience=20)
model_checkpoint = ModelCheckpoint('best_model.keras', monitor='val_loss', save_best_only=True)
tensorboard = TensorBoard(log_dir=os.path.join('logs'))
csv_logger = CSVLogger('training.log')

callbacks = [early_stopping, model_checkpoint, tensorboard, csv_logger]

# Fit the model
tuner.search(train_tables_windows, train_labels_windows, epochs=50, validation_split=0.2, callbacks=callbacks)

# Get the optimal hyperparameters
best_hps=tuner.get_best_hyperparameters(num_trials=1)[0]

# Build the model with the optimal hyperparameters and train it on the data
model = tuner.hypermodel.build(best_hps)

# Fit the model
model.fit(train_tables_windows, train_labels_windows, epochs=50, batch_size=32, callbacks=callbacks)

# Evaluate the model
mse = model.evaluate(test_tables_windows, test_labels_windows)
predictions = model.predict(test_tables_windows)

print('Mean Squared Error:', mse)

# Calculate R2 score
r2 = r2_score(test_labels_windows, predictions)
print('R2:', r2)

# Plot the predicted angles against the true angles
for i in range(10):
    plt.figure(figsize=(16,3))
    plt.plot(predictions[1:200,i], label='Predicted')
    plt.plot(np.array(test_labels_windows.iloc[1:200, i]), label='True')
    plt.xlabel('Time')
    plt.ylabel('Angle')
    plt.title('Predicted vs True Angles of Joint {}'.format(i))
    plt.legend()
    plt.show()

Relevant log output

Trial 2 Complete [00h 00m 02s]

Best val_loss So Far: None
Total elapsed time: 00h 00m 03s

Search: Running Trial #3

Value             |Best Value So Far |Hyperparameter
192               |128               |units

Epoch 1/50
/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/keras/src/layers/rnn/rnn.py:204: UserWarning: Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.
  super().__init__(**kwargs)
2024-04-08 15:31:39.985374: W tensorflow/core/framework/op_kernel.cc:1827] UNKNOWN: JIT compilation failed.
2024-04-08 15:31:39.985401: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: UNKNOWN: JIT compilation failed.
	 [[{{function_node __inference_one_step_on_data_15483}}{{node adam/Pow_5}}]]
2024-04-08 15:31:39.985408: I tensorflow/core/framework/local_rendezvous.cc:426] Local rendezvous send item cancelled. Key hash: 2708995372197654331
2024-04-08 15:31:39.985417: W tensorflow/core/framework/op_kernel.cc:1827] UNKNOWN: JIT compilation failed.
2024-04-08 15:31:39.985424: W tensorflow/core/framework/op_kernel.cc:1827] UNKNOWN: JIT compilation failed.
2024-04-08 15:31:39.985430: W tensorflow/core/framework/op_kernel.cc:1827] UNKNOWN: JIT compilation failed.
2024-04-08 15:31:39.985436: W tensorflow/core/framework/op_kernel.cc:1827] UNKNOWN: JIT compilation failed.
2024-04-08 15:31:39.985442: W tensorflow/core/framework/op_kernel.cc:1827] UNKNOWN: JIT compilation failed.
2024-04-08 15:31:39.985450: W tensorflow/core/framework/op_kernel.cc:1827] UNKNOWN: JIT compilation failed.
2024-04-08 15:31:39.985455: W tensorflow/core/framework/op_kernel.cc:1827] UNKNOWN: JIT compilation failed.
2024-04-08 15:31:39.985461: W tensorflow/core/framework/op_kernel.cc:1827] UNKNOWN: JIT compilation failed.
2024-04-08 15:31:39.985467: W tensorflow/core/framework/op_kernel.cc:1827] UNKNOWN: JIT compilation failed.
2024-04-08 15:31:39.985473: W tensorflow/core/framework/op_kernel.cc:1827] UNKNOWN: JIT compilation failed.
2024-04-08 15:31:39.985478: W tensorflow/core/framework/op_kernel.cc:1827] UNKNOWN: JIT compilation failed.
2024-04-08 15:31:39.985484: W tensorflow/core/framework/op_kernel.cc:1827] UNKNOWN: JIT compilation failed.
2024-04-08 15:31:39.985489: W tensorflow/core/framework/op_kernel.cc:1827] UNKNOWN: JIT compilation failed.
2024-04-08 15:31:39.985495: W tensorflow/core/framework/op_kernel.cc:1827] UNKNOWN: JIT compilation failed.
2024-04-08 15:31:39.985500: W tensorflow/core/framework/op_kernel.cc:1827] UNKNOWN: JIT compilation failed.
Traceback (most recent call last):
  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/keras_tuner/src/engine/base_tuner.py", line 274, in _try_run_and_update_trial
  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/keras_tuner/src/engine/base_tuner.py", line 239, in _run_and_update_trial
  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/keras_tuner/src/engine/tuner.py", line 314, in run_trial
  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/keras_tuner/src/engine/tuner.py", line 233, in _build_and_fit_model
  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/keras_tuner/src/engine/hypermodel.py", line 149, in fit
  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/keras/src/utils/traceback_utils.py", line 122, in error_handler
  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/tensorflow/python/eager/execute.py", line 53, in quick_execute
tensorflow.python.framework.errors_impl.UnknownError: Graph execution error:

Detected at node adam/Pow_5 defined at (most recent call last):
  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/runpy.py", line 197, in _run_module_as_main

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/runpy.py", line 87, in _run_code

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/ipykernel_launcher.py", line 18, in <module>

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/traitlets/config/application.py", line 1075, in launch_instance

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/ipykernel/kernelapp.py", line 739, in start

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/tornado/platform/asyncio.py", line 205, in start

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/asyncio/base_events.py", line 601, in run_forever

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/asyncio/base_events.py", line 1905, in _run_once

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/asyncio/events.py", line 80, in _run

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/ipykernel/kernelbase.py", line 545, in dispatch_queue

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/ipykernel/kernelbase.py", line 534, in process_one

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/ipykernel/kernelbase.py", line 437, in dispatch_shell

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/ipykernel/ipkernel.py", line 359, in execute_request

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/ipykernel/kernelbase.py", line 778, in execute_request

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/ipykernel/ipkernel.py", line 446, in do_execute

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/ipykernel/zmqshell.py", line 549, in run_cell

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3048, in run_cell

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3103, in _run_cell

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/IPython/core/async_helpers.py", line 129, in _pseudo_sync_runner

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3308, in run_cell_async

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3490, in run_ast_nodes

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3550, in run_code

  File "/tmp/ipykernel_726232/2121198222.py", line 49, in <module>

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/keras_tuner/src/engine/base_tuner.py", line 234, in search

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/keras_tuner/src/engine/base_tuner.py", line 274, in _try_run_and_update_trial

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/keras_tuner/src/engine/base_tuner.py", line 239, in _run_and_update_trial

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/keras_tuner/src/engine/tuner.py", line 314, in run_trial

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/keras_tuner/src/engine/tuner.py", line 233, in _build_and_fit_model

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/keras_tuner/src/engine/hypermodel.py", line 149, in fit

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/keras/src/utils/traceback_utils.py", line 117, in error_handler

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/keras/src/backend/tensorflow/trainer.py", line 325, in fit

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/keras/src/backend/tensorflow/trainer.py", line 118, in one_step_on_iterator

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/keras/src/backend/tensorflow/trainer.py", line 106, in one_step_on_data

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/keras/src/backend/tensorflow/trainer.py", line 73, in train_step

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/keras/src/optimizers/base_optimizer.py", line 269, in apply_gradients

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/keras/src/optimizers/base_optimizer.py", line 330, in apply

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/keras/src/optimizers/base_optimizer.py", line 380, in _backend_apply_gradients

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/keras/src/backend/tensorflow/optimizer.py", line 117, in _backend_update_step

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/keras/src/backend/tensorflow/optimizer.py", line 131, in _distributed_tf_update_step

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/keras/src/backend/tensorflow/optimizer.py", line 128, in apply_grad_to_update_var

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/keras/src/optimizers/adam.py", line 119, in update_step

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/keras/src/ops/numpy.py", line 5649, in power

  File "/home/firep1/Documents/gitworks/phd/ReWire/intracortical-decoding/.conda/lib/python3.9/site-packages/keras/src/backend/tensorflow/numpy.py", line 1886, in power

JIT compilation failed.
	 [[{{node adam/Pow_5}}]] [Op:__inference_one_step_on_iterator_15546]
@google-ml-butler google-ml-butler bot added the type:bug Bug label Apr 8, 2024
@Venkat6871 Venkat6871 added TF 2.16 subtype: ubuntu/linux Ubuntu/Linux Build/Installation Issues labels Apr 10, 2024
@Venkat6871
Copy link

Hi @firmanserdana ,

Sorry for the delay, I reproduced the code shared but facing different error .Could you please share the colab gist with all the dependencies to analyze more of it. Here i providing gist for reference.

Thank you!

@Venkat6871 Venkat6871 added the stat:awaiting response Status - Awaiting response from author label Apr 15, 2024
Copy link

This issue is stale because it has been open for 7 days with no activity. It will be closed if no further activity occurs. Thank you.

@github-actions github-actions bot added the stale This label marks the issue/pr stale - to be closed automatically if no activity label Apr 23, 2024
Copy link

github-actions bot commented May 1, 2024

This issue was closed because it has been inactive for 7 days since being marked as stale. Please reopen if you'd like to work on this further.

@github-actions github-actions bot closed this as completed May 1, 2024
Copy link

Are you satisfied with the resolution of your issue?
Yes
No

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale This label marks the issue/pr stale - to be closed automatically if no activity stat:awaiting response Status - Awaiting response from author subtype: ubuntu/linux Ubuntu/Linux Build/Installation Issues TF 2.16 type:bug Bug
Projects
None yet
Development

No branches or pull requests

2 participants