Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Error when using DeepExplainer on LSTM Model #3593

Open
2 of 4 tasks
jgolliher opened this issue Mar 27, 2024 · 1 comment
Open
2 of 4 tasks

BUG: Error when using DeepExplainer on LSTM Model #3593

jgolliher opened this issue Mar 27, 2024 · 1 comment
Labels
bug Indicates an unexpected problem or unintended behaviour

Comments

@jgolliher
Copy link

jgolliher commented Mar 27, 2024

Issue Description

I'm training a deep neural network predicting if two records refer to the same entity using LSTM layers. I can't get SHAP to work on the LSTM model, but it does provide values on regular Dense layers. I saw #3344 and wanted to provide another example, in case it's helpful.

For context, the model takes two inputs. Each input is an array of text embeddings for various fields (i.e., first_name, last_name, title, etc.), so the shape is (n, 15, 300), where n is the number of records, 15 is the number of fields, and 300 is the length of the text embeddings. These inputs are concatenated together and run through the model which makes a binary prediction (1=Match/0=Non-Match).

This might be a completely separate question, but the SHAP values for the Dense model are the same size as the input. I assume this is expected behavior, but for the text embeddings (size 300) I don't need a value for each number in the embedding array. Does it make sense to take the average of the embedding and use this as the SHAP value? Happy to do more research on this, just wanted to ask the experts.

Happy to provide more details if that's helpful! Thank you for such a useful package- you guys are awesome!

Versions:
python==3.9.18
tensorflow==2.12.0
keras==2.12.0
numpy==1.23.5
shap==0.45.0

Minimal Reproducible Example

##################
## Libraries
##################

import tensorflow as tf
import numpy as np
import shap as shap

##################
## Prepare Data
##################

N = 2000
shape = (15, 300)

# Set seed
np.random.seed(1)

# Generate db1 "embeddings"
db1_embeddings = np.random.rand(N, *shape)

# Generate db2 "embeddings"
db2_embeddings = np.random.rand(N, *shape)

# Generate Y data (0=Unmatched, 1=Matched)
Y = np.random.randint(0, 2, size=N)


###################
## LSTM
###################

input_1 = tf.keras.layers.Input(shape=(None, 300))
input_2 = tf.keras.layers.Input(shape=(None, 300))

combined = tf.keras.layers.concatenate([input_1, input_2], axis=1)

rrn1 = tf.keras.layers.LSTM(32, return_sequences=True)(combined)
rrn2 = tf.keras.layers.LSTM(32, return_sequences=True)(rrn1)
rrn3 = tf.keras.layers.LSTM(32, return_sequences=True)(rrn2)
rrn4 = tf.keras.layers.LSTM(32, return_sequences=False)(rrn3)

outputs = tf.keras.layers.Dense(units=1, activation="sigmoid")(rrn4)


lstm_model = tf.keras.Model(inputs=[input_1, input_2], outputs=outputs)

lstm_model.compile(
    optimizer=tf.keras.optimizers.Adam(learning_rate=0.0001),
    loss="binary_crossentropy",
    metrics=["accuracy"],
)

lstm_model.fit(
    x=[db1_embeddings, db2_embeddings],
    y=Y,
    batch_size=10,
    epochs=1,
)

###################
## LSTM Shap (Doesn't Work)
###################

e = shap.DeepExplainer(lstm_model, data=[db1_embeddings, db2_embeddings])

shap_values = e.shap_values(X=[db1_embeddings[:10], db2_embeddings[:10]])

###################
## DENSE
###################

input_1 = tf.keras.layers.Input(shape=(15,300))
input_2 = tf.keras.layers.Input(shape=(15,300))

combined = tf.keras.layers.concatenate([input_1, input_2], axis=1)

flatten = tf.keras.layers.Flatten()(combined)

rrn1 = tf.keras.layers.Dense(100, activation='relu')(flatten)
rrn2 = tf.keras.layers.Dense(100, activation='relu')(rrn1)
rrn3 = tf.keras.layers.Dense(100, activation='relu')(rrn2)
rrn4 = tf.keras.layers.Dense(100, activation='relu')(rrn3)

outputs = tf.keras.layers.Dense(units=1, activation="sigmoid")(rrn4)


dnn_model = tf.keras.Model(inputs=[input_1, input_2], outputs=outputs)

dnn_model.compile(
    optimizer=tf.keras.optimizers.Adam(learning_rate=0.0001),
    loss="binary_crossentropy",
    metrics=["accuracy"],
)

dnn_model.summary()

dnn_model.fit(
    x=[db1_embeddings, db2_embeddings],
    y=Y,
    batch_size=10,
    epochs=1,
)


###################
## DNN Shap (Works)
###################

e = shap.DeepExplainer(dnn_model, data=[db1_embeddings, db2_embeddings])

shap_values = e.shap_values(X=[db1_embeddings[:10], db2_embeddings[:10]])

Traceback

When running LSTM Shap:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
File /Users/jgolliher/Files/School Stuff/BZAN 554 - Deep Learning/dnn_matching_test.py:35
     29 ###################
     30 ## LSTM Shap (Doesn't Work)
     31 ###################
     33 e = shap.DeepExplainer(lstm_model, data=[db1_embeddings, db2_embeddings])
---> 35 shap_values = e.shap_values(X=[db1_embeddings[:10], db2_embeddings[:10]])

File ~/opt/anaconda3/envs/DNN/lib/python3.9/site-packages/shap/explainers/_deep/__init__.py:124, in Deep.shap_values(self, X, ranked_outputs, output_rank_order, check_additivity)
     90 def shap_values(self, X, ranked_outputs=None, output_rank_order='max', check_additivity=True):
     91     """ Return approximate SHAP values for the model applied to the data given by X.
     92 
     93     Parameters
   (...)
    122         were chosen as "top".
    123     """
--> 124     return self.explainer.shap_values(X, ranked_outputs, output_rank_order, check_additivity=check_additivity)

File ~/opt/anaconda3/envs/DNN/lib/python3.9/site-packages/shap/explainers/_deep/deep_tf.py:319, in TFDeep.shap_values(self, X, ranked_outputs, output_rank_order, check_additivity)
    317 # run attribution computation graph
    318 feature_ind = model_output_ranks[j,i]
--> 319 sample_phis = self.run(self.phi_symbolic(feature_ind), self.model_inputs, joint_input)
    321 # assign the attributions to the right part of the output arrays
    322 for l in range(len(X)):

File ~/opt/anaconda3/envs/DNN/lib/python3.9/site-packages/shap/explainers/_deep/deep_tf.py:379, in TFDeep.run(self, out, model_inputs, X)
    376         tf_execute.record_gradient = tf_backprop.record_gradient
    378     return final_out
--> 379 return self.execute_with_overridden_gradients(anon)

File ~/opt/anaconda3/envs/DNN/lib/python3.9/site-packages/shap/explainers/_deep/deep_tf.py:415, in TFDeep.execute_with_overridden_gradients(self, f)
    413 # define the computation graph for the attribution values using a custom gradient-like computation
    414 try:
--> 415     out = f()
    416 finally:
    417     # reinstate the backpropagatable check
    418     if hasattr(tf_gradients_impl, "_IsBackpropagatable"):

File ~/opt/anaconda3/envs/DNN/lib/python3.9/site-packages/shap/explainers/_deep/deep_tf.py:369, in TFDeep.run.<locals>.anon()
    367 shape = list(self.model_inputs[i].shape)
    368 shape[0] = -1
--> 369 data = X[i].reshape(shape)
    370 v = tf.constant(data, dtype=self.model_inputs[i].dtype)
    371 inputs.append(v)

TypeError: 'NoneType' object cannot be interpreted as an integer

Expected Behavior

No response

Bug report checklist

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest release of shap.
  • I have confirmed this bug exists on the master branch of shap.
  • I'd be interested in making a PR to fix this bug

Installed Versions

0.45.0

@jgolliher jgolliher added the bug Indicates an unexpected problem or unintended behaviour label Mar 27, 2024
@CloseChoice
Copy link
Collaborator

CloseChoice commented Mar 28, 2024

Thanks for reporting a clear and concise example of this bug.

As you already mentioned, this is a known issue that is reported over and over, see e.g. here #3344 (as you mentioned) or #3343. There is a draft PR #3419 that tries to solve it but this is a difficult problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behaviour
Projects
None yet
Development

No branches or pull requests

2 participants