V chguan/add icml ex nlp code #90
V chguan/add icml ex nlp code #90
Conversation
Check out this pull request on ReviewNB: https://app.reviewnb.com/microsoft/nlp/pull/90 Visit www.reviewnb.com to know how we simplify your Jupyter Notebook workflows. |
This is great. Thanks! |
hey @Frozenmad this is really nice. We'll take a look at the code soon. In the meantime, all the images and data that we use in the repo are in blobs, see discussion here. I uploaded the json and png to our blob: https://nlpbp.blob.core.windows.net/images/result.png and https://nlpbp.blob.core.windows.net/data/regular.json. Could you please remove these files and use the links? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks really good
|
||
|
||
def test_train_fixed_length_interp(fixed_length_interp): | ||
fixed_length_interp.optimize(iteration=10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here is there something that is produced and that can be checked?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here is there something that is produced and that can be checked?
This line returns nothing but will change some parameters inside the instance fixed_length_interp
. I'm testing this to make sure the data type (in this function) is matched when calculating. Do you have suggestions testing this function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at the end of optimize there is this instruction self.load_state_dict(state_dict)
, would it be possible to test that there has been some change in the state_dict?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea! Thanks for your suggestion and I'll check if the regular and ratio change (regular shouldn't change and ratio should) in this instance!
@@ -0,0 +1,206 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
several of the comments I added in the other notebook applies here. Please see other comments
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I'll check other comments~
@@ -0,0 +1,206 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the Bert model that has 24 layers? why are you choosing 3 instead of other layer or a group of layers?
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The BERT model we use is the base model in paper, which contains 12 layers only, which I will clarify in the new code. We randomly take a layer in BERT (the 3rd layer here) as an example to show how to explain a certain layer using our tools.
Note that all the layers are explainable via our tools, through similar methods given in this notebook (just change some parameters). Here, we want to show how to use instead of what's the result in this notebook because the user may need to explain the layers in their own models instead of BERT or other BERT model layers.
Do you have suggestions on how to make this clearer? I think I'll change the name of this notebook to something like how to explain a layer in a pretrained model and clarify that this notebook shows only one case. Is that proper?
Thanks for your comments! My graduation ceremony is now finished and I'm back on managing this branch now. I'll try my best to meet your requirements! |
Sure! Thanks for help uploading these files! |
hey @Frozenmad please let me know when you are finished with the changes so I can take a look, is now the best moment to look at the code and should I wait for more changes from your part? |
@miguelgfierro Ops! I'm sorry. The new code is now ready for checking! Thanks for your nice reminder~ |
utils_nlp/interpreter/Interpreter.py
Outdated
|
||
class Interpreter(nn.Module): | ||
""" Interpreter for interpreting one instance. The method is from | ||
paper [Towards a Deep and Unified Understanding of Deep Neural |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor detail, this documentation (at some point) will be parsed by sphinx. The notation for what you want is this:
`Towards a Deep and Unified Understanding of Deep NeuralModels in NLP <http://proceedings.mlr.press/v97/guan19a/guan19a.pdf>`_
It's important to add the final _
for some reason :-0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I'll change the style : )
|
||
|
||
def test_train_fixed_length_interp(fixed_length_interp): | ||
fixed_length_interp.optimize(iteration=10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at the end of optimize there is this instruction self.load_state_dict(state_dict)
, would it be possible to test that there has been some change in the state_dict?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is really good
"x = model.embeddings(token_tensor, segment_tensor)[0]\n", | ||
"\n", | ||
"# extract the Phi we need to explain, suppose the layer we are interested in is layer 3\n", | ||
"def Phi(x):\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor suggestion here, maybe we can add a parameter called layer
to make this function general, instead of harcoding the layer 3 inside the code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea!
"\n", | ||
"# extract the Phi we need to explain, suppose the layer we are interested in is layer 3\n", | ||
"def Phi(x):\n", | ||
" global model\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is making global the model needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems that the code can still run without this line hmmmmm.
I'll remove this line 0.0. Originally, I was thinking to make the model
global so that we don't need to load the model we need every time we call Phi()
.
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# here, we load the regularization we already calculated for simplicity\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here could you just print the regularization value? out of curiosity, how did you get this value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add the print here. We get this by sampling. We first sample part of the training data of BERT, then pass them to BERT and collect the 3rd hidden state and calculate the std values of every dimension of them (the std values are the regular values).
@@ -0,0 +1,234 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hey @Frozenmad, these two notebooks are pretty small. Do you think it make sense to merge them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll try to merge them together~
Hi @miguelgfierro. I've uploaded new codes here! Major changes include:
Thanks for all your advice : ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome! amazing stuff
Description
We have added the code of our ICML paper. The related files are:
interpreter.py
andREADME.md
files under utils_nlp\interpreter. Theinterpreter.py
file is the main functional file we utilize.README.md
is an instruction file on it.explain_simple_model.ipynb
andexplain_BERT_model.ipynb
files under scenarios\interpret_NLP_models for two scenarios on how tointerpreter.py
.test_interpreter.py
under tests\unit. This file contains 6 unit tests forinterpreter.py
(which, in my machine, cost about 2.25s to run).example.png
under utils_nlp\interpreter folder used byREADME.md
, andregular.json
under scenarios\interpret_NLP_models folder used byexplain_BERT_model.ipynb
. I know from other pull requests that files like these are not allowed to merge. So, can anyone help me upload these two files to somewhere? Thanks for your help in advance : )Related Issues
Our issue is #62.
Checklist:
.md
files should I modify or add?).