New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: ICE/PDP explainer #1284
feat: ICE/PDP explainer #1284
Conversation
…ICEExplainer.scala Co-authored-by: Jason Wang <jasonwang_83@hotmail.com>
…ICEExplainer.scala Co-authored-by: Jason Wang <jasonwang_83@hotmail.com>
…ICEExplainer.scala Co-authored-by: Jason Wang <jasonwang_83@hotmail.com>
…park into ezherdeva/ice_pdp
…ICEExplainer.scala Co-authored-by: Jason Wang <jasonwang_83@hotmail.com>
…park into ezherdeva/ice_pdp
…ICEExplainer.scala Co-authored-by: Jason Wang <jasonwang_83@hotmail.com>
…park into ezherdeva/ice_pdp
…g/FuzzingTest.scala Co-authored-by: Kashyap Patel <64443771+ms-kashyap@users.noreply.github.com>
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
core/src/main/scala/com/microsoft/azure/synapse/ml/explainers/ICEExplainer.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much for this fantastic work! I put in a few nits but to give it a good review perhaps we should chat so I can get a better idea of what this does then
val result = predicted.withColumn(targetCol, explainTarget) | ||
|
||
getKind.toLowerCase match { | ||
case this.averageKind => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: idt you need the "this" here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added backticks, because it's looking for a stable identifier. Ref: https://stackoverflow.com/questions/7078022/why-does-pattern-matching-in-scala-not-work-with-variables
val targetClasses = DatasetExtensions.findUnusedColumnName("targetClasses", df) | ||
val dfWithId = df | ||
.withColumn(idCol, monotonically_increasing_id()) | ||
.withColumn(targetClasses, this.get(targetClassesCol).map(col).getOrElse(lit(getTargetClasses))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: use getters directly here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
targetClassesCol is Optional by design, that's why we're using get like this
core/src/main/scala/com/microsoft/azure/synapse/ml/explainers/ICEExplainer.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/com/microsoft/azure/synapse/ml/explainers/ICEExplainer.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/com/microsoft/azure/synapse/ml/explainers/ICEExplainer.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/com/microsoft/azure/synapse/ml/explainers/ICEExplainer.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/com/microsoft/azure/synapse/ml/explainers/ICEExplainer.scala
Outdated
Show resolved
Hide resolved
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a bit more detailed feedback. Looks awesome though and appreciate all the hard work and iterations!
core/src/main/scala/com/microsoft/azure/synapse/ml/explainers/ICEExplainer.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/com/microsoft/azure/synapse/ml/explainers/ICEExplainer.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/com/microsoft/azure/synapse/ml/explainers/ICEExplainer.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/com/microsoft/azure/synapse/ml/explainers/ICEFeature.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/com/microsoft/azure/synapse/ml/explainers/ICEFeature.scala
Outdated
Show resolved
Hide resolved
core/src/test/scala/com/microsoft/azure/synapse/ml/explainers/split1/ICEExplainerSuite.scala
Outdated
Show resolved
Hide resolved
core/src/test/scala/com/microsoft/azure/synapse/ml/explainers/split1/ICEExplainerSuite.scala
Outdated
Show resolved
Hide resolved
core/src/test/scala/com/microsoft/azure/synapse/ml/explainers/split1/ICEExplainerSuite.scala
Outdated
Show resolved
Hide resolved
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of the comments that are marked as resolved don't seem to be resolved perhaps I'm missing something or commenting too early
core/src/main/scala/com/microsoft/azure/synapse/ml/explainers/ICEExplainer.scala
Outdated
Show resolved
Hide resolved
} | ||
|
||
private def collectCategoricalValues[_](df: DataFrame, feature: ICECategoricalFeature): Array[_] = { | ||
val featureCount = DatasetExtensions.findUnusedColumnName("__feature__count__", df) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might want to rename this to featureCount to be consistent with other added columns
In this PR, I'm introducing ICETransformer and adding it in the com.microsoft.ml.spark.explainers package.
ICETransformer displays the model dependence on specified features with the given data frame.
Also, I added ICECategoricalFeature and ICENumericFeature classes which are used in ICETransformer.
All of these classes can be called from the python side.