Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Facing issue while running Pyspark GLM explainer notebook #5

Open
rupesh15203 opened this issue Nov 7, 2022 · 0 comments
Open

Facing issue while running Pyspark GLM explainer notebook #5

rupesh15203 opened this issue Nov 7, 2022 · 0 comments

Comments

@rupesh15203
Copy link

Thanks for creating such a great library for model explanation . I tried to run the pyspark glm explainer notebook on my local machine. I am getting following error while running the code:

TypeError Traceback (most recent call last)
Input In [35], in <cell line: 1>()
----> 1 explain_stages = get_glm_explain_stages(predictions_view, features_coefficient_view, label_column, family="gaussian", link="log")
2 explain_pipeline = Pipeline(stages=explain_stages)
3 explain_df = explain_pipeline.fit(prediction_df).transform(prediction_df)

Input In [33], in get_glm_explain_stages(predictions_view, coefficients_view, label_column, family, link, variance_power, link_power)
4 link_function_type = resolve_link_function(family, link, link_power)
5 print(f"link_function_type : {link_function_type}")
6 stages = [
----> 7 OneHotDecoder(oheSuffix="_OHE", idxSuffix="_IDX", unknownSuffix="Unknown"),
8 SQLTransformer(statement=f"CREATE OR REPLACE TEMPORARY VIEW {predictions_view} AS SELECT * from THIS"),
9 GLMExplainTransformer(predictionView=predictions_view, coefficientView=coefficients_view,
10 linkFunctionType=link_function_type, label=label_column, nested=True,
11 calculateSum=True, family=family, variancePower=variance_power, linkPower=link_power
12 )
13 ]
14 return stages

File ~.conda\envs\car_price_pred\lib\site-packages\pyspark_init_.py:114, in keyword_only..wrapper(self, *args, **kwargs)
112 raise TypeError("Method %s forces keyword arguments." % func.name)
113 self._input_kwargs = kwargs
--> 114 return func(self, **kwargs)

File ~.conda\envs\car_price_pred\lib\site-packages\transparency\spark\ohe\decoder.py:22, in OneHotDecoder.init(self, oheSuffix, idxSuffix, unknownSuffix)
19 @keyword_only
20 def init(self, oheSuffix=None, idxSuffix=None, unknownSuffix=None):
21 super(OneHotDecoder, self).init()
---> 22 self._java_obj = self._new_java_obj(OneHotDecoder._classpath, self.uid)
24 self._setDefault(oheSuffix="_OHE", idxSuffix="_IDX", unknownSuffix="Unknown")
26 kwargs = self._input_kwargs

File ~.conda\envs\car_price_pred\lib\site-packages\pyspark\ml\wrapper.py:66, in JavaWrapper._new_java_obj(java_class, *args)
64 java_obj = getattr(java_obj, name)
65 java_args = [_py2java(sc, arg) for arg in args]
---> 66 return java_obj(*java_args)

TypeError: 'JavaPackage' object is not callable

I add the jar in spark config while creating spark session using

from pyspark.sql import SparkSession
spark = SparkSession.builder.appName("pyspark_glm_explain") \
        .config("spark.sql.execution.arrow.enabled", "true") \
        .config("spark.jars", r"spark_model_explainer-assembly-0.0.1.jar") \
        .enableHiveSupport() \
        .getOrCreate()

Following specification I am using to run the code:
Operating System: Windows 10
Pyspark: version-3.3.1
Java: version-11.0.16
transparency: version-0.0.9

Can anyone help in finding out what exactly I am missing here? Thanks in advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant