Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error reporting: check existence of the path to siva files #52

Closed
bzz opened this issue Sep 22, 2017 · 0 comments
Closed

Error reporting: check existence of the path to siva files #52

bzz opened this issue Sep 22, 2017 · 0 comments
Assignees

Comments

@bzz
Copy link
Contributor

bzz commented Sep 22, 2017

If example from our README https://github.com/src-d/spark-api#pyspark-api-usage is tried literally

from sourced.spark import API as SparkAPI
from pyspark.sql import SparkSession
 
spark = SparkSession.builder.appName("test").master("local[*]").getOrCreate()
api = SparkAPI(spark, '/path/to/siva/files')
api.repositories.filter("id = 'github.com/mawag/faq-xiyoulinux'").references.filter("name = 'refs/heads/HEAD'").show()

with the )non-existent path to .siva files_, it will result in

---------------------------------------------------------------------------
Py4JJavaError                             Traceback (most recent call last)
<ipython-input-1-7f67465f882f> in <module>()
      4 spark = SparkSession.builder.appName("test").master("local[*]").getOrCreate()
      5 api = SparkAPI(spark, '/path/to/siva/files')
----> 6 api.repositories.filter("id = 'github.com/mawag/faq-xiyoulinux'").references.filter("name = 'refs/heads/HEAD'").show()

/usr/local/spark/python/pyspark/sql/dataframe.py in show(self, n, truncate)
    334         """
    335         if isinstance(truncate, bool) and truncate:
--> 336             print(self._jdf.showString(n, 20))
    337         else:
    338             print(self._jdf.showString(n, int(truncate)))

/usr/local/spark/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py in __call__(self, *args)
   1131         answer = self.gateway_client.send_command(command)
   1132         return_value = get_return_value(
-> 1133             answer, self.gateway_client, self.target_id, self.name)
   1134 
   1135         for temp_arg in temp_args:

/usr/local/spark/python/pyspark/sql/utils.py in deco(*a, **kw)
     61     def deco(*a, **kw):
     62         try:
---> 63             return f(*a, **kw)
     64         except py4j.protocol.Py4JJavaError as e:
     65             s = e.java_exception.toString()

/usr/local/spark/python/lib/py4j-0.10.4-src.zip/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name)
    317                 raise Py4JJavaError(
    318                     "An error occurred while calling {0}{1}{2}.\n".
--> 319                     format(target_id, ".", name), value)
    320             else:
    321                 raise Py4JError(

Py4JJavaError: An error occurred while calling o39.showString.
: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: execute, tree:
Exchange hashpartitioning(repository_id#11, 200)
+- *Filter (((isnotnull(name#12) && (name#12 = refs/heads/HEAD)) && isnotnull(repository_id#11)) && (repository_id#11 = github.com/mawag/faq-xiyoulinux))
   +- *Scan GitRelation(org.apache.spark.sql.SQLContext@4faae818,references,/path/to/siva/files,/tmp) [repository_id#11,name#12,hash#13] PushedFilters: [IsNotNull(name), EqualTo(name,refs/heads/HEAD), IsNotNull(repository_id), EqualTo(repository_id,..., ReadSchema: struct<repository_id:string,name:string,hash:string>

which is not very clean error message.

This can be fixed by implementing a proper check and error reporting, if the given path does not exist, in the Scala part.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants