Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python3: Validation problems in job, exiting #1082

Open
maharshibhavsar opened this issue Jul 8, 2018 · 6 comments
Open

Python3: Validation problems in job, exiting #1082

maharshibhavsar opened this issue Jul 8, 2018 · 6 comments
Labels
help wanted Something is not working and a deeper look from the community is appreciated

Comments

@maharshibhavsar
Copy link

maharshibhavsar commented Jul 8, 2018

Used Spark version
SPARK 2.2.1

Used Spark Job Server version
Released version 0.8.1

Deployed mode
Spark Standalone

Used Python Executable
Python3

Actual (wrong) behavior
While running the word_count example, gives successful output (even with the pysql-context), Sql_Average returns
java.lang.Exception:Python job failed with error code 1 and standard err [Validation problems in job, exiting]

Steps to reproduce
This works fine
curl -d 'input.strings = ["a", "b", "a", "b" ]' "localhost:8090/jobs?appName=example_jobs&classPath=example_jobs.word_count.WordCountSparkSessionJob&context=pysql-context"

I've also tried to provide input.strings as a .conf file which also worked good for WordCount.

But while trying SQL Average job, it is not working and gives the validation error.
curl -d @sqlinput.conf "localhost:8090/jobs?appName=example_jobs&classPath=example_jobs.sql_average.SQLAverageJob&context=pysql-context"

The content of sqlinput.conf is same as mentioned in the Docs, which is
input.data = [ ["bob", 20, 1200], ["jon", 21, 1400], ["mary", 20, 1300], ["sue", 21, 1600] ]
Logs

[2018-07-09 13:08:58,831] INFO  .jobserver.JobManagerActor [] [akka://JobServer/user/context-supervisor/pysql-context] - Creating new JobId for current job
[2018-07-09 13:08:58,832] INFO  .jobserver.JobManagerActor [] [akka://JobServer/user/context-supervisor/pysql-context] - Starting Spark job bfd35817-2f0d-4542-b017-0c1013c3afc7 [example_jobs.sql_average.SQLAverageJob]...
[2018-07-09 13:08:58,833] INFO  .jobserver.JobManagerActor [] [akka://JobServer/user/context-supervisor/pysql-context] - Starting job future thread
[2018-07-09 13:08:58,833] INFO  k.jobserver.JobStatusActor [] [akka://JobServer/user/context-supervisor/pysql-context/$a] - Job bfd35817-2f0d-4542-b017-0c1013c3afc7 started
[2018-07-09 13:08:58,837] INFO  jobserver.python.PythonJob [] [akka://JobServer/user/context-supervisor/pysql-context] - Running example_jobs.sql_average.SQLAverageJob from /tmp/spark-jobserver/sqldao/data/example_jobs-20180706_034417_092.egg
[2018-07-09 13:08:58,838] INFO  jobserver.python.PythonJob [] [akka://JobServer/user/context-supervisor/pysql-context] - Using Python path of /tmp/spark-jobserver/sqldao/data/example_jobs-20180706_034417_092.egg:/home/maharshi/spark-2.2.1-bin-hadoop2.7/python/lib/pyspark.zip:/home/maharshi/spark-2.2.1-bin-hadoop2.7/python/lib/py4j-0.9-src.zip:/home/maharshi/spark-jobserver/job-server-extras/../job-server-python/target/python/spark_jobserver_python-0.8.1_SNAPSHOT-py2.7.egg
[2018-07-09 13:09:00,058] ERROR jobserver.python.PythonJob [] [akka://JobServer/user/context-supervisor/pysql-context] - From Python: Validation problems in job, exiting
[2018-07-09 13:09:00,125] ERROR jobserver.python.PythonJob [] [akka://JobServer/user/context-supervisor/pysql-context] - Python job failed with error code 1
[2018-07-09 13:09:00,126] ERROR .jobserver.JobManagerActor [] [akka://JobServer/user/context-supervisor/pysql-context] - Got Throwable
java.lang.Exception: Python job failed with error code 1 and standard err [Validation problems in job, exiting]
	at spark.jobserver.python.PythonJob$$anonfun$1.apply(PythonJob.scala:87)
	at scala.util.Try$.apply(Try.scala:192)
	at spark.jobserver.python.PythonJob.runJob(PythonJob.scala:60)
	at spark.jobserver.python.PythonJob.runJob(PythonJob.scala:12)
	at spark.jobserver.JobManagerActor$$anonfun$getJobFuture$4.apply(JobManagerActor.scala:447)
	at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
	at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
[2018-07-09 13:09:00,134] ERROR .jobserver.JobManagerActor [] [akka://JobServer/user/context-supervisor/pysql-context] - Exception from job bfd35817-2f0d-4542-b017-0c1013c3afc7: 
java.lang.Exception: Python job failed with error code 1 and standard err [Validation problems in job, exiting]
	at spark.jobserver.python.PythonJob$$anonfun$1.apply(PythonJob.scala:87)
	at scala.util.Try$.apply(Try.scala:192)
	at spark.jobserver.python.PythonJob.runJob(PythonJob.scala:60)
	at spark.jobserver.python.PythonJob.runJob(PythonJob.scala:12)
	at spark.jobserver.JobManagerActor$$anonfun$getJobFuture$4.apply(JobManagerActor.scala:447)
	at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
	at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
[2018-07-09 13:09:00,134] INFO  k.jobserver.JobStatusActor [] [akka://JobServer/user/context-supervisor/pysql-context/$a] - Job bfd35817-2f0d-4542-b017-0c1013c3afc7 finished with an error
[2018-07-09 13:09:00,135] INFO  r$RemoteDeadLetterActorRef [] [akka://JobServer/deadLetters] - Message [spark.jobserver.CommonMessages$JobErroredOut] from Actor[akka://JobServer/user/context-supervisor/pysql-context/$a#804708427] to Actor[akka://JobServer/deadLetters] was not delivered. [3] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.

@noorul
Copy link
Contributor

noorul commented Jul 9, 2018

In the SJS log that you posted, I don't see any reference to "example_jobs.sql_average.SQLAverageJob"

@maharshibhavsar
Copy link
Author

maharshibhavsar commented Jul 9, 2018

Hi, thanks for the reply, sorry for the mistake, I have updated the correct logs, can you please take a look now?

@noorul
Copy link
Contributor

noorul commented Jul 9, 2018

@maharshibhavsar What does the curl command return?

@maharshibhavsar
Copy link
Author

This is the response of the curl request through which I submitted the Job:

{
    "duration": "Job not done yet",
    "classPath": "example_jobs.sql_average.SQLAverageJob",
    "startTime": "2018-07-09T13:08:58.831+05:30",
    "context": "pysql-context",
    "status": "STARTED",
    "jobId": "bfd35817-2f0d-4542-b017-0c1013c3afc7",
    "contextId": ""
}

And this is the response of localhost:8090/jobs/bfd35817-2f0d-4542-b017-0c1013c3afc7:

{
    "duration": "1.302 secs",
    "classPath": "example_jobs.sql_average.SQLAverageJob",
    "startTime": "2018-07-09T13:08:58.831+05:30",
    "context": "pysql-context",
    "result": {
        "message": "java.lang.Exception: Python job failed with error code 1 and standard err [Validation problems in job, exiting]",
        "errorClass": "java.lang.RuntimeException",
        "stack": "java.lang.RuntimeException: java.lang.Exception: Python job failed with error code 1 and standard err [Validation problems in job, exiting]\n\tat spark.jobserver.python.PythonJob$$anonfun$1.apply(PythonJob.scala:87)\n\tat scala.util.Try$.apply(Try.scala:192)\n\tat spark.jobserver.python.PythonJob.runJob(PythonJob.scala:60)\n\tat spark.jobserver.python.PythonJob.runJob(PythonJob.scala:12)\n\tat spark.jobserver.JobManagerActor$$anonfun$getJobFuture$4.apply(JobManagerActor.scala:447)\n\tat scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)\n\tat scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat java.lang.Thread.run(Thread.java:748)\n"
    },
    "status": "ERROR",
    "jobId": "bfd35817-2f0d-4542-b017-0c1013c3afc7",
    "contextId": ""
}

@maharshibhavsar
Copy link
Author

Hi, I just figured that in both class (SQLAverageJob & WordCountSparkSessionJob) the Validate method is different.
In SQLAverageJob if i change the validate method from

def validate(self, context, runtime, config):
        problems = []
        job_data = None
        if not isinstance(context, SQLContext):
            problems.append('Expected a SQL context')
        if config.get('input.data', None):
            job_data = config.get('input.data')
        else:
            problems.append('config input.data not found')
        if len(problems) == 0:
            return job_data
        else:
            return build_problems(problems)

to similar method used in WordCountSparkSessionJob which would be something like

def validate(self, context, runtime, config):
        if config.get('input.data', None):
            return config.get('input.data')
        else:
            return build_problems(['config input.data not found'])

then it is running successfully. Is there any specific reason why the validate method is different in both classes? Does it have any downside which I'm missing out?

@Nibooor Nibooor added the help wanted Something is not working and a deeper look from the community is appreciated label Nov 26, 2020
@AreRex14
Copy link

Can different validation logic causes job error? I mean, it's just examples and as my understanding, the validation can be edit to suit what we need to validate right? Or am I wrong? I just trying this SJS, so I'm beginner here..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Something is not working and a deeper look from the community is appreciated
Projects
None yet
Development

No branches or pull requests

4 participants