Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding JDBC postgres driver connection for external db to environment / jupyter #850

Open
RobertSellers opened this issue Apr 16, 2019 · 1 comment

Comments

@RobertSellers
Copy link

commented Apr 16, 2019

What docker image you are using?

pyspark-notebook on ubuntu 18.04 server.

What complete docker command do you run to launch the container?

I am using a docker-compose.yaml, for three services: jupyter, spark-master, and spark-worker-1. For the spark services, I am running (approximately):

spark-master:
       image: "pyspark-notebook"
       command: /home/jovyan/start-spark.sh
       volumes:
              - /local/scratch-drive/:/scratch
              - /local/work/:/usr/local/spark/work

spark-worker-1:
       image: "pyspark-notebook"
       command: /home/jovyan/start-spark-worker.sh
       volumes:
              - /local/scratch-drive/:/scratch
              - /local/work/:/usr/local/spark/work

What steps do you take once the container is running to reproduce the issue?

I am not super familiar with java, but I have tried a variety of things. Predominantly, I have been trying to run the jaydebeapi python library inside jupyter and point it to the three driver .jar files (one primary and two dependencies), we'll call them driver.jar, dependency1.jar, and dependency2.jar. I have run the following (jclassname / IP url not real) to no avail:

import jaydebeapi
path "/path/with/three_driver_jars/"
conn = jaydebeapi.connect(jclassname='com.example.jdbc.Driver', 
                    url= 'jdbc:https://0.0.0.0.0/sql:.', 
                    driver_args=[user, pw],
                    jars=os.listdir(path))

What do you expect to happen?

I expect to create a connection. I have been able to compile simple .java scripts on a local windows machine to read out records, but I am unable to reconcile this jdbc configuration inside this docker configuration. I have tried moving the .jars to the JRE directory, editing the spark-defaults.conf (see example below) and adding "CLASSPATH" environment variables

(inside spark-defaults.conf)

spark.driver.extraClassPath /shared_jars/driver.jar:/shared_jars/dependency1.jar:/shared_jars/dependency2.jar
spark.executor.extraClassPath /shared_jars/driver.jar:/shared_jars/dependency1.jar:/shared_jars/dependency2.jar

What actually happens?

I am routinely confronted with the following error:

java.lang.RuntimeExceptionPyRaisable: java.lang.RuntimeException: Class com.example.jdbc.Driver not found

For more clarity, I have successfully configured the following windows script to output data after having compiled the three .jars into a class file and executing, so the driver and credentials seem to be OK.

public class ConnectTest{
    public static void main(String[] args){
    java.sql.Driver driver = new com.example.jdbc.Driver();
    java.util.Properties info = new java.util.Properties();
    info.put("user", username);
    info.put("password", password);
    java.sql.Connection conn = driver.connect("https://0.0.0.0.0/sql", info);
    et cetera....
    conn.close();
 }

Am I missing something here? Is the java environment used for Spark incompatible or in need of modification? Or is this something I can possibly hack together inside of the current container state? I am very new to Java, but have some decent experience with python and docker.

@RobertSellers RobertSellers changed the title Adding JDBC postgres driver connection for external db to environment / jupyter code Adding JDBC postgres driver connection for external db to environment / jupyter Apr 16, 2019
@parente

This comment has been minimized.

Copy link
Member

commented Apr 21, 2019

We're starting to experiment with a general Q&A section on https://discourse.jupyter.org/c/questions to see if cross-technology questions like this one catch more attention from a broader community audience. You might try re-posting your question over there to see if someone with more experience in this topic can help.

If you do post the question again on the Discourse site, feel free to leave a link in a comment here for those that happen upon this closed issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.