Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[question] : Support Spark SQL standalone #2483

Closed
amramtamir opened this issue Mar 27, 2017 · 2 comments
Closed

[question] : Support Spark SQL standalone #2483

amramtamir opened this issue Mar 27, 2017 · 2 comments

Comments

@amramtamir
Copy link

Hi,
I am trying to connect to Spark 2.1.0 , but the documentation is fairly poor and not well covered yet.
I have tried via thrift server using the hive:// sql alchemy but with no success.

Do I need to setup Druid or can it be without it ?
Can you please share what is supported (if supported) and how to achieve it?

My Spark cluster is standalone, the Superset is running on the same network as Spark and thrift.

Thanks!

@xrmx
Copy link
Contributor

xrmx commented Mar 27, 2017

Dup of #241

@xrmx xrmx closed this as completed Mar 27, 2017
@bobzdar
Copy link

bobzdar commented Jul 13, 2017

I can connect using the hive:// sql alchemy, but it throws a "Could not locate column in row for column 'database_name'" error when I save the connector as there are no tables in the default (or any other) database - all tables are in global_temp to allow multi-session thrift usage for resource management purposes and thrift does not allow you to set global_temp as the default database since it's a system preserved database. The issue with multi session mode is that it creates a new session for each connection and will not have any tables loaded, so I'm trying to figure out how to create a (dummy) table when the connection is initiated (using metadata_params?) to allow it to find a table and save. Then I should be able to query the global_temp tables.

If you run thrift in single session mode and create some tables or views before connecting superset, it should work. I did have to install some sasl dependencies on the superset python install to get it to connect.

If you get a SASL error on connect, start thrift in NOSASL mode and specify "auth":"NOSASL" in engine_params in superset, or try the following:

pip install thrift-sasl
pip install sasl
pip install thriftpy
yum -y install cyrus-sasl-plain

Hope this helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants