Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connecting to presto through https #55

Closed
yaxxie opened this issue Jun 15, 2018 · 6 comments
Closed

Connecting to presto through https #55

yaxxie opened this issue Jun 15, 2018 · 6 comments

Comments

@yaxxie
Copy link

yaxxie commented Jun 15, 2018

I'm trying to connect to presto which is reachable only via SSL. How is this achieved with omniduct? I note that pyhive have support for https, but I didn't see any way in documentation for me to specify this is available over HTTPS only.

Advice appreciated.

@danfrankj
Copy link
Collaborator

Hi @yaxxie

When creating an omniduct PrestoClient, you can pass additional connection_options as keyword arguments (https://github.com/airbnb/omniduct/blob/master/omniduct/databases/presto.py#L38). These then get passed to pyhive's presto client. In this case I think you want protocol='https'. (https://github.com/dropbox/PyHive/blob/19d7c2b8fb8b19dadeb2339d4d3f42c8f09204c1/pyhive/presto.py#L93)

I'll close this but feel free to re-open if that doesn't work for you.

@yaxxie
Copy link
Author

yaxxie commented Jun 18, 2018

Hi, it did not work for me. I can see from the source code that while the kwargs get passed to the connect call, they're not being passed to the SQLAlchemy engine: https://github.com/airbnb/omniduct/blob/master/omniduct/databases/presto.py#L76

@yaxxie
Copy link
Author

yaxxie commented Jun 26, 2018

@danfrankj are you able to re-open? I'm not able to reopen this.

@naoyak naoyak reopened this Jun 26, 2018
@danfrankj
Copy link
Collaborator

@yaxxie Ah I see, the sql_alchemy engine is only used for schema browsing and saving dataframes ("push"ing). Are you able to query using connection_options (https://github.com/airbnb/omniduct/blob/master/omniduct/databases/presto.py#L75) ? This should only use the pyhive client and does not rely on the sqlalchemy engine.

@yaxxie
Copy link
Author

yaxxie commented Jun 26, 2018

@danfrankj sorry for wasting your time -- I just realised that this is to do with a known issue with presto returning http URLs when using HTTPs terminated at a load balancer.
The sqlalchemy engine would still need the protocol passed through if SSL were enabled and we wanted to use the dataframe retrieval function, right? So perhaps that needs to be an enhancement request/PR?

@danfrankj
Copy link
Collaborator

@yaxxie that is an annoying issue and hopefully the presto team is working on it.

for omniduct the connection we use to issue queries is pyhive's presto client (https://github.com/airbnb/omniduct/blob/master/omniduct/databases/presto.py#L73) which you can already configure to use https and retrieve dataframes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants