Skip to content
This repository has been archived by the owner on May 20, 2021. It is now read-only.

Support for Presto 0.145 (work in progress) #179

Merged
merged 12 commits into from Aug 26, 2016

Conversation

gwittel
Copy link
Contributor

@gwittel gwittel commented May 18, 2016

Opening this PR per request in Issue #145 (comment). Unfortunately I don't have time to work further on this, so hopefully someone can take this and run with it.

Please note that all caveats mentioned in that thread still apply (see my comment here #145 (comment)). This PR represents a work in progress. It compiles, and launches. However, I was unable to test functionality end to end. I'm not confident that things all work.

Two main risk points I'm less confident with my changes:

  • The main thing that is likely broken is the Column type field mapping. The API isn't well documented and I just took a guess at what was needed (I didn't have a Hive cluster handy to compare output of SHOW COLUMNS FOR). This impacts the column cache (HiveColumn). When querying a table I do get an exception related to this, but I don't have the time to debug further at the moment.
  • Shiro -- Several APIs changed and while I think I've migrated the code to match the original intent (basically its about how principals and subjects are injected), I haven't checked/tested it in any way

Greg Wittel added 8 commits April 29, 2016 12:55
I don't feel particularly sure about these changes as its unclear
how/where the output from Hive's "SHOW COLUMNS" can map to this cleanly
Still Broken -- airlift relies on Jetty 9.3.9, but 0.90 dropwizard uses
9.2.  Dropwizard 1.0.x uses Jetty 9.3 series which may work
Required to support Jetty 9.3 series used by presto/airlift

Still broken: Column data type mapping
@@ -25,7 +25,8 @@ dataSourceFactory:
driverClass: com.mysql.jdbc.Driver
user: MYSQL_USER
password: MYSQl_PASSWORD
url: jdbc:mysql://127.0.0.1:3306/MYSQL_DB_NAME
# localhost or 127.0.0.1 don't work for some reason
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This snuck in. Feel free to toss the change to this file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you simply revert this file?

@john-bodley
Copy link
Contributor

Thanks @gwittel for having a crack at this. @bkyryliuk and I will try to review/test your change in the next few days.

Do you mind updating the README.md changing

./gradlew -Dairpal.prestoVersion=0.85 clean shadowJar

to

./gradlew -Dairpal.prestoVersion=0.145 clean shadowJar    

as well as the compatibility-chart (https://github.com/airbnb/airpal#compatibility-chart). I'm not sure whether we should bump the version or merely state this only works with the 0.145 version of Presto.

@gwittel
Copy link
Contributor Author

gwittel commented May 19, 2016

Updated. Did not rewrite history since the PR already existed. I figure this can be all squash merged once work in progress is all set/tested/etc.

@john-bodley
Copy link
Contributor

@gwittel whilst testing your change the query status bar isn't present and one needs to refresh the page in order to see 'My recent queries' update once the query has finished. Do you think this could be related to having deprecated OldJettyHttpClient? What was your reasoning for deprecating it rather than replacing AsyncHttpClient with HttpClient.

Note for reference we're building using Presto version 0.145 though we're connecting to an older version of Presto. We're hoping to have our test cluster upgraded to 0.145 sometime next week.

@gwittel
Copy link
Contributor Author

gwittel commented May 20, 2016

It seems odd that it would cause GUI breakage (unless the request is failing repeatedly and that's the net effect on the GUI). All I can think of is either the underlying HTTP request fails for some odd reason; or the changes upgrading across a few major versions of dropwizard broke the UI in another way.

The AsyncHttpClient interface that OldJettyHttpClient implemented was completely removed from Airlift a while ago; this removal is reflected in Presto. Since we were already bringing in the airlift http-client, it didn't make sense to maintain the clone of a legacy HTTP client; when airlift already has one that can operate in async and sync modes. (also built on Jetty HTTP).

@airhorns
Copy link

Any movement on this? We'd really love to use Airpal against a newish presto -- anything we can do to help?

@john-bodley
Copy link
Contributor

Last week I re-added the OldJettyHttpClient interface to see if it would rectify the issue of the missing status bar whilst querying but alas it didn't seem to fix the issue. That said we were testing against an older Presto version and our test cluster was upgraded to Presto version 0.145 on Friday so I plan on retesting this either later today or tomorrow.

Has anyone else tried testing @gwittel PR to see if they run into the same UI issue?

@eschuchmann
Copy link

any updates on merging this? We too would love to use Airpal with a newer version of presto.

@airhorns
Copy link

@eschuchmann we did some hacking internally to see if we could get this branch to work and were unable to, so there is more work to be done sadly. @drdee and @udnay could tell you more about what needs to be done.

@john-bodley
Copy link
Contributor

We finally upgraded our Presto test cluster to 0.145 last week though @bkyryliuk and myself have had no luck to date resolving the UI issue even after restoring OldJettyHttpClient. We hope to spend a few cycles this week looking further into the issue.

@airhorns were you hitting the same issues as us, i.e. the UI wouldn't update with an additional row in the 'My recent queries' table when the query was running?

@drdee
Copy link

drdee commented Jun 14, 2016

@johnbodley IIRC, if you refresh (F5) the webpage then that row will show up in the My recent queries table.

@john-bodley
Copy link
Contributor

@drdee with the current PR if one refreshes the page the My recent queries table includes the relevant row as you mentioned, though this behavior is suboptimal. Additionally the Data Preview is non-operational (tested with the OldJettyHttpClient).

@drdee
Copy link

drdee commented Jun 14, 2016

Yes, definitely sub-optimal behavior.

@gwittel
Copy link
Contributor Author

gwittel commented Jun 14, 2016

Sorry I haven't had time to look at this further on my side. Wild guess -- is it possible something is broken with the SSE servlet functionality? We've gone through a large Jersey upgrade via Dropwizard/Airlift. It would be interesting to trace the UI interaction and see if the SSE requests are working on both ends.

@laoyang945
Copy link

hope this could be fixed soon!
grateful for all devs

@natesammons-nasdaq
Copy link
Contributor

Is AirPal still under active development? We had been using it for some time, however it's been incompatible with recent versions of Presto so we can not use it. Just curious if there's a plan to keep this project going or not.

@jefffeng
Copy link

jefffeng commented Aug 5, 2016

Hi @natesammons-nasdaq, we are aware of the incompatability of Airpal with recent versions of Presto but we haven't found a solution around it yet. We will evaluate maintenance-related bugs with Airpal but we are not working on new feature development at this time.

@gabe-lyons
Copy link
Collaborator

gabe-lyons commented Aug 17, 2016

Hey @gwittel! I'm on the Data Tools team at Airbnb. You were right! I investigated into your issue and I found your problem is related to how dropwizard's gzip interacts with SSE. Unfortunately, the clients are unable to read from the eventstream because gzip is holding them in its buffer. These webpages are informative:

https://java.net/jira/browse/JERSEY-3000
dropwizard/dropwizard#1673
https://groups.google.com/forum/#!topic/dropwizard-user/qKu_9g1MLGc

For some reason, even when I try setting the buffer size to 0KiB as outlined here:

http://www.dropwizard.io/0.7.1/docs/manual/configuration.html#man-configuration-gzip

The events still get lost in gzip-land. I looked into the red part of your diff and noticed this line,

https://github.com/airbnb/airpal/pull/179/files#diff-1caa7ace00f5827f33c3aced0bb3717cL119

Which indicates that the previous version dropped gzip for similar reasons. If you update your reference.yml to like so

server:
    # insert this here
    gzip:
        enabled: false
    # continue with your reference.yml as expected
    ...

then gzip will no longer intercept the event stream, and updates will flow as normal. I noticed that specifically when I use presto-0.145, the progress bar is not filled in. I'm looking into that issue, but for now if you add those two lines to your reference.yml the main bug you were discussing will be fixed.

@laoyang945
Copy link

Great news from @GabeLoins

@gwittel
Copy link
Contributor Author

gwittel commented Aug 17, 2016

@GabeLoins Great job tracking that down. The old gzip code removed was for the HTTP client interacting with Presto APIs and not the HTTP server. However, end result is the same with the config :)

I will push the gzip-enabled=false for now. However, should (can?) we hard disable this in code to prevent accidental breakage?

…fered.

Upon later dropwizard upgrade we should consider a more fine tuned
approach using 'syncFlush=true' fix from dropwizard/dropwizard#1673.
@gabe-lyons
Copy link
Collaborator

gabe-lyons commented Aug 17, 2016

I believe it's possible. Looking at LeDominik's solution in the dropwizard issue, dropwizard/dropwizard#1673, he seems to have found a solution. I'm not familiar enough with jetty to get it working, do you think you could @gwittel?

Also, not sure if this solution is going to be any more reliable, it seems pretty sketchy.

@gabe-lyons
Copy link
Collaborator

@gwittel is this ready to merge? Or are there updates you would still like to make?

@gwittel
Copy link
Contributor Author

gwittel commented Aug 22, 2016

I think it depends how we feel about the known issues:

Open issues as I see them:

        // Disable GZIP content encoding for SSE endpoints
        environment.lifecycle().addServerLifecycleListener(server -> {
            for (Handler handler : server.getChildHandlersByClass(BiDiGzipHandler.class)) {
                ((BiDiGzipHandler) handler).addExcludedMimeTypes(SERVER_SENT_EVENTS);
            }
        });

Thoughts?

@gabe-lyons
Copy link
Collaborator

gabe-lyons commented Aug 23, 2016

@gwittel

  1. No progress on the progress bar. I would merge without the fix and when we get a fix to that we can throw up another pull request.

  2. The new code looks good. Can you add a commit with it so I can test it out on my end?

@gwittel
Copy link
Contributor Author

gwittel commented Aug 23, 2016

Done. Seems to work but would be good to confirm if you have a real vs dev environment to test against.

@gabe-lyons
Copy link
Collaborator

ok. tested out and looks good to me! I'll get someone with admin privs to merge this in.

@gabe-lyons gabe-lyons merged commit e3e6528 into airbnb:master Aug 26, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants