New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for tunneling JDBC traffic over an HTTP proxy #541
Conversation
It looks like the patch uses Java 8 features. Currently, Gobblin is Java 7-based. |
Ah, right. Will fix. |
The test that fails has a 137 return code from its Gradle Test Executor. It could be that a specific test is failing because it's using up too much memory, in which case TravisCI SIGKILLS it (see: https://discuss.gradle.org/t/gradle-travis-ci/11928/3). I'll try to fix the test to use less memory. |
@chavdar It seems TravisCI is killing certain test processes (possibly because they are consuming too many resources.) I just significantly reduced the memory usage of some tests, but they still continue to get killed. Can you think of anything we can do to debug or sidestep this issue? |
While I'm trying to debug the TravisCI/gradle issue, here are instructions for performing end-to-end testing for these changes using gobblin standalone and a publicly accessible MySQL database: https://gist.github.com/kunalkandekar/b4f8c5fff6df62084e93 |
@sahilTakiar can you please help reviewing this? |
Will start reviewing today. |
Few general comments: 1: Can you squash all your commits into a single commit (http://stackoverflow.com/questions/5189560/squash-my-last-x-commits-together-using-git) |
Thanks for the comments, will incorporate the feedback. |
@kunalkandekar just one more comment: Gobblin is not using the standard LinkedIn coding style so none of our classes use variable names starting with |
Thanks @sahilTakiar, am incorporating your feedback. Out of curiosity, could you explain the rationale between squashing commits into a single one? There is no detailed design doc, only the comment on the original commit, will include details in the JavaDocs. |
We generally try to have 1 commit per new feature in Gobblin. This way we can have a single, descriptive commit message describing the change. |
@sahilTakiar I see. Would it make sense to squash the commits once I push my current changes or should I do so beforehand? |
@pcadabam Can you also help @sahilTakiar with the review? |
Looking into it. |
* | ||
* Implements a tunnel through a proxy to resource on the internet | ||
*/ | ||
public class Tunnel { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class seems too long. Can you pull out the private classes into a different file? Also Can you add more javadoc for the public methods and the classes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, this also what @sahilTakiar suggested. I'm refactoring accordingly.
@kunalkandekar you can go ahead to push all your local changes, and squash the commits once the PR is ready to be merged |
_delayedDoubleEchoServer = startDoubleEchoServer(1000); | ||
System.out.println("Delayed DoubleEchoServer on " + _delayedDoubleEchoServer.getServerSocketPort()); | ||
_talkFirstEchoServer = startTalkFirstEchoServer(); | ||
System.out.println("TalkFirstEchoServer on " + _delayedDoubleEchoServer.getServerSocketPort()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
Overall looks good. git rebase -i HEAD~num_commits_to_squash |
|
||
private static final ByteBuffer OK_REPLY = ByteBuffer.wrap("HTTP/1.1 200".getBytes()); | ||
private static final Set<ByteBuffer> OK_REPLIES = new HashSet<ByteBuffer>( | ||
Arrays.asList(OK_REPLY, ByteBuffer.wrap("HTTP/1.0 200".getBytes()))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be HTTP 1.1
instead of HTTP 1.0
- might be good to make the HTTP version a string constant so that version changes are easy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The HTTP/1.0 string is actually in addition to the HTTP/1.1 response. This ProxySetupHandler compares the response from the proxy against both strings. This is to handle the case where the intermediate proxy is based on HTTP 1.0 rather than 1.1. May help to move them to string constants, however.
…s could be because the Tunnel thread was not dying. This is an experimental change to test this hypothesis.
* @author kkandekar@linkedin.com | ||
*/ | ||
abstract class EasyThread extends Thread { | ||
static Set<EasyThread> ALL_THREADS = Collections.synchronizedSet(new HashSet<EasyThread>()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make this final
so it can't be mutated by any other class
@kunalkandekar LGTM! Are there any other changes you want to commit before I merge this? |
Hey @sahilTakiar, I'm just running some end-to-end tests once more on the latest changes with some adverse situations (connections being cut unexpectedly, the proxy dying halfway through, etc.) to ensure these are handled cleanly. Will ping as soon as I'm done. |
Can you resolve the merge conflicts? |
Whoops, hadn't noticed the conflicts, sorry. Merged and pushed. |
Do you know what's going on with the Java 8 build on Travis? Let's just disable any tests that currently don't run on Travis using the |
@sahilTakiar I've tried disabling all Tunnel tests by adding them to the "disabledOnTravis" group, but they are apparently still running and the build is still failing. Could you take a look and let me know if I'm doing it wrong? |
Add support for tunneling JDBC traffic over an HTTP proxy
@kunalkandekar thanks for doing all of this and for addressing all the comments! This feature has now been merged! |
@sahilTakiar nice! Thanks! |
Fixing FindBugs warnings for gobblin-tunnel from #541
Frequently data stores to be accessed by Gobblin reside outside data centers. In these cases, outbound access from a data center typically needs to go through a gateway proxy for security purposes. In some cases this is an HTTP proxy. However, some protocols like JDBC don't support the concept of "proxies", let alone HTTP proxies, and hence a solution is needed to enable this.
This Pull Request provides a method of tunneling JDBC connections over an HTTP proxy. Note that it is currently only implemented for JDBC, but can be extended to work with any other TCP-based protocol.
The way this works for JDBC is:
The Tunnel can accept as many connections as the JdbcExtractor opens. The Tunnel uses NIO to minimize resource usage.