-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-22180][CORE] Allow IPv6 address in org.apache.spark.util.Utils.parseHostPort #19408
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: ore
You might note here that you're checking that you don't have a [::1]:123 IPv6 address here -- the braces are key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turn the rule back on after the block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hex digits can be uppercase right?
Should the pattern it not be more like [0-9a-f]*(:[0-9a-f]*)+ match a number, then colon-number colon-number pairs, not number-colon-number number-colon-number sequences?
It might end up being equivalent because the match is for 0 or more digits.
This allows some strings that it shouldn't like "::::", but, the purpose isn't to catch every possible case I guess. It would fail name resolution.
I thought Inet6Address would just provide parsing for this but I guess not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes a real parser would be much better!!
I hope the methods will check the input. Like the name resolver...
At this point I thought more about the separation of the port.
I think it is important to check if two colons exists, otherwise this expression accepts hostnames like abc:123
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Final nit, use braces for both parts of the if-else
|
@obermeier Could you please rebase this with the latest master? Thanks! |
cf1d920 to
8026b0f
Compare
|
Done |
418927f to
a6894d5
Compare
|
Are you planning to fully address the IPv6 issues? If not, why do we choose to adapt this single function separately? |
|
I chose this function because I had some exceptions like this [1] if I used IPv6 hosts. I do not have an overview over the Spark code so I am currently not able to provide a general IPv6 solution. I think this code snipets will improve the situation and enables us to use IPv6 hosts in some cases. [1] |
|
Sounds good. |
jiangxb1987
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change LGTM, cc @cloud-fan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we should add more test cases to cover the invalid cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the preferred way to handle this kind of parse errors in Spark?
Changing the signature of this method to something like :Try[..], :Option ... is no option!?
Error log messages?
Unchecked Exceptions?
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this comment still valid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed this comment
….parseHostPort ## What changes were proposed in this pull request? Take ```hostPort``` as literal IPv6 address if it contains tow ore more colons. If IPv6 addresses are enclosed in square brackets port definition is still possible. ## How was this patch tested? Added a new test case into UtilsSuite Remove comment
a6894d5 to
453e104
Compare
|
ok to test |
|
Test build #84982 has finished for PR 19408 at commit
|
Remove whitespace at end of line.
|
Test build #85062 has finished for PR 19408 at commit
|
|
To rephrase @jiangxb1987's question - supporting IPv6 is a much larger effort, which spark currently does not. We should be addressing that as the problem to solve, and fix this as part of IPv6 support : eliminating individual exceptions could simply result in spark platform going into inconsistent states, without any telemetry in the logs on why (because we removed/'fixed' the expected exceptions). |
|
Fully agree with @mridulm 's big picture suggestion, and I also think supporting IPv6 should be designed as a integral feature, instead of just putting together some PRs. |
|
I total agree with you. |
|
This issue seems to be fixed in Spark 2.3.2 |
|
If Spark runs in YARN Cluster this issue still exists |
|
Can one of the admins verify this patch? |
|
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. |
External applications like Apache Cassandra are able to deal with IPv6 addresses. Libraries like spark-cassandra-connector combine Apache Cassandra with Apache Spark.
This combination is very useful IMHO.
One problem is that
org.apache.spark.util.Utils.parseHostPort(hostPort:String)takes the last colon to sepperate the port from host path. This conflicts with literal IPv6 addresses.I think we can take
hostPortas literal IPv6 address if it contains tow ore more colons. If IPv6 addresses are enclosed in square brackets port definition is still possible.