Skip to content
This repository has been archived by the owner on Jun 7, 2021. It is now read-only.

Updated hadoop version from 2.6.0 to 3.1.0 #607

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Updated hadoop version from 2.6.0 to 3.1.0 #607

wants to merge 1 commit into from

Conversation

abossert
Copy link

As discussed on the dev mail list, I have upgraded Hadoop from 2.6.0 to 3.1.0. The build is successful, but no other functionality or incorporation of new Hadoop 3.1 features yet...

@@ -119,8 +119,8 @@
@Deprecated
public static final String KEY_TAB_FILE = StramUserLogin.DT_AUTH_PREFIX + "store.keytab";
public static final String TOKEN_ANTICIPATORY_REFRESH_FACTOR = StramUserLogin.DT_AUTH_PREFIX + "token.refresh.factor";
public static final long DELEGATION_TOKEN_MAX_LIFETIME_DEFAULT = 7 * 24 * 60 * 60 * 1000;
public static final long DELEGATION_TOKEN_RENEW_INTERVAL_DEFAULT = 24 * 60 * 60 * 1000;
public static final long RM_DELEGATION_TOKEN_MAX_LIFETIME_DEFAULT = 7 * 24 * 60 * 60 * 1000;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest you submit this change in a separate PR as it is not related to the Hadoop 3.1 change.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I seem to remember Intellij complaining about those two lines after I made the changes to the constants used by Hadoop 3.1. Perhaps I misunderstood/misinterpreted the relationship?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not necessary to rename Apex constants.

* exception java.io.IOException is never thrown in body of corresponding try statement
* The remedy for this appears to be to use a more specific exception as the current try/catch block could
* not spit out IOException. For now, just using Exception to try and get this to build, will revisit later. */
} catch (Exception ex) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tried removing the catch clause completely and let your IDE insert the catch clause(s) for the exact specific types? Exception is more general than IOException but includes RuntimeException, is that why the compiler likes it?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that is a good idea...should have thought of it....I'll give that a shot...as it relates to why Exception made the compiler happy...I suspect because Exception covers pretty much everything, it is a horrible choice, but not, from a compiler's standpoint, incorrect.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove try/catch as getPos() does not throw IOException now.

* been deprecated as far back as I can find Hadoop versions (2.4.1 at least), HADOOP_CALLER_CONTEXT_ENABLED_DEFAULT and
* HADOOP_CALLER_CONTEXT_ENABLED_KEY are the only two constants in 3.1.0 that seem even remotely related...
* */
boolean sslEnabled = conf.getBoolean(CommonConfigurationKeysPublic.HADOOP_CALLER_CONTEXT_ENABLED_KEY, CommonConfigurationKeysPublic.HADOOP_CALLER_CONTEXT_ENABLED_DEFAULT);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comments on this:

  • instead of having a tentative fix like this why can't we have a permanent fix? See below
  • Do we know what string constants these identifiers CommonConfigurationKeysPublic.HADOOP_SSL_ENABLED_KEY and CommonConfigurationKeysPublic.HADOOP_CALLER_CONTEXT_ENABLED_KEY stand for?
  • Same thing for CommonConfigurationKeysPublic.HADOOP_SSL_ENABLED_DEFAULT and CommonConfigurationKeysPublic.HADOOP_CALLER_CONTEXT_ENABLED_DEFAULT
  • if they are the same (which I doubt) we haven't introduced any incompatibility as far as external user impact is concerned.
  • if they are different we can just break the compatibility (not desirable) by using these new property names
  • or (desirable) honor the old property value by defining it here and checking for it before checking the new value.\
  • the comment is also repeated elsewhere. Ideally it should only exist in one place.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You make a good point...wil look into the constants more...the weird thing is that in the Javadocs for Hadoop, those two constants have no information about them as far back as I can see...other than being marked as deprecated...will do some more digging. From the APEX side of the house, what are we doing with this function other than the obvious...checking to see if SSL is enabled.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how reliable this page is...but the string/boolean values for the constants seem to make sense...I will hunt down which constants, if any, replaced them...

public static final String HADOOP_SSL_ENABLED_KEY = "hadoop.ssl.enabled"
public static final boolean HADOOP_SSL_ENABLED_DEFAULT = false

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worthwhile asking on the users@ list if anyone is using Apex in SSL mode. If no-one is using SSL we don't need to worry about breaking compatibility. But we should still confirm that HADOOP_CALLER_CONTEXT_ENABLED_KEY is the right property to check to verify SSL is enabled in the user's Hadoop cluster.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I think our comments crossed. And what's the value of HADOOP_CALLER_CONTEXT_ENABLED_KEY ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not sufficient to ask on "user@apex" who uses SSL. It is necessary to vote on "dev@apex" prior to dropping SSL support and I will be the one to -1 it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vrozov who is recommending dropping SSL support? I didn't suggest that if that's the impression you got. My question was in connection with creating incompatibility which might be okay if no-one is currently using SSL

@sanjaypujare
Copy link
Contributor

Both the Jenkins and Travis builds/tests have failed. Could you check the failures and see if they are related to these changes?

@abossert
Copy link
Author

Both the Jenkins and Travis builds/tests have failed. Could you check the failures and see if they are related to these changes?

Yeah, I was just looking into that. I reverted all my changes, so I have a vanilla Master branch...the build fails there too with the tests:

 <failure message="Version number pattern does not match: Unknown" type="java.lang.AssertionError">java.lang.AssertionError: Version number pattern does not match: Unknown
	at com.datatorrent.stram.util.VersionInfoTest.testMavenProperties(VersionInfoTest.java:64)
</failure>

@sanjaypujare
Copy link
Contributor

If this is the same failure you see in your PR CI tests then that's good news. Not sure why we are getting "Unknown" from the VersionInfo object but it might be worthwhile debugging and fixing that in a separate PR.

@vrozov
Copy link
Member

vrozov commented Jan 16, 2019

@abossert Please open JIRA and follow "Opening Pull Requests" guidelines (see http://apex.apache.org/contributing.html)

@@ -14,8 +14,7 @@
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
* under the License.*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

undo this formatting change.

@@ -119,8 +119,8 @@
@Deprecated
public static final String KEY_TAB_FILE = StramUserLogin.DT_AUTH_PREFIX + "store.keytab";
public static final String TOKEN_ANTICIPATORY_REFRESH_FACTOR = StramUserLogin.DT_AUTH_PREFIX + "token.refresh.factor";
public static final long DELEGATION_TOKEN_MAX_LIFETIME_DEFAULT = 7 * 24 * 60 * 60 * 1000;
public static final long DELEGATION_TOKEN_RENEW_INTERVAL_DEFAULT = 24 * 60 * 60 * 1000;
public static final long RM_DELEGATION_TOKEN_MAX_LIFETIME_DEFAULT = 7 * 24 * 60 * 60 * 1000;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not necessary to rename Apex constants.

* been deprecated as far back as I can find Hadoop versions (2.4.1 at least), HADOOP_CALLER_CONTEXT_ENABLED_DEFAULT and
* HADOOP_CALLER_CONTEXT_ENABLED_KEY are the only two constants in 3.1.0 that seem even remotely related...
* */
boolean sslEnabled = conf.getBoolean(CommonConfigurationKeysPublic.HADOOP_CALLER_CONTEXT_ENABLED_KEY, CommonConfigurationKeysPublic.HADOOP_CALLER_CONTEXT_ENABLED_DEFAULT);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not sufficient to ask on "user@apex" who uses SSL. It is necessary to vote on "dev@apex" prior to dropping SSL support and I will be the one to -1 it.

* exception java.io.IOException is never thrown in body of corresponding try statement
* The remedy for this appears to be to use a more specific exception as the current try/catch block could
* not spit out IOException. For now, just using Exception to try and get this to build, will revisit later. */
} catch (Exception ex) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove try/catch as getPos() does not throw IOException now.

@sanjaypujare
Copy link
Contributor

@abossert just checking. Hopefully you are not stuck on anything and making progress on this PR.

@abossert
Copy link
Author

abossert commented Jan 24, 2019 via email

@sanjaypujare
Copy link
Contributor

@tweise added that test (com.datatorrent.stram.util.VersionInfoTest.testMavenProperties). Any ideas?

@abossert
Copy link
Author

abossert commented Jan 24, 2019 via email

@tweise
Copy link
Contributor

tweise commented Jan 24, 2019

@abossert looks like a different error than what is shown in the CI. How are you running the tests?

Try mvn clean install if that's not what you are doing already.

@abossert
Copy link
Author

abossert commented Jan 24, 2019 via email

@abossert
Copy link
Author

abossert commented Jan 24, 2019 via email

@abossert
Copy link
Author

abossert commented Jan 24, 2019 via email

@sanjaypujare
Copy link
Contributor

My preference: just make the change to honor dfs.http.policy only and not worry about dfs.https.enable and hadoop.ssl.enabled . We haven't heard about any existing Apex users using SSL.

@abossert
Copy link
Author

abossert commented Jan 25, 2019 via email

@sanjaypujare
Copy link
Contributor

Good question. As per https://hadoop.apache.org/docs/r3.1.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml, HTTP_AND_HTTPS means service is provided both on http and https. We use this setting for both outbound and inbound (apex web-service) and enhancing our inbound to support HTTP_AND_HTTPS will be some work which is not really needed at this time. I am okay to treat HTTP_AND_HTTPS as HTTPS_ONLY.

@abossert
Copy link
Author

abossert commented Jan 25, 2019 via email

@abossert
Copy link
Author

abossert commented Jan 25, 2019 via email

@abossert
Copy link
Author

abossert commented Jan 26, 2019 via email

@tweise
Copy link
Contributor

tweise commented Apr 13, 2019

@abossert thanks for the contribution. Seeing no activity on the PR since Jan, what's the plan forward?

@abossert
Copy link
Author

abossert commented Apr 15, 2019 via email

@vrozov
Copy link
Member

vrozov commented Jun 3, 2019

@abossert any update? If not, I'll close the PR, please re-open when there is an update.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
5 participants