Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HADOOP-16965. Refactor abfs stream configuration. (#1956) #4171

Open
wants to merge 3 commits into
base: branch-2.10
Choose a base branch
from

Conversation

arjun4084346
Copy link

@arjun4084346 arjun4084346 commented Apr 14, 2022

Contributed by Mukund Thakur.

(cherry picked from commit 8031c66)

Description of PR

It is an almost clean cherry pick of commit 8031c66

How was this patch tested?

Ran mvn test -pl hadoop-tools/hadoop-azure
No new unit tests fail.

Ran all integration abfs tests using mvn -T 1C -Dparallel-tests=abfs clean verify with my storage account arjundev.dfs.core.windows.net
No tests failed. There are 3 errors; these were negative test cases where error was expected.

[INFO] Results:
[INFO] 
[ERROR] Errors: 
[ERROR]   ITestAzureBlobFileSystemCreate.testFilterFSWriteAfterClose:182 » IO java.io.Fi...
[ERROR]   ITestAzureBlobFileSystemE2E.testFlushWithFileNotFoundException:224 » IO java.i...
[ERROR]   ITestAzureBlobFileSystemE2E.testWriteWithFileNotFoundException:204 » IO java.i...
[INFO] 
[ERROR] Tests run: 413, Failures: 0, Errors: 3, Skipped: 253

Same three errors are found in the current branch-2.10 code as well.

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

Contributed by Mukund Thakur.

(cherry picked from commit 8031c66)
Copy link

@raymondlam12 raymondlam12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 in general; small code formatting clean up, otherwise looks good

@mukund-thakur
Copy link
Contributor

Hey! Could you please update more details on why this is getting backported to branch-2.10. Thanks

@arjun4084346
Copy link
Author

Hi @mukund-thakur , this is my first pull request in apache hadoop and thank you for the review. I am trying to backport fix for https://issues.apache.org/jira/browse/HADOOP-17215 (#2246). In order to cleanly cherry pick cherry pick this commit from branch-3.3 to branch-2.10 , I need to cherry pick several other commits before I pick this one. The commit being picked in this PR is one of those several commits.

@steveloughran
Copy link
Contributor

@arjun4084346 i don't see us merging this into asf branch-2.10.x; branch 2 is feature complete and only gets security fixes. trying to backport that far is a losing battle.

private LI/MSFT fork, fine, but not the apache one

@steveloughran
Copy link
Contributor

also, process police point, i see you checked the "Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?"

until you explicitly state the endpoint you ran the integration tests against, this PR doesn't actually exist
https://hadoop.apache.org/docs/stable/hadoop-azure/testing_azure.html#Policy_for_submitting_patches_which_affect_the_hadoop-azure_module.

@virajith
Copy link
Contributor

Hi @steveloughran - 2.10 is still listed as an active release line (https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+Active+Release+Lines). Has there been a community vote on just limiting it to security fixes? What's the process for declaring the branch to have security fixes only?

@arjun4084346
Copy link
Author

also, process police point, i see you checked the "Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?"

until you explicitly state the endpoint you ran the integration tests against, this PR doesn't actually exist https://hadoop.apache.org/docs/stable/hadoop-azure/testing_azure.html#Policy_for_submitting_patches_which_affect_the_hadoop-azure_module.

Removed that 'check' on 'object storage'. It was not applicable to this PR.

@steveloughran
Copy link
Contributor

@virajith ok, it's almost completely dead. but it still has to compile against java7, uses lots of old dependencies, including the older mockito and logging apis. backporting is a major piece of work

@steveloughran
Copy link
Contributor

@arjun4084346

Removed that 'check' on 'object storage'. It was not applicable to this PR.

it absolutely is applicable to this.

please declare which azure storage endpoint you ran the entire maven integration suites of the hadoop-azure module against.

no tests, no review.

@arjun4084346
Copy link
Author

Hi @steveloughran , understood what you meant. I ran the integration tests according to the link you provided. There were 0 failures. 3 Errors were, I believe, expected.

@arjun4084346
Copy link
Author

@virajith can you please review/merge

@virajith
Copy link
Contributor

@virajith ok, it's almost completely dead. but it still has to compile against java7, uses lots of old dependencies, including the older mockito and logging apis. backporting is a major piece of work

I agree @steveloughran. However, at LinkedIn, we continue to run 2.10 and are looking to continue maintaining branch-2.10 given we have to backport these to our internal 2.10 branch anyway.

Copy link
Contributor

@virajith virajith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you specific the region where the storage account you used to run the tests is located? Also, I see that 253 tests were skipped. Can you call out why these were skipped?

@virajith
Copy link
Contributor

@steveloughran / @mukund-thakur - given the above reason on why this is needs to be backported to branch-2.10, let me know if you are ok getting this into 2.10.

@arjun4084346
Copy link
Author

Storage account's Primary location: East US, Secondary location: West US

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 11m 30s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ branch-2.10 Compile Tests _
+1 💚 mvninstall 18m 2s branch-2.10 passed
+1 💚 compile 0m 40s branch-2.10 passed with JDK Azul Systems, Inc.-1.7.0_262-b10
+1 💚 compile 0m 34s branch-2.10 passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~18.04-b07
+1 💚 checkstyle 0m 31s branch-2.10 passed
+1 💚 mvnsite 0m 43s branch-2.10 passed
+1 💚 javadoc 0m 40s branch-2.10 passed with JDK Azul Systems, Inc.-1.7.0_262-b10
+1 💚 javadoc 0m 28s branch-2.10 passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~18.04-b07
+1 💚 spotbugs 1m 26s branch-2.10 passed
_ Patch Compile Tests _
+1 💚 mvninstall 0m 29s the patch passed
+1 💚 compile 0m 32s the patch passed with JDK Azul Systems, Inc.-1.7.0_262-b10
+1 💚 javac 0m 32s the patch passed
+1 💚 compile 0m 26s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~18.04-b07
+1 💚 javac 0m 26s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 18s the patch passed
+1 💚 mvnsite 0m 31s the patch passed
+1 💚 javadoc 0m 24s the patch passed with JDK Azul Systems, Inc.-1.7.0_262-b10
+1 💚 javadoc 0m 19s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~18.04-b07
+1 💚 spotbugs 1m 2s the patch passed
_ Other Tests _
-1 ❌ unit 1m 17s /patch-unit-hadoop-tools_hadoop-azure.txt hadoop-azure in the patch passed.
+1 💚 asflicense 0m 29s The patch does not generate ASF License warnings.
43m 48s
Reason Tests
Failed junit tests hadoop.fs.azure.TestBlobMetadata
hadoop.fs.azure.TestNativeAzureFileSystemConcurrency
hadoop.fs.azure.TestNativeAzureFileSystemMocked
hadoop.fs.azure.TestWasbFsck
hadoop.fs.azure.TestNativeAzureFileSystemContractMocked
hadoop.fs.azure.TestNativeAzureFileSystemFileNameCheck
hadoop.fs.azure.TestNativeAzureFileSystemOperationsMocked
hadoop.fs.azure.TestOutOfBandAzureBlobOperations
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4171/3/artifact/out/Dockerfile
GITHUB PR #4171
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname Linux 47af85695580 4.15.0-175-generic #184-Ubuntu SMP Thu Mar 24 17:48:36 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision branch-2.10 / bf6bdc1
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~18.04-b07
Multi-JDK versions /usr/lib/jvm/zulu-7-amd64:Azul Systems, Inc.-1.7.0_262-b10 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~18.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4171/3/testReport/
Max. process+thread count 263 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4171/3/console
versions git=2.17.1 maven=3.6.0 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@arjun4084346
Copy link
Author

I ran the integration tests on branch-2.10 and that also has skipped tests. Test results are same for both the branch-2.10 and my PR in terms of tests run, failures, errors, skipped tests.

[INFO] --- maven-site-plugin:3.5:attach-descriptor (attach-descriptor) @ hadoop-azure ---
[ERROR] Errors: 
[ERROR]   ITestAzureBlobFileSystemCreate.testFilterFSWriteAfterClose:182 » IO java.io.Fi...
[ERROR]   ITestAzureBlobFileSystemE2E.testFlushWithFileNotFoundException:224 » IO java.i...
[ERROR]   ITestAzureBlobFileSystemE2E.testWriteWithFileNotFoundException:204 » IO java.i...
[INFO] 
[ERROR] Tests run: 413, Failures: 0, Errors: 3, Skipped: 253

[INFO] --- maven-failsafe-plugin:2.21.0:integration-test (integration-test-abfs-parallel-classes) @ hadoop-azure ---
[WARNING] Tests run: 151, Failures: 0, Errors: 0, Skipped: 24

It's the same and behavior doesn't change

@virajith
Copy link
Contributor

Thanks for checking the skipped tests and confirming @arjun4084346! @mukund-thakur / @steveloughran unless you have any objections, I'd like to commit this.

@mukund-thakur
Copy link
Contributor

@arjun4084346
Copy link
Author

Yetus still failing with unit tests. Please fix those https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4171/3/artifact/out/patch-unit-hadoop-tools_hadoop-azure.txt

Hi @mukund-thakur , those unit tests fails in branch-2.10 . This PR does not introduce any new failing tests. e.g. take this PR #4151. It is merged into branch-2.10 and it also has some failing unit tests.

@steveloughran
Copy link
Contributor

do you have a branch in your personal repo where you have cherrypicked all the changes you need and show that things are good test wise at the end of the chain?

we don't need to do the patch by patch test and review if we are confident the final sequence is good

@mukund-thakur
Copy link
Contributor

Yetus still failing with unit tests. Please fix those https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4171/3/artifact/out/patch-unit-hadoop-tools_hadoop-azure.txt

Hi @mukund-thakur , those unit tests fails in branch-2.10 . This PR does not introduce any new failing tests. e.g. take this PR #4151. It is merged into branch-2.10 and it also has some failing unit tests.

Okay if that's the case. I am okay with the change and no longer have any concerns with the current PR.

@steveloughran
Copy link
Contributor

going with mukund then,
+1

Now, as I asked before: do you have a branch in your own github which has the chain of call Cherry pics I needed? So we can see that the full chain is good?

As then you can submit a PR with that chain and we can review/verify the final output. The individual commits would still need merging one by one, but reviewing is a lot easier. After all, we shouldn't be suggesting any changes here that are not in trunk.

@arjun4084346
Copy link
Author

arjun4084346 commented Apr 30, 2022

going with mukund then, +1

Now, as I asked before: do you have a branch in your own github which has the chain of call Cherry pics I needed? So we can see that the full chain is good?

As then you can submit a PR with that chain and we can review/verify the final output. The individual commits would still need merging one by one, but reviewing is a lot easier. After all, we shouldn't be suggesting any changes here that are not in trunk.

Yes, these were the next PRs I plan to send https://github.com/arjun4084346/hadoop/pulls
They all look fine.

cc @virajith @abhishekdas99

@apache apache deleted a comment from hadoop-yetus May 3, 2022
@apache apache deleted a comment from hadoop-yetus May 3, 2022
@steveloughran
Copy link
Contributor

can you submit a single PR with that chain and we can review/verify the final output. The individual commits would still need merging one by one, but reviewing is a lot easier. especially as some of that pr list you just pointed to are failing. if we can be confident that the final change is good, we don't need to worry about test failures along the way

@arjun4084346
Copy link
Author

can you submit a single PR with that chain and we can review/verify the final output. The individual commits would still need merging one by one, but reviewing is a lot easier. especially as some of that pr list you just pointed to are failing. if we can be confident that the final change is good, we don't need to worry about test failures along the way

sure, please find #4261 a single PR for 7 commits. @steveloughran @mukund-thakur @virajith @abhishekdas99

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants