Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a bug that the file copied by TF from HDFS to local may be wrong,… #42860

Merged
merged 4 commits into from
Sep 3, 2020

Conversation

yuanbopeng
Copy link
Contributor

This is a PR from TaiJi AI platform in Tencent.

  • The file copied by TF from HDFS to local may be wrong, when HDFS file is being overwritten #42597

@yuanbopeng
Copy link
Contributor Author

@mihaimaruseac
Can you help review the patch-3? I look forward to your comments, thanks :).

@gbaned gbaned self-assigned this Sep 1, 2020
@gbaned gbaned added comp:core issues related to core part of tensorflow prtype:bugfix PR to fix a bug labels Sep 1, 2020
@gbaned gbaned added this to Assigned Reviewer in PR Queue via automation Sep 1, 2020
Copy link
Collaborator

@mihaimaruseac mihaimaruseac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not the ideal patch, but at least this is localized to the HDFS implementation and does not remove existing tests.

Would be good if you can also add a test and add a similar change to the existing filesystem plugin for HDFS.

@google-ml-butler google-ml-butler bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Sep 2, 2020
PR Queue automation moved this from Assigned Reviewer to Approved by Reviewer Sep 2, 2020
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Sep 2, 2020
@tensorflow-copybara tensorflow-copybara merged commit 2871fc6 into tensorflow:master Sep 3, 2020
PR Queue automation moved this from Approved by Reviewer to Merged Sep 3, 2020
@vnghia
Copy link
Contributor

vnghia commented Sep 13, 2020

I will take care for the change and test in filesystem plugin for HDFS. However, I prefer reading the env in the initialization of the filesystem for a consistency between multiple calls for hdfsRead. what do you think @yuanbopeng @mihaimaruseac

@yuanbopeng
Copy link
Contributor Author

yuanbopeng commented Sep 14, 2020

@vnvo2409
I will add a test case for ReadWhileOverwriting soon, and I will also fix the compilation problem caused by the HadoopFileSystemTest call HDFS API without TransactionTokend. S3FileSystemTest also has similar problems.
image

If it is necessary to modify, I prefer to reading the env in the initialization of HDFSRandomAccessFile, which is more flexible and can support both WriteWhileReading and ReadWhileOverwriting use cases.

@vnghia
Copy link
Contributor

vnghia commented Sep 14, 2020

@yuanbopeng

@yuanbopeng
Copy link
Contributor Author

yuanbopeng commented Sep 14, 2020

@vnvo2409

ok. In order to ensure that the WriteWhileReading test case can pass, I will delete test cases that caused the compilation problem locally, and finally submit only the WriteWhileReading test case code.

  • So i will read the env inside NewRandomAccessFile for Hadoop.

Would it be better to read env in the constructor of HDFSRandomAccessFile?

@yuanbopeng yuanbopeng deleted the patch-3 branch September 14, 2020 04:28
@vnghia
Copy link
Contributor

vnghia commented Sep 14, 2020

@yuanbopeng

Would it be better to read env in the constructor of HDFSRandomAccessFile?

Agree.

ok. In order to ensure that the WriteWhileReading test case can pass, I will delete test cases that caused the compilation problem locally, and finally submit only the WriteWhileReading test case code.

The cloud filesystems tests have a tag manual so the CI will skip them. You see the compilation problem because no one has run those tests for a long time while adding a new transactional token into the filesystem operation but not into the tests. I think it is not necessary adding the test here since you will have to add the transactional token ( a nullptr I think so ) to every functions to fix the compilation problem.

I think I will add your change and add my own test into HDFS plugins. I will ping you in that PR so you could review the test and make sure that the test is what you want. The implementation of the filesystem plugin and the current filesystem are very similar so I think adding test in one place is enough.

You could also work with the filesystem plugin but keep in mind that if you would like to see the change immediately (tf-nightly). You will have to work with the current filesystem.

let's see if @mihaimaruseac agree with us.

@mihaimaruseac
Copy link
Collaborator

Sounds good to me. Thanks for driving this forward

@yuanbopeng
Copy link
Contributor Author

yuanbopeng commented Sep 14, 2020

@vnvo2409
add ReadWhileOverwriting test cases. #43220

In addition, fixes the compilation problem.
HDFS: #43221
S3: #43222

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes comp:core issues related to core part of tensorflow prtype:bugfix PR to fix a bug ready to pull PR ready for merge process size:XS CL Change Size: Extra Small
Projects
PR Queue
  
Merged
Development

Successfully merging this pull request may close these issues.

None yet

7 participants