Skip to content

[HUDI-5128] Fix getFileSystem way in FileSystemBackedTableMetadata, DatePartitionPathSelector and BootstrapUtils not consistent issue#7105

Closed
TJX2014 wants to merge 2 commits intoapache:masterfrom
TJX2014:HUDI-5128
Closed

[HUDI-5128] Fix getFileSystem way in FileSystemBackedTableMetadata, DatePartitionPathSelector and BootstrapUtils not consistent issue#7105
TJX2014 wants to merge 2 commits intoapache:masterfrom
TJX2014:HUDI-5128

Conversation

@TJX2014
Copy link
Contributor

@TJX2014 TJX2014 commented Nov 1, 2022

Change Logs

Make getFileSystem way unify to org.apache.hudi.common.fs.FSUtils#getFs(org.apache.hadoop.fs.Path, org.apache.hadoop.conf.Configuration)

Impact

none

Risk level (write none, low medium or high below)

none

Documentation Update

none

Contributor's checklist

  • Read through contributor's guide
  • Change Logs and Impact were stated clearly
  • Adequate tests were added if applicable
  • CI passed

@TJX2014
Copy link
Contributor Author

TJX2014 commented Nov 1, 2022

@danny0405 please help review this : )

Path path = new Path(directory);
FileSystem fileSystem = path.getFileSystem(new Configuration());
FileSystem fileSystem = FSUtils.getFs(path, fs.getConf());
RemoteIterator<LocatedFileStatus> itr = fileSystem.listFiles(path, true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fs is already a FileSystem, why new an instance here ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For context.flatMap is not thread safe, three is many thread in context.flatMap may operate same fs if not create new one in directory -> {} function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-> may operate same fs

What's the problem here, the fs instance is cached BTW, so you would still fetch the same fs instance each time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we just remove path.getFileSystem(new Configuration());, and rename fs ouside to filesystem in lambda,right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it though ~

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danny0405 Hi, I have fixed it, please help review again ~

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What bug is fixed here then ? Or simplify a code refactoring.

…atePartitionPathSelector and BootstrapUtils not consistent issue
@nsivabalan nsivabalan added the priority:high Significant impact; potential bugs label Nov 7, 2022
@hudi-bot
Copy link
Collaborator

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@TJX2014 TJX2014 requested a review from danny0405 November 16, 2022 11:35
@nsivabalan nsivabalan added the release-0.12.2 Patches targetted for 0.12.2 label Dec 6, 2022
@github-actions github-actions bot added the size:S PR with lines of changes in (10, 100] label Feb 26, 2024
@yihua
Copy link
Contributor

yihua commented Sep 7, 2024

@danny0405 is this still a bug or we can close the PR?

@danny0405
Copy link
Contributor

@danny0405 is this still a bug or we can close the PR?

I kind of think it makes sense, we better to unify the instantiation of the fs objects.

@yihua
Copy link
Contributor

yihua commented Sep 11, 2024

Sounds good. @CTTY could you revamp this PR given you have worked on HoodieStorage related refactoring and improvement recently?

@CTTY
Copy link
Contributor

CTTY commented Sep 12, 2024

I've assigned the JIRA ticket to myself. FileSystemBackedTableMetadata, FSUtils, and BootStrapUtils in this PR have already been refactored. I'll post a new PR to replace FS usages in DatePartitionPathSelector and related classes later

@yihua
Copy link
Contributor

yihua commented Sep 12, 2024

Sounds good. Closing this stale PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

priority:high Significant impact; potential bugs release-0.12.2 Patches targetted for 0.12.2 size:S PR with lines of changes in (10, 100]

Projects

Status: ✅ Done

Development

Successfully merging this pull request may close these issues.

6 participants