-
Notifications
You must be signed in to change notification settings - Fork 459
Remove broken support for map files #3378
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* Remove support for Hadoop map files for tablet data, as this hasn't worked since 1.5.0 anyway. Now, an error should be thrown, as the file type is no longer recognized. Replaced special handling for map file extension or directory with a logged warning, error, or allow an exception to be thrown, depending on the situation. * Rename 'mapFile' variables and related methods to the more generic 'dataFile' or equivalent, to reduce confusion, since we no longer use MyMapFile or Hadoop MapFile, and only use RFile for storing tablet data files. This was done once before when RFile was initially added to the code base, which is why many of our variables and types already used the term 'dataFile', but evidently, not all of it had been renamed, or some of that terminology leaked back in. Some of it was added as part of recent Compaction changes, but there may have been earlier times this language was added back in. This change here resets us to the generic language for tablet data files. * Rename and relocate tabletingest-specific Thrift struct MapFileInfo to DataFileInfo under the tabletingest.thrift namespace. This only affects the bulk load Manager FATE operation and tablet servers. End user clients did not use this API, so it will not affect them. * Rename loadMapFile variable to loadMappingFile to make it clear that it's the file that stores the mapping of files for bulk loads, rather than a file in the MapFile format. * Add new validator to `table.file.type` value to prevent it from being anything other than the currently supported 'rf' extension for RFiles, which is also the current default (and the only value that would have worked anyway). * Fix issues with DefaultCompactionPlanner reported as part of apache#3377, but those should be applied to 2.1 branch first, and this change rebased onto the updated main after that is merged forward. **TODO** remove this from the commit message once that's done
All unit tests and ITs passed. |
EdColeman
approved these changes
May 5, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
cshannon
approved these changes
May 6, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM as well, I like the new name of dataFile
Manno15
approved these changes
May 6, 2023
Closed
ctubbsii
added a commit
to ctubbsii/accumulo
that referenced
this pull request
May 10, 2023
Backport some changes from PR apache#3378 for 3.0 into 2.1 to fix issue apache#3377 Specifically, backport the changes to DefaultCompactionPlanner (its unit test) to make private some of its internal implementation methods that were unintentionally left public. This also includes the name changes to those methods that were part of PR apache#3378, to reduce merge conflicts. This fixes apache#3377
ctubbsii
added a commit
to ctubbsii/accumulo
that referenced
this pull request
May 10, 2023
Drop MapFile support (apache#3359) by partially backporting PR apache#3378, and making minimally disruptive changes to avoid use of MapFiles for 1.10.4 and 2.1.1. As noted in apache#3378, MapFile support is already broken and has been for a long time. This change will cause an explicit and detectable failure, rather than a silent one, if a MapFile is attempted to be used.
ctubbsii
added a commit
that referenced
this pull request
May 12, 2023
Drop MapFile support (#3359) by partially backporting PR #3378, and making minimally disruptive changes to avoid use of MapFiles for 1.10.4 and 2.1.1. As noted in #3378, MapFile support is already broken and has been for a long time. This change will cause an explicit and detectable failure, rather than a silent one, if a MapFile is attempted to be used.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Remove support for Hadoop map files for tablet data, as this hasn't worked since 1.5.0 anyway. Now, an error should be thrown, as the file type is no longer recognized. Replaced special handling for map file extension or directory with a logged warning, error, or allow an exception to be thrown, depending on the situation.
Rename 'mapFile' variables and related methods to the more generic 'dataFile' or equivalent, to reduce confusion, since we no longer use MyMapFile or Hadoop MapFile, and only use RFile for storing tablet data files. This was done once before when RFile was initially added to the code base, which is why many of our variables and types already used the term 'dataFile', but evidently, not all of it had been renamed, or some of that terminology leaked back in. Some of it was added as part of recent Compaction changes, but there may have been earlier times this language was added back in. This change here resets us to the generic language for tablet data files.
Rename and relocate tabletingest-specific Thrift struct MapFileInfo to DataFileInfo under the tabletingest.thrift namespace. This only affects the bulk load Manager FATE operation and tablet servers. End user clients did not use this API, so it will not affect them.
Rename loadMapFile variable to loadMappingFile to make it clear that it's the file that stores the mapping of files for bulk loads, rather than a file in the MapFile format.
Add new validator to
table.file.type
value to prevent it from being anything other than the currently supported 'rf' extension for RFiles, which is also the current default (and the only value that would have worked anyway).Fix issues with DefaultCompactionPlanner reported as part of DefaultCompactionPlanner SPI class has several static methods/inner classes exposed as public API #3377, but those should be applied to 2.1 branch first, and this change rebased onto the updated main after that is merged forward. TODO remove this from the commit message once that's done