Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle old sorted map files in Upgrade #2185

Merged
merged 3 commits into from Jul 30, 2021

Conversation

milleruntime
Copy link
Contributor

@milleruntime milleruntime commented Jun 30, 2021

* Add upgradeFiles method for upgrading files to Upgrader
* Refactor status handling in UpgradeCoordinator to allow upgradeFiles
to run concurrently and wait for metadata upgrade to complete

* Implement upgradeFiles method in Upgrader9to10

Copy link
Contributor

@Manno15 Manno15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks good. Noticed one small typo but that is about it.

@ctubbsii ctubbsii added this to In progress in 2.1.0 via automation Jul 7, 2021
Copy link
Contributor

@Manno15 Manno15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks good from my perspective. Have you done any testing to verify that Accumulo will automatically re-sort the WALs once these are deleted? Any way to create a test for that?

@milleruntime
Copy link
Contributor Author

Everything looks good from my perspective. Have you done any testing to verify that Accumulo will automatically re-sort the WALs once these are deleted? Any way to create a test for that?

Yeah I did some manual testing with Uno. It may be possible to create an IT with the old files saved off but I imagine it would be quite involved. I am not sure its worth the work for a one time thing that will only ever happen between versions.

* Add upgradeFiles method for upgrading files to Upgrader
* Refactor status handling in UpgradeCoordinator to allow upgradeFiles
to run concurrently and wait for metadata upgrade to complete
* Implement upgradeFiles method in Upgrader9to10
* Add dropSortedMapWALFiles for resolving sorted map files that may
still be around during upgrade. Follow up for apache#2117
* Closes apache#2179
@milleruntime milleruntime changed the title Add upgradeFiles to Upgrade code Handle old sorted map files in Upgrade Jul 30, 2021
@milleruntime milleruntime merged commit 09e3d77 into apache:main Jul 30, 2021
2.1.0 automation moved this from In progress to Done Jul 30, 2021
@milleruntime milleruntime deleted the sorted-wal-fix branch July 30, 2021 16:43
@milleruntime
Copy link
Contributor Author

@ctubbsii I reverted the changes I made adding a new upgradeFiles method to the Upgrade interface and now this change is much simpler. I believe I got rid of the changes you had concern over. This PR was just the new code I added to fix old sorted logs.

@milleruntime
Copy link
Contributor Author

So I did so more testing and while it successfully deleted the old data, I am not 100% sure that it won't delete WALs that still have referenced data in them. This is a little tricky to test manually, as you need to make sure the table is recovering but also have unloaded tablets (before the data is flushed to disk). I am going to open a ticket to test this using Continuous Ingest.

* Remove old temporary map files to prevent problems during recovery.
*/
static void dropSortedMapWALFiles(VolumeManager vm) {
Path recoveryDir = new Path("/accumulo/recovery");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this should loop over the set of configured volumes instead of this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I will create a follow on ticket with this and any other suggestions you have.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think using the volumes configured in instance.volumes will be enough?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think using the volumes configured in instance.volumes will be enough?

Yeah, could look through all of those. Could narrow it by calling :

Set<String> choosable(org.apache.accumulo.core.spi.fs.VolumeChooserEnvironment env,

But I don't think it hurts to just look through all volumes. Maybe could use

to get the list of volumes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would probably be best to avoid choosable() as that will give the volumes configured for new WALs. The config could change and an old WAL could be on a volume that choosable() no longer returns. So probably best to inspect all volumes looking for old sorted wals to nuke.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would probably be best to avoid choosable() as that will give the volumes configured for new WALs. The config could change and an old WAL could be on a volume that choosable() no longer returns. So probably best to inspect all volumes looking for old sorted wals to nuke.

Definitely avoid this. Consider RandomVolumeChooser 😺

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the VolumeChooser was for selecting volumes for writes? I want to loop through every volume configured to find the old sorted WALs to remove. Will the VolumeChooser allow for this?

@@ -142,6 +148,8 @@ public void upgradeMetadata(ServerContext ctx) {
upgradeRelativePaths(ctx, Ample.DataLevel.USER);
upgradeDirColumns(ctx, Ample.DataLevel.USER);
upgradeFileDeletes(ctx, Ample.DataLevel.USER);
// special case where old files need to be deleted
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you consider calling this new method in upgradeZookeeper instead? I think that is called before the root tablet is loaded, which would allow deleting any old sorted logs the root tablet may reference.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I didn't think any tables were loaded until after the Upgrader was finished. If that is the case, then moving it to upgradeZookeeper would be better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the upgrade methods are run as follows.

  • upgradeZookeeper() is run before the root tablet is loaded.
  • upgradeRoot() is run after the root tablet is loaded and before the metadata table is loaded.
  • upgradeMetadata is run after the metadata table is loaded and before loading user tablets.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah OK. I will move it to upgradeZookeeper().

EdColeman pushed a commit to EdColeman/accumulo that referenced this pull request Aug 4, 2021
Capture discussion on PR apache#2185 to update javadoc comments.
Doc update only - contains no code changes.
EdColeman added a commit that referenced this pull request Aug 6, 2021
…2225)

* Update interface comments to include expected system state.

Capture discussion on PR #2185 to update javadoc comments.
Doc update only - contains no code changes.

* update phrasing per PR comments

Co-authored-by: Ed Coleman etcoleman <edcoleman@apache.org>
milleruntime added a commit to milleruntime/accumulo that referenced this pull request Aug 12, 2021
* Follow on work for apache#2185
* Improve upgrade code in Upgrader9to10.dropSortedMapWALFiles()
* Add comments to Upgrader interface
milleruntime added a commit to milleruntime/accumulo that referenced this pull request Aug 12, 2021
* Follow on work for apache#2185
* Improve upgrade code in Upgrader9to10.dropSortedMapWALFiles()
milleruntime added a commit to milleruntime/accumulo that referenced this pull request Aug 13, 2021
* Follow on work for apache#2185
* Improve upgrade code in Upgrader9to10.dropSortedMapWALFiles()
cbevard1 pushed a commit to cbevard1/accumulo that referenced this pull request Aug 13, 2021
…pache#2225)

* Update interface comments to include expected system state.

Capture discussion on PR apache#2185 to update javadoc comments.
Doc update only - contains no code changes.

* update phrasing per PR comments

Co-authored-by: Ed Coleman etcoleman <edcoleman@apache.org>
milleruntime added a commit to milleruntime/accumulo that referenced this pull request Aug 18, 2021
* Follow on work for apache#2185
* Improve upgrade code in Upgrader9to10.dropSortedMapWALFiles()
milleruntime added a commit that referenced this pull request Aug 23, 2021
* Follow on work for #2185
* Improve upgrade code in Upgrader9to10.dropSortedMapWALFiles()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
2.1.0
  
Done
Development

Successfully merging this pull request may close these issues.

Handle old sorted map files
4 participants