Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] filer.sync cannot sync emptying chunks correctly if file modified by different clusters #3328

Closed
creeew opened this issue Jul 18, 2022 · 3 comments

Comments

@creeew
Copy link

creeew commented Jul 18, 2022

Describe the bug
ClusterA and ClusterB, using filer.sync to sync files to each other.
In ClusterA's filer mount directory:
touch hello.txt
echo "hello from A" >> hello.txt

In ClusterB's filer mount directory:
echo "hello from B" >> hello.txt

In ClusterA's filer mount directory:
echo "hello from A again" >> hello.txt

In ClusterB's filer mount directory:
echo "hello from B again" > hello.txt

Result:
In ClusterB hello.txt content is "hello from B again",
but in ClusterA hello.txt content is
"
hello from B again
hello from A again"
"

System Setup

  • 1 Master Server, 1 Volume Server, 1 Filer Server and using mount
  • Using default leveDB2 filer store

Expected behavior
In the last command, echo "hello from B again" > hello.txt, using echo > will overwrite the file content.
We expect the file hello.txt's content is "hello from B again" in ClusterA and ClusterB.

Additional context
command "echo "xx" > file" would produce two event notifications

  1. oldEntry is current entry and newEntry's chunks are empty (empty the file content)
  2. oldEntry's chunks are empty and newEntry contains new file content chunk

The bug is caused by the first step: empty file content eventNotification. The ClusterB's emptying all chunks metadata event notify to ClusterA. ClusterA cannot empty all chunks if ClusterA's chunks contain fids uploaded by itself.

ClusterA received empty metadata event from ClusterB:
oldEntry got chunks [fid:1 sid:11, fid2, fid3 sid33], newEntry got nothing chunks
compare old and new chunks to get the deleted chunks
https://github.com/chrislusf/seaweedfs/blob/56ec89625a5d4a66e3b1c810544afd93f449842e/weed/replication/sink/filersink/filer_sink.go#L194
deletedChunks are [fid:1 sid:11, fid2, fid3 sid33]

In ClusterA existingChunks are [fid:11, fid:22 sid:2, fid:33]
existingChunks need to delete deletedChunks by this codes
https://github.com/chrislusf/seaweedfs/blob/56ec89625a5d4a66e3b1c810544afd93f449842e/weed/replication/sink/filersink/filer_sink.go#L202
In this code, matching deleted chunks by source file id, existingChunks are [fid:11, fid:33] after deleting.
We are looking forward emptying all chunks but we got leftover chunks in ClusterA.

@creeew creeew changed the title [BUG] filer.sync cannot sync content correctly when sync emptying chunks [BUG] filer.sync cannot sync emptying chunks correctly if file modified by different clusters Jul 18, 2022
@chrislusf
Copy link
Collaborator

I am confused by [fid:1 sid:11, fid2, fid3 sid33]. Please create a unit test case to reproduce this clearly.

@creeew
Copy link
Author

creeew commented Jul 23, 2022

Here's the test case, filer_sink compare old and new entry to get the deleted chunk, but in clusterA cannot delete all chunks if clusterB emptied it's chunks.

filechunks_test.go
func TestDoMinusChunks(t *testing.T) {

	// clusterA and clusterB using filer.sync to sync file: hello.txt
	// clusterA append a new line and then clusterB also append a new line
	// clusterA append a new line again
	chunksInA := []*filer_pb.FileChunk{
		{Offset:0, Size:3, FileId:"11", Mtime:100},
		{Offset:3, Size:3, FileId:"22", SourceFileId:"2", Mtime:200},
		{Offset:6, Size:3, FileId:"33", Mtime:300},
	}
	chunksInB := []*filer_pb.FileChunk{
		{Offset:0, Size:3, FileId:"1", SourceFileId:"11", Mtime:100},
		{Offset:3, Size:3, FileId:"2", Mtime:200},
		{Offset:6, Size:3, FileId:"3", SourceFileId:"33", Mtime:300},
	}

	// clusterB using command "echo 'content' > hello.txt" to overwrite file
	// clusterA will receive two evenNotification, need to empty the whole file content first and add new content
	// the first one is oldEntry is chunksInB and newEntry is empty fileChunks
	firstOldEntry := chunksInB
	firstNewEntry := []*filer_pb.FileChunk{}

	// clusterA received the first one event, gonna empty the whole chunk, according the code in filer_sink 194
	// we can get the deleted chunks and newChunks
	firstDeletedChunks := DoMinusChunks(firstOldEntry, firstNewEntry)
	log.Println("first deleted chunks:", firstDeletedChunks)
	//firstNewEntry := DoMinusChunks(firstNewEntry, firstOldEntry)

	// clusterA need to delete all chunks in firstDeletedChunks
	emptiedChunksInA := DoMinusChunksBySourceFileId(chunksInA, firstDeletedChunks)
	// chunksInA supposed to be empty by minus the deletedChunks but it just delete the chunk which sync from clusterB
	log.Println("clusterA synced empty chunks event result:", emptiedChunksInA)
	// clusterB emptied it's chunks and clusterA must sync the change and empty chunks too
	assert.Equalf(t, firstNewEntry, emptiedChunksInA,"empty")
}

@chrislusf
Copy link
Collaborator

Thanks for the nice unit test! I added a fix and also add the test to the repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants