Read from branch with compaction data #7701

idanovo · 2024-04-25T14:22:42Z

No description provided.

github-actions · 2024-04-25T14:29:50Z

E2E Test Results - DynamoDB Local - Local Block Adapter

arielshaqed

Thanks! Really exciting to see how we can separate the two functions of "the" (currently single) branch metarange ID: right now it is used both as the parent for the next commit, and also as the base for differences in staging.

I would prefer to avoid the term "compaction" here, and instead name the new metarange ID on a branch by some other name - maybe BaseMetarangeID. It may make our lives easier if we ever have to add user-requested "partial commits". The name we pick right now matters, because it it stored to the KV. So it will be expensive to change in future!

I am not sure why we need a "compaction manager". All it does is duplicate the code that we already have in Graveler, just for a different metarange. We even need the same if statements to select the right base metarange. Furthermore, AFAICT this compaction manager leads to these bugs:

An SSTable handle leaks whenever a compacted metarange exists on the branch. This leaks a file descriptor as well as memory.
The code opens an unused metarange whenever a compacted metarange exists on the branch. This is a loss of efficiency. IOPs (I/O operations) are sometimes an expensive resource on lakeFS installations.

Can we try to select the correct metarange in if statements before accessing the commit metarange, instead of having the manager? I believe that that code would be easier to understand, and provide fewer opportunities for bugs.

pkg/graveler/graveler.go

itaiad200 · 2024-04-28T06:16:18Z

Agree with @arielshaqed comments. Is this PR goal to cover all branch reading scenarios? What about functions that are not covered in this PR, like listStagingArea, isStagingEmpty, ..?

guy-har

Agree with Ariel, I would expect compaction to be handled as any other metarange. furthermore, compaction made the term staging a tricky one
i.e some would expect listStagingArea to return only data existing under the staging token and others may expect it to return all staging data (IMO all staging data is the trivial one). Also, we still have many places not considering the compacted data, for example graveler.Get and graverler.Commit which should return "dirty" if staging is empty, etc...

pkg/graveler/graveler.go

itaiad200 · 2024-04-29T12:29:14Z

pkg/graveler/graveler_test.go

Thanks for the tests! With Commit, StagingToken and SealedTokens already present, and with the addition of BaseMetaRange concept, the number of test scenarios is huge. I think we missed some important ones, like testing with tombstones, testing with all 4 (or 5 for multiple sealed tokens) keys, querying for empty staging area, etc. The reason we need to be pedantic about changes here, is that bugs in the versioning engine can be catastrophic. We intend on introducing the compaction operation itself later and that's even more worrying! Finding the bugs are normally easier during the PR itself, and it gets harder as we build on top of it.

You can look at graveler_v2_test.go file too. I find it easier to write the tests with proper generated mocks instead of limited fakes.

…d-compacted-branch

itaiad200

Reviewed everything except graveler_test.go (will get back to it later).
Overall we're almost there! Bunch of medium/small comments..

pkg/catalog/catalog_test.go

pkg/catalog/gc_write_uncommitted.go

pkg/graveler/graveler.go

itaiad200 · 2024-05-05T15:44:16Z

pkg/graveler/graveler.go

@@ -3109,6 +3164,14 @@ func (g *Graveler) Diff(ctx context.Context, repository *RepositoryRecord, left,
 		leftValueIterator.Close()
 		return nil, err
 	}
+	if rightBranch.CompactedBaseMetaRangeID != "" {
+		compactedDiffIterator, err := g.CommittedManager.Diff(ctx, repository.StorageNamespace, leftCommit.MetaRangeID, rightBranch.CompactedBaseMetaRangeID)


leftCommit & rightBranch.CompactedBaseMetaRangeID? Can you explain why this is true?

In case there's no compaction data on the right we return a combined iterator with:

diff iterator between the left committed data and the right committed data

iterator of committed data in the left branch

iterator of the staging data of the right branch

So in case there's compacted data on the right branch we need to change the first iterator to contain also changes that have already compacted to the right branch

In case there's no compaction data on the right we return a combined iterator with:

But this path handles the case where rightBranch.CompactedBaseMetaRangeID != ""

So, why isn't it JoinedDiff(stagingIterator, compactedDiffIterator)? Isn't iterator of committed data in the left branch already being diffed in compactedDiffIterator?

pkg/graveler/graveler.go

pkg/graveler/graveler_v2_test.go

pkg/graveler/graveler.go

itaiad200

Partial review, but I think there's some valuable comments already

pkg/catalog/catalog_test.go

pkg/graveler/graveler.go

itaiad200 · 2024-05-07T13:23:09Z

pkg/graveler/graveler.go

+		}
+	}
+
+	if updatedValue == nil && reference.CompactedBaseMetaRangeID != "" {


How do we get here with updatedValue != nil?

in case the object didn't change in staging

pkg/graveler/graveler.go

itaiad200 · 2024-05-07T13:52:44Z

pkg/graveler/graveler.go

@@ -2492,7 +2512,7 @@ func (g *Graveler) ResetPrefix(ctx context.Context, repository *RepositoryRecord
 		newSealedTokens = append(newSealedTokens, branch.SealedTokens...)

 		// Reset keys by prefix on the new staging token
-		itr, err := g.listStagingArea(ctx, branch, 0)
+		itr, err := g.listStagingAreaWithoutCompaction(ctx, branch, 0)


Shouldn't compaction be involved too? AFAIR ResetPrefix resets all the uncommitted changes in a prefix. Why shouldn't it reset changes that exist in compacted area too?

pkg/graveler/graveler.go

…d-compacted-branch

idanovo · 2024-05-08T12:57:59Z

Graveler function Tested Controller function Tested AI

ResetKey X ResetBranch V Add a controller test for ResetKey for a compacted branch when we add the CompactBranch API call

ResetPrefix X ResetBranch V Add a controller test for ResetPrefix for a compacted branch when we add the CompactBranch API call

prepareForCommitIDUpdate X Revert (dirty branch) V Added a test in graveler testsShould also add a test in the controller once we have the CompactBranch API call

prepareForCommitIDUpdate X CherryPick (dirty branch) V Add a controller test for CherryPick for a compacted branch when we add the CompactBranch API call

prepareForCommitIDUpdate X UpdateBranch (dirty branch) X Add a controller test for ResetPrefix for a compacted branch when we add the CompactBranch API call

prepareForCommitIDUpdate X Merge (dirty branch) V Added a test in graveler testsShould also add a test in the controller once we have the CompactBranch API call

prepareForCommitIDUpdate X Import (dirty branch) X Add a controller test for ResetPrefix for a compacted branch when we add the CompactBranch API call

itaiad200

Thank you for pushing this one through, it's certainly wasn't an easy one.
Once it's merged, please open an issue containing the tests table as a dependency for the next compaction task.

pkg/catalog/catalog_test.go

pkg/graveler/graveler.go

…d-compacted-branch

arielshaqed

Only blocking diffs are on JoinedDiffIterator. It is really important code, and needs to be as simple as possible. I think we may have simpler code if we revisit some decisions.

One way might be this. "Advance inner" (now, after seeing the code) seems like the wrong strategy. AFAICT it forces us to assume a particular special DiffIterator.Value() return of nil when the iterator is over. And then it looks like we check almost the same cases but not quite. Because there are so many, I worry that we will miss some edge cases.

arielshaqed · 2024-05-09T09:28:58Z

pkg/catalog/catalog_test.go

+			numBranch:     3,
+			numRecords:    0,
+			expectedCalls: 1,
+			compactBranch: true,


Nit: Rather than double each test scenario, I would probably just loop over both scenarios by saying for _, compactBranch := range{false, true} { t.Run(...) } on l. 585.

arielshaqed · 2024-05-09T09:30:59Z

pkg/catalog/catalog_test.go

 	}

 	if numRecords > 0 {
 		test.GarbageCollectionManager.EXPECT().
 			GetUncommittedLocation(gomock.Any(), gomock.Any()).
-			Times(expectedCalls).
+			AnyTimes().


Why? Particularly worried that we might never call this. If no UGC occurs, will the test fail somewhere else?

pkg/catalog/gc_write_uncommitted.go

arielshaqed · 2024-05-09T09:37:53Z

pkg/graveler/graveler.go

+	// CompactedBaseMetaRangeID - the MetaRangeID of the last compaction's
+	CompactedBaseMetaRangeID MetaRangeID


Really not a fan of the "compacted" part of the name. It describes how it happened not what it is. I would prefer to use "BaseMetaRangeID". It is important to get the name right before any users - because if we get it wrong it will be hard to fix existing KVs.

In future we might have other reasons to change the base. For one example: during concurrent merges, one of the merges will fail when it tries to update the branch record of the destination. That is inefficient and slow. We could use your work here to speed up concurrent merges! After the merge fails like this we might optionally update the base metarange of the source branch to point at the intended result. Now the source branch looks like the destination branch got merged into it, so retrying the merge will be faster.

(This is just one example, I have many plans for this feature. I don't want to tie them to the word "compacted"!)

I think @itaiad200 was worried that the word "compact" was missing for this variable.
Decision time 😈

@itaiad200 can you chime in, please?

Not tying ourselves to Compacted means that we add another term on top of staging, sealed, compaction and uncommitted.. For example, LastStagingSnapshot - Pro: It describes what it is, not obscure. Con: Another term.

I still hold my opinion but not firmly, I'd rather it be CompactedStaging over BaseMetaRangeID. I believe it does describe what it is, not necessarily the action that happened.
So up to you, not blocking :)

pkg/graveler/graveler_v2_test.go

pkg/graveler/joined_diff_iterator.go

arielshaqed · 2024-05-09T09:43:20Z

pkg/graveler/joined_diff_iterator.go

+		// first
+		nextA = c.iterA.Next()
+		nextB = c.iterB.Next()
+	case valA == nil && valB == nil:


I prefer not to depend on nil values like this. I think that the problem is you already called Next on the iterators by the time you call this function; perhaps we should figure out how to avoid that?

arielshaqed · 2024-05-09T09:49:57Z

pkg/graveler/joined_diff_iterator.go

+	if !c.advanceInnerIterators() {
+		return false
+	}


I think this is more complex than it should be. You have to call c.Next() the first time, before using the iterator. That's why you need to have the special case for p == nil in advanceInnerIterators. This function:

Relies on DiffIterator returning nil value in certain conditions. Regardless of whether or not it does this, relying on this makes it a very special case - most Go iterators do not support this!

We repeatedly call Value() on each iterator. This is weird.

We have 2 switch statements. The first has 7 cases and 2 special cases in if statements, the second has 5 cases. This is too complex.

arielshaqed · 2024-05-09T09:51:05Z

pkg/graveler/joined_diff_iterator.go

+	if c.p != nil && c.p.Err() != nil {
+		return c.p.Err()
+	}


I don't understand. c.p is always either c.iterA or c.iterB. Why do we need to check it separately?

arielshaqed · 2024-05-09T09:51:19Z

pkg/graveler/joined_diff_iterator.go

+	if c.p != nil && c.p.Err() != nil {
+		return c.p.Err()
+	}
+	if c.iterA != nil && c.iterA.Err() != nil {


When is c.iterA == nil?

…d-compacted-branch

arielshaqed

Still worried about the iterator. The switch statement in there is huge, which makes it too hard for me to understand. I think it has issues, but TBH I really don't know.

So still requesting changes, if only to documentation.

arielshaqed · 2024-05-13T08:12:13Z

pkg/graveler/graveler.go

+	// CompactedBaseMetaRangeID - the MetaRangeID of the last compaction's
+	CompactedBaseMetaRangeID MetaRangeID


@itaiad200 can you chime in, please?

pkg/graveler/graveler.go

arielshaqed · 2024-05-18T15:07:58Z

pkg/graveler/joined_diff_iterator.go

+	iterAHasMore bool
+	iterB        DiffIterator
+	iterBHasMore bool
+	p            DiffIterator


Again, please document p. The name is very unintuitive.

arielshaqed · 2024-05-18T15:10:16Z

pkg/graveler/joined_diff_iterator.go

+	case bytes.Equal(valA.Key, valB.Key):
+		c.iterAHasMore = c.iterA.Next()
+		c.iterBHasMore = c.iterB.Next()
+	case bytes.Compare(valA.Key, valB.Key) < 0:


This traverses the keys twice. Please compare once; it is fine to call a separate routine here, but that will need to be documented. In fact you can return immediately in all the above cases, and then compute bytes.Compare(val!.Key, valB.Key) once.

arielshaqed · 2024-05-18T15:10:47Z

pkg/graveler/joined_diff_iterator.go

+		c.iterBHasMore = c.iterB.Next()
+	case bytes.Compare(valA.Key, valB.Key) < 0:
+		c.iterAHasMore = c.iterA.Next()
+		c.iterBHasMore = true


Why do we need this line? After line 43 we know that iterBHasMore.

arielshaqed · 2024-05-18T15:10:52Z

pkg/graveler/joined_diff_iterator.go

+	default:
+		// value of iterA < value of iterB
+		c.iterBHasMore = c.iterB.Next()
+		c.iterAHasMore = true


arielshaqed · 2024-05-18T15:12:59Z

pkg/graveler/joined_diff_iterator.go

+		return false
+	}
+	if !c.iterAHasMore && !c.iterBHasMore {
+		return false


I don't understand. Suppose that the last values of iterA and iterB refer to the same key. Then these lines will make this condition true -- and the iterator misses the last value!

We process only values we already returned in the first switch-case statement.
I'll add a test case just to make sure.

arielshaqed · 2024-05-18T15:14:17Z

pkg/graveler/joined_diff_iterator.go

+	valA = c.iterA.Value()
+	valB = c.iterB.Value()


If c.iterAHasMore then it is incorrect to call c.iterA.Value(), and similarly for iterB.

arielshaqed · 2024-05-18T15:15:51Z

pkg/graveler/joined_diff_iterator.go

+	c.iterA.SeekGE(id)
+	c.iterB.SeekGE(id)


This seems incorrect: AFAIR you must call Next() after SeekGE().

That's true.
This is the same behavior we have in our CombinedIterator.
I don't know if we want to have the same behavior here or change it.

…d-compacted-branch

arielshaqed

Thanks! A lot of effort went into this, and I really believe it will improve performance - so definitely worth it. Sorry for picking so many nits - this is the core of lakeFS, and we need both tests and manual review.

My few remaining comments are minor - feel free to push to trunk however you decide to handle them.

arielshaqed · 2024-06-10T10:57:43Z

pkg/graveler/prefix.go

-// UpperBoundForPrefix returns, given a prefix `p`, a slice 'q' such that a byte slice `s` starts with `p`
-// if and only if p <= s < q. Namely, it returns an exclusive upper bound for the set of all byte arrays
-// that start with this prefix. It returns nil if there is no such byte slice because all bytes of `p` are
+// UpperBoundForPrefix returns, given a prefix `currenIter`, a slice 'q' such that a byte slice `s` starts with `currenIter`


Nit: I think this should be currentIter.

arielshaqed · 2024-06-10T11:06:30Z

pkg/graveler/joined_diff_iterator.go

+	if c.currenIter != nil && c.currenIter.Err() != nil {
+		return c.currenIter.Err()
 	}


This seems wrong: it says that if one iterator failed but the other still works, then everything is OK. But it's not! Next() returned false, and I need to fail.

Suggested change

if c.currenIter != nil && c.currenIter.Err() != nil {

return c.currenIter.Err()

}

if c.iterA.Err() != nil { // TODO(idan): Return a multierror with _both_ errors if both failed!

return c.iterA.Err()

}

if c.iterB.Err() != nil {

return c.iterB.Err()

}

return nil

We need a unit test for this, too, unfortunately.

arielshaqed · 2024-06-10T11:10:00Z

pkg/graveler/joined_diff_iterator.go

 	}
 	return true
 }

 func (c *JoinedDiffIterator) Value() *Diff {
-	if c.p == nil {
+	if c.currenIter == nil {


Please note that calling Value() here is an error. Even though you return successfully, the caller will probably panic shortly after. Your call, but personally I would at least log an "internal error".

pkg/catalog/gc_write_uncommitted.go

pkg/graveler/graveler.go

I've got 2 approvals

Read from branch with compaction data

deee833

idanovo added the exclude-changelog PR description should not be included in next release changelog label Apr 25, 2024

idanovo self-assigned this Apr 25, 2024

idanovo linked an issue Apr 25, 2024 that may be closed by this pull request

Manage read from branch with compaction #7698

Closed

Remove unneeded test

036a2b4

idanovo requested review from arielshaqed, itaiad200 and guy-har April 27, 2024 21:19

arielshaqed requested changes Apr 28, 2024

View reviewed changes

pkg/graveler/graveler.go Outdated Show resolved Hide resolved

pkg/graveler/graveler.go Outdated Show resolved Hide resolved

pkg/graveler/graveler.go Outdated Show resolved Hide resolved

guy-har previously requested changes Apr 28, 2024

View reviewed changes

Fix comments

82694ff

itaiad200 requested changes Apr 29, 2024

View reviewed changes

idanovo added 6 commits May 1, 2024 11:42

Merge branch 'master' of https://github.com/treeverse/lakeFS into rea…

90059b0

…d-compacted-branch

WIP

c554d87

WIP

0170d65

WIP

5f44d2b

WIP

644a04d

WIP

18742fa

idanovo requested review from arielshaqed, itaiad200 and guy-har May 3, 2024 04:17

itaiad200 requested changes May 5, 2024

View reviewed changes

guy-har reviewed May 6, 2024

View reviewed changes

idanovo added 5 commits May 6, 2024 12:44

PR reviews

6cf1337

WIP

11a448f

Fix review

dfb54e6

Added tests

f609ec2

Added GC tests

eb9c4e4

idanovo requested a review from itaiad200 May 7, 2024 09:05

itaiad200 requested changes May 7, 2024

View reviewed changes

idanovo added 7 commits May 7, 2024 18:24

PR review

d51ead2

Fix

3adc263

Fix tests

70c49d0

Merge branch 'master' of https://github.com/treeverse/lakeFS into rea…

ca9a9cb

…d-compacted-branch

WIP

29e5085

Added compaction to gc test

c2cef8b

Rename function name

005f718

idanovo requested a review from itaiad200 May 8, 2024 12:58

itaiad200 approved these changes May 8, 2024

View reviewed changes

pkg/catalog/catalog_test.go Show resolved Hide resolved

pkg/graveler/graveler.go Show resolved Hide resolved

pkg/graveler/graveler.go Show resolved Hide resolved

pkg/graveler/graveler.go Outdated Show resolved Hide resolved

Merge branch 'master' of https://github.com/treeverse/lakeFS into rea…

2717510

…d-compacted-branch

arielshaqed requested changes May 9, 2024

View reviewed changes

Review comments

0400e6c

idanovo requested review from arielshaqed May 9, 2024 13:06

Merge branch 'master' of https://github.com/treeverse/lakeFS into rea…

23221ae

…d-compacted-branch

arielshaqed requested changes May 18, 2024

View reviewed changes

idanovo added 4 commits May 18, 2024 22:01

Fix PR review

1c10a6e

Merge branch 'master' of https://github.com/treeverse/lakeFS into rea…

19d797c

…d-compacted-branch

Lint

2a0f495

Merge branch 'master' of https://github.com/treeverse/lakeFS into rea…

4c46b26

…d-compacted-branch

arielshaqed approved these changes Jun 11, 2024

View reviewed changes

PR review

3426e0b

idanovo removed the request for review from guy-har June 11, 2024 09:57

idanovo merged commit 317c985 into master Jun 11, 2024
35 checks passed

idanovo deleted the read-compacted-branch branch June 11, 2024 09:57

		// CompactedBaseMetaRangeID - the MetaRangeID of the last compaction's
		CompactedBaseMetaRangeID MetaRangeID

-	if c.currenIter != nil && c.currenIter.Err() != nil {
-		return c.currenIter.Err()
-	}
+	if c.iterA.Err() != nil {  // TODO(idan): Return a multierror with _both_ errors if both failed!
+		return c.iterA.Err()
+	}
+	if c.iterB.Err() != nil {
+		return c.iterB.Err()
+	}
+	return nil

Read from branch with compaction data #7701

Read from branch with compaction data #7701

Conversation

idanovo commented Apr 25, 2024

github-actions bot commented Apr 25, 2024 • edited

E2E Test Results - DynamoDB Local - Local Block Adapter

arielshaqed left a comment

Choose a reason for hiding this comment

itaiad200 commented Apr 28, 2024 • edited

guy-har left a comment • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

itaiad200 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

itaiad200 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

idanovo commented May 8, 2024

itaiad200 left a comment

Choose a reason for hiding this comment

arielshaqed left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arielshaqed left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arielshaqed left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Apr 25, 2024 •

edited

itaiad200 commented Apr 28, 2024 •

edited

guy-har left a comment •

edited