Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Allow querying of 9th, 19th, 29th, etc collections #2819

Merged

Conversation

AndrewSisley
Copy link
Contributor

@AndrewSisley AndrewSisley commented Jul 8, 2024

Relevant issue(s)

Resolves #2810

Description

Allows querying of 9th, 19th, 29th, etc collections.

The fetcher calls PrefixEnd which causes the start prefix to be (e.g.) "/9" and the end "/10" causing nothing to be returned to the user. This PR fixes that by encoding the DataStore CollectionRootIDs as variable sized BigEndians.

#2818 has been broken out of this as it is not a user-visible issue atm and resolving it is a bit more involved due to some legacy tech. debt.

@AndrewSisley AndrewSisley added bug Something isn't working area/datastore Related to the datastore / storage engine system labels Jul 8, 2024
@AndrewSisley AndrewSisley added this to the DefraDB v0.13 milestone Jul 8, 2024
@AndrewSisley AndrewSisley requested a review from a team July 8, 2024 20:07
@AndrewSisley AndrewSisley self-assigned this Jul 8, 2024
Copy link

codecov bot commented Jul 8, 2024

Codecov Report

Attention: Patch coverage is 92.85714% with 4 lines in your changes missing coverage. Please review.

Project coverage is 79.14%. Comparing base (24fa14f) to head (0afbc3c).

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #2819      +/-   ##
===========================================
+ Coverage    78.60%   79.14%   +0.54%     
===========================================
  Files          318      318              
  Lines        24128    24162      +34     
===========================================
+ Hits         18965    19123     +158     
+ Misses        3785     3662     -123     
+ Partials      1378     1377       -1     
Flag Coverage Δ
all-tests 79.14% <92.86%> (+0.54%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
internal/core/key.go 83.96% <100.00%> (-0.03%) ⬇️
internal/planner/scan.go 90.75% <100.00%> (+1.16%) ⬆️
internal/core/encoding.go 79.63% <91.49%> (+3.30%) ⬆️

... and 25 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 24fa14f...0afbc3c. Read the comment docs.

@@ -0,0 +1,315 @@
// Copyright 2024 Democratized Data Foundation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Praise: Thanks a bunch for documenting these! Really nice to see the issue narrowed down with the edge cases you highlighted.

@@ -101,8 +101,8 @@ query @explain {
"collectionID": "3",
"collectionName": "Author",
"spans": [{
"start": "/3",
"end": "/4"
"start": "/\x8b",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: My eyes 😮

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has been reverted - encoding is not json friendly, so explain must be pretty-printed anyway

@@ -34,86 +34,6 @@ func TestNewDataStoreKey_ReturnsEmptyStruct_GivenEmptyString(t *testing.T) {
assert.ErrorIs(t, ErrEmptyKey, err)
}

func TestNewDataStoreKey_ReturnsCollectionIdAndIndexIdAndDocIDAndFieldIdAndInstanceType_GivenFourItemsWithType(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Do we loose any coverage or paths that these tests were testing due to the removal?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt it, in theory everything should be covered by integration tests anyway, and the new code has fewer branches IIRC.

}, nil
}

func EncodeDataStoreKey(key *DataStoreKey) []byte {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: A line documenting the Encoding type

Copy link
Contributor Author

@AndrewSisley AndrewSisley Jul 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add

  • doc func

@@ -288,3 +288,81 @@ func EncodeIndexDataStoreKey(key *IndexDataStoreKey) []byte {

return b
}

func DecodeDataStoreKey(data []byte) (DataStoreKey, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: A line documenting the decoding type

Copy link
Contributor Author

@AndrewSisley AndrewSisley Jul 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add

  • doc func

Copy link
Member

@shahzadlone shahzadlone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments

Copy link
Contributor

@islamaliev islamaliev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I left some clarifying comments.

@@ -1301,6 +1300,7 @@ func createDocViaGQL(
node client.DB,
collections []client.Collection,
) ([]*client.Document, error) {
println(fmt.Sprint(collections))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: is it left here intentional?

Copy link
Contributor Author

@AndrewSisley AndrewSisley Jul 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not :) And I should have been more careful so soon after the mutation type bug :) Thanks for spotting

  • remove print

}

var docID string
if len(data) > 40 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: please extract the magic number to a const

Copy link
Contributor Author

@AndrewSisley AndrewSisley Jul 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

  • Rm magic number

}

var instanceType InstanceType
if len(data) > 1 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: wouldn't it be better to do:

if len(data) <= 1 {
  return ...
}

// check instance type ...

Currently if if len(data) > 1 is false the rest of the function is irrelevant, but is being executed.

Copy link
Contributor Author

@AndrewSisley AndrewSisley Jul 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do prefer the simplicity of the current code, although it does have a small runtime cost. I'll have a quick look when I remove the magic number - do let me know if you have a strong preference though.

  • Have quick look at early returns

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't call it a strong preference. Just following return-early best practice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a play, whilst I like it hightlights the data flow a little more nicely, it ends up duplicating logic (or mutating, or a new func) and even without my gut prefers the feel of the current. Leaving as-is.

"start": "/3",
"end": "/4",
"start": "/\x8b",
"end": "/\x8c",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

though: this is pain. Would be nice to have it extracted so that if it changes again, we don't need to go through it again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explain was actually pretty easy considering how invasive they are by nature, the unit tests were much more of a pain.

This might get changed slightly before merge anyway - the http client does not like this encoding and it caught me out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been reverted, the encoding was not json friendly and needed to be prettied up in order to work with the http client.

testUtils "github.com/sourcenetwork/defradb/tests/integration"
)

func TestSimple_WithSevenDummyTypesBefore(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: would be nice to have "expectation" part in the test name:
TestSimple_WithSevenDummyTypesBefore_ShouldSucceed or something like this.

Comment on lines +42 to +48
type Type6 {
f: String
}

type User {
name: String
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: for these tests would be nice to have some more information (as a documentation or test description) explaining why these weird-looking tests are necessary and which edge-case they test.

Copy link
Contributor Author

@AndrewSisley AndrewSisley Jul 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I'll add some in - thanks for pushing past my laziness here :)

  • Add test docs

Comment on lines -105 to -115
func TestNewDataStoreKey_GivenAStringWithExtraSuffix(t *testing.T) {
instanceType := "anyType"
fieldId := "f1"
docID := "docID"
collectionId := "1"
inputString := "/db/data/" + collectionId + "/" + instanceType + "/" + docID + "/" + fieldId + "/version_number"

_, err := NewDataStoreKey(inputString)

assert.ErrorIs(t, ErrInvalidKey, err)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: you just removed the tests. Don't you plan to replace them with new ones?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I don't think they had much value before, and even less so with the new code IMO.

Copy link
Member

@shahzadlone shahzadlone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please double check when all tests are passing that the removed unit tests weren't testing any edge cases before merge, other than that all good! Thanks a bunch for sorting this out.

@AndrewSisley AndrewSisley merged commit 1acc084 into sourcenetwork:develop Jul 9, 2024
39 checks passed
@AndrewSisley AndrewSisley deleted the 2810-dave-many-many-bug branch July 9, 2024 14:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/datastore Related to the datastore / storage engine system bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Many-many bridge appears to not be working
3 participants