Skip to content

Conversation

@jeltsch
Copy link
Collaborator

@jeltsch jeltsch commented Feb 27, 2025

This pull request adds index benchmarks that are independent of index type and instantiates them for compact and ordinary indexes.

@jeltsch jeltsch requested a review from dcoutts as a code owner February 27, 2025 17:24
@jeltsch jeltsch added the enhancement New feature or request label Feb 27, 2025
@jeltsch jeltsch self-assigned this Feb 27, 2025
@jeltsch jeltsch requested a review from wenkokke as a code owner February 27, 2025 17:24
@jeltsch jeltsch force-pushed the jeltsch/general-index-benchmarks branch from 9770b4c to 5081353 Compare March 11, 2025 18:41
Comment on lines +111 to +116
incrementalConstructionAppends
:: Int -- ^ Number of keys used in the construction
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
incrementalConstructionAppends
:: Int -- ^ Number of keys used in the construction
incrementalConstructionAppends ::
Int -- ^ Number of keys used in the construction

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeltsch did you forget to resolve this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that this wouldn’t be so important, given that it was labeled a “suggested” change and the pull request was already approved as it were.

Generally, I think that having this sort of hanging indentation by not putting :: above the arrows makes for an inferior layout. I know that your editor and GitHub seem to have problems with a layout like the present one when it comes to syntax highlighting, but I think we shouldn’t let current bugs of tools influence our layout.

Well, I can make the change. Should I?

Comment on lines +48 to +75
-- | Deterministically constructs a value using a QuickCheck generator.
generated :: Gen a -> a
generated (MkGen exec) = exec (mkQCGen 411) 30

{-|
Constructs serialised keys that conform to the key size constraint of
compact indexes.
-}
keysForIndexCompact :: Int -- ^ Number of keys
-> [SerialisedKey] -- ^ Constructed keys
keysForIndexCompact = vector >>>
generated >>>
map (getKeyForIndexCompact >>> SerialisedKey)

{-|
Constructs append operations whose serialised keys conform to the key size
constraint of compact indexes.
-}
appendsForIndexCompact :: Int -- ^ Number of keys used in the construction
-> [Append] -- ^ Constructed append operations
appendsForIndexCompact = keysForIndexCompact >>>
mkPages 0.03 (choose (0, 16)) 0.01 >>>
generated >>>
toAppends
Copy link
Collaborator

@jorisdral jorisdral Mar 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally normally avoid using QuickCheck generators because they are often in flux, and because they are primarily tailored towards testing, not benchmarking. It can be more future proof to write these functions with System.Random and related functions. When we change QuickCheck generators, we might accidentally change a benchmark as well

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m also not completely satisfied with using QuickCheck generators for benchmarks. That said, by using System.Random directly, we lose the ability to use existing utilities, like we use mkPages above. I’d like to leave the above code as it is for now. It might be worthwhile, though, to generally revisit the uses of QuickCheck generators in our benchmarks.

Copy link
Collaborator

@jorisdral jorisdral Mar 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, by using System.Random directly, we lose the ability to use existing utilities, like we use mkPages above.

That's not completely true. You should run the mkPages Gen with a seed, yes, but you can generate the keys with System.Random. See the Index.Compact benchmarks

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you’re advocating a mixed-approach with some data being generated by directly using System.Random and other by using QuickCheck?

In the compact-index benchmark module, I can only see one use of mkPages, the one in constructionEnv. However, there the generator constructed by mkPages is run via generate, which means that it uses a seed produced by the global random number generator, not a seed specified in the source file. That said, I could still use my generated function for generators to work with fixed seeds.

Should I change the general index benchmarks to use that mixed approach where just the page data is generated using QuickCheck?

@jorisdral
Copy link
Collaborator

BTW, I think all commits can be squashed into one before merging

@jeltsch jeltsch force-pushed the jeltsch/general-index-benchmarks branch from 4904374 to b635b2c Compare March 12, 2025 19:44
@jeltsch jeltsch enabled auto-merge March 12, 2025 19:45
@jeltsch jeltsch added this pull request to the merge queue Mar 12, 2025
Merged via the queue into main with commit 585b257 Mar 12, 2025
27 checks passed
@jeltsch jeltsch deleted the jeltsch/general-index-benchmarks branch March 12, 2025 20:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants