-
Notifications
You must be signed in to change notification settings - Fork 873
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(tiering): Simplest small bins #2810
Conversation
0bab36e
to
216a607
Compare
src/server/tiering/disk_storage.h
Outdated
@@ -14,6 +14,10 @@ | |||
namespace dfly::tiering { | |||
|
|||
struct DiskSegment { | |||
DiskSegment FillPages() const { | |||
return {offset / 4_KB * 4_KB, length / 4_KB * 4_KB}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not good, imho. You round down length, where you should round up offset + length
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, actually I don't use the length for now anywhere. Offset should be rounded down of course. Rounding should always contain the original page
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
src/server/tiering/small_bins.cc
Outdated
return {id, std::move(out)}; | ||
} | ||
|
||
SmallBins::CutList SmallBins::ReportStashed(unsigned id, DiskSegment segment) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's a cutlist? why this name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll call it KeySegmentList. After a bin was stashed, this list returns where each value is stored on the page (segment)
src/server/tiering/small_bins.cc
Outdated
auto& key = node.key(); | ||
auto& value = node.mapped(); | ||
|
||
// 1. 8 byte key hash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you interleave hashes and key/values.
Is it how it was in tiered_storage? I think we keep all the hashes together. The motivation behind the separation: to read easily all the hashes from the page when performing defragmentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the format was different. The hashes were written first, then all the values with their lengths rounded up because we had separate bins. Will change to all hashes being stored first, but keep in mind this is not a final design
Now the format is: num_entries [hash...] [value_length value...] |
5613130
to
cbdbd3b
Compare
you need to rebase and fix confllicts. |
Actually, I don't store value lengths, fixed accounting 😅 |
using FilledBin = std::pair<unsigned /* id */, std::string>; | ||
|
||
// List of locations of values for corresponding keys | ||
using KeySegmentList = std::vector<std::pair<std::string /* key*/, DiskSegment>>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of commenting on basic types, I suggest using
to declare Key
, Id
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also comment on Id that it's unique
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will define Id. I suggest not to define Key = string anywhere because it's overkill, we don't to it anywhere. It's clear from the comments above, inline comments are just for readability
src/server/tiering/small_bins.h
Outdated
// SIMPLEST VERSION for now. | ||
class SmallBins { | ||
public: | ||
using FilledBin = std::pair<unsigned /* id */, std::string>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here actually add a comment that string is a serialized blob, that is at most 4KB but can be less.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's stated above in SmallBins that it fill 4kb pages
src/server/tiering/small_bins.h
Outdated
FilledBin FlushBin(); | ||
|
||
private: | ||
unsigned last_bin_id_ = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the implication of this id ? unsigned can overflow, that's why I am asking.
Do you use it to track in flight requests or it's more then that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I see. you use it to track in flight ids.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's only for pending stashes. It can overflow safely, I doubt we can have 2e9 ongoing disk operations
531c3a3
to
dc1a0c1
Compare
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
Signed-off-by: Vladislav Oleshko <vlad@dragonflydb.io>
…nfly ( v1.16.1 → v1.17.0 ) (#3473) This PR contains the following updates: | Package | Update | Change | |---|---|---| | [docker.dragonflydb.io/dragonflydb/dragonfly](https://togithub.com/dragonflydb/dragonfly) | minor | `v1.16.1` -> `v1.17.0` | --- ### Release Notes <details> <summary>dragonflydb/dragonfly (docker.dragonflydb.io/dragonflydb/dragonfly)</summary> ### [`v1.17.0`](https://togithub.com/dragonflydb/dragonfly/releases/tag/v1.17.0) [Compare Source](https://togithub.com/dragonflydb/dragonfly/compare/v1.16.1...v1.17.0) ##### Dragonfly v1.17.0 Some prominent changes include: - Improved performance for MGET operations ([#​2453](https://togithub.com/dragonflydb/dragonfly/issues/2453)) - Fix argument parsing in json.objkeys ([#​2872](https://togithub.com/dragonflydb/dragonfly/issues/2872)) - Fix ipv6 support for replication ([#​2889](https://togithub.com/dragonflydb/dragonfly/issues/2889)) - Support serialisation of bloom filters - saving to and loading from snapshots ([#​2846](https://togithub.com/dragonflydb/dragonfly/issues/2846)) - Support of HLL PFADD ([#​2761](https://togithub.com/dragonflydb/dragonfly/issues/2761)) - Support bullmq workloads that do not have `{}` hashtags in their queue names ([#​2890](https://togithub.com/dragonflydb/dragonfly/issues/2890)) ##### What's Changed - fix: [#​2745](https://togithub.com/dragonflydb/dragonfly/issues/2745) don't start migration process again after apply the same the same config is applied by [@​BorysTheDev](https://togithub.com/BorysTheDev) in [dragonflydb/dragonfly#2822 - feat(transaction): Idempotent callbacks (immediate runs) by [@​dranikpg](https://togithub.com/dranikpg) in [dragonflydb/dragonfly#2453 - refactor(cluster): replace sync_id with node_id for slot migration [#​2835](https://togithub.com/dragonflydb/dragonfly/issues/2835) by [@​BorysTheDev](https://togithub.com/BorysTheDev) in [dragonflydb/dragonfly#2838 - feat(tiering): Simple OpManager by [@​dranikpg](https://togithub.com/dranikpg) in [dragonflydb/dragonfly#2781 - chore: implement path mutation for JsonFlat by [@​romange](https://togithub.com/romange) in [dragonflydb/dragonfly#2805 - feat(cluster): add migration removing by config [#​2835](https://togithub.com/dragonflydb/dragonfly/issues/2835) by [@​BorysTheDev](https://togithub.com/BorysTheDev) in [dragonflydb/dragonfly#2844 - chore: expose direct API on Bloom objects by [@​romange](https://togithub.com/romange) in [dragonflydb/dragonfly#2845 - chore: generalize CompactObject::AllocateMR by [@​kostasrim](https://togithub.com/kostasrim) in [dragonflydb/dragonfly#2847 - feat(tiering): Simplest small bins by [@​dranikpg](https://togithub.com/dranikpg) in [dragonflydb/dragonfly#2810 - refactor: clean cluster slot migration code by [@​BorysTheDev](https://togithub.com/BorysTheDev) in [dragonflydb/dragonfly#2848 - fix(tests): Fix numsub test by [@​dranikpg](https://togithub.com/dranikpg) in [dragonflydb/dragonfly#2852 - fix: healthcheck for docker containers by [@​romange](https://togithub.com/romange) in [dragonflydb/dragonfly#2853 - fix: possible crash in tls code by [@​romange](https://togithub.com/romange) in [dragonflydb/dragonfly#2854 - fix(server): Do not block admin-port commands by [@​chakaz](https://togithub.com/chakaz) in [dragonflydb/dragonfly#2842 - fix(pytest): make pytests fail if server crash on shutdown by [@​adiholden](https://togithub.com/adiholden) in [dragonflydb/dragonfly#2827 - feat(server): add prints on takeover timeout by [@​adiholden](https://togithub.com/adiholden) in [dragonflydb/dragonfly#2856 - fix(pytest): dont check process return code on kill by [@​adiholden](https://togithub.com/adiholden) in [dragonflydb/dragonfly#2862 - fix: authorize the http connection to call commands by [@​romange](https://togithub.com/romange) in [dragonflydb/dragonfly#2863 - feat(cluster): Send number of keys for incoming and outgoing migrations. by [@​chakaz](https://togithub.com/chakaz) in [dragonflydb/dragonfly#2858 - feat(tiering): TieredStorageV2 by [@​dranikpg](https://togithub.com/dranikpg) in [dragonflydb/dragonfly#2849 - bug(server): set connection flags block/pause flag on all blocking commands by [@​adiholden](https://togithub.com/adiholden) in [dragonflydb/dragonfly#2816 - chore: serialize SBF by [@​romange](https://togithub.com/romange) in [dragonflydb/dragonfly#2846 - fix: test_replicaof_reject_on_load crash on stop by [@​kostasrim](https://togithub.com/kostasrim) in [dragonflydb/dragonfly#2818 - feat(dbslice): Add self-laundering iterator in `DbSlice` by [@​chakaz](https://togithub.com/chakaz) in [dragonflydb/dragonfly#2815 - chore: License update by [@​romange](https://togithub.com/romange) in [dragonflydb/dragonfly#2767 - fix(acl): incompatibilities with acl load by [@​kostasrim](https://togithub.com/kostasrim) in [dragonflydb/dragonfly#2867 - fix(json): make path optional in json.objkeys by [@​kostasrim](https://togithub.com/kostasrim) in [dragonflydb/dragonfly#2872 - fix: return wrong type errors for SET...GET command by [@​romange](https://togithub.com/romange) in [dragonflydb/dragonfly#2874 - fix(redis replication): remove partial sync flow ,not supported yet by [@​adiholden](https://togithub.com/adiholden) in [dragonflydb/dragonfly#2865 - chore: limit traffic logger only to the main interface by [@​romange](https://togithub.com/romange) in [dragonflydb/dragonfly#2877 - chore: relax repltakeover constraints to only exclude write commands by [@​kostasrim](https://togithub.com/kostasrim) in [dragonflydb/dragonfly#2873 - chore(replayer): Roll back to go1.18 by [@​dranikpg](https://togithub.com/dranikpg) in [dragonflydb/dragonfly#2881 - fix: brpoplpush single shard to wake up blocked transactions by [@​kostasrim](https://togithub.com/kostasrim) in [dragonflydb/dragonfly#2875 - chore: LockTable tracks fingerprints of keys by [@​romange](https://togithub.com/romange) in [dragonflydb/dragonfly#2839 - chore: reject TLS handshake when our listener is plain TCP by [@​romange](https://togithub.com/romange) in [dragonflydb/dragonfly#2882 - Add support for Sparse HLL PFADD by [@​azuredream](https://togithub.com/azuredream) in [dragonflydb/dragonfly#2761 - feat server: bring visibility to script errors by [@​adiholden](https://togithub.com/adiholden) in [dragonflydb/dragonfly#2879 - chore: clean up REPLTAKEOVER flow by [@​romange](https://togithub.com/romange) in [dragonflydb/dragonfly#2887 - chore(tiering): Move files and move kb literal to common by [@​dranikpg](https://togithub.com/dranikpg) in [dragonflydb/dragonfly#2868 - chore(interpreter): Support object replies by [@​dranikpg](https://togithub.com/dranikpg) in [dragonflydb/dragonfly#2885 - fix(ci/helm): Stick to v0.73.0 version of prom operator by [@​Pothulapati](https://togithub.com/Pothulapati) in [dragonflydb/dragonfly#2893 - fix(acl): authentication with UDS socket by [@​kostasrim](https://togithub.com/kostasrim) in [dragonflydb/dragonfly#2895 - feat(cluster): add repeated ACK if an error is happened by [@​BorysTheDev](https://togithub.com/BorysTheDev) in [dragonflydb/dragonfly#2892 - chore(blocking): Remove faulty DCHECK by [@​dranikpg](https://togithub.com/dranikpg) in [dragonflydb/dragonfly#2898 - chore: add a clear link on how to build dragonfly from source by [@​romange](https://togithub.com/romange) in [dragonflydb/dragonfly#2884 - feat(server): Allow configuration of hashtag extraction by [@​chakaz](https://togithub.com/chakaz) in [dragonflydb/dragonfly#2890 - fix: fix build under macos by [@​BorysTheDev](https://togithub.com/BorysTheDev) in [dragonflydb/dragonfly#2901 - fix(cluster_replication): replicate redis cluster node bug fix by [@​adiholden](https://togithub.com/adiholden) in [dragonflydb/dragonfly#2876 - fix(acl): skip http and add check on connection traversals by [@​kostasrim](https://togithub.com/kostasrim) in [dragonflydb/dragonfly#2883 - fix(zset): Better memory consumption calculation by [@​chakaz](https://togithub.com/chakaz) in [dragonflydb/dragonfly#2900 - fix: fix ld for num converting by [@​BorysTheDev](https://togithub.com/BorysTheDev) in [dragonflydb/dragonfly#2902 - chore: add help string for memory_fiberstack_vms_bytes by [@​romange](https://togithub.com/romange) in [dragonflydb/dragonfly#2903 - fix(sanitizers): false positive fail on multi_test::Eval by [@​kostasrim](https://togithub.com/kostasrim) in [dragonflydb/dragonfly#2896 - chore: pull helio and add ipv6 replication test by [@​dranikpg](https://togithub.com/dranikpg) in [dragonflydb/dragonfly#2889 - chore: add ipv6 support for native linux release by [@​romange](https://togithub.com/romange) in [dragonflydb/dragonfly#2908 ##### Huge thanks to all the contributors! ❤️ **Full Changelog**: dragonflydb/dragonfly@v1.16.0...v1.17.0 </details> <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy4zMDEuNSIsInVwZGF0ZWRJblZlciI6IjM3LjMwMS41IiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJyZW5vdmF0ZS9jb250YWluZXIiLCJ0eXBlL21pbm9yIl19--> Co-authored-by: repo-jeeves[bot] <106431701+repo-jeeves[bot]@users.noreply.github.com>
Simplest design for small bins