Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing PFCOUNT #925

Closed
martadinata666 opened this issue Mar 11, 2023 · 14 comments · Fixed by dragonflydb/documentation#80
Closed

Missing PFCOUNT #925

martadinata666 opened this issue Mar 11, 2023 · 14 comments · Fixed by dragonflydb/documentation#80
Assignees
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@martadinata666
Copy link

Describe the bug
Missing PFCOUNT command on mastodon installation

To Reproduce
Steps to reproduce the behavior:

  1. Replace redis to dragonfly on mastodon installation
  2. Load the mastodon web
  3. Web empty
  4. Check logs show Missing PFCOUNT

Expected behavior
Web loaded, and run successfully.

Environment (please complete the following information):

  • OS: Ubuntu 22.04
  • Kernel: `6.2.1-060201-generic
  • Containerized?: Docker Compose
  • Dragonfly Version: 0.17.0

Log

[83618058-3981-4ac1-9d89-7d913e276f63] method=HEAD path=/health format=*/* controller=HealthController action=show status=200 duration=0.36 view=0.19 db=0.00
[8d0133b7-154c-445d-8392-caa4ff3d7cf0] method=GET path=/api/v1/instance format=*/* controller=Api::V1::InstancesController action=show status=200 duration=1.51 view=0.28 db=0.00
[f40159fd-6247-4f67-9f03-4e5de65b2a11] method=GET path=/.well-known/nodeinfo format=json controller=WellKnown::NodeInfoController action=index status=200 duration=1.02 view=0.51 db=0.00
[86aedd5c-da14-4bcd-9df2-899cfd971401] method=GET path=/nodeinfo/2.0 format=*/* controller=WellKnown::NodeInfoController action=show status=500 error='Redis::CommandError: ERR unknown command `PFCOUNT`' duration=1.96 view=0.00 db=0.00
[86aedd5c-da14-4bcd-9df2-899cfd971401]   
[86aedd5c-da14-4bcd-9df2-899cfd971401] Redis::CommandError (ERR unknown command `PFCOUNT`):
[86aedd5c-da14-4bcd-9df2-899cfd971401]   
[86aedd5c-da14-4bcd-9df2-899cfd971401] app/lib/activity_tracker.rb:50:in `sum'
[86aedd5c-da14-4bcd-9df2-899cfd971401] app/presenters/instance_presenter.rb:66:in `block in active_user_count'
[86aedd5c-da14-4bcd-9df2-899cfd971401] app/presenters/instance_presenter.rb:66:in `active_user_count'
[86aedd5c-da14-4bcd-9df2-899cfd971401] app/serializers/nodeinfo/serializer.rb:28:in `usage'
[86aedd5c-da14-4bcd-9df2-899cfd971401] app/controllers/concerns/cache_concern.rb:22:in `render_with_cache'
[86aedd5c-da14-4bcd-9df2-899cfd971401] app/controllers/well_known/nodeinfo_controller.rb:16:in `show'
[86aedd5c-da14-4bcd-9df2-899cfd971401] lib/mastodon/rack_middleware.rb:9:in `call'
[4e207fc1-dda8-4aa5-899e-2e3f6d64e8e2] method=GET path=/api/v1/streaming/public/local format=html controller=ApplicationController action=raise_not_found status=404 duration=3.78 view=2.05 db=0.00
[56488c9c-b4f1-48d3-9055-e27ea51d1abd] method=POST path=/inbox format=json controller=ActivityPub::InboxesController action=create status=202 duration=25.09 view=0.00 db=1.29 key=https://relay.gruenehoelle.nl/actor#main-key
[facc8d71-e724-4806-a702-0f321f9c29e3] method=GET path=/ format=*/* controller=HomeController action=index status=200 duration=52.70 view=32.25 db=6.46
[279f0cc5-9539-4d33-b1d3-6778aa15f902] method=GET path=/api/v1/mutes format=*/* controller=Api::V1::MutesController action=index status=200 duration=14.39 view=0.81 db=4.85
[db3b819f-939c-44af-914a-e3b9bdcab4b5] method=GET path=/api/v1/blocks format=*/* controller=Api::V1::BlocksController action=index status=200 duration=10.38 view=0.39 db=2.68
[46e0d300-6c53-40f8-aba2-1e0190c9297a] method=GET path=/api/v1/domain_blocks format=*/* controller=Api::V1::DomainBlocksController action=show status=200 duration=5.03 view=0.30 db=1.01
[426aadac-1953-40a2-b33a-8738b182e58e] method=HEAD path=/health format=*/* controller=HealthController action=show status=200 duration=0.35 view=0.16 db=0.00

@martadinata666 martadinata666 added the bug Something isn't working label Mar 11, 2023
@romange
Copy link
Collaborator

romange commented Apr 18, 2023

@chakaz I think it's a perfect starter project for you :)
it's a medium size task that will force you to go through all the internals of Dragonfly codebase.

  1. Integrating Redis code into /src/redis/ lib.
    The Redis data-structure implementation is pretty good, and I do not see any reason right now to reimplement it ourselves. So the goal is to add hyperloglog.c to our codebase. The way I do this (as you can see with other data structure files), I usually take a file from the latest stable redis branch (7.2, for example), and delete all the functions having "client*" in their signature. "client*" functions are part of server functionality that we can not reuse and must reimplement in Dragonfly. hyperloglog.c does not have n accompanying header file on its own, so you need to add one in order to the export functions it implements (subject to requirements from the next steps).
    Also, you can not rely on "server.h" - but it's easily solvable - see how I did it with other files, For example, we have a stub "server" object that contains required settings with the sane defaults.
  2. Wrapping hyperloglog into compact_object. CompactObject is a mess - sorry about it. You will need to add hyperloglog to RobjWrapper object, and introduce a test in "compact_object_test" that can import "robj*" representing hyperloglog into compactobject. You will need to implement memory management wrappers, expose Type functionality etc. You can check how zset was integrated for example.
  3. Once we have support in compact_object_test, we can implement the commands. The commands are divided into 2 parts: coordinator and shard local implementation. Again, you will be able to take any zset related command and see how it's done. The MVP we should focus on is to implement only two commands "PFADD" and "PFCOUNT" and for PFCOUNT we will implement only a single key version right now (can be easily extended later). Obviously this will require adding a new unit test - "pf_family_test.cc"
  4. Serialization - we will need to support load/save functionality. Another pretty complicated flow inside rdb_load.cc and rdb_save.cc. rdb_test.cc should help you there.
  5. Last but not least, replication - we must verify that it's replicated, which means expanding python regression tests to cover PF_ commands.
  6. Documentation - update https://github.com/dragonflydb/documentation/ with the new command set :)

It's not a small task but I think you gonna enjoy it :)
@adiholden will help you along the way and will help with code reviews.

@romange romange added enhancement New feature or request good first issue Good for newcomers and removed bug Something isn't working labels Apr 18, 2023
@chakaz
Copy link
Collaborator

chakaz commented Apr 21, 2023

This looks like an exciting task :)

I do have a question re/ (2) above:
@adiholden, Roman mentioned that I should add HLL as a distinct sub-type (I assume that via assigning a unique type_ ID, like we do for string, json, etc). However, it seems like Redis treats HLL as plain strings without explicit distinction:

The HyperLogLog, being a Redis string, can be retrieved with GET and restored with SET. Calling PFADD, PFCOUNT or PFMERGE commands with a corrupted HyperLogLog is never a problem, it may return random values but does not affect the stability of the server. Most of the times when corrupting a sparse representation, the server recognizes the corruption and returns an error.

(source: https://redis.io/commands/pfcount/)

So my question is: should we indeed make this a type different from string?

If we'll keep it as a string:

  • We get replication, serialization, etc for free (implementation-wise)
  • We remain compatible with Redis (although I imagine not many users explicitly set the string value of HLL)

If we'll set a distinct type:

  • We'll make it easier to track errors, like when a user tries to use an HLL as a string or vice versa (although that's supposedly supported?)

wdyt? Are there some considerations which I I've missed?

@romange
Copy link
Collaborator

romange commented Apr 21, 2023 via email

@chakaz
Copy link
Collaborator

chakaz commented Apr 23, 2023

Here's where I'm currently at:
I exposed the relevant functionality from hyperloglog.c, with a simple C API.
Then, when using this API from a new hll_family.cc file that I created, I encountered an issue.
The thing is - hyperloglog.c works with robj*. For example, when resizing the string, it does so via the robj API. I'm not sure if this plays nicely with CompactObj, as at least so far all of my attempts have failed.
Specifically what I tried was to let hyperloglog.c create the robj for us and feed that to a CompactObj via ImportRObj(). That works, and indeed I am able to create an empty HLL. Yay!
But then, any attempt to modify that HLL (like via PFADD) fails. The reason, so I figured, is that calling CompactObj::AsRObj() does not return an robj that behaves like the original robj. Specifically, it does not have the sds size encoding right before the data memory, so HLL thinks that it's an empty string (thus invalid HLL). Looking around, I see that AsRObj() is not used with strings (?), so perhaps all I need is to modify that method to encode sds correctly? However I also see that after calling AsRObj() I'll also need to SyncRObj(), so perhaps I'm just way off?
Should I use RObjPtr() and SetRObjPtr() instead? These feel more raw to me, so I want to make sure that it's not a mistake.
Any advice? Thanks!

@romange
Copy link
Collaborator

romange commented Apr 24, 2023

@chakaz , well you are right. We should not mix sds with DF strings so the integration I suggested won't work.

The only possible solution I see is to peel the API even further. That means we will need to reimplement all the robj handing functions like createHLLObject that will use CompactObj in our version.

Luckily we still have low-level that we can reuse like hllDenseAdd, hllCount etc. I had to do similar peelings for some of the data structures. Does it sound reasonable?

@chakaz
Copy link
Collaborator

chakaz commented Apr 24, 2023

Sure, it sounds very much reasonable! I'm happy I asked, as I didn't realize this was a "deal breaker".
I think that the only case that really needs to resize the string is in the case of a sparse (i.e. non-dense) HLL, when adding a new element. Hopefully that won't be too hard to integrate. Otherwise it should always be string modifications that do not require resizing (at least that's what I understand).
I'm on it.

@chakaz
Copy link
Collaborator

chakaz commented Apr 24, 2023

I will say that, should we want to reduce complexity and "launch" faster, we could (at least initially?) only support the dense encoding. It has clear advantages in our case, which is that the size is fixed and thus the integration is much easier.
It does have the downside of always consuming more memory, even when Redis would have used less. Also, it won't play nicely with replication where the master is Redis.
Anyway, I imagine that we're not interested in that, and so I'll go ahead and will work on the deeper integration, but let me know if you do think that it's a better idea (even if temporarily)

@romange
Copy link
Collaborator

romange commented Apr 24, 2023

Sounds reasonable!

@chakaz
Copy link
Collaborator

chakaz commented Apr 24, 2023

Sorry, which option sounds reasonable? :)
Only supporting dense encoding, or doing the deeper integration?

@romange
Copy link
Collaborator

romange commented Apr 24, 2023 via email

@chakaz
Copy link
Collaborator

chakaz commented Apr 30, 2023

With #1152 merged, we still need to add:

  1. Support for PFMERGE
  2. Documentation

I'm on it :)

@romange
Copy link
Collaborator

romange commented Apr 30, 2023 via email

@chakaz
Copy link
Collaborator

chakaz commented May 1, 2023

Yes :)
I'll get to that after PFMERGE though.

@romange
Copy link
Collaborator

romange commented May 18, 2023

FYI, Dragonfly v1.3.0 and later supports this API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants