Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Binary data is too long" in log - Multiple times every second #12

Closed
exetico opened this issue Oct 30, 2020 · 12 comments
Closed

"Binary data is too long" in log - Multiple times every second #12

exetico opened this issue Oct 30, 2020 · 12 comments

Comments

@exetico
Copy link

exetico commented Oct 30, 2020

Hi

I'm using the default configs for lsmcd. In my node.conf I've the following configs defined for LogLevel and LogFile:

LogLevel=notice
#LogLevel=dbg_medium
LogFile=/tmp/lsmcd.log

However, 2020-06-20 07:48:51.511 [NOTICE] [__root] Binary data is too long is reported over and over again - many times every second. (Let's just say it was above 1GB. I spotted it while making some maintenance of the server 🔥 🍔 )

If I take a look at the source, I can find the logics - but I'm not really sure that it means.

LS_NOTICE("Binary data is too long\n");

...

McBinStat LsMemcache::chkMemSz(MemcacheConn *pConn, int arg)
{
...
    if (!chkItemSize(m_parms.val.len))
    {
        // memcached compatibility - remove `stale' data
        LsShmHash::iteroffset iterOff;
        if (arg == MC_BINCMD_SET)
        {
            if ((iterOff = pConn->getHash()->
                    findIteratorWithKey(m_hkey,&m_parms)).m_iOffset != 0)
                pConn->getHash()->eraseIterator(iterOff);
        }
        LS_NOTICE("Binary data is too long\n");
        return MC_BINSTAT_E2BIG;
    }
...
}

What does the !chkItemSize(m_parms.val.len) logics check, and what can I do to fix this? The log are growing every minute. I can log-rotate if I want, but there must be a problem with the setup, or something else.

Let me know if I need to provide anything else.

@exetico
Copy link
Author

exetico commented Oct 30, 2020

I can see it's using this:

inline bool chkItemSize(uint32_t size)

And it's using:

pParms->m_iValMaxSz = getReplConf()->getValMaxSz();

Overall, it's defined here, right?

#define DEF_MEMMAXSZ 0

m_valMaxSz = ptr ? atoi(ptr) : DEF_VALMAXSZ;

So, what's going on :-D ?

@exetico
Copy link
Author

exetico commented Oct 30, 2020

By checking the documentation, I found this.

Cached.ValMaxSz

The maximum size of any individual entry in the cache. 0 disables the limit, the default is 1000000.

What will the impact be, if I'm bumping it to ten times the value, or maybe something else?

I hope someone are able to guide me.


And so, maybe it's a good ideá to add a comment in the logs, that DEF_VALMAXSZ ~ Cached.ValMaxSz can be configured in the configs, and the notice is related to that.

@exetico
Copy link
Author

exetico commented Nov 2, 2020

I've changed it to 10000000 (Adding 0 to the end), but I would still like to know, why it started as a problem.

@exetico
Copy link
Author

exetico commented Nov 12, 2020

I trashed lsmcd, and replaced it with memcached. Way better performance and caching numbers. I'll close this one... I would really like to use the default option - but without any answers, it's hard to do that :-D

Thanks for a great product, anyway! 👍

@exetico exetico closed this as completed Nov 12, 2020
@rperper
Copy link
Collaborator

rperper commented Nov 12, 2020

Hi,
I'm so sorry that we missed your earlier messages. This is a terrible oversight on our side and we'll work to address our approach to seeing messages in the github environment.

Yes, this is a bug - 0 should mean that the value is not used and the code will be updated shortly to reflect that. You should see an update here today.

If you decide to use lsmcd again, please feel free to contact me directly via email and we can avoid this kind of issue in the future.

Sorry once again, and be sure that your voice is heard!

Bob Perper
rperper@litespeedtech.com

@exetico
Copy link
Author

exetico commented Nov 12, 2020

Prioritizing it always a hard thing to master, from time to time. No hard feelings from my end. Maybe I could give it another try.

Lsmcd reported 'error' to all commands i provided to it, and therefore we switched to memcached.

Should there be a performance increasement, by using Lsmcd, instead of memcached? It won't take long time to upgrade and switch back 👍

@rperper
Copy link
Collaborator

rperper commented Nov 12, 2020

There should be little performance difference between memcached and lsmcd. There are a number of advantages of lsmcd:

  • Supports SASL security verification
  • Supports user separation of data if using SASL.
  • Designed to scale by using multiple slices.
  • Persists data after a shutdown/restart as it stores it's cache in shared memory.

@exetico
Copy link
Author

exetico commented Nov 13, 2020

Hi,

I've now updated to the last version (compiled the code, and secured it's updated and so on...)

But it keeps reporting ERROR if I connect to the socket, and trying something like stats:

nc -U /tmp/lsmcd.sock
stats
ERROR

The service looks fine (not enabled yet, cause I need to know if it's now working as it should):

# systemctl status lsmcd -l
● lsmcd.service - LSB: lsmcd
   Loaded: loaded (/etc/rc.d/init.d/lsmcd; bad; vendor preset: disabled)
   Active: active (running) since Fri 2020-11-13 15:28:15 CET; 15s ago
     Docs: man:systemd-sysv-generator(8)
  Process: 71095 ExecStop=/etc/rc.d/init.d/lsmcd stop (code=exited, status=0/SUCCESS)
  Process: 71112 ExecStart=/etc/rc.d/init.d/lsmcd start (code=exited, status=0/SUCCESS)
   CGroup: /system.slice/lsmcd.service
           ├─71131 ./bin/lsmcd -f /usr/local/lsmcd/conf/node.conf
           ├─71133 ./bin/lsmcd                                   
           ├─71134 ./bin/lsmcd                                   
           ├─71135 ./bin/lsmcd                                   
           ├─71136 ./bin/lsmcd                                   
           └─71137 ./bin/lsmcd                                   

Nov 13 15:28:14 centos-16gb-fsn1-2 systemd[1]: Starting LSB: lsmcd...
Nov 13 15:28:15 centos-16gb-fsn1-2 lsmcd[71112]: start lsmcd                                 [        OK        ]
Nov 13 15:28:15 centos-16gb-fsn1-2 systemd[1]: Started LSB: lsmcd.

And the config looks normal, too.

What can that be? That part of the problem was also present. before i switched our system to memcached.

@exetico
Copy link
Author

exetico commented Nov 13, 2020

Are there any issues with our (almost default) configuration?

Repl.HeartBeatReq=30
Repl.HeartBeatRetry=3000
Repl.MaxTidPacket=2048000
Repl.GzipStream=YES
Repl.LbAddrs=127.0.0.1:12340
Repl.ListenSvrAddr=127.0.0.1:12340
REPL.DispatchAddr=127.0.0.1:5501
RepldSockPath=/tmp/repld.usock
CACHED.PRIADDR=127.0.0.1:11000

#CACHED.ADDR=127.0.0.1:11211
CACHED.ADDR=UDS:///tmp/lsmcd.sock
#default is 8, it can be bigger depending on cache data amount
Cached.Slices=8
Cached.Slice.Priority.0=100
Cached.Slice.Priority.1=100
Cached.Slice.Priority.2=100
Cached.Slice.Priority.3=100
Cached.Slice.Priority.4=100
Cached.Slice.Priority.5=100
Cached.Slice.Priority.6=100
Cached.Slice.Priority.7=100

Cached.ShmDir=/dev/shm/lsmcd
Cached.UseSasl=false
Cached.DataByUser=true
Cached.Anonymous=false
#Cached.SaslDB=/etc/sasllsmcd

##this is the global setting, no need to have per slice configuration.
User=nobody
Group=nobody
#depends CPU core
CachedProcCnt=4
CachedSockPath=/tmp/cached.usock.
TmpDir=/tmp/lsmcd
LogLevel=notice
#LogLevel=dbg_medium
LogFile=/tmp/lsmcd.log

@rperper
Copy link
Collaborator

rperper commented Nov 13, 2020

Hi,
Interesting, I've not tried the nc method to connect to a UDS socket. Let me check that out...
Thanks.

Bob

@rperper
Copy link
Collaborator

rperper commented Nov 13, 2020

You just need one additional parameter on nc:
nc -U /tmp/lsmcd.sock -C
Let me know how that works out for you.

Bob Perper
rperper@litespeedtech.com

@exetico
Copy link
Author

exetico commented Nov 13, 2020

I guess I've missed that parameter, after using memcached. The command for socat is socat - UNIX-CONNECT:/tmp/lsmcd.sock,crnl (Just checked what the -C parameter was).

Thanks for the help again, rperper :-) Everything is now wkring as expected. Cleared the shared memory file was required.

For people who find this issue in the future, here's a few things to secure the proper traffic from Google - and the solution, too.



After updating the lsmcd-service, the following error was reported:
2020-11-13 16:12:58.914 [WARN] [__root] [SHM] [-1-13:0xd03150] alloc2 cross large page boundary, offset: FD000, size: 15392, split to 12288/3104 release

So fix this, I stopped lsmcd (systemctl stop lsmcd), hereafter I removed the shared files used by lsmcd: sudo rm -rf /dev/shm/lsmcd, and finally, started lsmcd again (systemctl start lsmcd). Use sudo, if you need that.... :-)

And oh:
Try and connect with the service hereafter. It's done with nc -U /tmp/lsmcd.sock -C or socat - UNIX-CONNECT:/tmp/lsmcd.sock,crnl if it's connected with UDS. If lsmcd are configed to listed on TCP, use telnet 127.0.0.1 11211. Try and type stats, and check if everything are running as expected.

The output should look something like this:

stats
STAT pid 76627
STAT version 1.0.0
STAT pointer_size 64
STAT rusage_user 1.191186
STAT rusage_system 1.749438
STAT cmd_get 90260
STAT cmd_set 17598
STAT cmd_flush 0
STAT cmd_touch 0
STAT get_hits 75903
STAT get_misses 14357
STAT delete_misses 22
STAT delete_hits 74
STAT incr_misses 0
STAT incr_hits 0
STAT decr_misses 0
STAT decr_hits 0
STAT cas_misses 0
STAT cas_hits 0
STAT cas_badval 0
STAT touch_hits 0
STAT touch_misses 0
STAT auth_cmds 0
STAT auth_errors 0
END

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants