Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem: ZMQ_SOCKET_LIMIT and ZMQ_THREAD_PRIORITY have the same value #3362

Closed
UndeadKernel opened this issue Jan 14, 2019 · 20 comments
Closed

Comments

@UndeadKernel
Copy link

UndeadKernel commented Jan 14, 2019

Issue description

test_ctx_options fails and I'm unable to identify the reasons why.
In particular, the part of the test that fails, located in test_ctx_options.cpp, is the second assert in:

    if (is_allowed_to_raise_priority ()) {
        rc = zmq_ctx_set (
          ctx_, ZMQ_THREAD_PRIORITY,
          1 /* any positive value different than the default will be ok */);
        assert (rc == 0);
        rc = zmq_ctx_get (ctx_, ZMQ_THREAD_PRIORITY);
        assert (rc == 1);
    }

I manually checked and the call to zmq_ctx_set ( ctx_, ZMQ_THREAD_PRIORITY, 1) is successful, returning 1. However, the call to rc = zmq_ctx_get (ctx_, ZMQ_THREAD_PRIORITY) returns the value 65,535. In the beginning I thought this value was a -1, indicating that the call to zmq_ctx_get failed.

After investigating the returned value, it seems that it is not a -1 as zmq_ctx_get returns an int and not a short int.

How could I debug this problem further? It may seem that the problem is in the way that priority is acquired?

Environment

  • libzmq version (commit hash if unreleased): 4.3.0
  • OS: Arch GNU/Linux
    The command ulimit -a shows:
-t: cpu time (seconds)              unlimited
-f: file size (blocks)              unlimited
-d: data seg size (kbytes)          unlimited
-s: stack size (kbytes)             8192
-c: core file size (blocks)         unlimited
-m: resident set size (kbytes)      unlimited
-u: processes                       63741
-n: file descriptors                1048576
-l: locked-in-memory size (kbytes)  65536
-v: address space (kbytes)          unlimited
-x: file locks                      unlimited
-i: pending signals                 63741
-q: bytes in POSIX msg queues       819200
-e: max nice                        40
-r: max rt priority                 0
-N 15:                              unlimited

What's the actual result?

Command line output of the test:

PASS: tests/test_router_handover
./config/test-driver: line 107: 21636 Aborted                 (core dumped) "$@" > $log_file 2>&1
FAIL: tests/test_ctx_options
PASS: tests/test_unbind_wildcard

Contents of the test log:

lt-test_ctx_options: tests/test_ctx_options.cpp:116: void test_ctx_thread_opts(void*): Assertion `rc == 1' failed.
@bluca
Copy link
Member

bluca commented Jan 14, 2019

-1 is the default, so it can't really be a coincidence. But I'm not really sure how it could possibly fail to set the parameter, and not return an error... the code is pretty straight forward: 22c3ecc

@UndeadKernel
Copy link
Author

@bluca, it's not really returning -1, instead, its 65,535. If the return value of zmq_ctx_get was an unsigned short int, then I'd interpret that result as a -1. The function returns however an int.

It's also not failing to set the priority. It fails when it retrieves it with zmq_ctx_get.

@bluca
Copy link
Member

bluca commented Jan 14, 2019

How are you getting 65,535? My point is that the only possible values are 1, which is the setting in the test, and -1, which is the default, it can't really be anything else

@UndeadKernel
Copy link
Author

UndeadKernel commented Jan 14, 2019

Given this code fragment modified in test_ctx_options.cpp:

        int rc;
        ...
        rc = zmq_ctx_set (ctx_, ZMQ_THREAD_PRIORITY, 10);
        rc = zmq_ctx_get (ctx_, ZMQ_THREAD_PRIORITY);
        printf("%i \n", rc);

no matter which value I put instead of 10 in the set call, the value printed by printf for the get call is always 65,535.

I however understand what you mean @bluca. The get call is failing and the value it returns should technically be a -1 (as indicated in the documentation). Though, with something like:

        int rc;
        ...
        rc = zmq_ctx_set (ctx_, ZMQ_THREAD_PRIORITY, -10);
        printf("%i \n", rc);

the value returned is -1. Notice that I try to set the priority to -10 and does fail.

I don't know why 65,535 is the returned value instead of -1.

Do you have any clue what could possibly be changed in the system to make this test pass? I have another Arch GNU/Linux system (fully upgraded) where the test passes.

@bluca
Copy link
Member

bluca commented Jan 14, 2019

I have no idea, sorry. I have never seen that anywhere, on any platform. I honestly think there is something quite wrong with that system... as you can see the code is as simple and straightforward as it can be.

@sigiesec
Copy link
Member

Maybe there is some strange ABI difference between libzmq and the test, so the calling conventions mismatch. Can you attach the build log showing the compiler options of both the libzmq and test builds?

@UndeadKernel
Copy link
Author

UndeadKernel commented Jan 15, 2019

After further tests, I found that in the system where things were "working", the nice level could not be changed. Therefore, the test was being skipped due to:

is_allowed_to_raise_priority()

in line 115 of tests_ctx_options.cpp.

After allowing any program to raise the nice level to any value,test_ctx_options consistently fails in Arch GNUI/Linux.

So, to reproduce the error you just need to give processes the right to elevate the niceness of a process to any value. In /etc/system.d/system.conf I have the line:

DefaultLimitNICE=40

@bluca
Copy link
Member

bluca commented Jan 15, 2019

...oh boy. Spot the error...

#define ZMQ_SOCKET_LIMIT 3
#define ZMQ_THREAD_PRIORITY 3

https://github.com/zeromq/libzmq/blob/master/include/zmq.h#L214

@bluca
Copy link
Member

bluca commented Jan 15, 2019

That used to work because ZMQ_SOCKET_LIMIT was get-only and ZMQ_THREAD_PRIORITY was set-only. Then I added the getter for ZMQ_THREAD_PRIORITY, which actually returns ZMQ_SOCKET_LIMIT which is... 65535. Bingo.

@bluca bluca changed the title Test test_ctx_options, related to priority, fails Problem: ZMQ_SOCKET_LIMIT and ZMQ_THREAD_PRIORITY have the same value Jan 15, 2019
@bluca
Copy link
Member

bluca commented Jan 15, 2019

We can either un-document and remove the test for the getter of ZMQ_THREAD_PRIORITY, or change its value but keep the setter working with the older value for backward-compatibility.
The getter has always been broken, so technically it's not a backward-incompatible change...

@UndeadKernel
Copy link
Author

Bingo!
I'm glad that I reported this mystical error instead of flipping my table because I could not figure out where that 65,535 was coming from.
Started to believe that I had forgotten all those years of C++ programming :'-(

@UndeadKernel
Copy link
Author

Would there be too many problems if ZMQ_THREAD_PRIORITY is assigned the value 10? Repeating values causes confusion and makes the library hard to debug.

@bluca
Copy link
Member

bluca commented Jan 15, 2019

It's the second solution I proposed and technically it's an ABI breakage

@UndeadKernel
Copy link
Author

Assuming that library users directly access ZMQ_THREAD_PRIORITY instead of manually typing 3, there shouldn't be problems in breaking the ABI. Or would there be?

@bluca
Copy link
Member

bluca commented Jan 15, 2019

With a backward compatible layer in the library as I mentioned it's fine for the setter, but the getter will never work without a rebuild

@bluca
Copy link
Member

bluca commented Jan 15, 2019

Actually the value can't be changed - it would be backward-compatible but not forward-compatible. Applications linked against the new version wouldn't work with the old version.

I'll un-document the getter and remove this test.

@pi1ot
Copy link

pi1ot commented Sep 26, 2019

How to check or verify zmq io-threads priority, if we can't get ZMQ_THREAD_PRIORITY ?

@bluca
Copy link
Member

bluca commented Sep 26, 2019

If it wasn't changed, then it's the default as documented

@pi1ot
Copy link

pi1ot commented Sep 26, 2019

If it wasn't changed, then it's the default as documented

I try to modify zmq io threads scheduing policy to SCHED_RR and priproty to 99,run as root,but all ZMQbg/n threads PRI value always 20

@bluca
Copy link
Member

bluca commented Sep 26, 2019

ensure you are setting those before creating any socket

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants