Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

msgpack serialisation : float is treated as 64bit float, not 32bit float. #2196

Closed
pfeatherstone opened this issue Jun 17, 2020 · 10 comments · Fixed by #2201
Closed

msgpack serialisation : float is treated as 64bit float, not 32bit float. #2196

pfeatherstone opened this issue Jun 17, 2020 · 10 comments · Fixed by #2201
Assignees
Labels
aspect: binary formats BSON, CBOR, MessagePack, UBJSON confirmed kind: bug release item: 🔨 further change solution: proposed fix a fix for the issue has been proposed and waits for confirmation
Milestone

Comments

@pfeatherstone
Copy link
Contributor

Example:

vector<float> buf(1024);
nlohmann::json j = {{"buf", buf}};
vector<uint8_t> msgpack = nlohmann::json::to_msgpack(j);
cout << msgpack.size() << endl;

I would expect this to print roughly ~ 5000, instead it is printing ~ 9000.
Going through GDB, it is treating the float type as 64bit.

@pfeatherstone
Copy link
Contributor Author

For CBOR, you apply range checking like so

if (static_cast<double>(j.m_value.number_float) >= static_cast<double>(std::numeric_limits<float>::lowest()) and
                            static_cast<double>(j.m_value.number_float) <= static_cast<double>((std::numeric_limits<float>::max)()) and
                            static_cast<double>(static_cast<float>(j.m_value.number_float)) == static_cast<double>(j.m_value.number_float))
                    {
                        oa->write_character(get_cbor_float_prefix(static_cast<float>(j.m_value.number_float)));
                        write_number(static_cast<float>(j.m_value.number_float));
                    }
                    else
                    {
                        oa->write_character(get_cbor_float_prefix(j.m_value.number_float));
                        write_number(j.m_value.number_float);
                    }

So if your float is in a 32bit float range, cast it to a 32bit float. You don't do that in the msgpack serialiser. Maybe it could? Or, in :

enum class value_t : std::uint8_t
{
    null,             ///< null value
    object,           ///< object (unordered set of name/value pairs)
    array,            ///< array (ordered collection of values)
    string,           ///< string value
    boolean,          ///< boolean value
    number_integer,   ///< number value (signed integer)
    number_unsigned,  ///< number value (unsigned integer)
    number_float,     ///< number value (floating-point)
    binary,           ///< binary array (ordered collection of bytes)
    discarded         ///< discarded by the parser callback function
};

we separate the number_float enum into 2 separate enums. Options...

@pfeatherstone
Copy link
Contributor Author

Interestingly, i wanted to see if the CBOR serialisation was slower due to bounds checking, and it isn't, it's nearly 3x faster. I used the following code:

vector<float> buf(1024*1024);
    nlohmann::json j = {{"buf", buf}};
    
    int ntests = 100;
    uint64_t now = get_timestamp_ns();
    for (int i = 0 ; i < ntests ; i++)
        vector<uint8_t> msgpack = nlohmann::json::to_msgpack(j);
    now = get_timestamp_ns() - now;
    cout << "msgpack serialisation in " << now*1e-6/ntests << "ms" << endl;
    
    now = get_timestamp_ns();
    for (int i = 0 ; i < ntests ; i++)
        vector<uint8_t> cbor = nlohmann::json::to_cbor(j);
    now = get_timestamp_ns() - now;
    cout << "cbor serialisation in " << now*1e-6/ntests << "ms" << endl;

Which outputs

msgpack serialisation in 28.2028ms
cbor serialisation in 11.4229ms

So maybe adding bounds checking isn't going to affect performance...

@nlohmann
Copy link
Owner

We only recently added the code to the CBOR serializer to use 32-bit floats where possible. Not using the same approach in MessagePack was an oversight. I think adding this should be possible.

@nlohmann
Copy link
Owner

@pfeatherstone Could you please try with 88a3701 whether everything works as expected? In your initial example, I got the output 5128.

@nlohmann nlohmann added the solution: proposed fix a fix for the issue has been proposed and waits for confirmation label Jun 17, 2020
@pfeatherstone
Copy link
Contributor Author

Yep all works fine. Thanks for the quick fix

@nlohmann
Copy link
Owner

I still need to adjust the test suites, as the change affects existing tests. But then this should be ready for the next release.

@pfeatherstone
Copy link
Contributor Author

pfeatherstone commented Jun 17, 2020

marvelous. So do you intend on postponing the release until you have a large enough set of new features, or will you do a quick minor release?

@pfeatherstone
Copy link
Contributor Author

I would also add the solution proposed by @KonanM in issue 2175 to the next release as it looks oven baked to me.

@nlohmann
Copy link
Owner

I'm not sure about the release cycle. My usual approach is to wait until it feels "done". There are several issues that are nearly ready, so I will not rush anything because of this issue.

@pfeatherstone
Copy link
Contributor Author

Cool

@nlohmann nlohmann added this to the Release 3.8.1 milestone Jun 18, 2020
nlohmann added a commit that referenced this issue May 1, 2022
* ⬆️ Doctest 2.4.7

* 👷 add CI step for ICPC

* 👷 add CI step for ICPC

* 👷 add CI step for ICPC

* ⬇️ downgrade to Doctest 2.4.6

* 👷 add CI step for ICPC

* 👷 add CI step for ICPC

* 👷 add CI step for ICPC

* 👷 add CI step for ICPC

* 👷 add CI step for ICPC

* 🔇 suppress warning #2196: routine is both "inline" and "noinline"

* Re-enable <filesystem> detection on ICPC

* Limit regression test for #3070 to Clang and GCC >=8.4

* Disable deprecation warnings on ICPC

* Disable regression test for #1647 on ICPC (C++20)

* Fix compilation failure of regression test for #3077 on ICPC

* Disable wstring unit test on ICPC

Fixes:
  error 913: invalid multibyte character sequence

* Add ICPC to README

Co-authored-by: Niels Lohmann <mail@nlohmann.me>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aspect: binary formats BSON, CBOR, MessagePack, UBJSON confirmed kind: bug release item: 🔨 further change solution: proposed fix a fix for the issue has been proposed and waits for confirmation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants