New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should support uint8_t for std::vector and std::array #243
Comments
Thank you for sending the issue. I considered the issue. First, I'd like to let you know my design choice. There are two reasons that I mapped std::vector and std::array to BIN. The first reason is that the containers located on continuous memory. The second reason is that char is used as representing byte stream in the C++(11) standard library. For example, http://www.cplusplus.com/reference/ostream/basic_ostream/write/ uses char* even if C++11. The next topic is uint8_t. uint8_t is considered as an appropriate type to represent binary data. I think that mapping vector and array of uint8_t to BIN is reasonable. There is minor problem. std::uint8_t is an alias type of uint8_t. It's ok. uint8_t might be an alias type of unsigned char. In this case, std::vector is mapped to BIN implicitly. When I provide two overloads like http://melpon.org/wandbox/permlink/MskKRI45qbfiMrmI , it is ill-formed. To solve this problem, using SFINAE as follows? msgpack conversion from msgpack::object to a user type using operator>>. inline msgpack::object const& operator>> (msgpack::object const& o, std::vector<char>& v) So I need to write as follows: The final topic is user's expectation. As I wrote before, char is ok becase it is the same mannar as C++ standard library. uint8_t and unsigned char is also ok. It is implementation defined whether they are the same type or not. But if one is mapped to BIN, the other should be too.
How about int8_t and singed char? msgpack user might expect that they are mapped to ARRAY that have -128..127 values. To choose the mapping targets, we can use msgpack::type::raw_ref. #include <msgpack.hpp>
#include <vector>
#include <sstream>
int main() {
{
std::vector<int8_t> v { -1, 1, 2, 3 };
std::stringstream ss;
msgpack::pack(ss, v);
auto obj = msgpack::unpack(ss.str().data(), ss.str().size()).get();
assert(obj.type == msgpack::type::ARRAY);
}
{
std::vector<int8_t> v { -1, 1, 2, 3 };
std::stringstream ss;
msgpack::pack(ss, msgpack::type::raw_ref(reinterpret_cast<char*>(v.data()), v.size()));
auto obj = msgpack::unpack(ss.str().data(), ss.str().size()).get();
assert(obj.type == msgpack::type::BIN);
}
}
[NOTE]Discussion about reinterpret_cast: It is difficult to judge which is appropriate behavior because users could have different expectation. However, if they are mapped to ARRAY, the users can choose ARRAY or BIN using raw_ref wrapper. I can say the same about uint8_t and unsigned char. I think that they are more strongly expected to map to BIN than int8_t and signed char. |
Kudos to your thoroughness in analyzing the different aspects of this issue. I agree with all of your assessments, and the SFINAE solution seems like a good one. I wasn't aware of the raw_ref approach before, knowing about it definitely makes it easier to choose. I feel that BIN would be a good default for unsigned char and uint8_t, after reading your analysis I think that maybe the signed variants should remain as ARRAY for the reasons that you listed. Maybe an array_ref (analogous to raw_ref) could be provided for users who specifically want to store char/uchar/uint8_t as lists? |
Sorry for the late reply. I had worked for #249.
http://stackoverflow.com/questions/16260033/reinterpret-cast-between-char-and-stduint8-t-safe We need to do the similar check on the C++03. What is the good way to implement that? In the raw_ref, those kinds of casting problem is users responsibility ;) raw_ref(const char* p, uint32_t s) : size(s), ptr(p) {} |
BOOST_STATIC_ASSERT is a good one. But some of msgpack-c users don't want to external dependency. The implementation is pretty tricky https://github.com/boostorg/static_assert/blob/master/include/boost/static_assert.hpp |
This approach appears compelling for a compact pre-C++11 static assert: http://stackoverflow.com/a/14621998 |
Thank you for the comment. It looks good for me. I will restart implementing the PR to solve the issue. |
I just restarted implementing. And then I realized that all I need to do is implement I will implement |
std::vector<unsigned char> and std::array<unsigned char> are mapped to BIN. std::vector<uint8_t> and std::array<uint8_t> are mapped to BIN if uint8_t is the same type of unsigned char, otherwise mapped to ARRAY. Added array_ref. When client wraps BIN mapped types above with array_ref as msgpack::type::array_ref<std::vector<char> >, the type is mapped to ARRAY.
std::vector<unsigned char> and std::array<unsigned char> are mapped to BIN. std::vector<uint8_t> and std::array<uint8_t> are mapped to BIN if uint8_t is the same type of unsigned char, otherwise mapped to ARRAY. Added array_ref. When client wraps BIN mapped types above with array_ref as msgpack::type::array_ref<std::vector<char> >, the type is mapped to ARRAY.
Release 1.0.0 supports adapters for
std::vector<char>
andstd::array<char>
, however binary data is often represented asuint8_t
(orstd::uint8_t
in C++11).char
is defined as implementation-specific signed or unsigned integer type - see http://en.wikipedia.org/wiki/C_data_types - so while it's a single byte pretty much everywhere, the specification doesn't actually guarantee this.uint8_t
andint8_t
are specified more clearly, and thus a better choice for binary data.So while I think it's not much of a problem to keep the
char
-templated adapters in msgpack-c, I think the project should endorseuint8_t
as a better default and ship adapters for that type as well. Since msgpack-c doesn't have to choose between types, I suggest also including one forint8_t
so that all three major types are covered.The text was updated successfully, but these errors were encountered: