Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loan_sample() returns illegal oparation when using FastDDS-Gen 2.5.0 and fastrtps.so.2.3.4 and 2.10.0 #193

Closed
amdsobhy opened this issue Jun 20, 2023 · 11 comments

Comments

@amdsobhy
Copy link

amdsobhy commented Jun 20, 2023

loan_sample() returns RETCODE_ILLEGAL_OPERATION when using FastDDS-Gen 2.5.0, tested with fastrtps.so.2.3.4 and 2.10

Issue does not happen when using FastDDS-Gen 2.1.0

@amdsobhy amdsobhy changed the title loan_sample() returns illegal oparation when using FastDDS-Gen 2.5.0 and fastrtps.so.2.3.4 loan_sample() returns illegal oparation when using FastDDS-Gen 2.5.0 and fastrtps.so.2.3.4 and 2.10.0 Jun 21, 2023
@MiguelCompany
Copy link
Member

@amdsobhy Which IDL produces that behavior?

@amdsobhy
Copy link
Author

@MiguelCompany here is the IDL:

module DDS {

#define MAX_SIZE (1920 * 1200 * 3)

    struct Header
    {
        unsigned long long  eightbytes0;
        long                fourbytes1;
        long                fourbytes2;
        long                fourbytes3;
    };

    struct foo
    {
        Header data0;
        long fourbytes1;
        long fourbytes2;
        long fourbytes3;
        unsigned long fourbytes4;
        unsigned long fourbytes5;
        long fourbytes6;
        octet buffer[MAX_SIZE];
    };

};

@MiguelCompany
Copy link
Member

@amdsobhy The in-memory representation of that structure will not match the CDR representation. This implies that foo should not be considered plain and, as such, loan_sample() cannot be used.

This is due to the padding present between data0 and fourbytes1. If you want to make the type plain, you have several options:

  1. Change Header::eightbytes0 into unsigned long eightbytes0_high and unsigned long eightbytes0_low
  2. Change Header::eightbytes0 to the end of Header
  3. Add long _padding_; at the end of Header
  4. Put the fields of Header directly in foo

@amdsobhy
Copy link
Author

Thank you for the reply and the explanation. I thought that the generated code would handle such situation. Why doesn't cdr handle this situation so that the user does not have to worry about it?

How can one guarantee in future implementation that the in-memory representation matches CDR representation? What are the rules to garantee such matching?

I read that according to the standard padding is inserted into the struct to make it's size in multiples of 8 bytes. In my situation the header is 20 bytes so by adding another 4 bytes padding it becomes 24 bytes and a multiple of 8 and thus it fixes the problem of misalignment between cdr representation and memory representation, but what about the situation in point #3 you mentioned above? why does moving the eightbytes0 variable to the bottom of the Header struct solve the problem when the size of the struct is still the same?

@MiguelCompany
Copy link
Member

How can one guarantee in future implementation that the in-memory representation matches CDR representation? What are the rules to garantee such matching?

  1. Do not use string, sequences, maps, or unions
  2. If possible, avoid using nested structures. For your use-case, you could use inheritance, i.e. struct foo : Header
  3. If nested structures are required, put the field with the biggest alignment at the end
  4. Check the in-memory representation matches the one of a type without nested structures.

why does moving the eightbytes0 variable to the bottom of the Header struct solve the problem when the size of the struct is still the same?

I have prepared this example so you can see the differences in memory representation

@amdsobhy
Copy link
Author

amdsobhy commented Jun 28, 2023

Thank you for the great example. I thought the padding was inserted using fastcdr during serialization and not the compiler. From your example I see that this is the default padding inserted by the complier.

Would it be okay to modify the generated headers and include compiler directives to disable padding? Also why doesn't fastcdr handle the compiler padding during serialization and deserialization?

As far as I know compiler padding is different accross architectures, compilers and compiler versions, so what happens when two different machines are communicating over dds and each one of them has a different point of view of what the padding in the message look like? Shouldn't padding be abstracted from the reader and writer of the message through the middleware?

@nyanpasua
Copy link

same problem, is this a bug in ddsgen v2.5.0?

@MiguelCompany
Copy link
Member

@amdsobhy Sorry for my late reply. Let me try to explain how Fast DDS works behind the scenes when loan_sample is called.

When the DataWriter is created, the type support is invoked to query the maximum size of the serialized (i.e. CDR) payload.
That maximum size will include the size of the padding according to the CDR spec.
The DataWriter will then pre-allocate space for a certain number of those serialized payloads, depending on the ResourceLimits QoS.

The purpose of loan_sample is to return a pointer to one of those payloads, in order for it to be used as a pointer to the generated type. This way, CDR serialization is not performed when calling write with a pointer that was loaned.

So in order for that operation to be legal, the CDR representation should match the in-memory representation of the generated type. That implies that the CDR padding and the compiler padding should match.
This is what the is_plain method from the type support is responsible for.

The type generated by the IDL here would most probably have a padding different from the one mandated by the CDR spec.

On previous Fast DDS Gen versions, the is_plain method implementation relied just on the type of each field. This could lead to wrongly considering the type generated by the IDL here as plain.

On Fast DDS Gen 2.5.0, we included a mechanism to better implement is_plain. Apart from checking the type of each field, the generated code will check whether the offset of the last field + its length equals the length of the CDR serialization.

So, answering @nyanpasua , this is not a bug in v2.5.0. It is a bug-fix.

@MiguelCompany
Copy link
Member

As far as I know compiler padding is different accross architectures, compilers and compiler versions, so what happens when two different machines are communicating over dds and each one of them has a different point of view of what the padding in the message look like? Shouldn't padding be abstracted from the reader and writer of the message through the middleware?

When not using loans, CDR serialization and deserialization will take place. In this case, and the writer and the reader are abstracted with regards to padding.

@JLBuenoLopez
Copy link
Contributor

I am going to proceed and close this issue as it has already been answered.

@davidqin1986
Copy link

Thanks for @MiguelCompany 's explanation, but I can't agree your opinion. I think this modification is a kind of regression.

The original intention of idl definition is to facilitate users. The purpose of loan sample is to improve performance. But this ”is_ plain” implementation imposes so many limitations on users, and also decrease usage scenario. Don't stick to the CDR specification. After all, the zero copy implementation of fastdds is also not within the specifications of RTPS.

In addition, the development trend of modern C++ is the convenience of developers, otherwise there wouldn't be so many syntax sugars.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants