Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assumption made about buffer header alignment #627

Open
Paul-C-Anagnostopoulos opened this issue Aug 6, 2010 · 7 comments
Open

Assumption made about buffer header alignment #627

Paul-C-Anagnostopoulos opened this issue Aug 6, 2010 · 7 comments

Comments

@Paul-C-Anagnostopoulos
Copy link
Contributor

Various memory management routines (e.g., gc_ms_allocate_buffer_storage) assume that the size of a buffer header is equal to the size of a pointer. This is probably true throughout the system as it stands, but those same routines take pains not to make that assumption in other places.

Here is a line from the above routine:

    Buffer_buflen(buffer)   = new_size - sizeof (void *);

If the size of a pointer is less than the buffer header size, the value stored in buflen will be too big. new_size includes the entire size of the buffer header, which may include alignment padding in addition to the pointer.

Originally http://trac.parrot.org/parrot/ticket/1731

@ghost
Copy link

ghost commented Aug 6, 2010

On Thu, 5 Aug 2010, Parrot wrote:
>  Various memory management routines (e.g., `gc_ms_allocate_buffer_storage`)
>  assume that the size of a buffer header is equal to the size of a pointer.
>  This is probably true throughout the system as it stands, but those same
>  routines take pains not to make that assumption in other places.
>
>  Here is a line from the above routine:
>  {{{
>      Buffer_buflen(buffer)   = new_size - sizeof (void *);
>  }}}
>  If the size of a pointer is less than the buffer header size, the value
>  stored in `buflen` will be too big. `new_size` includes the entire size of
>  the buffer header, which may include alignment padding in addition to the
>  pointer.
I don't understand what you are saying.  A Buffer looks like this
(include/parrot/pobj.h):
typedef struct buffer_t {
    Parrot_UInt flags;
    void *     _bufstart;
    size_t     _buflen;
} Buffer;
Which part, specifically, is the "header"? Or are you referring to
something else?
--
    Andy Dougherty      doughera@lafayette.edu

@Paul-C-Anagnostopoulos
Copy link
Contributor Author

Sorry, confusing terminology. The buffer "header" is the information stored in front of the actual buffer memory. The Buffer object is the buffer "descriptor." Here's the new description from pobj.h:

/* A buffer descriptor object points to a buffer in a Memory_Block.
   The buffer includes a header, but _bufstart points to the data
   portion. Here is how it works:
    Buffer descriptor                     buffer
   +-------------------+                 +------------------------+
   |       flags       |                 |  (possible padding)    | }
   +-------------------+                 +---------------------+--+  > header
   |      _bufstart    | ------+         |    *Memory_Block    |fl| }
   +-------------------+       |         +---------------------+--+
   |      _buflen      |       +-------> |    data portion        |
   +-------------------+                 |                        |
                                         ~                        ~
                                         |                        |
                                         +------------------------+
   The buffer header consists of possible padding and a pointer to the
   Memory_Block containing the buffer. There are two flags in the low-order
   bits of the pointer (see string.h). Padding is only required if the
   alignment of the data portion is higher than that of a pointer.
   This was not the case as of 8/2010.
*/

@ghost
Copy link

ghost commented Aug 7, 2010

Replying to Paul C. Anagnostopoulos:

Sorry, confusing terminology. The buffer "header" is the information stored in front of the actual buffer memory. The Buffer object is the buffer "descriptor."

Ah, I see. Thanks for the new description. I had hoped that with the new memory organization, that hidden flag stuck before _bufstart could have been eliminated. I have never been a fan of that trick, and prefer to tell the compiler explicitly what I'm doing so it automatically gets all the alignment correct.

Does every Buffer have that Memory_Block element? If so, would it make sense to just explicitly list it in the Buffer "descriptor?"

Padding is only required if the alignment of the data portion is higher than that of a pointer. This was not the case as of 8/2010.

It depends on how the memory blocks are obtained and carved up. On 32-bit SPARC, doubles should be aligned at 8-byte boundaries, but pointers are only 4 bytes. I'm not sure how parrot obtains and carves up space for those buffers.

@Paul-C-Anagnostopoulos
Copy link
Contributor Author

Now that I think about my terminology, it runs contrary to the terminology used elsewhere. I'll fix that in the comments.

I believe that every memory buffer has the Memory_Block in its header, although I'm not positive. In order to move it to the Buffer object, we would have to do a careful analysis to see if any functions grab the Memory_Block directly from a memory buffer rather than via its Buffer object.

Buffers are currently aligned using WORD_ALIGN_1, which is based on the size of a pointer. That's why this problem won't arise now. But there is BUFFER_ALIGN_1 lurking around and it would align on 8-byte boundaries if it were used. This is odd, since a comment claims that buffers are aligned for FLOATVALs. Either the comment is wrong or I don't understand what is going on.

@Paul-C-Anagnostopoulos
Copy link
Contributor Author

The Buffer descriptor is indeed called the header, so I'll now refer to the stuff preceding the actual buffer memory as the prolog.

@Paul-C-Anagnostopoulos
Copy link
Contributor Author

We have two options for fixing the faulty assumption described in the original ticket.

One approach is to simply declare by fiat that the size of the buffer prolog will always equal the size of a pointer. This means that the buffer data area will always be aligned on the same alignment as a pointer; there will be no possibility of any other alignment.

The second approach is actually to fix the problem. This means that the calculation of the buffer alignment will become more complex, which affects about a dozen functions in three or four files.

Which approach should we take?

@ghost
Copy link

ghost commented Aug 13, 2010

Replying to Paul C. Anagnostopoulos:

Which approach should we take?

It depends on what buffers are supposed to be able to store. If they are supposed to be able to store arbitrary items (e.g. FLOATVALS, or structures) then they must be aligned properly for those structures, which may require padding. If they are not supposed to be able to store arbitrary items, then less strict alingment may be appropriate.

I was not able to find any documentation for what they are supposed to be able to store. Many of the comments in the source about alignment date back to times when there was a "cache" element in most of the structures that could hold a FLOATVAL, and may not apply anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant