Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python ctypes BigEndianStructure bitfield assignment misbehavior in Linux #64828

Closed
AlanNing mannequin opened this issue Feb 14, 2014 · 6 comments
Closed

Python ctypes BigEndianStructure bitfield assignment misbehavior in Linux #64828

AlanNing mannequin opened this issue Feb 14, 2014 · 6 comments
Labels
topic-ctypes type-bug An unexpected behavior, bug, or error

Comments

@AlanNing
Copy link
Mannequin

AlanNing mannequin commented Feb 14, 2014

BPO 20629
Nosy @amauryfa, @abalkin, @zhangyangyu

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2016-11-09.07:17:33.818>
created_at = <Date 2014-02-14.21:43:22.973>
labels = ['ctypes', 'type-bug']
title = 'Python ctypes BigEndianStructure bitfield assignment misbehavior in Linux'
updated_at = <Date 2016-11-09.07:17:33.816>
user = 'https://bugs.python.org/AlanNing'

bugs.python.org fields:

activity = <Date 2016-11-09.07:17:33.816>
actor = 'xiang.zhang'
assignee = 'none'
closed = True
closed_date = <Date 2016-11-09.07:17:33.818>
closer = 'xiang.zhang'
components = ['ctypes']
creation = <Date 2014-02-14.21:43:22.973>
creator = 'Alan.Ning'
dependencies = []
files = []
hgrepos = []
issue_num = 20629
keywords = []
message_count = 6.0
messages = ['211241', '211648', '250491', '280371', '280374', '280377']
nosy_count = 7.0
nosy_names = ['amaury.forgeotdarc', 'belopolsky', 'mrolle', 'Alan.Ning', 'rgaddi', 'xiang.zhang', 'Brian Trotter']
pr_nums = []
priority = 'normal'
resolution = 'fixed'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue20629'
versions = ['Python 2.7', 'Python 3.3', 'Python 3.4']

@AlanNing
Copy link
Mannequin Author

AlanNing mannequin commented Feb 14, 2014

I am seeing a strange issue with bitfields and BigEndianStructure under Ubuntu 12.04 x64, Python 2.7.3.

This bug only occurs if I define my bitfields using c_uint. If I switch to c_ushort, it goes away.

Below is a simple code that highlights the problem. I have two structures - BitField1U and BitField2U. It is a union of a 4 bytes array and a bitfield definition.

Under Linux, by simply setting fields.a = 1 twice, it modifies the underlying byte array twice in a very different way. This behavior does not occur in Windows.

Output: Ubuntu 12.04x64 Python 2.7.3
20000000
20000020 <- problem
20000000
20000000

Output: Window 7 x64 Python 2.7.3
20000000
20000000
20000000
20000000

This bug was originally reported as a question in StackOverflow.
http://stackoverflow.com/questions/21785874/python-ctypes-bitfield-windows-vs-linux

Source code:

import ctypes
import binascii

class BitField1(ctypes.BigEndianStructure):
    _pack_ = 1
    _fields_ = [
    ('a', ctypes.c_uint, 3),
    ('b', ctypes.c_uint, 1),
    ]

class BitField1U(ctypes.Union):
    _pack_ = 1
    _fields_ = [("fields", BitField1), 
        ("raw_bytes", ctypes.c_ubyte * 4)]

class BitField2(ctypes.BigEndianStructure):
    _pack_ = 1
    _fields_ = [
    ('a', ctypes.c_ushort, 3),
    ('b', ctypes.c_ushort, 1),
    ]

class BitField2U(ctypes.Union):
    _pack_ = 1
    _fields_ = [("fields", BitField2), 
        ("raw_bytes", ctypes.c_ubyte * 4)]

def printBytes(raw_bytes) :
    ba = bytearray(raw_bytes)
    print(binascii.hexlify(ba))
    
def printFields(fields) :
    print(fields.a),
    print(fields.b),
    print

b1 = BitField1U()
b2 = BitField2U()

# Simply set fields.a = 1 twice, and notice how the raw_bytes changes.

b1.fields.a = 1
printBytes(b1.raw_bytes)
b1.fields.a = 1
printBytes(b1.raw_bytes)

b2.fields.a = 1
printBytes(b2.raw_bytes)
b2.fields.a = 1
printBytes(b2.raw_bytes)

@AlanNing AlanNing mannequin added the topic-ctypes label Feb 14, 2014
@rgaddi
Copy link
Mannequin

rgaddi mannequin commented Feb 19, 2014

I was just working on similar things, and found the same problem. I can confirm failure on both Python 2.7.4 and Python 3.3.1 running on 64-bit Linux, and that the Windows builds do not have this problem.

My code:

from __future__ import print_function
from ctypes import *
from itertools import product

bases = (BigEndianStructure, LittleEndianStructure)
packs = (True, False)
basetypes = ( (c_uint,16), (c_ushort,16), (c_uint,32) )

print("Base                     Basetype  pack  high  low   size  bytes")
for basetype, base, pack in product(basetypes, bases, packs):
    fields = [
        ('high', basetype[0], basetype[1]),
        ('low', basetype[0], basetype[1]),
    ]
    cls = type('', (base,), {'_pack_' : pack, '_fields_' : fields})
    
    x = cls(high = 0x1234, low = 0x5678)
    
    bacls = c_uint8 * sizeof(x)
    ba = bacls.from_buffer(x)
    s = ''.join('{0:02X}'.format(b) for b in ba)
    
    k = '*' if (x.high != 0x1234 or x.low != 0x5678) else ''
    
    report = "{name:25s}{basetype:10s}{pack:4d}  {high:04X}  {low:04X}  {size:4d}  {s}{k}".format(
        name = base.__name__,
        high = x.high,
        low = x.low,
        size = sizeof(x),
        pack = pack,
        basetype = basetype[0].__name__,
        s = s,
        k = k
    )
    print(report)
        
My results:
Base                     Basetype  pack  high  low   size  bytes
BigEndianStructure       c_uint       1  0000  5678     4  00005678*
BigEndianStructure       c_uint       0  0000  5678     4  00005678*
Structure                c_uint       1  1234  5678     4  34127856
Structure                c_uint       0  1234  5678     4  34127856
BigEndianStructure       c_ushort     1  1234  5678     4  12345678
BigEndianStructure       c_ushort     0  1234  5678     4  12345678
Structure                c_ushort     1  1234  5678     4  34127856
Structure                c_ushort     0  1234  5678     4  34127856
BigEndianStructure       c_uint       1  1234  5678     8  0000123400005678
BigEndianStructure       c_uint       0  1234  5678     8  0000123400005678
Structure                c_uint       1  1234  5678     8  3412000078560000
Structure                c_uint       0  1234  5678     8  3412000078560000

On python3, the BigEndianStructure seemingly at random will set the high or low fields from one execution to the next, but always misses one or the other. I have always seen high = 0, low = 0x5678 on python2.

@BrianTrotter
Copy link
Mannequin

BrianTrotter mannequin commented Sep 11, 2015

I am experiencing the same bug with c_uint32 bitfields inside BigEndianStructure in Python 3.4.0 on Ubuntu 14.04.3 x64. No problem in Windows 7 x64. As shown in the example below, the fourth byte is the only one that is written correctly. This is a rather significant error.

Source:

import ctypes

class BitFieldsBE(ctypes.BigEndianStructure):
    _pack_ = 1
    _fields_ = [
        ('a', ctypes.c_uint32, 8),
        ('b', ctypes.c_uint32, 8),
        ('c', ctypes.c_uint32, 8),
        ('d', ctypes.c_uint32, 8)]

class BitFieldsLE(ctypes.LittleEndianStructure):
    _pack_ = 1
    _fields_ = [
        ('a', ctypes.c_uint32, 8),
        ('b', ctypes.c_uint32, 8),
        ('c', ctypes.c_uint32, 8),
        ('d', ctypes.c_uint32, 8)]

be = BitFieldsBE()
le = BitFieldsLE()

def prints(arg):
    print(arg)
    print('be',bytes(be))
    print('le',bytes(le))

prints('00000000')
be.a = 0xba; be.b = 0xbe; be.c = 0xfa; be.d = 0xce
le.a = 0xba; le.b = 0xbe; le.c = 0xfa; le.d = 0xce
prints('babeface')
be.a = 0xde; be.b = 0xad; be.c = 0xbe; be.d = 0xef
le.a = 0xde; le.b = 0xad; le.c = 0xbe; le.d = 0xef
prints('deadbeef')

Output:

0000000
be b'\x00\x00\x00\x00'
le b'\x00\x00\x00\x00'
babeface
be b'\x00\xfa\x00\xce'
le b'\xba\xbe\xfa\xce'
deadbeef
be b'\x00\xbe\x00\xef'
le b'\xde\xad\xbe\xef'

@BrianTrotter BrianTrotter mannequin added the type-bug An unexpected behavior, bug, or error label Sep 11, 2015
@mrolle
Copy link
Mannequin

mrolle mannequin commented Nov 9, 2016

Similar problem with 2.7.8 with cygwin.

My example is:

Python 2.7.8 (default, Jul 25 2014, 14:04:36)
[GCC 4.8.3] on cygwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from ctypes import *
>>> class C (BigEndianStructure): _fields_ = (('rva', c_uint, 31), ('fl', c_uint, 1))
...
>>> buffer(x)[:]; x.rva; x.fl
'\x00\x00\x00\x00'
0L
0L
>>> x.rva = 256
>>> buffer(x)[:]; x.rva; x.fl
'\x00\x00\x02\x00'
256L
0L
>>> x.rva = 256
>>> buffer(x)[:]; x.rva; x.fl
'\x00\x00\x02\x00'
256L
0L
>>> x.fl = 1
>>> buffer(x)[:]; x.rva; x.fl
'\x00\x02\x00\x01'
65536L
1L
>>> x.fl = 1
>>> buffer(x)[:]; x.rva; x.fl
'\x01\x00\x02\x01'
8388864L
1L
>>> x.fl = 1
>>> buffer(x)[:]; x.rva; x.fl
'\x01\x02\x00\x01'
8454144L
1L
>>> x.fl = 1
>>> buffer(x)[:]; x.rva; x.fl
'\x01\x00\x02\x01'
8388864L
1L
>>> x.rva = 256
>>> buffer(x)[:]; x.rva; x.fl
'\x00\x00\x02\x01'
256L
1L
>>> x.rva = 256
>>> buffer(x)[:]; x.rva; x.fl
'\x00\x00\x02\x00'
256L
0L
>>> x.rva = 256
>>> buffer(x)[:]; x.rva; x.fl
'\x00\x00\x02\x00'
256L
0L

I'm disappointed that this bug hasn't been fixed after two years!

I understand that ctypes might not be portable across different
platforms. However, the above behavior is clearly wrong.

BTW, I also have Python 2.7.8 (default, Jun 30 2014, 16:08:48) [MSC v.1500 64 bit (AMD64)] on win32, and this version works fine.

@mrolle
Copy link
Mannequin

mrolle mannequin commented Nov 9, 2016

As a separate issue, I'd like to find an appropriate package,
other than ctypes, for interpreting data bytes in a consistently
defined manner, independent of the platform I'm running on.
The struct package is perfect where there are no bitfields
involved, i.e., where each item occupies whole bytes. But
it doesn't support packing/unpacking bitfields.

Actually, ctypes could fit the bill if you specified that bitfields
be allocated from MSB to LSB for BigEndianStructure, and from LSB
to MSB for LittleEndianStructure. This way, for instance, it wouldn't matter if a sequence of 4-bit fields were based on c_ubyte
or c_ushort, etc. Each pair of fields would be allocated to the
next consecutive byte. And if the platform native compiler for some strange reason doesn't follow either of these rules, then Structure
would follow the platform compiler.

differs from both Big and Little

@zhangyangyu
Copy link
Member

The bug is fixed in bpo-23319. More recent Py2.7 and Py3.4+ should get rid of it.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-ctypes type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

1 participant