Skip to content

donadigo/bytefield

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Build Status

ByteField

A Python library for parsing/manipulating binary data with easily accessible Python properties inspired by Django. The library is still in development. ByteField supports:

  • Variable length fields
  • Nested structures
  • Parsing only accessed fields

Quick example

ByteField allows to define binary data layout declaratively which then maps to underlying bytes:

from bytefield import *

class Header(ByteStruct):
    magic = StringField(length=5)
    length = IntegerField()
    array = ArrayField(shape=None, elem_field_type=IntegerField)
    floating = FloatField()

header = Header(magic='bytes', floating=3.14)
header.length = 3
header.array = list(range(1, header.length + 1))
print(header.data)

Output:

bytearray(b'bytes\x03\x00\x00\x00\x01\x00\x00\x00\x02\x00\x00\x00\x03\x00\x00\x00\xc3\xf5H@')`

Example: parse a JPEG header

You can embed other structure declarations inside structures:

from bytefield import *

class RGB(ByteStruct):
    r = IntegerField(signed=False, size=1)
    g = IntegerField(signed=False, size=1)
    b = IntegerField(signed=False, size=1)

class Marker(ByteStruct):
    marker = IntegerField(size=2, signed=False)
    length = IntegerField(size=2, signed=False)
    identifier = StringField(length=5, encoding='ascii')
    version = IntegerField(size=2, signed=False)
    density = IntegerField(size=1, signed=False)
    x_density = IntegerField(size=2, signed=False)
    y_density = IntegerField(size=2, signed=False)
    x_thumbnail = IntegerField(size=2, signed=False)
    y_thumbnail = IntegerField(size=2, signed=False)
    thumb_data = ArrayField(shape=None, elem_field_type=RGB)

class JPEGHeader(ByteStruct):
    soi = IntegerField(size=2, signed=False)
    marker = StructField(Marker)

with open('image.jpg', 'rb') as f:
    # Parse the JPEG header
    header = JPEGHeader(f.read())

    # Resize the thumbnail data
    header.marker.resize(
        Marker.thumb_data_field, header.marker.x_thumbnail * header.marker.y_thumbnail
    )

    # Display the thumbnail
    display_thumbnail(header.marker.thumb_data)

Writing custom struct logic

You can create high-level structures which define their own behavior depending on the data contained within the struct:

from bytefield import *

class DynamicFloatArray(ByteStruct):
    length = IntegerField(signed=False)
    array_data = ArrayField(None, FloatField)

    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        # When instantiated, resize the array according to its length
        self.resize(DynamicFloatArray.array_data_field, self.length)

data = bytearray(b'\x03\x00\x00\x00\x00\x00\x80?\x00\x00\x00@\x00\x00@@')
print(DynamicFloatArray(data))

Output:

[DynamicFloatArray object at 0x1c88e709e50]
length (int): 3
array_data (ndarray): [1.0 2.0 3.0]

Variable fields

Bytefield supports fields with unknown type/size:

from bytefiel import *

TYPE_INTEGER = 0
TYPE_FLOAT = 1
TYPE_STRING = 2

class DynamicString(ByteStruct):
    length = IntegerField(signed=False)
    str = StringField(None)

    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.resize(DynamicString.str_field, self.length)

class Content(ByteStruct):
    content_type = IntegerField(signed=False, size=2)
    content_data = VariableField()  # a variable field that will be resized when parsing the struct

    def __init__(self, data: bytearray = None, *args, **kwargs):
        super().__init__(data, *args, **kwargs)
        resize_bytes = not bool(data)
        if self.content_type == TYPE_INTEGER:
            self.resize(Content.content_data_field, IntegerField(), resize_bytes=resize_bytes)
        elif self.content_type == TYPE_FLOAT:
            self.resize(Content.content_data_field, FloatField(), resize_bytes=resize_bytes)
        elif self.content_type == TYPE_STRING:
            self.resize(Content.content_data_field, StructField(DynamicString), resize_bytes=resize_bytes)

write = Content()
write.content_type = TYPE_STRING
write.resize(Content.content_data_field, StructField(DynamicString), resize_bytes=True)
write.content_data.str = 'content'
write.content_data.length = len(write.content_data.str)

read = Content(write.data)
print(f'{write.data} is parsed to:\n{read}')

Output

bytearray(b'\x02\x00\x07\x00\x00\x00content') is parsed to:
[Content object at 0x1c1846888b0]
content_type (int): 2
content_data (DynamicString):
        length (int): 7
        str (str): content

About

A Python library for parsing/manipulating binary data with easily accessible Python properties.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages