Skip to content

Commit

Permalink
move endianness to Schema
Browse files Browse the repository at this point in the history
  • Loading branch information
julienledem committed Aug 10, 2016
1 parent ed0d6f7 commit e2a3587
Show file tree
Hide file tree
Showing 5 changed files with 25 additions and 21 deletions.
24 changes: 12 additions & 12 deletions cpp/src/arrow/ipc/metadata-internal.cc
Original file line number Diff line number Diff line change
Expand Up @@ -246,6 +246,16 @@ Status FieldFromFlatbuffer(const flatbuf::Field* field, std::shared_ptr<Field>*

// Implement MessageBuilder

// will return the endianness of the system we are running on
flatbuf::Endianness endianness() {
union {
uint32_t i;
char c[4];
} bint = {0x01020304};

return bint.c[0] == 1 ? flatbuf::Endianness_Big : flatbuf::Endianness_Little;
}

Status MessageBuilder::SetSchema(const Schema* schema) {
header_type_ = flatbuf::MessageHeader_Schema;

Expand All @@ -257,26 +267,16 @@ Status MessageBuilder::SetSchema(const Schema* schema) {
field_offsets.push_back(offset);
}

header_ = flatbuf::CreateSchema(fbb_, fbb_.CreateVector(field_offsets)).Union();
header_ = flatbuf::CreateSchema(fbb_, endianness(), fbb_.CreateVector(field_offsets)).Union();
body_length_ = 0;
return Status::OK();
}

// will return the endianness of the system we are running on
flatbuf::Endianness endianness(void) {
union {
uint32_t i;
char c[4];
} bint = {0x01020304};

return bint.c[0] == 1 ? flatbuf::Endianness_Big : flatbuf::Endianness_Little;
}

Status MessageBuilder::SetRecordBatch(int32_t length, int64_t body_length,
const std::vector<flatbuf::FieldNode>& nodes,
const std::vector<flatbuf::Buffer>& buffers) {
header_type_ = flatbuf::MessageHeader_RecordBatch;
header_ = flatbuf::CreateRecordBatch(fbb_, length, endianness(),
header_ = flatbuf::CreateRecordBatch(fbb_, length,
fbb_.CreateVectorOfStructs(nodes),
fbb_.CreateVectorOfStructs(buffers))
.Union();
Expand Down
Binary file modified format/Arrow.graffle
Binary file not shown.
Binary file modified format/Arrow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 2 additions & 2 deletions format/Layout.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,12 +79,12 @@ Base requirements
## Byte Order ([Endianness][3])

The Arrow format is little endian by default.
The RecordBatch metadata has an endianness field labelling the RecordBatch accordingly.
The Schema metadata has an endianness field indicating endiannmess of RecordBatches.
Typically this is the endianness of the system where the RecordBatch was generated.
The main use case is exchanging RecordBatches between systems with the same Endianness.
At first we will return an error when trying to read a RecordBatch with an endianness
that does not match the underlying system. The reference implementation is focused on
Little Endian and provides test for it. Eventually we may provide automatic conversion
Little Endian and provides tests for it. Eventually we may provide automatic conversion
via byte swapping.

## Alignment and Padding
Expand Down
18 changes: 11 additions & 7 deletions format/Message.fbs
Original file line number Diff line number Diff line change
Expand Up @@ -91,10 +91,21 @@ table Field {
children: [Field];
}

/// ----------------------------------------------------------------------
/// Endianness of the platform that produces the RecordBatch

enum Endianness:int { Little, Big }

/// ----------------------------------------------------------------------
/// A Schema describes the columns in a row batch

table Schema {

/// endianness of the buffer
/// it is Little Endian by default
/// if endianness doesn't match the underlying system then the vectors need to be converted
endianness: Endianness=Little;

fields: [Field];
}

Expand Down Expand Up @@ -134,8 +145,6 @@ struct FieldNode {
null_count: int;
}

enum Endianness:int { Little, Big }

/// A data header describing the shared memory layout of a "record" or "row"
/// batch. Some systems call this a "row batch" internally and others a "record
/// batch".
Expand All @@ -144,11 +153,6 @@ table RecordBatch {
/// length
length: int;

/// endianness of the buffer
/// it is Little Endian by default
/// if endianness doesn't match the underlying system then the vectors need to be converted
endianness: Endianness=Little;

/// Nodes correspond to the pre-ordered flattened logical schema
nodes: [FieldNode];

Expand Down

0 comments on commit e2a3587

Please sign in to comment.