-
Notifications
You must be signed in to change notification settings - Fork 689
Closed
Labels
rfcRequest for comment and feedback on a post, proposal, etc.Request for comment and feedback on a post, proposal, etc.triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate module
Description
🚀 The feature, motivation and pitch
This proposal introduces a schema to store tensors that can be loaded alongside a PTE file. Having tensors in a separate file to the executable PTE file allows for data sharing between separate PTE files, reducing overall memory footprint.
RFC
High-level (file contents)
Reference ExecuTorch schema that this based off of:
PTE File format
program.fbs
Data Schema V1
include "scalar_type.fbs";
namespace executorch_flatbuffer;
// Update after BC breaking changes.
file_identifier "DT01";
file_extension "data";
table TensorMetadata {
// the unique id used to connect the data and the program.
fully_qualified_name:string;
scalar_type:ScalarType;
dimensions:[int]; // dim sizes
dim_order:[ubyte];
// Tensor offsets are relative to the data segment offset provided in
// Data.segments for the specific index in TensorSegment.segment_index.
offset: uint64;
size: uint64;
}
table TensorSegment {
// Index of the segment in Data.segments.
segment_index: uint;
// Tensor information, including offset.
tensor_metadata:[TensorMetadata];
}
table DataSegment {
// Segment offsets are relative to the segment base offset provided in
// the extended file header. Segments will typically be aligned in a
// way to make it possible to use mmap() to load them.
offset: uint64;
// The size in bytes of valid data starting at the offset. The segment
// data may be followed by padding before the segment that follows it,
// to make it easier to use mmap().
size: uint64;
}
table Data {
// Schema version.
version:uint;
// Tensor information.
tensor_segments:[TensorSegment];
// Data segments.
segments:[DataSegment];
// Alignment for each tensor.
tensor_alignment: uint32;
}
root_type Data;
Note: all tensors can be stored in one segment. Having multiple DataSegments
provides the option to selectively load subsets, which can reduce peak memory usage.
Data Schema V2 (+CompatibilityMetadata)
Note:
- Ensure the PTE file schema contains similar metadata fields to match on. Potentially store this metadata as an opaque JSON blob.
- GGUF file format specifies a lot more metadata. We probably do not need all of it.
...
table CompatibilityMetadata {
model_name: str; // eg. 'llama'
version: str; // Model version, eg. 2
quantization_scheme: str; // Settle on a format for this.
custom_description: str; // Custom field for user-specific cases.
... // Other fields if necessary.
}
table Data {
// Schema version.
version:uint;
// Check compatibility with program file.
compatibility_metadata: CompatibilityMetadata];
// Tensor information.
tensor_segments:[TensorSegment];
// Data segments.
segments:[DataSegment];
// Alignment for each tensor.
tensor_alignment: uint32;
}
root_type Data;
Gasoonjia, jackzhxng, iseeyuan, tarun292 and btrude
Metadata
Metadata
Assignees
Labels
rfcRequest for comment and feedback on a post, proposal, etc.Request for comment and feedback on a post, proposal, etc.triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate module