Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[compiler] Support large models #12732

Closed
seanshpark opened this issue Mar 7, 2024 · 11 comments
Closed

[compiler] Support large models #12732

seanshpark opened this issue Mar 7, 2024 · 11 comments

Comments

@seanshpark
Copy link
Contributor

seanshpark commented Mar 7, 2024

Issue to gather issues, task for large models, such as

  • flatbuffer size > 2G
  • quantize < 8 bit
  • ...


things to do with our LLM model

  • convert model to circle -> done with internal onnx2circle
  • optimize circle model -> conversion seems ok
  • ...
@hseok-oh
Copy link
Contributor

hseok-oh commented Mar 7, 2024

We may need to support external data file on circle spec (like onnx spec) to save large weight.

@seanshpark
Copy link
Contributor Author

We may need to support external data file

circle schema from 0.7 has this in changes from tflite

(for table Operator)
  // When an op is using custom_options in a model that is larger than 2GB, then
  // we instead use the following attributes to find the buffer location which
  // is stored outside of flatbuffers, the offset is calculated relative to the
  // beginning of the file and is only valid if > 1
  large_custom_options_offset: ulong;
  large_custom_options_size: ulong;

(for table Buffer)
  // In a model that is larger than 2GB, then buffers instead uses the following
  // attributes to find stored data, which is outside of flatbuffers
  // the offset is calculated relative to the beginning of the file and is only
  // valid if > 1.
  offset: ulong;
  size: ulong;

I need to look inside tflite implementation but,
it seems large data are added at the end of the file, flat buffer managed part,
as single file.

@seanshpark
Copy link
Contributor Author

seanshpark commented Aug 8, 2024

Now it's time to support this.

modules that may need revision; progress with draft changes

  • res of tfl schema 2.16.1
  • mio-tflite2121
  • mio-circle08
  • tflchef
  • tflite2circle
  • luci/import
  • luci/export
  • circledump
  • tfldump
  • tfl-inspect
  • circle-inspect
  • ...

modules using luci/import, luci/export

  • circle-eval-diff
  • circle-execution-plan
  • circle-interpreter
  • circle-mqpsolver
  • circle-opselector
  • circle-partitioner
  • circle-quantizer
  • circle2circle
  • common-artifacts
  • dalgona
  • embedded-import-value-test
  • fme-apply
  • fme-detect
  • luci/tester
  • luci-eval-driver
  • minmax-embedder
  • q-implant
  • record-minmax

how to test this?

  • tflchef to produce huge model model with new offset/size
    • use TF2.16.1 schema in res
    • check huge tflite is produced
    • check tfldump
    • temporary tflite2circle to produce huge circle new offset/size circle
    • check circledump

issues

  • TF 2.12.1 doesn't support size > 2G --> we need to upgrade TF some day

@seanshpark
Copy link
Contributor Author

seanshpark commented Aug 8, 2024

How tflite does

tensorflow/lite/core/model_builder.cc

  // Only run validator on models less than 2GB
  if (allocation->bytes() < flatbuffer_size_max) {
    flatbuffers::Verifier base_verifier(
        reinterpret_cast<const uint8_t*>(allocation->base()),
        allocation->bytes());
    if (!VerifyModelBuffer(base_verifier)) {
      TF_LITE_REPORT_ERROR(error_reporter,
                           "The model is not a valid Flatbuffer buffer");
      return nullptr;
    }
  }

tensorflow/compiler/mlir/lite/flatbuffer_export.cc

  // check if Flatbuffer builder can no longer hold the given amount of the data
  inline bool IsModelBiggerThan2GB(const uint64_t data_size) {
    return data_size > flatbuffer_size_max - builder_.GetSize();
  }

@seanshpark
Copy link
Contributor Author

seanshpark commented Aug 8, 2024

How to export ?

  • go with two phase
  • first phase is to save normal model, and check if size > 2GB, set require_use_buffer_offset flag and exit with error
  • if the first phase exits with error and require_use_buffer_offset flag set, run second phase with use_buffer_offset set to store all Buffers into outside of FB area

@seanshpark
Copy link
Contributor Author

seanshpark commented Aug 8, 2024

As of testing, let's use new Buffer.offset if size > 2MB
Or as an option, we can add a flag in tflite recipe to use Buffer.offset

@seanshpark
Copy link
Contributor Author

seanshpark commented Aug 8, 2024

New fields to support


table Operator {
...
  large_custom_options_offset: ulong;
  large_custom_options_size: ulong;
...
}

table Buffer {
...
  offset: ulong;
  size: ulong;
}

@chunseoklee
Copy link
Contributor

large_custom_options_offset: ulong;
large_custom_options_size: ulong;

we need this option from the start ? IMHO, this is a kind of custom something.

@seanshpark
Copy link
Contributor Author

this is a kind of custom something.

Yes, for CircleCustom.. We don't need to implement this at this moment :)

@seanshpark
Copy link
Contributor Author

Name for this feature... (I thought I've left a mention about this but now I can't find it...)

Extended Buffer

@seanshpark
Copy link
Contributor Author

done for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants