Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Jun 9, 2025

Summary

This PR implements comprehensive casting support for Int4x2 and UInt4x2 data types in the ONNX Runtime CPU provider, addressing the TODO comment in cast_op.cc and enabling 4-bit integer quantization workflows.

Changes Made

1. Updated Type Lists

  • Changed cast operation type constraints from AllIRv9 to AllIRv10 to include Int4x2/UInt4x2 types
  • Enables Int4x2/UInt4x2 as both source and destination types for cast operations

2. Core Casting Implementation

Added TensorCaster specializations for:

  • Int4x2/UInt4x2 ↔ float: Primary conversions for quantization workflows
  • Int4x2/UInt4x2 ↔ all numeric types: int8_t, uint8_t, int16_t, int32_t, int64_t, double, MLFloat16, BFloat16

3. Helper Functions and Templates

  • ConvertFromInt4x2<Signed, DstType>(): Unpacks 4-bit values to destination types
  • ConvertToInt4x2<Signed, SrcType>(): Converts and clamps values to 4-bit range with proper bounds checking
  • Generic converter templates (Int4x2ToCaster, CasterToInt4x2) to eliminate code duplication

4. Key Features

  • Proper nibble ordering: Follows "low nibble first, then high nibble" specification
  • Value clamping: Signed Int4 values clamped to [-8, 7], unsigned UInt4 to [0, 15]
  • Odd-sized tensor handling: Automatically pads incomplete pairs with zero
  • Memory efficient: Leverages existing Int4x2/UInt4x2 infrastructure

Example Usage

// Int4x2 to float conversion
std::vector<Int4x2> int4_data = {Int4x2(3, -5), Int4x2(7, -8)};
// Unpacks to: [3.0, -5.0, 7.0, -8.0]

// Float to Int4x2 with clamping
std::vector<float> float_data = {3.7f, -5.2f, 10.0f, -15.0f, 2.1f};
// Clamps and packs to: [(3, -5), (7, -8), (2, 0)]  // Note: 10.0→7, -15.0→-8, padded with 0

Testing

Created comprehensive standalone tests validating:

  • Correct unpacking of Int4x2/UInt4x2 to various numeric types
  • Proper value clamping during conversion from larger types
  • Correct handling of odd-sized tensors with zero padding
  • Nibble ordering verification (val0→low nibble, val1→high nibble)

Compatibility

  • No breaking changes: All existing cast operations continue to work
  • Minimal footprint: Only 275 lines added, 3 lines modified
  • Template-safe: Uses inheritance to avoid conflicting partial specializations
  • Performance: Direct conversions without unnecessary intermediate steps

Resolves: TODO comment "Implement support for int4 and uint4" in cast_op.cc


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI self-assigned this Jun 9, 2025
Copilot AI and others added 2 commits June 9, 2025 16:20
…ersions

Co-authored-by: emarin_microsoft <115768453+emarin_microsoft@users.noreply.github.com>
…nd float types

Co-authored-by: emarin_microsoft <115768453+emarin_microsoft@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix Int4x2/UInt4x2 Cast Operator Implementation to Avoid Template Specialization Conflicts Implement Int4x2 and UInt4x2 casting support in ONNX Runtime CPU provider Jun 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant