Constrained decoding integration #1381

ajindal1 · 2025-04-07T21:11:35Z

Integrate Constrained decoding using LLGuidance library.

Based on Ying's Constrained Decoding branch (yingxiong/constrained_decoding)

@Taka152

@Taka152

Co-authored-by: Baiju Meswani <bmeswani@microsoft.com>

examples/python/guidance-model-chat.py

src/logits_processor.cpp

Integrate Constrained decoding using LLGuidance library. Based on Ying's Constrained Decoding branch (yingxiong/constrained_decoding) --------- Co-authored-by: Ying Xiong <yingxiong@microsoft.com> Co-authored-by: Michał Moskal <michal@moskal.me> Co-authored-by: Kunal Vaishnavi <kvaishnavi@microsoft.com> Co-authored-by: Ryan Hill <38674843+RyanUnderhill@users.noreply.github.com> Co-authored-by: Baiju Meswani <bmeswani@microsoft.com>

examples/python/model-chat.py

Address previous PR review comments from #1470 (#1473) Address QNN specific regressions (#1470) Fix array eos_token_id handling (#1463) Constrained decoding integration (#1381) Remove BF16 CPU from valid GQA configuration (#1469) Avoid adding providers if not requested (#1464) Persist provider options across ClearProviders, AppendProvider where possible (#1454) Fix accuracy issues with Gemma models (#1448) Add bfloat16 support in model builder (#1447) Add final norm for LoRA models (#1446) Update version to 0.8.0-rc3 --------- Co-authored-by: kunal-vaishnavi <115581922+kunal-vaishnavi@users.noreply.github.com> Co-authored-by: Nenad Banfic <46795300+nenad1002@users.noreply.github.com> Co-authored-by: Nenad Banfic <nebanfic@microsoft.com> Co-authored-by: Baiju Meswani <bmeswani@microsoft.com> Co-authored-by: Abhishek Jindal <abjindal@microsoft.com> Co-authored-by: Ying Xiong <yingxiong@microsoft.com> Co-authored-by: Michał Moskal <michal@moskal.me> Co-authored-by: Kunal Vaishnavi <kvaishnavi@microsoft.com>

Taka152 and others added 30 commits October 31, 2024 01:01

add llguidance based logits processor

bf38535

add unit test

c151d52

constrained decoding fixes (#1023)

9d5a8a0

@Taka152

add test grammars

48c3e96

support cuda

d70b849

use tokenize.json to generate token_bytes

6b90c1c

fix win build

bdb9ca4

async compute mask

a25de8e

add llguidance build in cmake

edc0bae

update windows build

09861d7

clean cmake

ee94df8

add install rust to GHA

4d077cf

test action

15b20b8

test win cpu build action

6029510

update win build action

4d8d8a6

update win build action

346f88c

update win build action

c00d8fa

update win build action

39fb7ed

update win build action

8038723

update win build action

324f550

update win build action

8722727

add rust install to workflows

d620422

support batch infer

c1ede01

add corrosion to deps.txt

d2f47e2

Merge branch 'main' into yingxiong/constrained_decoding

8deba60

fix merge

b256e6d

fix bugs

e5d6dad

update linux gpu workflow

2fd52d2

update linux gpu workfow

a11684b

update linux gpu workflow

56663c0

update logits processor leakcheck

2b75631

ajindal1 added the 0.8.0 label May 8, 2025

ajindal1 and others added 7 commits May 8, 2025 22:29

modify macos preset

699fc50

update linux cpu arm64 pipeline

b275255

add comments for logits processor

1905744

install pkgconfig and openssl for arm64

4287e74

merge with main

9b201f3

remove comment

9317d78

Co-authored-by: Baiju Meswani <bmeswani@microsoft.com>

eos token

77fd62b

ajindal1 requested review from baijumeswani, kunal-vaishnavi and RyanUnderhill May 9, 2025 02:47

kunal-vaishnavi reviewed May 9, 2025

View reviewed changes

examples/python/guidance-model-chat.py Outdated Show resolved Hide resolved

kunal-vaishnavi reviewed May 9, 2025

View reviewed changes

src/logits_processor.cpp Outdated Show resolved Hide resolved

kunal-vaishnavi reviewed May 9, 2025

View reviewed changes

src/logits_processor.cpp Outdated Show resolved Hide resolved

ajindal1 added 3 commits May 9, 2025 18:42

merge guidance examples into model-q and model-chat

fb1f904

change logits processor to constrained logits processor

e58d51c

merge wih main

f885002

kunal-vaishnavi previously approved these changes May 9, 2025

View reviewed changes

initialize constraint_ptr as nullptr

bb7666d

ajindal1 dismissed kunal-vaishnavi’s stale review via bb7666d May 9, 2025 21:17

kunal-vaishnavi approved these changes May 9, 2025

View reviewed changes

baijumeswani approved these changes May 9, 2025

View reviewed changes

ajindal1 merged commit 48e5cb3 into main May 9, 2025
14 checks passed

ajindal1 deleted the abjindal/constrained_decoding_integration branch May 9, 2025 22:22

RyanUnderhill mentioned this pull request May 12, 2025

Ryanunderhill/rc3 cherry picks #1475

Merged

RyanUnderhill added the cherry picked label May 12, 2025

jiafatom reviewed May 13, 2025

View reviewed changes

examples/python/model-chat.py Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Constrained decoding integration #1381

Constrained decoding integration #1381

Uh oh!

ajindal1 commented Apr 7, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Constrained decoding integration #1381

Constrained decoding integration #1381

Uh oh!

Conversation

ajindal1 commented Apr 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ajindal1 commented Apr 7, 2025 •

edited

Loading