(WIP) Multi backend refactor -> main (full diff of all already merged PRs) #1220

Titus-von-Koeller · 2024-05-25T16:46:27Z

This PR to main serves the purpose to keep an overview of all the extensive changes that have been introduced to multi-backend-refactor to the iterative PRs around this topic.

We will eventually merge this into master and before that do a thorough final review and, as well, get Tim's final sign-off on this extensive refactor.

For now, it mainly serves the purpose of providing a public diff of the entirety of the changes. However, already feel free to leave constructive feedback and review comments.

This reverts commit b7ca5cf.

…bytes into fix_igemmlt_int

Enable igemmlt int test on rocm

* Add build job for rocm * Add rocm build script * Copy shared obj file into output_dir * upload build artifacts and enable wheels build * Remove cuda build temporarily * Add ROCm version to .so filename * Add rocm_version to whls build * Revert "Remove cuda build temporarily" This reverts commit 1413c5f. * Add rocm_version env var * Remove thrush header files * Print node info * print cuda node info * Revert "print cuda node info" This reverts commit cdb209a. * Revert "Print node info" This reverts commit 7e9a65c. * Add rocm arch to compile command * Rename .so files to rocm * Update default gpu arch * Skip cpu based igemmlt int tests on ROCm * Update Documentation * Update upstream repo name * Update docs * Update string format Co-authored-by: Aarni Koskela <akx@iki.fi> * Remove pre-release option for torch install * Update pytorch install path Co-authored-by: Titus <9048635+Titus-von-Koeller@users.noreply.github.com> * Add messages for Heuristics error * Remove toolcache for disk space * print disk usage * Clean disk space for linux * Fix for ubuntu * Add sudo for apt clean * Update clean up disk list * remove disk usage print * Add BNB_BACKEND variable * Update diagnostic functions for ROCm * Fix tuple error * Fix library detection bug for recursive and symlink cases * fix pre-commit errors * Remove recursive path lib search * Create function for runtime lib patterns * Update logger format Co-authored-by: Aarni Koskela <akx@iki.fi> * Update error reporting Co-authored-by: Aarni Koskela <akx@iki.fi> * Remove commented code Co-authored-by: Aarni Koskela <akx@iki.fi> * Update error reporting Co-authored-by: Aarni Koskela <akx@iki.fi> * Update error reporting * Create hip diagnostics functions * Fix Typo * Fix pre-commit checks * Enable 6.2 build * Skip gemv 4 bit cpu test * Update documentation for 6.2.0 pip install * Update README for default branch change * Fix typo * Sync README with upstream * Remove depth --------- Co-authored-by: Aarni Koskela <akx@iki.fi> Co-authored-by: Titus <9048635+Titus-von-Koeller@users.noreply.github.com> Co-authored-by: Aswin John Mathews <81309834+amathews-amd@users.noreply.github.com> Co-authored-by: root <root@banff-cyxtera-s78-4.ctr.dcgpu>

…tsandbytes from source in personal repo (#1419)

* enable new ipex API ipex weight is 4D so we cannot transpose fix dequant check require grad * use ipex op in backward * enable backward * Multi backend refactor (#8) * AMD: Clarify diagnostic messages; free up disk space for CI build * Add build job for rocm * Add rocm build script * Copy shared obj file into output_dir * upload build artifacts and enable wheels build * Remove cuda build temporarily * Add ROCm version to .so filename * Add rocm_version to whls build * Revert "Remove cuda build temporarily" This reverts commit 1413c5f. * Add rocm_version env var * Remove thrush header files * Print node info * print cuda node info * Revert "print cuda node info" This reverts commit cdb209a. * Revert "Print node info" This reverts commit 7e9a65c. * Add rocm arch to compile command * Rename .so files to rocm * Update default gpu arch * Skip cpu based igemmlt int tests on ROCm * Update Documentation * Update upstream repo name * Update docs * Update string format Co-authored-by: Aarni Koskela <akx@iki.fi> * Remove pre-release option for torch install * Update pytorch install path Co-authored-by: Titus <9048635+Titus-von-Koeller@users.noreply.github.com> * Add messages for Heuristics error * Remove toolcache for disk space * print disk usage * Clean disk space for linux * Fix for ubuntu * Add sudo for apt clean * Update clean up disk list * remove disk usage print * Add BNB_BACKEND variable * Update diagnostic functions for ROCm * Fix tuple error * Fix library detection bug for recursive and symlink cases * fix pre-commit errors * Remove recursive path lib search * Create function for runtime lib patterns * Update logger format Co-authored-by: Aarni Koskela <akx@iki.fi> * Update error reporting Co-authored-by: Aarni Koskela <akx@iki.fi> * Remove commented code Co-authored-by: Aarni Koskela <akx@iki.fi> * Update error reporting Co-authored-by: Aarni Koskela <akx@iki.fi> * Update error reporting * Create hip diagnostics functions * Fix Typo * Fix pre-commit checks --------- Co-authored-by: Aarni Koskela <akx@iki.fi> Co-authored-by: Titus <9048635+Titus-von-Koeller@users.noreply.github.com> * check grad before using ipex (#1358) * Enable packaging for ROCm 6.2 (#1367) * Enable 6.2 build * Update documentation for 6.2.0 pip install * Update for VS2022 17.11 compatibility with CUDA < 12.4 (#1341) * Update for VS2022 17.11 compatibility with CUDA < 12.4 * Try again * Enable continuous releases for multi-backend-refactor branch * Update release workflow * Publish continuous release for multi-backend * continuous release: revert wheel renaming due to install err * Revert "continuous release: revert wheel renaming due to install err" This reverts commit 0a2b539. * add dynamic tag-based versioning + git hash for dev vers * docs: update w/ changes from `main` * get tags for dynamic versioning * fine-tune continuous release params * reduce the pkg size + build times for the preview release * refine docs for multi-backend alpha release (#1380) * refine docs for multi-backend alpha release * docs: further tweaks to multi-backend alpha docs * docs: further tweaks to multi-backend alpha docs * docs: further tweaks to multi-backend alpha docs * docs: add multi-backend feedback links * docs: add request for contributions * docs: small fixes * docs: small fixes * docs: add info about `main` continuous build * docs: further tweaks to multi-backend alpha docs * docs: further tweaks to multi-backend alpha docs * docs: remove 2 obsolete lines --------- Co-authored-by: pnunna93 <104791500+pnunna93@users.noreply.github.com> Co-authored-by: Aarni Koskela <akx@iki.fi> Co-authored-by: Titus <9048635+Titus-von-Koeller@users.noreply.github.com> Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com> * Revert "enable backward" This reverts commit cd7bf21. * Revert "use ipex op in backward" This reverts commit b8df1aa. * fix finetune * check training * fix gemv check * reformat * avoid double quant in backward if not needed * Zh/xpu support (#9) * Add xpu support * Add xpu support for int8 * Add xpu dequant kernel support * update code * remove debug comments * remove redundant comments * Add xpu integration for woqlinear * correct the comments * Update cpu_xpu_common.py --------- Co-authored-by: zhuhong61 <hong.zhu@intel.com> Co-authored-by: zhuhong61 <95205772+zhuhong61@users.noreply.github.com> * avoid import triton if CPU and XPU backend * fix setup in docker without git config * xpu do not support compile for now Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update xpu Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update 4bit compute dtype * fix xpu int8 path Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * optimize 4bit dequant Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix xpu dequant Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * add empty cache in each xpu op * add nf4 dequant ipex kernel * fix dequant 4bit op * empty cache has negative effect on 4bit gemv * fix xpu save * fix save * xpu use float16 default Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * rm empty cache as it cause slower perf Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix xpu save Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix 8bit int8 param device Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix 8bit int8 param device Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix 8bit int8 param device Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix 8bit int8 param device Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format * update readme for Intel CPU and XPU do not need make csrc codes * fix format * fix import --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: pnunna93 <104791500+pnunna93@users.noreply.github.com> Co-authored-by: Aarni Koskela <akx@iki.fi> Co-authored-by: Titus <9048635+Titus-von-Koeller@users.noreply.github.com> Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com> Co-authored-by: zhuhong61 <hong.zhu@intel.com> Co-authored-by: zhuhong61 <95205772+zhuhong61@users.noreply.github.com>

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* Add npu support for nf4 quant Co-authored-by: Slightwind <slightwindsec@gmail.com> Co-authored-by: Ginray <ginray0215@gmail.com> * code format * update * pass lint check and fix typos * add npu to supported devices --------- Co-authored-by: Slightwind <slightwindsec@gmail.com> Co-authored-by: Ginray <ginray0215@gmail.com>

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix dequant 8bit Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * support double quant on intel cpu and xpu Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix shape Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix 4bit format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix device error for xpu Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix 4bit tensor shape Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix nf4 xpu finetune Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* new matmul8bit Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix cxb Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

…yproject.toml

github-actions · 2025-02-10T20:14:29Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

* fix xpu dtypoe Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix nf4 dtype Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix version Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix setup version Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* enable benchmark script Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * Small fixes to non_cuda_backends.mdx --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Titus <9048635+Titus-von-Koeller@users.noreply.github.com>

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* enable quant storage Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix to numpy Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix 4bit XPU dequant 4bit Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix default value Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix ipex linear set Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix ipex linear set to false when calling state dict Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix Int8Param device patch Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix xpu to cpu Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix xpu cpu data device Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

jianan-gu and others added 30 commits December 3, 2023 19:54

add quant to device when init weight paam

b2a4d54

minor fix

c44cf06

mv cuda to common backends

365491a

format fix

4050fe3

format fix

30175d1

use device.type

e17549e

minor fix

a53bc31

backend refinement

80c598c

minor fix

59facc8

final refinement

066d0dc

Enable col to row transformation

657ca4b

Add make functions for row to col transformation

a390e0c

Update get_transform_buffer for row to col in HIP

99ad6b5

Update igemmlt for col format

039b808

Unskip test_igemmlt_int on ROCm

1a052ee

Update igemmlt_int test for col inputs

b7ca5cf

Skip transpose igemmlt test on ROCm

a2cd90d

Revert "Update igemmlt_int test for col inputs"

5b6c5ac

This reverts commit b7ca5cf.

Return nvidia_transform from transform for HIP

218bf66

Fix syntax error

8bb5c2f

Add comment for shape change

eb2edf7

Enable nvidia_transform tests

a38ea0f

Merge branch 'fix_igemmlt_int' of https://github.com/pnunna93/bitsand…

fbacd7a

…bytes into fix_igemmlt_int

Enable igemmlt_half tests

67c383b

Revert col32 check in nvidia_transform test

42b860f

Merge pull request #3 from pnunna93/fix_igemmlt_int

7198d6b

Enable igemmlt int test on rocm

Merge remote-tracking branch 'upstream/main' into IFU-master-2024-01-24

b1d484a

Update README.md

c36085d

Update hip files with upstream changes

0e91e48

Skip failing tests for now

1295d53

pnunna93 and others added 16 commits October 16, 2024 13:51

Fix issue that no valid semantic version tag found when installing bi…

cd73601

…tsandbytes from source in personal repo (#1419)

fix cpu nf4 (#1432)

9315692

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

fix device check (#1453)

7e6f865

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

Enable dequant+matmul 8bit path for Intel CPU and XPU (#1484)

307fbd5

* new matmul8bit Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix cxb Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

add device index (#1489)

a0a95fd

Sync branch with main; resolve conflicts.

ca29936

Update base backend docstrings

ed2a58d

Update NPU backend with new spec

07c23de

Update CPU tests

94d6027

ROCm: Fix compilation.

3fabd1a

Fix

d3ead1e

Build: use setuptools_scm for dynamic versioning compatibility with p…

6c4d878

…yproject.toml

matthewdouglas and others added 5 commits February 10, 2025 15:40

Update wheel build

2d06869

Add rocm6.3.2

7c917b0

setuptools_scm update

fdbbfb6

fix xpu woq linear dtype (#1506)

89373b8

* fix xpu dtypoe Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix nf4 dtype Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

fix version (#1532)

2640753

* fix version Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix setup version Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

matthewdouglas added the cross-platform label Feb 28, 2025

jiqing-feng and others added 7 commits March 4, 2025 20:39

enable benchmark script (#1554)

c66e137

* enable benchmark script Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * Small fixes to non_cuda_backends.mdx --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Titus <9048635+Titus-von-Koeller@users.noreply.github.com>

update comments (#1562)

83c147d

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

enable quant storage (#1563)

0cd87aa

* enable quant storage Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix to numpy Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

fix meta device dispatch (#1564)

2354bdd

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

Enable XPU int matmul (#1547)

249a3cd

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

Fix xpu to cpu (#1570)

d3658c5

* fix xpu to cpu Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix xpu cpu data device Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(WIP) Multi backend refactor -> main (full diff of all already merged PRs) #1220

(WIP) Multi backend refactor -> main (full diff of all already merged PRs) #1220

Titus-von-Koeller commented May 25, 2024 •

edited

Loading

github-actions bot commented Feb 10, 2025

(WIP) Multi backend refactor -> main (full diff of all already merged PRs) #1220

Are you sure you want to change the base?

(WIP) Multi backend refactor -> main (full diff of all already merged PRs) #1220

Conversation

Titus-von-Koeller commented May 25, 2024 • edited Loading

github-actions bot commented Feb 10, 2025

Titus-von-Koeller commented May 25, 2024 •

edited

Loading