Skip to content

Conversation

zhuohan123
Copy link
Member

No description provided.

@zhuohan123 zhuohan123 requested a review from WoosukKwon June 22, 2023 07:33
Copy link
Collaborator

@WoosukKwon WoosukKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@zhuohan123 zhuohan123 merged commit 83658c8 into main Jun 22, 2023
@zhuohan123 zhuohan123 deleted the bumpup-version-0-1-1 branch June 22, 2023 07:33
@zhuohan123 zhuohan123 restored the bumpup-version-0-1-1 branch June 22, 2023 07:34
@WoosukKwon WoosukKwon deleted the bumpup-version-0-1-1 branch June 22, 2023 08:01
michaelfeil pushed a commit to michaelfeil/vllm that referenced this pull request Jun 24, 2023
hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024
yukavio pushed a commit to yukavio/vllm that referenced this pull request Jul 3, 2024
SUMMARY:
* update NIGHTLY workflow to be whl centric
* update benchmarking jobs to use generated whl

TEST PLAN:
runs on remote push. i'm also triggering NIGHTLY manually.

---------

Co-authored-by: andy-neuma <andy@neuralmagic.com>
Co-authored-by: Domenic Barbuzzi <domenic@neuralmagic.com>
Co-authored-by: Domenic Barbuzzi <dbarbuzzi@gmail.com>
mht-sharma pushed a commit to mht-sharma/vllm that referenced this pull request Oct 30, 2024
dtrifiro added a commit to dtrifiro/vllm that referenced this pull request Apr 7, 2025
"variables" in `docker-bake.hcl` can have defaults, but are overridden
by env vars with the same name. We can remove these (useless) defaults
and fix the name for `GITHUB_REPO` (it's actually `GITHUB_REPOSITORY`)


Example:
```bash 
env \
  GITHUB_REPOSITORY=neuralmagic/nm-vllm-ent \
  PYTHON_VERSION=3.12 \
  GITHUB_SHA=$(git rev-parse HEAD) \
  VLLM_VERSION=0.8.3 \
  docker buildx bake cuda --print
```
output:
```json
{
  "group": {
    "default": {
      "targets": [
        "cuda"
      ]
    }
  },
  "target": {
    "cuda": {
      "context": ".",
      "dockerfile": "Dockerfile.ubi",
      "args": {
        "BASE_UBI_IMAGE_TAG": "9.5-1739420147",
        "FLASHINFER_VERSION": "https://github.com/flashinfer-ai/flashinfer/releases/download/v0.2.1.post1/flashinfer_python-0.2.1.post1+cu124torch2.5-cp38-abi3-linux_x86_64.whl",
        "LIBSODIUM_VERSION": "1.0.20",
        "PYTHON_VERSION": "3.12",
        "VLLM_TGIS_ADAPTER_VERSION": "0.6.3"
      },
      "labels": {
        "org.opencontainers.image.source": "https://github.com/neuralmagic/nm-vllm-ent",
        "vcs-ref": "9803ee1c6d30330c9dc3fca6d42491794f135013",
        "vcs-type": "git"
      },
      "tags": [
        "quay.io/vllm/vllm:0.8.3",
        "quay.io/vllm/vllm:9803ee1c6d30330c9dc3fca6d42491794f135013",
        "quay.io/vllm/vllm:2025-04-04-17-55"
      ],
      "platforms": [
        "linux/amd64"
      ]
    }
  }
}
```
chaojun-zhang pushed a commit to chaojun-zhang/vllm that referenced this pull request Jun 17, 2025
* use 2025.1.1 instead (vllm-project#196)

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>

* Use standalone_compile by default in torch >= 2.8.0 (vllm-project#18846)

Signed-off-by: rzou <zou3519@gmail.com>

* fix xpu compile issue

---------

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
Signed-off-by: rzou <zou3519@gmail.com>
Co-authored-by: Richard Zou <zou3519@users.noreply.github.com>
chaojun-zhang pushed a commit to chaojun-zhang/vllm that referenced this pull request Jun 17, 2025
* use 2025.1.1 instead (vllm-project#196)

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>

* Use standalone_compile by default in torch >= 2.8.0 (vllm-project#18846)

Signed-off-by: rzou <zou3519@gmail.com>

* fix xpu compile issue

---------

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
Signed-off-by: rzou <zou3519@gmail.com>
Co-authored-by: Richard Zou <zou3519@users.noreply.github.com>
jikunshang added a commit to jikunshang/vllm that referenced this pull request Jun 18, 2025
* use 2025.1.1 instead (vllm-project#196)

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>

* Use standalone_compile by default in torch >= 2.8.0 (vllm-project#18846)

Signed-off-by: rzou <zou3519@gmail.com>

* fix xpu compile issue

---------

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
Signed-off-by: rzou <zou3519@gmail.com>
Co-authored-by: Richard Zou <zou3519@users.noreply.github.com>
zhenwei-intel pushed a commit to zhenwei-intel/vllm that referenced this pull request Jun 23, 2025
* use 2025.1.1 instead (vllm-project#196)

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>

* Use standalone_compile by default in torch >= 2.8.0 (vllm-project#18846)

Signed-off-by: rzou <zou3519@gmail.com>

* fix xpu compile issue

---------

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
Signed-off-by: rzou <zou3519@gmail.com>
Co-authored-by: Richard Zou <zou3519@users.noreply.github.com>
jikunshang added a commit to jikunshang/vllm that referenced this pull request Jun 24, 2025
* use 2025.1.1 instead (vllm-project#196)

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>

* Use standalone_compile by default in torch >= 2.8.0 (vllm-project#18846)

Signed-off-by: rzou <zou3519@gmail.com>

* fix xpu compile issue

---------

Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
Signed-off-by: rzou <zou3519@gmail.com>
Co-authored-by: Richard Zou <zou3519@users.noreply.github.com>
amy-why-3459 pushed a commit to amy-why-3459/vllm that referenced this pull request Sep 15, 2025
…nd v_cache. (vllm-project#204)

This PR changes the shape of kv cache to avoid the view of k_cache and
v_cache.
What's more, cache the metadata of k_cache and v_cache to avoid
duplicative slice operations to improve performance.

Signed-off-by: hw_whx <wanghexiang7@huawei.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants