Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge sequence steps response #3690

Closed
wants to merge 26 commits into from

Commits on May 14, 2024

  1. beMerged field in crd, BeMerged field in struct, router behavior

    Signed-off-by: asd981256 <asd981256@gmail.com>
    asd981256 committed May 14, 2024
    Configuration menu
    Copy the full SHA
    d7cdd1e View commit details
    Browse the repository at this point in the history
  2. add test func

    Signed-off-by: asd981256 <asd981256@gmail.com>
    asd981256 committed May 14, 2024
    Configuration menu
    Copy the full SHA
    af4b42d View commit details
    Browse the repository at this point in the history
  3. go mod vendor, make generate, make test

    Signed-off-by: asd981256 <asd981256@gmail.com>
    asd981256 committed May 14, 2024
    Configuration menu
    Copy the full SHA
    46658e2 View commit details
    Browse the repository at this point in the history
  4. Remove generate endpoints (kserve#3654)

    Signed-off-by: Curtis Maddalozzo <cmaddalozzo@bloomberg.net>
    Signed-off-by: asd981256 <asd981256@gmail.com>
    cmaddalozzo authored and asd981256 committed May 14, 2024
    Configuration menu
    Copy the full SHA
    3d1f5a3 View commit details
    Browse the repository at this point in the history
  5. Assign device to input tensors in huggingface server with huggingface…

    … backend (kserve#3657)
    
    * Assign device of input tensors
    
    Signed-off-by: sailgpu <sailesh.duddupudi@nutanix.com>
    
    * lint fix
    
    Signed-off-by: sailgpu <sailesh.duddupudi@nutanix.com>
    
    ---------
    
    Signed-off-by: sailgpu <sailesh.duddupudi@nutanix.com>
    Signed-off-by: asd981256 <asd981256@gmail.com>
    saileshd1402 authored and asd981256 committed May 14, 2024
    Configuration menu
    Copy the full SHA
    767318d View commit details
    Browse the repository at this point in the history
  6. Test image builds for ARM64 arch in CI (kserve#3629)

    * Test image builds for ARM64 arch in CI
    
    Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
    
    * Update lockfiles
    
    Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
    
    * Add ARM64 support for paddle
    
    Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
    
    ---------
    
    Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
    Signed-off-by: asd981256 <asd981256@gmail.com>
    sivanantha321 authored and asd981256 committed May 14, 2024
    Configuration menu
    Copy the full SHA
    18010f4 View commit details
    Browse the repository at this point in the history
  7. Fix Huggingface server stopping criteria (kserve#3659)

    * Encoder-decoder models do not include input tokens in their output
    
    Signed-off-by: Curtis Maddalozzo <cmaddalozzo@bloomberg.net>
    
    * Pass stopping criteria into streamer
    
    Signed-off-by: Curtis Maddalozzo <cmaddalozzo@bloomberg.net>
    
    ---------
    
    Signed-off-by: Curtis Maddalozzo <cmaddalozzo@bloomberg.net>
    Signed-off-by: asd981256 <asd981256@gmail.com>
    cmaddalozzo authored and asd981256 committed May 14, 2024
    Configuration menu
    Copy the full SHA
    d84d4b4 View commit details
    Browse the repository at this point in the history
  8. Enabled the multiple domains support on an inference service (kserve#…

    …3615)
    
    * Added the field AdditionalIngressDomains into the struct IngressConfig
    
    Signed-off-by: Vincent Hou <shou73@bloomberg.net>
    
    * Added the additional ingress domains into the hosts
    
    Signed-off-by: Vincent Hou <shou73@bloomberg.net>
    
    * Fixed the indentation
    
    Signed-off-by: Vincent Hou <shou73@bloomberg.net>
    
    * Added isvc name and namespace into the domain name
    
    * Added the validation for the URLs
    
    Signed-off-by: Vincent Hou <shou73@bloomberg.net>
    
    * Validate the domain in the additionalIngressDomains
    
    Signed-off-by: Vincent Hou <shou73@bloomberg.net>
    
    * Create the hosts from the list of additionalIngressDomains
    
    Signed-off-by: Vincent Hou <shou73@bloomberg.net>
    
    * Change the way to validate the host
    
    Signed-off-by: Vincent Hou <shou73@bloomberg.net>
    
    * Change the validation error message
    
    Signed-off-by: Vincent Hou <shou73@bloomberg.net>
    
    * Revert the name to url
    
    Signed-off-by: Vincent Hou <shou73@bloomberg.net>
    
    * Get all the available domain list
    
    Signed-off-by: Vincent Hou <shou73@bloomberg.net>
    
    * gofmt -s -w the file
    
    Signed-off-by: Vincent Hou <shou73@bloomberg.net>
    
    * Add additionalIngressDomains into the charts
    
    Signed-off-by: Vincent Hou <shou73@bloomberg.net>
    
    * Added the comments and refactor the tests
    
    Signed-off-by: Vincent Hou <shou73@bloomberg.net>
    
    * Regenerate the manifests
    
    Signed-off-by: Vincent Hou <shou73@bloomberg.net>
    
    * Modify createHTTPMatchRequest, the charts and the test cases
    
    Signed-off-by: Vincent Hou <shou73@bloomberg.net>
    
    * Run make generate
    
    Signed-off-by: Vincent Hou <shou73@bloomberg.net>
    
    ---------
    
    Signed-off-by: Vincent Hou <shou73@bloomberg.net>
    Signed-off-by: asd981256 <asd981256@gmail.com>
    houshengbo authored and asd981256 committed May 14, 2024
    Configuration menu
    Copy the full SHA
    1337105 View commit details
    Browse the repository at this point in the history
  9. Explicitly specify pad token id when generating tokens (kserve#3565)

    * Add fall back pad token for tokenizer
    
    Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
    
    * Make linter happy
    
    Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
    
    * Update test
    
    Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
    
    * Rebase master
    
    Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
    
    ---------
    
    Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
    Signed-off-by: asd981256 <asd981256@gmail.com>
    sivanantha321 authored and asd981256 committed May 14, 2024
    Configuration menu
    Copy the full SHA
    fa101f6 View commit details
    Browse the repository at this point in the history
  10. Fix quick install does not cleans up Istio installer (kserve#3660)

    Fix quick install does not cleansup Istio installer
    
    Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
    Signed-off-by: asd981256 <asd981256@gmail.com>
    sivanantha321 authored and asd981256 committed May 14, 2024
    Configuration menu
    Copy the full SHA
    3138e5a View commit details
    Browse the repository at this point in the history
  11. Add base model for proxying request to an OpenAI API enabled model se…

    …rver (kserve#3621)
    
    * Add model to proxy requests to an OpenAI-enabled predictor
    
    Signed-off-by: Curtis Maddalozzo <cmaddalozzo@bloomberg.net>
    
    * Set default timeout
    
    Signed-off-by: Curtis Maddalozzo <cmaddalozzo@bloomberg.net>
    
    * Add error handling
    
    Signed-off-by: Curtis Maddalozzo <cmaddalozzo@bloomberg.net>
    
    * Add missing licenses
    
    Signed-off-by: Curtis Maddalozzo <cmaddalozzo@bloomberg.net>
    
    ---------
    
    Signed-off-by: Curtis Maddalozzo <cmaddalozzo@bloomberg.net>
    Signed-off-by: asd981256 <asd981256@gmail.com>
    cmaddalozzo authored and asd981256 committed May 14, 2024
    Configuration menu
    Copy the full SHA
    56920f5 View commit details
    Browse the repository at this point in the history
  12. Add headers to predictor exception logging (kserve#3658)

    * Add headers to predictor exception logging
    
    Signed-off-by: grandbora <grandbora@fb.com>
    
    * Log request id only
    
    Signed-off-by: grandbora <grandbora@fb.com>
    
    * Update log
    
    Signed-off-by: grandbora <grandbora@fb.com>
    
    ---------
    
    Signed-off-by: grandbora <grandbora@fb.com>
    Signed-off-by: asd981256 <asd981256@gmail.com>
    grandbora authored and asd981256 committed May 14, 2024
    Configuration menu
    Copy the full SHA
    fef646b View commit details
    Browse the repository at this point in the history
  13. workflow file for cherry-pick on comment (kserve#3653)

    * workflow file for cherry-pick on comment
    
    Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>
    
    * updated release notes and workflow
    
    Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>
    
    * Remove obsolete cherry pick workflow
    
    Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>
    
    ---------
    
    Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>
    Signed-off-by: asd981256 <asd981256@gmail.com>
    andyi2it authored and asd981256 committed May 14, 2024
    Configuration menu
    Copy the full SHA
    312f2ba View commit details
    Browse the repository at this point in the history
  14. Enhance controller setup based on available CRDs (kserve#3472)

    * Enhance controller setup based on available CRDs
    
    This enhances the setup of the InferenceService controller and the InferenceGraph controller. Instead of relying on the `defaultDeploymentMode` configuration to determine what CRDs to watch, the setup now checks whether KNative Services and Istio VirtualServices are available in the cluster and setup the watches (invoke `Owns`) accordingly.
    
    This enhancement has the following advantages:
    * A crashloop is prevented if the CRDs are missing in the cluster. The user would still be able to create InferenceServices by taking care of annotating the ISVC for RawDeployment mode.
    * If RawDeployment mode is configured as the default mode, the controllers would still watch for KNative and Istio resources if these components are available. This will let the controller watch for changes for the dependent resources if the user uses Serverless mode for some of the InferenceServices.
    * In the InferenceService controller, the watch for the VirtualServices is still conditioned to the value of the `disableVirtualHost` configuration.
    
    Signed-off-by: Edgar Hernández <23639005+israel-hdez@users.noreply.github.com>
    
    * Controller setup - add schemas based on CRDs available
    
    Since KServe controllers are modified to watch resources based on
    available CRDs, a similar change in the setup of the manager is needed:
    schemas need to be added to the manager based on available CRDs rather
    than based only on the values in the inferenceservice-config ConfigMap.
    This would keep both manager setup and controller setup in sync with
    regards schemas and watches around the CRDs.
    
    Signed-off-by: Edgar Hernández <23639005+israel-hdez@users.noreply.github.com>
    
    ---------
    
    Signed-off-by: Edgar Hernández <23639005+israel-hdez@users.noreply.github.com>
    Signed-off-by: asd981256 <asd981256@gmail.com>
    israel-hdez authored and asd981256 committed May 14, 2024
    Configuration menu
    Copy the full SHA
    4275854 View commit details
    Browse the repository at this point in the history
  15. Bump version to 0.13.0-rc0 (kserve#3665)

    Signed-off-by: Curtis Maddalozzo <cmaddalozzo@bloomberg.net>
    Signed-off-by: asd981256 <asd981256@gmail.com>
    cmaddalozzo authored and asd981256 committed May 14, 2024
    Configuration menu
    Copy the full SHA
    dcb71d8 View commit details
    Browse the repository at this point in the history
  16. upgrade vllm/transformers version (kserve#3671)

    upgrade vllm version
    
    Signed-off-by: Johnu George <johnugeorge109@gmail.com>
    Signed-off-by: asd981256 <asd981256@gmail.com>
    johnugeorge authored and asd981256 committed May 14, 2024
    Configuration menu
    Copy the full SHA
    dc49c86 View commit details
    Browse the repository at this point in the history
  17. Add openai models endpoint (kserve#3666)

    Signed-off-by: Curtis Maddalozzo <cmaddalozzo@bloomberg.net>
    Signed-off-by: asd981256 <asd981256@gmail.com>
    cmaddalozzo authored and asd981256 committed May 14, 2024
    Configuration menu
    Copy the full SHA
    4437792 View commit details
    Browse the repository at this point in the history
  18. feat: Support customizable deployment strategy for RawDeployment mode.

    …Fixes kserve#3452 (kserve#3603)
    
    * feat: Support customizable deployment strategy for RawDeployment mode
    
    Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
    
    * regen
    
    Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
    
    * lint
    
    Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
    
    * Correctly apply rollingupdate
    
    Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
    
    * address comments
    
    Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
    
    * Add validation
    
    Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
    
    ---------
    
    Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
    Signed-off-by: asd981256 <asd981256@gmail.com>
    terrytangyuan authored and asd981256 committed May 14, 2024
    Configuration menu
    Copy the full SHA
    2a22f5a View commit details
    Browse the repository at this point in the history
  19. Enable dtype support for huggingface server (kserve#3613)

    * Enable dtype for huggingface server
    
    Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com>
    
    * Set float16 as default. Fixup linter
    
    Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com>
    
    * Add small comment to make the changes understandable
    
    Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com>
    
    * Fixup linter
    
    Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com>
    
    * Adapt to new huggingfacemodel
    
    Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com>
    
    * Fixup merge :)
    
    Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com>
    
    * Explicitly mention the behaviour of dtype flag on auto.
    
    Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com>
    
    * Default to FP32 for encoder models
    
    Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com>
    
    * Selectively add --dtype to parser. Use FP16 for GPU and FP32 for CPU
    
    Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com>
    
    * Fixup linter
    
    Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com>
    
    * Update poetry
    
    Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com>
    
    * Use torch.float32 forr tests explicitly
    
    Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com>
    
    ---------
    
    Signed-off-by: Dattu Sharma <venkatadattasainimmaturi@gmail.com>
    Signed-off-by: asd981256 <asd981256@gmail.com>
    Datta0 authored and asd981256 committed May 14, 2024
    Configuration menu
    Copy the full SHA
    fe6cd06 View commit details
    Browse the repository at this point in the history
  20. Add method for checking model health/readiness (kserve#3673)

    Signed-off-by: Curtis Maddalozzo <cmaddalozzo@bloomberg.net>
    Signed-off-by: asd981256 <asd981256@gmail.com>
    cmaddalozzo authored and asd981256 committed May 14, 2024
    Configuration menu
    Copy the full SHA
    81cef35 View commit details
    Browse the repository at this point in the history
  21. fix for extract zip from gcs (kserve#3510)

    * fix for extract zip from gcs
    
    Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>
    
    * initial commit for gcs model download unittests
    
    Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>
    
    * unittests for model download from gcs
    
    Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>
    
    * black format fix
    
    Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>
    
    * code verification
    
    Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>
    
    ---------
    
    Signed-off-by: Andrews Arokiam <andrews.arokiam@ideas2it.com>
    Signed-off-by: asd981256 <asd981256@gmail.com>
    andyi2it authored and asd981256 committed May 14, 2024
    Configuration menu
    Copy the full SHA
    0b24241 View commit details
    Browse the repository at this point in the history
  22. Update Dockerfile and Readme (kserve#3676)

    Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
    Signed-off-by: asd981256 <asd981256@gmail.com>
    gavrishp authored and asd981256 committed May 14, 2024
    Configuration menu
    Copy the full SHA
    73bad9a View commit details
    Browse the repository at this point in the history
  23. Update huggingface readme (kserve#3678)

    * update wording for huggingface README
    
    small update to make readme easier to understand
    
    Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
    
    * Update README.md
    
    Signed-off-by: Alexa Griffith agriffith50@bloomberg.net
    
    * Update python/huggingfaceserver/README.md
    
    Co-authored-by: Filippe Spolti <filippespolti@gmail.com>
    Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
    
    * update vllm
    
    Signed-off-by: alexagriffith <agriffith50@bloomberg.net>
    
    * Update README.md
    
    ---------
    
    Signed-off-by: Alexa Griffith  <agriffith50@bloomberg.net>
    Signed-off-by: Alexa Griffith agriffith50@bloomberg.net
    Signed-off-by: alexagriffith <agriffith50@bloomberg.net>
    Signed-off-by: Dan Sun <dsun20@bloomberg.net>
    Co-authored-by: Filippe Spolti <filippespolti@gmail.com>
    Co-authored-by: Dan Sun <dsun20@bloomberg.net>
    Signed-off-by: asd981256 <asd981256@gmail.com>
    3 people authored and asd981256 committed May 14, 2024
    Configuration menu
    Copy the full SHA
    5567a73 View commit details
    Browse the repository at this point in the history
  24. fix: HPA equality check should include annotations (kserve#3650)

    * fix: HPA equality check should include annotations
    
    Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
    
    * Only watch related autoscalerclass annotation
    
    Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
    
    * simplify
    
    Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
    
    * Add missing delete action
    
    Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
    
    * fix logic
    
    Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
    ---------
    
    Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
    Signed-off-by: asd981256 <asd981256@gmail.com>
    terrytangyuan authored and asd981256 committed May 14, 2024
    Configuration menu
    Copy the full SHA
    1f0d5f3 View commit details
    Browse the repository at this point in the history
  25. Fix: huggingface runtime in helm chart (kserve#3679)

    fix huggingface runtime in chart
    
    Signed-off-by: Dan Sun <dsun20@bloomberg.net>
    Signed-off-by: asd981256 <asd981256@gmail.com>
    yuzisun authored and asd981256 committed May 14, 2024
    Configuration menu
    Copy the full SHA
    f47bb9a View commit details
    Browse the repository at this point in the history
  26. rename field to Response, re-generate code

    Signed-off-by: asd981256 <asd981256@gmail.com>
    asd981256 committed May 14, 2024
    Configuration menu
    Copy the full SHA
    9684f3d View commit details
    Browse the repository at this point in the history