.Net: HuggingFace implement also `IChatCompletionService`? #5403

jkone27 · 2024-03-10T20:57:14Z

i am testing the hugging face preview package for .NET, but i cannot make use of IChatCompletionService

it fails to resolve that, not using openai, only huggingface apis..

#r "nuget: Microsoft.SemanticKernel"
#r "nuget: Microsoft.Extensions.Logging.Debug"
#r "nuget: Microsoft.SemanticKernel.Abstractions"
#r "nuget: Microsoft.SemanticKernel.Connectors.HuggingFace, 1.5.0-preview"
#r "nuget: Microsoft.Extensions.DependencyInjection"

open Microsoft.Extensions.Logging
//open Microsoft.SemanticKernel.Plugins.Core
open Microsoft.SemanticKernel
open Microsoft.SemanticKernel.Connectors.HuggingFace
open System
open System.ComponentModel
open Microsoft.Extensions.DependencyInjection
open Microsoft.Extensions.Logging.Abstractions
open Microsoft.SemanticKernel.ChatCompletion
open Microsoft.SemanticKernel.Connectors.OpenAI
open System.Threading.Tasks

// type Test() =
//     interface IChatCompletionService with
//         member this.GetChatMessageContentAsync(chatHistory, executionSettings, kernel, cancellationToken) =
//             task { return [ new ChatMessageContent() ] :> IReadonlyList }


let builder =
    let apikey = "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"
    let model = "openchat/openchat-3.5-0106"

    Kernel
        .CreateBuilder()
        .AddHuggingFaceTextGeneration(model = model, apiKey = apikey)
        // .AddOpenAIChatCompletion(model) i am not using openai, tried adding this, but also fails to load

type LightPlugin() =
    let mutable isOn = false

    [<KernelFunction>]
    [<Description("Gets the state of the light.")>]
    member this.GetState() = if isOn then "on" else "off"

    [<KernelFunction>]
    [<Description("Changes the state of the light.")>]
    member this.ChangeState(newState: bool) =
        isOn <- newState
        let state = this.GetState()
        printfn $"[Light is now {state}]"
        state

// load a plugin from code
builder.Services.AddLogging(fun c -> c.AddDebug().SetMinimumLevel(LogLevel.Trace) |> ignore)
builder.Plugins.AddFromType<LightPlugin>() |> ignore
let kernel = builder.Build()

// or for fsi fsi.CommandLineArgs[1] for example
printfn "prompt to summarize:"
let request = Console.ReadLine()

let testPrompt =
    $"make a 1 liner quick and short summary of max 20 words of this request: `{request}`"

// 1. test "raw" promts to verify your connection
let result = kernel.InvokePromptAsync(testPrompt).Result

printfn $"> {result}"

// 2. test SK prompts

let history = new ChatHistory()

let chatCompletionService = kernel.GetRequiredService<IChatCompletionService>()


let switchTheLight (cmd: string) = $"switch the light {cmd}"

switchTheLight "ON" |> history.AddUserMessage

// Enable auto function calling
let openAIPromptExecutionSettings = new OpenAIPromptExecutionSettings()
openAIPromptExecutionSettings.ToolCallBehavior <- ToolCallBehavior.AutoInvokeKernelFunctions

// Get the response from the AI
let result2 =
    chatCompletionService
        .GetChatMessageContentAsync(history, openAIPromptExecutionSettings, kernel)
        .Result

printfn $">> {result2.Content}"
history.AddMessage(result2.Role, result2.Content) |> ignore

The text was updated successfully, but these errors were encountered:

Krzysztof318 · 2024-03-10T22:01:29Z

I suggest you implementing your own IChatCompletionService for specific model. You can use existing HuggingFaceTextGenerationService but with special prompt.

HuggingFace interference api doesn't have common syntax for chatting, so a chat payload can vary on model you use.

jkone27 · 2024-03-10T22:21:01Z

Can you provide a quick sample on how to implement a very basic one ? I was also thinking the same

Krzysztof318 · 2024-03-10T23:02:07Z

Look at this sample https://github.com/microsoft/semantic-kernel/blob/main/dotnet%2Fsamples%2FKernelSyntaxExamples%2FExample16_CustomLLM.cs

chat completion service is similar but takes ChatHistory param instead of string prompt.

You have to create prompt something like that from chat history

User: hello
Assistant: hi what can I do?
User:  help me with...
Assitant: <here empty so model can complete this>

But keep in mind this pattern can not work with each model.

markwallace-microsoft · 2024-03-12T09:20:50Z

@jkone27 if you do want to make a contribution to the Semantic Kernel then @RogerBarreto can help by setting up a feature branch and getting your changes reviewed.

Krzysztof318 · 2024-03-18T22:17:21Z

My oversight, I see now that huggingFace support chatting via interference api https://huggingface.co/docs/api-inference/detailed_parameters?code=python#conversational-task

JonathanVelkeneers · 2024-03-19T16:34:50Z

I was looking for this functionality as well. Is it possible to implement it in the connector seeing as it has been implemented in Huggingface's API recently?

https://huggingface.co/docs/text-generation-inference/messages_api#hugging-face-inference-endpoints
https://huggingface.co/blog/tgi-messages-api

jkone27 · 2024-03-20T10:25:07Z

@markwallace-microsoft @Krzysztof318 ⬆️

Krzysztof318 · 2024-03-20T10:28:56Z

@jkone27 hi, unfortunately I don't have time for implementing that now. But why don't you want to contribute and create PR fot that? Implementing this should be quite simple.

RogerBarreto · 2024-03-27T12:39:51Z

I was looking for this functionality as well. Is it possible to implement it in the connector seeing as it has been implemented in Huggingface's API recently?

It is, although not supported by the public API this seems to be valid for the TGI deployments.

Hugging Face - Chat Completion POC.

JonathanVelkeneers · 2024-03-29T12:24:09Z

I was looking for this functionality as well. Is it possible to implement it in the connector seeing as it has been implemented in Huggingface's API recently?

It is, although not supported by the public API this seems to be valid for the TGI deployments.

Hugging Face - Chat Completion POC.

@RogerBarreto
It is supported by the public api, albeit poorly documented. Here is an example of how a HuggingFace model can be used with an existing openAI client library, and thus can be used in chat mode.

This in turn can be translated to a cUrl request.

A requirement for this to work with a specific model is that it has to have the chat_template property in their tokenizer_config.json file example. So not all models will work OOTB with their public interference API, but most popular ones have this implemented.

Making a cUrl request to this model on the public API works in openAI chat style:

    curl https://api-inference.huggingface.co/models/NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO/v1/chat/completions \
    -X POST \
    -d '{"model":"NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO", "messages": [{"role":"user","content":"How is the weather in Antwerp, Belgium?"}], "parameters": {"temperature": 0.7, "max_new_tokens": 100}}' \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer XXXXXXXXXXXXXXX"

This yields the following result:

{"id":"","object":"text_completion","created":1711714682,"model":"text-generation-inference/Nous-Hermes-2-Mixtral-8x7B-DPO-medusa","system_fingerprint":"1.4.3-sha-e6bb3ff","choices":[{"index":0,"message":{"role":"assistant","content":"As weather data changes constantly, the most accurate and up-to-date weather information for Antwerp, Belgium, can be found through weather websites or apps. These sources provide real-time and forecasted weather updates, including temperature, wind speed, humidity, and chance of precipitation. To get the current data, visit websites like Weather.com or AccuWeather.com and search for 'Antwerp, Belgium'. Or, you can use a weather app like AccuWe"},"logprobs":null,"finish_reason":"length"}],"usage":{"prompt_tokens":20,"completion_tokens":100,"total_tokens":120}}%

Streaming is also supported when stream:true is present in the request body.

jkone27 · 2024-04-03T16:40:57Z

probably all models supported in huggingchat should support this also? just a thought

### Motivation and Context Closes #5403 1. Adding support to Chat Completion for TGI (Text Generation Inference) Deployment. 3. Adding Missing UnitTests for Streaming and Non Streaming scenarios (Text/Chat Completion) 4. Update Metadata + Usage Details for hugging face clients. ### Contribution Checklist - [x] The code builds clean without any errors or warnings - [x] The PR follows the [SK Contribution Guidelines](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md) and the [pre-submission formatting script](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md#development-scripts) raises no violations - [x] All unit tests pass, and I have added new tests where possible - [x] I didn't break anyone 😄

jkone27 · 2024-04-17T08:40:01Z

thank you @RogerBarreto

jkone27 · 2024-04-17T13:35:54Z

is there an example for this somewhere in tests or doc with any huggingface chat api?

JonathanVelkeneers · 2024-04-18T17:18:51Z

@jkone27

There is an example here.
In theory it's just replacing the localhost url with a hugginface interference api url.
Example:
Either https://api-inference.huggingface.co/models/NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO/v1/chat/completions
or https://api-inference.huggingface.co/models/NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO
should work.

However these changes have not been released in a new version. (last release 2 weeks ago)

markwallace-microsoft added .NET Issue or Pull requests regarding .NET code triage labels Mar 10, 2024

github-actions bot changed the title ~~HuggingFace implement also IChatCompletionService?~~ .Net: HuggingFace implement also IChatCompletionService? Mar 10, 2024

markwallace-microsoft added question Further information is requested and removed triage labels Mar 12, 2024

markwallace-microsoft assigned RogerBarreto Mar 12, 2024

markwallace-microsoft removed the question Further information is requested label Mar 12, 2024

RogerBarreto mentioned this issue Apr 4, 2024

.Net Hugging Face TGI Chat Completion Message API Support #5785

Merged

4 tasks

RogerBarreto closed this as completed in #5785 Apr 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.Net: HuggingFace implement also `IChatCompletionService`? #5403

.Net: HuggingFace implement also `IChatCompletionService`? #5403

jkone27 commented Mar 10, 2024 •

edited

Loading

Krzysztof318 commented Mar 10, 2024 •

edited

Loading

jkone27 commented Mar 10, 2024

Krzysztof318 commented Mar 10, 2024

markwallace-microsoft commented Mar 12, 2024

Krzysztof318 commented Mar 18, 2024

JonathanVelkeneers commented Mar 19, 2024

jkone27 commented Mar 20, 2024

Krzysztof318 commented Mar 20, 2024

RogerBarreto commented Mar 27, 2024

JonathanVelkeneers commented Mar 29, 2024 •

edited

Loading

Hugging Face - Chat Completion POC.

jkone27 commented Apr 3, 2024 •

edited

Loading

jkone27 commented Apr 17, 2024

jkone27 commented Apr 17, 2024

JonathanVelkeneers commented Apr 18, 2024

.Net: HuggingFace implement also IChatCompletionService? #5403

.Net: HuggingFace implement also IChatCompletionService? #5403

Comments

jkone27 commented Mar 10, 2024 • edited Loading

Krzysztof318 commented Mar 10, 2024 • edited Loading

jkone27 commented Mar 10, 2024

Krzysztof318 commented Mar 10, 2024

markwallace-microsoft commented Mar 12, 2024

Krzysztof318 commented Mar 18, 2024

JonathanVelkeneers commented Mar 19, 2024

jkone27 commented Mar 20, 2024

Krzysztof318 commented Mar 20, 2024

RogerBarreto commented Mar 27, 2024

Hugging Face - Chat Completion POC.

JonathanVelkeneers commented Mar 29, 2024 • edited Loading

Hugging Face - Chat Completion POC.

jkone27 commented Apr 3, 2024 • edited Loading

jkone27 commented Apr 17, 2024

jkone27 commented Apr 17, 2024

JonathanVelkeneers commented Apr 18, 2024

.Net: HuggingFace implement also `IChatCompletionService`? #5403

.Net: HuggingFace implement also `IChatCompletionService`? #5403

jkone27 commented Mar 10, 2024 •

edited

Loading

Krzysztof318 commented Mar 10, 2024 •

edited

Loading

JonathanVelkeneers commented Mar 29, 2024 •

edited

Loading

jkone27 commented Apr 3, 2024 •

edited

Loading