-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
.Net : Feature/multiconnector #2323
Conversation
…project (microsoft#1357) ### Motivation and Context [Oobabooga's text-generation-webui](https://github.com/oobabooga/text-generation-webui) is the most popular open-source platform to host LLMs. It has a web API that supports blocking and streaming requests. Having a connector to oobabooga brings semantic-kernel to the global "local Llamas" community. ### Description This PR adds to the solution a project similar to HuggingFace connectors project, and an additional integration test also similar to HuggingFace connector's The code for the connector was based on the existing HuggingFace's, with a couple improvements (e.g. using web sockets for streaming API)
merge upstream
Hi, this is a new PR from a new user-level fork to account for a github bug preventing edits by maintainers on a PR made from an organization-level fork (see that [final comment](microsoft#1357 (comment)) on last PR) This PR is only an attempt at merging upstream's main for final integration of the oobabooga connector into main branch. Last set of commits made CompleteRequestSettings.MaxTokens Nullable ([#450f1d3a11eb95d6975da33f581d3997bed42906](microsoft#1367)), which broke this connector. Making TextCompletionRequest.MaxNewTokens also nullable fixed the issue. Note that Oobabooga [defaults max tokens to 200](https://github.com/oobabooga/text-generation-webui/blob/main/extensions/api/util.py#L24) Co-authored-by: Shawn Callegari <36091529+shawncal@users.noreply.github.com>
merge upstream
feature/oobaboga: Fixing merge from main (microsoft#1911)
merge upstream
merge upstream
### Motivation and Context <!-- Thank you for your contribution to the semantic-kernel repo! Please help reviewers and future users, providing the following information:--> 1. Why is this change required? To keep up to date with main in order to allow merging into main branch 3. What problem does it solve? merge upstream 4. What scenario does it contribute to? Get the branch ready for merging 6. If it fixes an open issue, please link to the issue here. ### Description <!-- Describe your changes, the overall approach, the underlying design. These notes will help understanding how your code works. Thanks! --> @dmytrostruk @shawncal, hi guys, is there anything now preventing that branch getting merged into main ? Again, as you noted, keeping up to date with main means chasing a moving target, and I will be relieved when this is integrated. ### Contribution Checklist <!-- Before submitting this PR, please make sure: --> - [X] The code builds clean without any errors or warnings - [X] The PR follows SK Contribution Guidelines (https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md) - [X] The code follows the .NET coding conventions (https://learn.microsoft.com/dotnet/csharp/fundamentals/coding-style/coding-conventions) verified with `dotnet format` - [X] All unit tests pass, and I have added new tests where possible - [X] I didn't break anyone 😄 --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: Shawn Callegari <36091529+shawncal@users.noreply.github.com> Co-authored-by: Gina Triolo <51341242+gitri-ms@users.noreply.github.com> Co-authored-by: Devis Lucato <dluc@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Craig Presti <146438+craigomatic@users.noreply.github.com> Co-authored-by: Craig Presti <craig.presti@microsoft.com> Co-authored-by: Mark Wallace <127216156+markwallace-microsoft@users.noreply.github.com> Co-authored-by: Teresa Hoang <125500434+teresaqhoang@users.noreply.github.com> Co-authored-by: Abby Harrison <54643756+awharrison-28@users.noreply.github.com> Co-authored-by: Tao Chen <TaoChenOSU@users.noreply.github.com> Co-authored-by: Aman Sachan <51973971+amsacha@users.noreply.github.com> Co-authored-by: cschadewitz <schadewitzcasey@gmail.com> Co-authored-by: Dmytro Struk <13853051+dmytrostruk@users.noreply.github.com> Co-authored-by: Abby Harrison <abby.harrison@microsoft.com>
merge upstream
.Net: Feature/oobabooga merge from Main (microsoft#1967)
Merge upstream
merge upstream
Merge upstream
merge upstream
Merge upstream
Merge upstream
merge upstream
@nacharya1 we saw a demo of this PR today. Given it's quite a large PR, might be worth scheduling review during a future sprint? |
With the main demo pipeline now working, I will now add documentation make this more accessible. I refrained from updating to #2229 that will also take some time to merge, but most models from https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard are now chat-instruct, so there will probably be a bit of scaffolding falling into place when chat is available for the Oobabooga connector. |
merge upstream
@dmytrostruk, @shawncal, @lemillermicrosoft, @nacharya1, Following our previous discussions and the closure of the Oobabooga PR, I've successfully migrated both the Oobabooga and Multiconnector to a new repository. I've also created a series of starter notebooks to help users get up to speed with the new setup. I believe the repo and its NuGet packages are now stable and ready for wider use. Given these developments, it's probably time to close this PR as well. Questions:
Looking forward to your guidance. |
@jsboige This is amazing!!! I will communicate that with internal team and will let you know about further steps. Thanks a lot! |
As mentioned above this PR can be closed |
@markwallace-microsoft thanks, no problem at all. |
Motivation and Context
This PR aims at introducing differential Text completion leveraging distinct models for distinct prompts. With a focus on duration and cost gains, the new Multiconnector Text completion performs an online analysis where a primary completion is used by default and is charged with vetting secondary connectors by validating their response to progressively offload prompts to the connectors with the best duration/cost performances.
Description
This PR adds a new MultiConnector Text completion project to the solution and the corresponding Unit tests.
The MultiConnector class hierarchy introduces settings to control how connectors are analyzed, vetted and chosen to handle text completion calls.
Settings are organized by prompt types, with a notion of prompt signature to match prompts of the same type.
Initial commits include a single complex unit test as a proof of concept.
The POC reads as follows: A primary connector is capable of performing all 4 arithmetic operations whereas secondary connectors are only capable of a single one each. primary and secondary connectors have distinct duration and cost performances, with preferences on their respective weight.
The multi-connector is created with online analysis enabled, and a first round of calls is made with all 4 operations. The calls are handled by the primary connector and trigger the evaluation of all connectors on all 4 prompt types. The analysis then vets the valid connectors with their respective performances and update the main settings.
Another round of calls is performed and should be handled by connectors according to the analysis and the given preferences in performance gains.
More unit-tests will be included, as well as a real world integration test involving Offloading simple prompts from ChatGPT to smaller Lllama 2 models.
Further improvements will introduce an Infer.Net based probabilistic model to improve model and mixture-model capabilities assessment.
Contribution Checklist