Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: semi self-hosted LLMs using AzureML Online Endpoints #93

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

janaka
Copy link
Contributor

@janaka janaka commented Sep 5, 2023

Description

Implements #10

Enables deploying an OSS LLM like Llama2 on AzureML + LlamaIndex level API implementation.

  • IaC for infra using ARM templates
  • AzureML Online Endpoints Web API client
  • LlamaIndex LLM class for AzureML self-hosted models
  • Test
  • maybe some LLM class code folder re-organisation

Note:

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Refactor and code improvement (non-breaking change which improves code quality and/or performance)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation
  • Tests
  • Other chores such as maintenance

How Has This Been Tested?

Current not tested

  • Test A
  • Test B
  • Test C

Test Configuration:

There two methods for setting up the Model backend for testing

  • Spin up an Online Endpoint with Llama2-7b-chat on Azure
    • pro: this is a matter of running the ARM template via the shell script.
    • con: slow to spin up/down the infra 15-20mins. the infra is expensive ~$150 /day
  • Spin up a local Online Endpoint with Llama2-7b-chat on Azure
    • pro: zero cloud bill cost.
    • con: speed of spin up is not known for sure but assume still slow. some work to figure out local Online Endpoints

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • I have checked my code and corrected any misspellings
  • The commit message follows the convention of this project

@janaka janaka linked an issue Sep 5, 2023 that may be closed by this pull request
@cwang
Copy link
Contributor

cwang commented Sep 6, 2023

Nice, a few comments:

  • If any of the files are exported instead of drafted from scratch, would be nice to state how they are created in Azure console and how they are exported.
  • Docs should probably go into the doc site instead of staying with the README file here.

@janaka
Copy link
Contributor Author

janaka commented Sep 10, 2023

Nice, a few comments:

  • If any of the files are exported instead of drafted from scratch, would be nice to state how they are created in Azure console and how they are exported.

Agreed - though not expecting exported files. There are a bunch of files that aren't needed which needs to be cleaned up.

  • Docs should probably go into the doc site instead of staying with the README file here.

Yep

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

INFRA: Self-hosted Models
2 participants