Azure OpenAI + LLM (Large language model)

updated: 07/05/2023

Azure OpenAI + LLM (Large language model)

This repository contains references to open-source models similar to ChatGPT, as well as Langchain and prompt engineering libraries. It also includes related samples and research on Langchain, Vector Search (including feasibility checks on Elasticsearch, Azure Cognitive Search, Azure Cosmos DB), and more.

Not being able to keep up with and test every recent update, sometimes I simply copied them into this repository for later review. some code might be outdated.

Rule: Brief each item on one or a few lines as much as possible.

What's the difference between Azure OpenAI and OpenAI?

OpenAI is a better option if you want to use the latest features like function calling, plug-ins, and access to the latest models.
Azure OpenAI is recommended if you require a reliable, secure, and compliant environment.
Azure OpenAI provides seamless integration with other Azure services..
Azure OpenAI offers private networking and role-based authentication, and responsible AI content filtering.
Azure OpenAI provides a Service Level Agreement (SLA) that guarantees a certain level of uptime and support for the service.
Azure OpenAI does not use user input as training data for other customers. Data, privacy, and security for Azure OpenAI

Section 1 : llama-index and Vector Storage (Database)
Section 2 : ChatGPT + Enterprise data with Azure OpenAI and Cognitive Search
Section 3 : Microsoft Semantic Kernel with Azure Cosmos DB
- Semantic-Kernel
- Bing search Web UI and Semantic Kernel sample code
Section 4 : Langchain
- Langchain Cheetsheet
- Langchain Impressive features: cache, context-aware-splitting
- Langchain quick start: Sample code
- Langchain chain type: Summarizer
- langflow: langchain UI, Drag-and-Drop Workflow Experience
- Lanchain vs llama-index
Section 5: Prompt Engineering, Finetuning, and Langchain
- Prompt Engineering
- Azure OpenAI Prompt Guide
- OpenAI Prompt Guide
- DeepLearning.ai Prompt Engineering Course and others
- Awesome ChatGPT Prompts
- ChatGPT : “user”, “assistant”, and “system” messages.
- Finetuning : PEFT - LoRA - QLoRA
- Quantization : Quantization & Run ChatGPT on a Raspberry Pi / Android
- Sparsification
- Small size with Textbooks: High quality synthetic dataset
- Langchain vs Semantic Kernel
- guidance: A guidance language for controlling large language models.
Section 6: Improvement
- Introducing 100K Context Windows: Large Context Windows
- Math problem-solving skill: incl. Latex OCR
- Table Extraction: Extract Tables from PDFs
- OpenAI's plans according to Sam Altman Humanloop interview has been removed from the site. Instead of that, Web-archived link.
- Token counting & Token-limits: 5 Approaches To Solve LLM Token Limits
- Avoid AI hallucination Building Trustworthy, Safe and Secure LLM
- Gorilla: An API store for LLMs
- Memory Optimization: PagedAttention for 24x Faster LLM Inference
- Open AI Plugin and function calling
Section 7: Generative AI Landscape / List of OSS LLM
Section 8 : References
- picoGPT : tiny implementation of GPT-2.
- RLHF（Reinforcement Learning from Human Feedback): TRL, trlX, Argilla
- Langchain and Prompt engineering library
- AutoGPT / Communicative Agents
- Democratizing the magic of ChatGPT with open models
- Large Language and Vision Assistant
- MLLM (multimodal large language model)
- Application incl. UI/UX
- Edge and Chrome Extension & Plugin
- Awesome demo Prompt to Game - E2E game creation
- 日本語（Japanese Materials)
Section 9 : Relavant solutions and links
- Microsoft Fabric: Single unified data analytics solution
- DeepSpeed: Distributed training and memory optimization.
- Azure Machine Learning - Prompt flow: Low code
- Office Copilot: Semantic Interpreter, Natural Language Commanding via Program Synthesis
- microsoft/unilm: Microsoft Foundation models
Section 10 : AI Tools
- AI Tools
Acknowledgements
- Acknowledgements: -

Section 1 : llama-index and Vector Storage (Database)

This section has been created for testing and feasibility checks using elastic search as a vector database and integration with llama-index. llama-index is specialized in integration layers to external data sources.

Opensearch/Elasticsearch setup

docker : Opensearch Docker-compose
docker-elasticsearch : Not working for ES v8, requiring security plug-in with mandatory
docker-elk : Elasticsearch Docker-compose, Optimized Docker configurations with solving security plug-in issues.
es-open-search-set-analyzer.py : Put Language analyzer into Open search
es-open-search.py : Open search sample index creation
es-search-set-analyzer.py : Put Language analyzer into Elastic search
es-search.py : Usage of Elastic search python client
files : The Sample file for consuming

llama-index

index.json : Vector data local backup created by llama-index
index_vector_in_opensearch.json : Vector data stored in Open search (Source: files\all_h1.pdf)
llama-index-azure-elk-create.py: llama-index ElasticsearchVectorClient (Unofficial file to manipulate vector search, Created by me, Not Fully Tested)
llama-index-lang-chain.py : Lang chain memory and agent usage with llama-index
llama-index-opensearch-create.py : Vector index creation to Open search
llama-index-opensearch-query-chatgpt.py : Test module to access Azure Open AI Embedding API.
llama-index-opensearch-query.py : Vector index query with questions to Open search
llama-index-opensearch-read.py : llama-index ElasticsearchVectorClient (Unofficial file to manipulate vector search, Created by me, Not Fully Tested)

env.template : The properties. Change its name to .env once your values settings is done.

OPENAI_API_TYPE=azure
OPENAI_API_BASE=https://????.openai.azure.com/
OPENAI_API_VERSION=2022-12-01
OPENAI_API_KEY=<your value in azure>
OPENAI_DEPLOYMENT_NAME_A=<your value in azure>
OPENAI_DEPLOYMENT_NAME_B=<your value in azure>
OPENAI_DEPLOYMENT_NAME_C=<your value in azure>
OPENAI_DOCUMENT_MODEL_NAME=<your value in azure>
OPENAI_QUERY_MODEL_NAME=<your value in azure>

INDEX_NAME=gpt-index-demo
INDEX_TEXT_FIELD=content
INDEX_EMBEDDING_FIELD=embedding
ELASTIC_SEARCH_ID=elastic
ELASTIC_SEARCH_PASSWORD=elastic
OPEN_SEARCH_ID=admin
OPEN_SEARCH_PASSWORD=admin

llama-index example

llama-index-es-handson\callback-debug-handler.py: callback debug handler
llama-index-es-handson\chat-engine-flare-query.py: FLARE
llama-index-es-handson\chat-engine-react.py: ReAct
llama-index-es-handson\milvus-create-query.py: Milvus Vector storage

Vector Storage Comparison

Not All Vector Databases Are Made Equal
Printed version for "Medium" limits. - Link

Vector Storage Options for Azure

Pgvector extension on Azure Cosmos DB for PostgreSQL: Langchain Document URL
Vector Search in Azure Cosmos DB for MongoDB vCore
Vector search (private preview) - Azure Cognitive Search: Langchain Document URL
Azure Cache for Redis Enterprise: Enterprise Redis Vector Search Demo

Note: Azure Cache for Redis Enterprise: Enterprise Sku series are not able to deploy by a template such as Bicep and ARM.
azure-vector-db-python\vector-db-in-azure-native.ipynb: sample code for vector databases in azure

Milvus Embedded

pip install milvus
Docker compose: https://milvus.io/docs/install_offline-docker.md
Milvus Embedded through python console only works in Linux and Mac OS.

In Windows, Use this link, https://github.com/matrixji/milvus/releases.

# Step 1. Start Milvus

1. Unzip the package
Unzip the package, and you will find a milvus directory, which contains all the files required.

2. Start a MinIO service
Double-click the run_minio.bat file to start a MinIO service with default configurations. Data will be stored in the subdirectory s3data.

3. Start an etcd service
Double-click the run_etcd.bat file to start an etcd service with default configurations.

4. Start Milvus service
Double-click the run_milvus.bat file to start the Milvus service.

# Step 2. Run hello_milvus.py

After starting the Milvus service, you can test by running hello_milvus.py. See Hello Milvus for more information.

Conclusion

Azure Open AI Embedding API, text-embedding-ada-002, supports 1536 dimensions. Elastic search, Lucene based engine, supports 1024 dimensions as a max. Open search can insert 16,000 dimensions as a vector storage. Open search is available to use as a vector database with Azure Open AI Embedding API.
@citation: open ai documents: text-embedding-ada-002: Smaller embedding size. The new embeddings have only 1536 dimensions, one-eighth the size of davinci-001 embeddings, making the new embeddings more cost effective in working with vector databases. https://openai.com/blog/new-and-improved-embedding-model
@citation: open search documents: However, one exception to this is that the maximum dimension count for the Lucene engine is 1,024, compared with 16,000 for the other engines. https://opensearch.org/docs/latest/search-plugins/knn/approximate-knn/
@llama-index ElasticsearchReader class: The name of the class in llama-index is ElasticsearchReader. However, actually, it can only work with open search.

llama-index Deep dive

Section 2 : ChatGPT + Enterprise data with Azure OpenAI and Cognitive Search

The files in this directory, extra_steps, have been created for managing extra configurations and steps for launching the demo repository.

https://github.com/Azure-Samples/azure-search-openai-demo : Python, ReactJs, Typescript

Configuration

(optional) Check Azure module installation in Powershell by running ms_internal_az_init.ps1 script
(optional) Set your Azure subscription Id to default

Start the following commands in ./azure-search-openai-demo directory

(deploy azure resources) Simply Run azd up

The azd stores relevant values in the .env file which is stored at ${project_folder}\.azure\az-search-openai-tg\.env.

AZURE_ENV_NAME=<your_value_in_azure>
AZURE_LOCATION=<your_value_in_azure>
AZURE_OPENAI_SERVICE=<your_value_in_azure>
AZURE_PRINCIPAL_ID=<your_value_in_azure>
AZURE_SEARCH_INDEX=<your_value_in_azure>
AZURE_SEARCH_SERVICE=<your_value_in_azure>
AZURE_STORAGE_ACCOUNT=<your_value_in_azure>
AZURE_STORAGE_CONTAINER=<your_value_in_azure>
AZURE_SUBSCRIPTION_ID=<your_value_in_azure>
BACKEND_URI=<your_value_in_azure>

Move to app by cd app command
(sample data loading) Move to scripts then Change into Powershell by Powershell command, Run prepdocs.ps1

console output (excerpt)

        Uploading blob for page 29 -> role_library-29.pdf
        Uploading blob for page 30 -> role_library-30.pdf
Indexing sections from 'role_library.pdf' into search index 'gptkbindex'
Splitting './data\role_library.pdf' into sections
        Indexed 60 sections, 60 succeeded

Move to app by cd .. and cd app command
(locally running) Run start.cmd

console output (excerpt)

Building frontend


> frontend@0.0.0 build \azure-search-openai-demo\app\frontend
> tsc && vite build

vite v4.1.1 building for production...
✓ 1250 modules transformed.
../backend/static/index.html                    0.49 kB
../backend/static/assets/github-fab00c2d.svg    0.96 kB
../backend/static/assets/index-184dcdbd.css     7.33 kB │ gzip:   2.17 kB
../backend/static/assets/index-41d57639.js    625.76 kB │ gzip: 204.86 kB │ map: 5,057.29 kB

Starting backend

* Serving Flask app 'app'
* Debug mode: off
WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
* Running on http://127.0.0.1:5000
Press CTRL+C to quit
...

Running from second times

Move to app by cd .. and cd app command
(locally running) Run start.cmd

(optional)

fix_from_origin : The modified files, setup related
ms_internal_az_init.ps1 : Powershell script for Azure module installation
ms_internal_troubleshootingt.ps1 : Set Specific Subscription Id as default

Introducing Azure OpenAI Service On Your Data in Public Preview

Azure OpenAI Service On Your Data in Public Preview Link

Azure OpenAI samples

Azure OpenAI samples: Link
A simple ChatGPT Plugin: Link
The repository for all Azure OpenAI Samples complementing the OpenAI cookbook.: Link

Another Reference Architectures


Azure OpenAI Embeddings QnA	Azure Cosmos DB + OpenAI ChatGPT C# blazor and Azure Custom Template

C# Implementation ChatGPT + Enterprise data with Azure OpenAI and Cognitive Search	Simple ChatGPT UI application Typescript, ReactJs and Flask

Azure Video Indexer demo Azure Video Indexer + OpenAI	-
-	-

Azure Open AI work with Cognitive Search act as a Long-term memory

Azure Cognitive Search : Vector Search

Azure Cognitive Search : Vector Search

Options: 1. Vector similarity search, 2. Pure Vector Search, 3. Hybrid Search, 4. Semantic Hybrid Search

azure-search-vector-sample\azure-search-vector-python-sample.ipynb: Azure Cognitive Search - Vector and Hybrid Search

Section 3 : Microsoft Semantic Kernel with Azure Cosmos DB

Microsoft Langchain Library supports C# and Python and offers several features, some of which are still in development and may be unclear on how to implement. However, it is simple, stable, and faster than Python-based open-source software. The features listed on the link include: Semantic Kernel Feature Matrix

This section includes how to utilize Azure Cosmos DB for vector storage and vector search by leveraging the Semantic-Kernel.

Semantic-Kernel

appsettings.template.json : Environment value configuration file.
ComoseDBVectorSearch.cs : Vector Search using Azure Cosmos DB
CosmosDBKernelBuild.cs : Kernel Build code (test)
CosmosDBVectorStore.cs : Embedding Text and store it to Azure Cosmos DB
LoadDocumentPage.cs : PDF splitter class. Split the text to unit of section. (C# version of azure-search-openai-demo/scripts/prepdocs.py)
LoadDocumentPageOutput : LoadDocumentPage class generated output
MemoryContextAndPlanner.cs : Test code of context and planner
MemoryConversationHistory.cs : Test code of conversation history
Program.cs : Run a demo. Program Entry point
SemanticFunction.cs : Test code of conversation history
semanticKernelCosmos.csproj : C# Project file
Settings.cs : Environment value class
SkillBingSearch.cs : Bing Search Skill
SkillDALLEImgGen.cs : DALLE Skill (Only OpenAI, Azure Open AI not supports yet)

Environment variable

{
  "Type": "azure",
  "Model": "<model_deployment_name>",
  "EndPoint": "https://<your-endpoint-value>.openai.azure.com/",
  "AOAIApiKey": "<your-key>",
  "OAIApiKey": "",
  "OrdId": "-", //The value needs only when using Open AI.
  "BingSearchAPIKey": "<your-key>",
  "aoaiDomainName": "<your-endpoint-value>",
  "CosmosConnectionString": "<cosmos-connection-string>"
}

Semantic Kernel has recently introduced support for Azure Cognitive Search as a memory. However, it currently only supports Azure Cognitive Search with a Semantic Search interface, lacking any features to store vectors to ACS.
According to the comments, this suggests that the strategy of the plan could be divided into two parts. One part focuses on Semantic Search, while the other involves generating embeddings using OpenAI.

Azure Cognitive Search automatically indexes your data semantically, so you don't need to worry about embedding generation. samples/dotnet/kernel-syntax-examples/Example14_SemanticMemory.cs.

// TODO: use vectors
// @Microsoft Semactic Kernel
var options = new SearchOptions
{
        QueryType = SearchQueryType.Semantic,
        SemanticConfigurationName = "default",
        QueryLanguage = "en-us",
        Size = limit,
};

SemanticKernel Implementation sample to overcome Token limits of Open AI model. Semantic Kernel でトークンの限界を超えるような長い文章を分割してスキルに渡して結果を結合したい (zenn.dev) Semantic Kernel でトークンの限界を超える

Bing search Web UI and Semantic Kernel sample code

Semantic Kernel sample code to integrate with Bing Search (ReAct??)

\ms-semactic-bing-notebook

gs_chatgpt.ipynb: Azure Open AI ChatGPT sample to use Bing Search
gs_davinci.ipynb: Azure Open AI Davinci sample to use Bing Search

Bing Search UI for demo

\bing-search-webui: (Utility, to see the search results from Bing Search API)

Section 4 : Langchain

Langchain Cheetsheet

Feature Matrix: LangChain Features
Cheetsheet: LangChain CheatSheet

Langchain Impressive Features

Langchain/cache: Reducing the number of API calls
Langchain/context-aware-splitting: Splits a file into chunks while keeping metadata

Langchain Quick Start: How to Use and Useful Utilities

Langchain_1_(믹스의_인공지능).ipynb : Langchain Get started
langchain_1_(믹스의_인공지능).py : -
Langchain_2_(믹스의_인공지능).ipynb : Langchain Utilities

langchain_2_(믹스의_인공지능).py : -

from langchain.chains.summarize import load_summarize_chain
chain = load_summarize_chain(chat, chain_type="map_reduce", verbose=True)
chain.run(docs[:3])

@citation: @practical-ai

Langchain chain type: Summarizer

stuff: Sends everything at once in LLM. If it's too long, an error will occur.
map_reduce: Summarizes by dividing and then summarizing the entire summary.
refine: (Summary + Next document) => Summary
map_rerank: Ranks by score and summarizes to important points.

langflow

langflow: LangFlow is a UI for LangChain, designed with react-flow.

Langchain vs llama-index

Basically llmaindex is a smart storage mechanism, while Langchain is a tool to bring multiple tools together. @citation
LangChain offers many features and focuses on using chains and agents to connect with external APIs. In contrast, LlamaIndex is more specialized and excels at indexing data and retrieving documents.

Section 5: Prompt Engineering, and Langchain vs Semantic Kernel

Prompt Engineering

Zero-shot
- Large Language Models are Zero-Shot Reasoners
Few-shot Learning
- Open AI: Language Models are Few-Shot Learners
Chain of Thought (CoT): ReAct and Self Consistency also inherit the CoT concept.
Recursively Criticizes and Improves (RCI)
ReAct: Grounding with external sources. (Reasoning and Act)
Chain-of-Thought Prompting
- Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Tree of Thought (github)
- tree-of-thought\forest_of_thought.py: Forest of thought Decorator sample
- tree-of-thought\tree_of_thought.py: Tree of thought Decorator sample
- tree-of-thought\react-prompt.py: ReAct sample without Langchain

Prompt Concept
1. Question-Answering
2. Roll-play: Act as a [ROLE] perform [TASK] in [FORMAT]
3. Reasoning
4. Prompt-Chain
5. Program Aided Language Model
6. Recursive Summarization: Long Text -> Chunks -> Summarize pieces -> Concatenate -> Summarize
🤩Prompt Engineering : ⭐⭐⭐⭐⭐
Prompt Engineering Guide: Copyright © 2023 DAIR.AI

Azure OpenAI Prompt Guide

Prompt engineering techniques

OpenAI Prompt Guide

DeepLearning.ai Prompt Engineering COURSE and others

Awesome ChatGPT Prompts

Awesome ChatGPT Prompts

ChatGPT : “user”, “assistant”, and “system” messages.

To be specific, the ChatGPT API allows for differentiation between “user”, “assistant”, and “system” messages.

always obey "system" messages.
all end user input in the “user” messages.
"assistant" messages as previous chat responses from the assistant.

Presumably, the model is trained to treat the user messages as human messages, system messages as some system level configuration, and assistant messages as previous chat responses from the assistant. (@https://blog.langchain.dev/using-chatgpt-api-to-evaluate-chatgpt/)

Finetuning

PEFT: Parameter-Efficient Fine-Tuning (Youtube)

PEFT
LoRA: Low-Rank Adaptation of Large Language Models
QLoRA: Efficient Finetuning of Quantized LLMs

artidoro/qlora
Training language models to follow instructions with human feedback
Fine-tuning a GPT — LoRA: Comprehensive guide for LoRA ⭐⭐⭐⭐ . Printed version for backup. Link

Sparsification

@citation: Binghchat

Sparsification is a technique used to reduce the size of large language models (LLMs) by removing redundant parameters without significantly affecting their performance. It is one of the methods used to compress LLMs. LLMs are neural networks that are trained on massive amounts of data and can generate human-like text. The term “sparsification” refers to the process of removing redundant parameters from these models.

Small size with Textbooks: High quality synthetic dataset

ph-1: Despite being small in size, phi-1 attained 50.6% on HumanEval and 55.5% on MBPP.
Orca: Orca learns from rich signals from GPT 4 including explanation traces; step-by-step thought processes; and other complex instructions, guided by teacher assistance from ChatGPT.

Large Transformer Model Inference Optimization

😮 Large Transformer Model Inference Optimization : ⭐⭐⭐⭐⭐

Langchain vs Semantic Kernel

Langchain	Semantic Kernel
Memory	Memory
Tookit	Skill
Tool	LLM prompts (semantic functions) or native C# or Python code (native function)
Agent	Planner
Chain	Steps, Pipeline
Tool	Connector

Semantic Kernel : Semantic Function

expressed in natural language in a text file "skprompt.txt" using SK's Prompt Template language. Each semantic function is defined by a unique prompt template file, developed using modern

Semantic Kernel : Prompt Template language Key takeaways

Variables : use the {{$variableName}} syntax : Hello {{$name}}, welcome to Semantic Kernel!
Function calls: use the {{namespace.functionName}} syntax : The weather today is {{weather.getForecast}}.
Function parameters: {{namespace.functionName $varName}} and {{namespace.functionName "value"}} syntax : The weather today in {{$city}} is {{weather.getForecast $city}}.
Prompts needing double curly braces : {{ "{{" }} and {{ "}}" }} are special SK sequences.
Values that include quotes, and escaping :

For instance:

... {{ 'no need to \"escape" ' }} ... is equivalent to:

... {{ 'no need to "escape" ' }} ...

Langchain Agent

If you're using a text LLM, first try zero-shot-react-description.
If you're using a Chat Model, try chat-zero-shot-react-description.
If you're using a Chat Model and want to use memory, try conversational-react-description.
self-ask-with-search: self ask with search paper
react-docstore: ReAct paper

Sementic Kernel Glossary

Glossary in Git

Glossary in MS Doc

Journey	Short Description
ASK	A user's goal is sent to SK as an ASK
Kernel	The kernel orchestrates a user's ASK
Planner	The planner breaks it down into steps based upon resources that are available
Resources	Planning involves leveraging available skills, memories, and connectors
Steps	A plan is a series of steps for the kernel to execute
Pipeline	Executing the steps results in fulfilling the user's ASK
GET	And the user gets what they asked for ...

Langchain vs Sementic Kernel vs Azure Machine Learning Prompt flow

What's the difference between LangChain and Semantic Kernel?

LangChain has many agents, tools, plugins etc. out of the box. More over, LangChain has 10x more popularity, so has about 10x more developer activity to improve it. On other hand, Semantic Kernel architecture and quality is better, that's quite promising for Semantic Kernel. Link
What's the difference between Azure Machine Laering PromptFlow and Semantic Kernel?
1. Low/No Code vs C#, Python, Java
2. Focused on Prompt orchestrating vs Integrate LLM into their existing app.

guidance

guidance: Simple, intuitive syntax, based on Handlebars templating. Domain Specific Language (DSL) for handling model interaction.

Section 6 : Improvement

Introducing 100K Context Windows

Introducing 100K Context Windows: hundreds of pages, Around 75,000 words; demo

Math problem-solving skill

Plugin: Wolfram alpha
Improving mathematical reasoning with process supervision
Math formula OCR: MathPix, OSS LaTeX-OCR

Table Extraction

Azure Form Recognizer: documentation
Table to Markdown format: Table to Markdown

OpenAI's plans according to Sam Altman

Archived Link : Printed version for backup Link

Token counting & Token-limits

Open AI Tokenizer: GPT-3, Codex Token counting
tiktoken: BPE tokeniser for use with OpenAI's models. Token counting.
What are tokens and how to count them?
5 Approaches To Solve LLM Token Limits : Printed version for backup Link

Avoid AI hallucination

NeMo Guardrails: Building Trustworthy, Safe and Secure LLM Conversational Systems

Gorilla: An API store for LLMs

Gorilla: An API store for LLMs: Gorilla: Large Language Model Connected with Massive APIs
1. Used GPT-4 to generate a dataset of instruction-api pairs for fine-tuning Gorilla.
2. Used the abstract syntax tree (AST) of the generated code to match with APIs in the database and test set for evaluation purposes.
3. @citation Link
Another user asked how Gorilla compared to LangChain; Patil replied: Langchain is a terrific project that tries to teach agents how to use tools using prompting. Our take on this is that prompting is not scalable if you want to pick between 1000s of APIs. So Gorilla is a LLM that can pick and write the semantically and syntactically correct API for you to call! A drop in replacement into Langchain!
Meta: Toolformer: Language Models That Can Use Tools, by MetaAI

Memory Optimization

PagedAttention : vLLM: Easy, Fast, and Cheap LLM Serving with PagedAttention, 24x Faster LLM Inference Link

Open AI Plugin and function calling

ChatGPT Plugin
ChatGPT Function calling

Under the hood, functions are injected into the system message in a syntax the model has been trained on. This means functions count against the model's context limit and are billed as input tokens. If running into context limits, we suggest limiting the number of functions or the length of documentation you provide for function parameters.

Section 7 : Generative AI Landscape / List of OSS LLM

Generative AI Revolution: Exploring the Current Landscape

The Generative AI Revolution: Exploring the Current Landscape : Printed version for backup Link ⭐⭐⭐⭐

List of OSS LLM

List of OSS LLM
Printed version for "Medium" limits. Link

Huggingface Open LLM Learboard

Huggingface Open LLM Learboard

Hugging face Transformer

huggingface/transformers: 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX. (github.com)

Hugging face StarCoder

Section 8 : References

picoGPT

An unnecessarily tiny implementation of GPT-2 in NumPy. picoGPT: Transformer Decoder

RLHF（Reinforcement Learning from Human Feedback)

Machine learning technique that trains a "reward model" directly from human feedback and uses the model as a reward function to optimize an agent's policy using reinforcement learning
Libraries: TRL, trlX, Argilla

Langchain and Prompt engineering library

AutoGPT / Communicative Agents

Auto-GPT: Most popular
babyagi: Most simplest implementation - Coworking of 4 agents
microsoft/JARVIS
SuperAGI: GUI for agent settings
lightaime/camel: 🐫 CAMEL: Communicative Agents for “Mind” Exploration of Large Scale Language Model Society (github.com)
1:1 Conversation between two ai agents Camel Agents - a Hugging Face Space by camel-ai Hugging Face (camel-agents)

Democratizing the magic of ChatGPT with open models

The LLMs mentioned here are just small parts of the current advancements in the field. Most OSS LLM models have been built on the facebookresearch/llama. For a comprehensive list and the latest updates, please refer to the "Generative AI Landscape / List of OSS LLM" section.
facebookresearch/llama: Not licensed for commercial use
Falcon LLM Apache 2.0 license
LLM
- StableVicuna First Open Source RLHF LLM Chatbot
- Alpaca: Fine-tuned from the LLaMA 7B model
- gpt4all: Run locally on your CPU
- vicuna: 90% ChatGPT Quality
- Koala: Focus on dialogue data gathered from the web.
- dolly: Databricks
- Cerebras-GPT: 7 GPT models ranging from 111m to 13b parameters.
- GPT4All Download URL
- KoAlpaca: Alpaca for korean

Large Language and Vision Assistant

LLaVa: Large Language-and-Vision Assistant
MiniGPT-4: Enhancing Vision-language Understanding with Advanced Large Language Models
TaskMatrix, aka VisualChatGPT: Microsoft TaskMatrix; GroundingDINO + SAM

MLLM (multimodal large language model)

Facebook: ImageBind / SAM
1. facebookresearch/ImageBind: ImageBind One Embedding Space to Bind Them All (github.com)
2. facebookresearch/segment-anything(SAM): The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model. (github.com)
Microsoft: Kosmos-1 / Kosmos-2
1. Language Is Not All You Need: Aligning Perception with Language Models 2302.14045
2. Kosmos-2: Grounding Multimodal Large Language Models to the World
TaskMatrix.AI
1. TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs

Application incl. UI/UX

Gradio: Build Machine Learning Web Apps - in Python
Text generation web UI: Text generation web UI
Very Simple Langchain example using Open AI: langchain-ask-pdf
An open source implementation of OpenAI's ChatGPT Code interpreter: gpt-code-ui
Open AI Chat Mockup: An open source ChatGPT UI. mckaywrigley/chatbot-ui
Streaming with Azure OpenAI SSE
BIG-AGI FKA nextjs-chatgpt-app
Embedding does not use Open AI. Can be executed locally: pdfGPT
Tiktoken Alternative in C#: microsoft/Tokenizer: .NET and Typescript implementation of BPE tokenizer for OpenAI LLMs. (github.com)
Azure OpenAI Proxy: OpenAI API requests converting into Azure OpenAI API requests

Edge and Chrome Extension & Plugin

BetterChatGPT
ChatHub All-in-one chatbot client Webpage
ChatGPT Retrieval Plugin

Awesome demo

FRVR Official Teaser: Prompt to Game: AI-powered end-to-end game creation

日本語（Japanese Materials）

rinna: rinnaの36億パラメータの日本語GPT言語モデル: 3.6 billion parameter Japanese GPT language model
法律:生成AIの利用ガイドライン: Legal: Guidelines for the Use of Generative AI
New Era of Computing - ChatGPT がもたらした新時代
大規模言語モデルで変わるMLシステム開発: ML system development that changes with large-scale language models
GPT-4登場以降に出てきたChatGPT/LLMに関する論文や技術の振り返り: Review of ChatGPT/LLM papers and technologies that have emerged since the advent of GPT-4
LLMを制御するには何をするべきか？: How to control LLM
生成AIのマルチモーダルモデルでできること -タスク紹介編-: What can be done with multimodal models of generative AI
Azure OpenAIを活用したアプリケーション実装のリファレンス: 日本マイクロソフトリファレンスアーキテクチャ

Section 9 : Relavant solutions and links

Microsoft Fabric: Fabric integrates technologies like Azure Data Factory, Azure Synapse Analytics, and Power BI into a single unified product
DeepSpeed: DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Microsoft Office Copilot: Natural Language Commanding via Program Synthesis: Semantic Interpreter, a natural language-friendly AI system for productivity software such as Microsoft Office that leverages large language models (LLMs) to execute user intent across application features.
Azure Machine Learning - Prompt flow: Visual Designer for Prompt crafting. Use Jinja as a prompt template language.
Microsoft AI Models: Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities. https://aka.ms/nlpagi
Comparing Adobe Firefly, Dalle-2, OpenJourney, Stable Diffusion, and Midjourney: Generative AI for images
Prompt Engine: Craft prompts for Large Language Models: npm install prompt-engine
activeloopai/deeplake: AI Vector Database for LLMs/LangChain. Doubles as a Data Lake for Deep Learning. Store, query, version, & visualize any data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai (github.com)
mosaicml/llm-foundry: LLM training code for MosaicML foundation models (github.com)
Must read: the 100 most cited AI papers in 2022
The Best Machine Learning Resources
OpenAI Cookbook Examples and guides for using the OpenAI API
gpt4free for educational purposes only
Generate 3D objects conditioned on text or images openai/shap-e
Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold (paper) online demo
string2string: The library is an open-source tool that offers a comprehensive suite of efficient algorithms for a broad range of string-to-string problems. string2string
LLM evolutionary tree: @citation: LLMsPracticalGuide

Section 10 : AI Tools

@citation: The best AI Chatbots in 2023.: twitter.com/slow_developer

The leader: http://openai.com
The runner-up: http://bard.google.com
Open source: http://huggingface.co/chat
Searching web: http://perplexity.ai
Content writing: http://jasper.ai/chat
Sales and Marketing: http://chatspot.ai
AI Messenger: http://personal.ai
Tinkering: http://poe.com
Fun: http://beta.character.ai
Coding Auto-complete: http://github.com/features/copilot

+
Newsletters & Tool Databas: https://www.therundown.ai/

Acknowledgements

@TODO

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
azure-search-openai-demo		azure-search-openai-demo
azure-search-vector-sample		azure-search-vector-sample
azure-vector-db-python		azure-vector-db-python
babyagi		babyagi
bing-search-webui		bing-search-webui
docker-es-vector		docker-es-vector
files		files
infra		infra
langchain-@practical-ai		langchain-@practical-ai
llama-index-es-handson		llama-index-es-handson
ms-semactic-bing-notebook		ms-semactic-bing-notebook
ms-semantic-kernel-cosmos		ms-semantic-kernel-cosmos
picoGPT		picoGPT
quantization		quantization
tree-of-thought		tree-of-thought
.env.template		.env.template
.gitignore		.gitignore
README.md		README.md
README_Fabric.md		README_Fabric.md
README_SBCs.md		README_SBCs.md

sujitrulz/azure-openai-elastic-vector-langchain

Folders and files

Latest commit

History

Repository files navigation

Azure OpenAI + LLM (Large language model)

What's the difference between Azure OpenAI and OpenAI?

Table of contents

Section 1 : llama-index and Vector Storage (Database)

Opensearch/Elasticsearch setup

llama-index

llama-index example

Vector Storage Comparison

Vector Storage Options for Azure

Milvus Embedded

Conclusion

llama-index Deep dive

Section 2 : ChatGPT + Enterprise data with Azure OpenAI and Cognitive Search

Configuration

Introducing Azure OpenAI Service On Your Data in Public Preview

Azure OpenAI samples

Another Reference Architectures

Azure Cognitive Search : Vector Search

Section 3 : Microsoft Semantic Kernel with Azure Cosmos DB

Semantic-Kernel

Environment variable

Bing search Web UI and Semantic Kernel sample code

Section 4 : Langchain

Langchain Cheetsheet

Langchain Impressive Features

Langchain Quick Start: How to Use and Useful Utilities

Langchain chain type: Summarizer

langflow

Langchain vs llama-index

Section 5: Prompt Engineering, and Langchain vs Semantic Kernel

Prompt Engineering

Azure OpenAI Prompt Guide

OpenAI Prompt Guide

DeepLearning.ai Prompt Engineering COURSE and others

Awesome ChatGPT Prompts

ChatGPT : “user”, “assistant”, and “system” messages.

Finetuning

Sparsification

Small size with Textbooks: High quality synthetic dataset

Large Transformer Model Inference Optimization

Langchain vs Semantic Kernel

Semantic Kernel : Semantic Function

Semantic Kernel : Prompt Template language Key takeaways

Langchain Agent

Sementic Kernel Glossary

Langchain vs Sementic Kernel vs Azure Machine Learning Prompt flow

guidance

Section 6 : Improvement

Introducing 100K Context Windows

Math problem-solving skill

Table Extraction

OpenAI's plans according to Sam Altman

Token counting & Token-limits

Avoid AI hallucination

Gorilla: An API store for LLMs

Memory Optimization

Open AI Plugin and function calling

Section 7 : Generative AI Landscape / List of OSS LLM

Generative AI Revolution: Exploring the Current Landscape

List of OSS LLM

Huggingface Open LLM Learboard

Hugging face Transformer

Hugging face StarCoder

Section 8 : References

picoGPT

RLHF（Reinforcement Learning from Human Feedback)

Langchain and Prompt engineering library

AutoGPT / Communicative Agents

Democratizing the magic of ChatGPT with open models

Large Language and Vision Assistant

MLLM (multimodal large language model)

Application incl. UI/UX

Edge and Chrome Extension & Plugin

Awesome demo

日本語（Japanese Materials）