![rainbow](https://github.com/ancilcleetus/My-Learning-Journey/assets/25684256/839c3524-2a1d-4779-85a0-83c562e1e5e5)

# 1. Introduction

This lesson will delve into integrating several LLM models in LangChain. We will examine the platforms supporting these LLM models and compare their features. LangChain has built-in support for some of the most popular publicly available pre-trained models. In previous lessons, we already discussed several options like ChatGPT, GPT-4, GPT-3, and GPT4ALL.

This framework provides close to 30 integrations with well-known AI platforms like OpenAI, Cohere, Writer, and Replicate, to name a few. Most notably, they provide access to Huggingface Hub API with more than 120K available models that can be easily incorporated into your applications. These organizations offer different ways to access their services.

It is a common practice to pay for the API interfaces. The prices are usually determined by factors such as the number of processed tokens, as seen in OpenAI, or the process's duration measured in hours of GPU usage, as is the case with Huggingface Interface or Amazon Sagemaker. These options are generally easy and fast to set up. However, it is worth noting that you do not own the models, even if it was fine-tuned on your valuable datasets. They just provide access to the API with a pay-as-you-go plan.

On the other side of the spectrum, hosting the models locally on your servers is possible. It will enable you to have full and exclusive control over the network and your dataset. It is important to be aware of the hardware (high-end GPU for low latency) and maintenance (the expertise to deploy and fine-tune models) costs that are associated with this approach. Additionally, a number of publicly available models are not accessible for commercial use, like LLaMA.

The right approach is different for each use case and depends on details like budget, model capability, expertise, and trade secrets. It is straightforward to create a custom fine-tuned model by feeding your data to OpenAI's API. On the other hand, you might consider doing fine-tuning in-house if the dataset is part of your intellectual property and cannot be shared.

The different models' characteristics are another consideration. The network sizes and the dataset quality directly impact its language understanding capability. In contrast, a larger model is not always the best answer. The GPT-3's Ada variation is the smallest model in the collection, making it the fastest and most cost-effective option with low latency. However, it suits more straightforward tasks like parsing text or classification. Conversely, the latest GPT-4 version is the largest model to generate high-quality results for every task. But, the large number of parameters makes it a slow and the most expensive option. Therefore, selecting the model based on their ability is also necessary. It might be cheaper to use Ada to implement an application to hold a conversation, but it is not the model's objective and will result in disappointing responses.

**Note**

You can read [this article](https://levelup.gitconnected.com/how-to-benchmark-language-models-by-openai-deepmind-google-microsoft-783d4307ec50) for a comparison between a number of well-known LLMs.

![rainbow](https://github.com/ancilcleetus/My-Learning-Journey/assets/25684256/839c3524-2a1d-4779-85a0-83c562e1e5e5)

# 2. Popular LLM models accessible to LangChain via API

## 1. Cohere Command

The Cohere service provides a variety of models such as Command (`command`) for dialogue-like interactions, Generation (`base`) for generative tasks, Summarize (`summarize-xlarge`) for generating summaries, and more. You can get free, rate-limited usage for learning and prototyping. This means that usage is free until you go into production, however some of the models may be a bit more expensive than OpenAI APIs when you do—for example, $2.5 for generating 1K tokens. However, since Cohere offers more customized models for each task, this could lead to a more use case-specific model having improved outcomes in downstream tasks. The LangChain's Cohere class makes it easy to access these models `Cohere(model="<MODEL_NAME>", cohere_api_key="<API_KEY>")`.

**Note**

You might see deprecated model names in the LangChain documentation. (like `command-xlarge-20221108`) Please refer to the [Cohere documentation](https://docs.cohere.com/docs/models) for the latest naming convention.

## 2. GPT-3.5

GPT-3.5 is a language model developed by OpenAI. Its turbo version (recommended by OpenAI over [other variants](https://platform.openai.com/docs/models)) offers a more affordable option for generating human-like text through an API accessible via OpenAI endpoints. The model is optimized for chat applications while remaining powerful on other generative tasks and can process 96 languages. GPT-3.5-turbo has a up to 16K tokens context length and is the most cost-effective option from the OpenAI collection with only $0.002 per 1000 tokens. It is possible to access this model's API by using the `gpt-3.5-turbo` key while initializing either `ChatOpenAI` or `OpenAI` classes.

## 3. GPT-4

GPT-4 is a competent multimodal model developed by OpenAI. It is the latest and most powerful model published by OpenAI, and the multi-modality enables the model to process both text and image as input. It represents a significant advancement in natural language processing, providing enhanced capabilities in generating human-like text, understanding context, and handling complex language tasks. GPT-4 can process and generate text in multiple languages, making it versatile for global applications. The model supports a context length of up to 32K tokens, allowing it to manage longer conversations and documents effectively. While more advanced and capable than its predecessors, GPT-4 remains cost-effective for many applications. The API for GPT-4 can be accessed by using the `gpt-4` key while initializing either `ChatOpenAI` or `OpenAI` classes, making it easily integrable into LangChain for developing robust chat applications, content generation tools, and more.

## 4. Jurassic-2

The AI21’s Jurassic-2 is a language model with three sizes and different price points: Jumbo, Grande, and Large. The model sizes are not publicly available, but their documentation marks the Jumbo version as the most powerful model. They describe the models as general-purpose with excellent capability on every generative task. Their J2 model understands seven languages and can be fine-tuned on custom datasets. Getting your API key from the AI21 platform and using the `AI21()` class to access these models is possible.

## 5. StableLM

StableLM Alpha is a language model developed by Stable Diffusion, which can be accessed via HuggingFace Hub (with the following id `stabilityai/stablelm-tuned-alpha-3b`) to host locally or Replicate API with a rate from $0.0002 to $0.0023 per second. So far, it comes in two sizes, 3 billion and 7 billion parameters. The weights for StableLM Alpha are available under CC BY-SA 4.0 license with commercial use access. The context length of StableLM is 4096 tokens.

## 6. Dolly-v2-12B

Dolly-v2-12B is a language model created by Databricks, which can be accessed via HuggingFace Hub (with the following id `databricks/dolly-v2-3b`) to host locally or Replicate API with the same price range as mentioned in the previous subsection. It has 12 billion parameters and is available under an open source license for commercial use. The base model used for Dolly-v2-12B is Pythia-12B.

## 7. GPT4All

GPT4All is based on Meta's LLaMA model with 7B parameters. It is a language model developed by Nomic-AI that can be accessed through GPT4All and Hugging Face Local Pipelines. The model is published with a GPL 3.0 open-source license. However, it is not free to use for commercial applications. It is available for researchers to use for their projects and experiments.

![rainbow](https://github.com/ancilcleetus/My-Learning-Journey/assets/25684256/839c3524-2a1d-4779-85a0-83c562e1e5e5)

In [None]:
# Deep Learning as subset of ML

from IPython import display
display.Image("data/images/DL_01_Intro-01-DL-subset-of-ML.jpg")

![rainbow](https://github.com/ancilcleetus/My-Learning-Journey/assets/25684256/839c3524-2a1d-4779-85a0-83c562e1e5e5)