# MLX LangChain Chatbot
In this notebook, we will build a LangChain-based chatbot from end-to-end using MLX as the LLM-hosting service. This notebook is more or less an amalgamation of concepts covered from other sources, so we will tend to gloss over certain things here but will still provide links to those other resources for those who want to dive deeper.

The headings of markdown cells below will largely correlate to the same blog post on Medium. As such, minimal information will be provided beyond the code below.

## Notebook Setup
Throughout this notebook, we will largely be making use of only MLX and LangChain, along with other standand Python functions. In order to make use of these libraries, you will need to install them appropriately. For example, if you use `pip`, you will want to run the following command:

`pip install mlx mlx-lm langchain langchain-community langchain-core`

You are more than welcome to browse through the code in this section, particularly since we will use it all, but note that it is not particularly "mandatory." By that, I mean that these pieces of code are more like helper utilities than anything. As such, I do not touch on them in the blog post.

In [1]:
# Importing the necessary Python libraries
from mlx_lm import load as mlx_load
from mlx_lm import generate as mlx_generate
from langchain_community.llms.mlx_pipeline import MLXPipeline
from langchain_community.chat_models.mlx import ChatMLX

  from .autonotebook import tqdm as notebook_tqdm


In [5]:
# Setting constant values to represent model name and directory
MODEL_NAME = 'mistralai/Mistral-7B-Instruct-v0.2'
BASE_DIRECTORY = '../models'
MLX_DIRECTORY = f'{BASE_DIRECTORY}/mlx'
mlx_model_directory = f'{MLX_DIRECTORY}/{MODEL_NAME}'

In [6]:
# Setting a default system prompt
DEFAULT_SYSTEM_PROMPT = 'You are a helpful assistant.'

class MLXModelParameters():

    def __init__(self, temp = 0.7, max_tokens = 1000, system_prompt = DEFAULT_SYSTEM_PROMPT):
        self.temp = temp
        self.max_tokens = max_tokens
        self.system_prompt = system_prompt

    def __str__(self):
        return f'Temperature: {self.temp}\nMax Tokens: {self.max_tokens}\nSystem Prompt: {self.system_prompt}'
    
    def __repr__(self):
        return f'Temperature: {self.temp}\nMax Tokens: {self.max_tokens}\System Prompt: {self.system_prompt}'

    def update_temp(self, new_temp):
        self.temp = new_temp

    def update_max_tokens(self, new_max_tokens):
        self.max_tokens = new_max_tokens

    def update_system_prompt(self, new_system_prompt):
        self.system_prompt = new_system_prompt

    def to_json(self):
        return { 'temp': self.temp, 'max_tokens': self.max_tokens }
    
mlx_model_parameters = MLXModelParameters()

## A Quick Intro to MLX

In [4]:
# Loading the MLX quantized Mistral 7B model from file
mlx_model, mlx_tokenizer = mlx_load(mlx_model_directory)

# Producing the response (completion) with the MLX model
response = mlx_generate(
    model = mlx_model,
    tokenizer = mlx_tokenizer,
    prompt = 'Write me a haiku about Jupyter notebooks.',
    max_tokens = 1000,
    verbose = True
)

# Printing the response
print(response)

Prompt: Write me a haiku about Jupyter notebooks.


In the flow of code,
Notebook pages gently turn,
Data's story unfolds.
Prompt: 24.225 tokens-per-sec
Generation: 34.790 tokens-per-sec


In the flow of code,
Notebook pages gently turn,
Data's story unfolds.


In [7]:
# Setting up the LangChain MLX LLM
llm = MLXPipeline.from_model_id(
    model_id = mlx_model_directory,
    pipeline_kwargs = {
        'temp': mlx_model_parameters.temp,
        'max_tokens': mlx_model_parameters.max_tokens,
    }
)

# Setting up the LangChain MLX Chat Model with the LLM above
chat_model = ChatMLX(llm = llm)

# Getting the response from the LangChain MLX Chat Model
response = chat_model.invoke({'input': 'Write me a haiku about Jupyter notebooks.'})

print(response.content)

ValueError: Invalid input type <class 'dict'>. Must be a PromptValue, str, or list of BaseMessages.