https://python.langchain.com/docs/tutorials/llm_chain/

In [1]:
import os

from dotenv import load_dotenv
from IPython.display import Markdown
from langchain_mistralai import ChatMistralAI

In [2]:
# Load environment variables from .env file
load_dotenv()

# Retrieve API key from the environment
api_key = os.getenv("MISTRAL_API_KEY")

if api_key:
    model = ChatMistralAI(name="open-mixtral-8x22b")
else:
    print("API key not found in environment or .env file.")

In [3]:
from langchain_core.messages import HumanMessage, SystemMessage

In [4]:
messages = [
    SystemMessage(content="Answer briefly!"),
    HumanMessage(content="What is indexing in database?"),
]

result = model.invoke(messages)

In [5]:
from langchain_core.output_parsers import StrOutputParser

In [6]:
parser = StrOutputParser()

In [7]:
Markdown(parser.invoke(result))

Indexing in a database is a technique used to optimize the performance of a database by reducing the time it takes to retrieve data. It is similar to the index found at the back of a book, which allows you to quickly locate specific information without having to scan every page.

In a database, an index is a data structure that is created on one or more columns of a table to enable fast searching and data retrieval. Indexes can significantly improve the performance of database queries, especially for large databases, by reducing the amount of disk I/O and CPU time required to locate and retrieve data. However, creating and maintaining indexes can also add some overhead to the database, so it is important to use them judiciously.

In [8]:
chain = model | parser

In [9]:
messages = [
    SystemMessage(content="Give exact math formula for provided term"),
    HumanMessage(content="Self-attention in transformer architecture"),
]

In [10]:
result = chain.invoke(messages)

In [11]:
Markdown(result)

The self-attention mechanism in the Transformer architecture can be calculated using the following formulas:

1. Query, Key, and Value matrices are first calculated by multiplying the input matrix (X) with the corresponding weight matrices (W^Q, W^K, W^V):

Query (Q) = X \* W^Q
Key (K) = X \* W^K
Value (V) = X \* W^V

where * denotes matrix multiplication.

2. The attention scores are calculated by taking the dot product of the Query and Key matrices, dividing by the square root of the Key dimension (d_k), and applying a softmax function:

Attention(Q, K, V) = softmax(Q \* K^T / sqrt(d_k)) \* V

where ^T denotes the transpose of a matrix.

3. The output of the self-attention mechanism is then calculated by adding a bias term (b) and applying a weight matrix (W^O) through linear transformation:

Output = Attention(Q, K, V) \* W^O + b

where + denotes matrix addition.

These formulas are used to calculate the self-attention mechanism in each of the attention heads in the Transformer architecture. The final output is then calculated by concatenating the outputs of all attention heads and applying another linear transformation.

In [12]:
from langchain_core.prompts import ChatPromptTemplate

In [13]:
system_template = "Translate the following into {language}:"
prompt_template = ChatPromptTemplate.from_messages(
    [("system", system_template), ("user", "{text}")]
)

In [14]:
result = prompt_template.invoke({"language": "Russian", "text": "Machine Learning is good"})
result.to_messages()

[SystemMessage(content='Translate the following into Russian:', additional_kwargs={}, response_metadata={}),
 HumanMessage(content='Machine Learning is good', additional_kwargs={}, response_metadata={})]

In [32]:
chain = prompt_template | model | parser

In [33]:
out = chain.invoke({"language": "Russian", "text": "Machine Learning is good"})
Markdown(out)

Машинное обучение хорошо (Mashinnoe obuchenie horosho).

I hope this helps! Let me know if you have any other questions.