# RWKV Chains

LLMs are generally trained with different datasets and configurations, making them react differently to the same input. In the case of RWKV, because it is not transformer-based, most prompts built inside LangChain are not compatible with it. Therefore we collected and built a set of prompts in `RWKV_chains` to make LangChain work with RWKV.

## 0. Load a model first

**Please change the `model_name_or_path` to your own model path.** And it works better with 7B model. 14B is a bit hard to prompt.

In [None]:
%load_ext autoreload
%autoreload 2

In [None]:
# suppress the warnings
import warnings
warnings.filterwarnings('ignore')

In [None]:
import os
os.environ["RWKV_JIT_ON"] = '1'
os.environ["RWKV_CUDA_ON"] = '1' # if '1' then use CUDA kernel for seq mode (much faster)

from langchain.llms import RWKV

weight_path = "D:/weights/rwkv/RWKV-4-Raven-7B-v11-Eng99%-Other1%-20230427-ctx8192.pth"
# weight_path = "/home/ubuntu/Documents/Github/raven-weight-7b.pth"
tokenizer_json = "./20B_tokenizer.json"

model = RWKV(model=weight_path, strategy="cuda fp16i8 *20 -> cuda fp16", tokens_path=tokenizer_json)


## A do-all prompt

The Raven versions of RWKV are instruction fined-tuned with Alpaca datasets. Therefore it's much easier if we just include the Alpaca prompt before assigning RWKV to do anything. The Alpaca prompt template is shown below, and we have a `PromptTemplate` called `rwkv_prompt` for ease of use.

```text
Below is an instruction that describes a task. Write a response that appropriately completes the request.
# Instruction:
{Your instruction or question}

# Response:
```



In [None]:
from rwkv_chains import rwkv_prompt
from langchain.chains import LLMChain
chain = LLMChain(llm=model, prompt=rwkv_prompt, verbose=True)

In [None]:
chain.run("Why is fusion energy important?")

## Summarization with RWKV

Inside RWKV, the computation graph resembles RNN in the sense that it has a hidden state that is updated with each token. So in the case of summarization with long texts, the results can be much improved if we put the instruction to summarize **after** the text. The optimized prompt templates can be used via `load_rwkv_summarize_chain`. An example usage is shown below.

In [None]:
from rwkv_chains import load_rwkv_summarize_chain
from langchain.docstore.document import Document

summary_chain = load_rwkv_summarize_chain(llm=model, chain_type="stuff", output_key="summary")
# map_reduce and refine are also available

with open("james_webb_news.txt", "r", encoding="utf-8") as f:
    text = f.read().strip()
docs = [Document(page_content=text)]
result = summary_chain.run(docs)

In [None]:
print(result)

In [None]:
# map reduce is also possible, and it can work over a very long article.

mr_chain = load_rwkv_summarize_chain(llm=model, chain_type="map_reduce")

# this will help when you have very long articles that can't be fed into a forward of the LLM.
docs2 = [Document(page_content=i) for i in text.split("\n\n")]
result2 = mr_chain.run(docs2)
print(result2)

## Math with RWKV

RWKV is also capable of doing math with external tools, here we showcase how RWKV can call python `numexpr` to do math. The optimized prompt template is at `LLM_MATH_PROMPT`. An example usage is shown below.

In [None]:
from rwkv_chains import LLM_MATH_PROMPT
from langchain.chains import LLMMathChain

math_chain = LLMMathChain(llm=model, prompt=LLM_MATH_PROMPT, verbose=True)

math_chain.run("What is 592138 * 4242?")

## Bash with RWKV

In [None]:
from rwkv_chains import LLMBashChain

bash_chain = LLMBashChain.from_llm(llm=model, verbose=True)

bash_chain.run("On a windows computer, simply list all files under current directory.")

## Document based QA with RWKV

In [None]:
from rwkv_chains import load_rwkv_qa_chain

qa_chain = load_rwkv_qa_chain(llm=model, verbose=True, chain_type="stuff")

qa_chain.run(input_documents = docs, question="Did the article also introduced winners back in 1980?")