# RWKV Chains

LLMs are generally trained with different datasets and configurations, making them react differently to the same input. In the case of RWKV, because it is not transformer-based, most prompts built inside LangChain are not compatible with it. Therefore we collected and built a set of prompts in `RWKV_chains` to make LangChain work with RWKV.

## 0. Load a model first

**Please change the `model_name_or_path` to your own model path.** And it works better with 7B model. 14B is a bit hard to prompt.

In [2]:
%load_ext autoreload
%autoreload 2

In [3]:
import os
os.environ["RWKV_JIT_ON"] = '1'
os.environ["RWKV_CUDA_ON"] = '1' # if '1' then use CUDA kernel for seq mode (much faster)

from langchain.llms import RWKV

# weight_path = "D:/weights/rwkv/RWKV-4-Raven-7B-v11-Eng99%-Other1%-20230427-ctx8192.pth"
weight_path = "/home/ubuntu/Documents/Github/raven-weight-7b.pth"
tokenizer_json = "./20B_tokenizer.json"

model = RWKV(model=weight_path, strategy="cuda fp16i8 *20 -> cuda fp16", tokens_path=tokenizer_json)


Using /home/ubuntu/.cache/torch_extensions/py310_cu117 as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file /home/ubuntu/.cache/torch_extensions/py310_cu117/wkv_cuda/build.ninja...
Building extension module wkv_cuda...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
Loading extension module wkv_cuda...


ninja: no work to do.
RWKV_JIT_ON 1 RWKV_CUDA_ON 1 RESCALE_LAYER 6

Loading /home/ubuntu/Documents/Github/raven-weight-7b.pth ...
Strategy: (total 32+1=33 layers)
* cuda [float16, uint8], store 20 layers
* cuda [float16, float16], store 13 layers
0-cuda-float16-uint8 1-cuda-float16-uint8 2-cuda-float16-uint8 3-cuda-float16-uint8 4-cuda-float16-uint8 5-cuda-float16-uint8 6-cuda-float16-uint8 7-cuda-float16-uint8 8-cuda-float16-uint8 9-cuda-float16-uint8 10-cuda-float16-uint8 11-cuda-float16-uint8 12-cuda-float16-uint8 13-cuda-float16-uint8 14-cuda-float16-uint8 15-cuda-float16-uint8 16-cuda-float16-uint8 17-cuda-float16-uint8 18-cuda-float16-uint8 19-cuda-float16-uint8 20-cuda-float16-float16 21-cuda-float16-float16 22-cuda-float16-float16 23-cuda-float16-float16 24-cuda-float16-float16 25-cuda-float16-float16 26-cuda-float16-float16 27-cuda-float16-float16 28-cuda-float16-float16 29-cuda-float16-float16 30-cuda-float16-float16 31-cuda-float16-float16 32-cuda-float16-float16 
emb.weight

## A do-all prompt

The Raven versions of RWKV are instruction fined-tuned with Alpaca datasets. Therefore it's much easier if we just include the Alpaca prompt before assigning RWKV to do anything. The Alpaca prompt template is shown below, and we have a `PromptTemplate` called `rwkv_prompt` for ease of use.

```text
Below is an instruction that describes a task. Write a response that appropriately completes the request.
# Instruction:
{Your instruction or question}

# Response:
```



In [4]:
from rwkv_chains import rwkv_prompt
from langchain.chains import LLMChain
chain = LLMChain(llm=model, prompt=rwkv_prompt, verbose=True)

In [5]:
chain.run("Why is fusion energy important?")



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mBelow is an instruction that describes a task. Write a response that appropriately completes the request.
# Instruction:
Why is fusion energy important?

# Response:
[0m

[1m> Finished chain.[0m


'Fusion energy is important because it has the potential to provide a virtually limitless, clean source of energy that does not produce greenhouse gases or other pollutants. This could help to reduce our dependence on fossil fuels and decrease our carbon footprint. Additionally, fusion energy has the potential to create a new industry that could create jobs and stimulate economic growth.'

## Summarization with RWKV

Inside RWKV, the computation graph resembles RNN in the sense that it has a hidden state that is updated with each token. So in the case of summarization with long texts, the results can be much improved if we put the instruction to summarize **after** the text. The optimized prompt templates can be used via `load_rwkv_summarize_chain`. An example usage is shown below.

In [6]:
from rwkv_chains import load_rwkv_summarize_chain
from langchain.docstore.document import Document

summary_chain = load_rwkv_summarize_chain(llm=model, chain_type="stuff", output_key="summary")
# map_reduce and refine are also available

with open("james_webb_news.txt", "r", encoding="utf-8") as f:
    text = f.read().strip()
docs = [Document(page_content=text)]
result = summary_chain.run(docs)

In [7]:
print(result)


NASA's James Webb Space Telescope team has been awarded the 2023 Michael Collins Trophy for Lifetime and Current Achievements, recognizing the team's achievements in pushing the boundaries of space science. The trophy is a miniature version of the iconic "Web of Space" sculpture created by artist John Safer.


In [14]:
# map reduce is also possible, and it can work over a very long article.

mr_chain = load_rwkv_summarize_chain(llm=model, chain_type="map_reduce")

# this will help when you have very long articles that can't be fed into a forward of the LLM.
docs2 = [Document(page_content=i) for i in text.split("\n\n")]
result2 = mr_chain.run(docs2)
print(result2)


NASA's James Webb Space Telescope team has received the 2023 Michael Collins Trophy for Lifetime and Current Achievements, which recognizes their significant contributions to the field of aerospace science and technology. The trophy is named after NASA's first astronaut, who achieved his goal of exploring the moon in 1969. The trophy was created by NASA to recognize those who have made significant contributions to the agency's space missions.


## Math with RWKV

RWKV is also capable of doing math with external tools, here we showcase how RWKV can call python `numexpr` to do math. The optimized prompt template is at `LLM_MATH_PROMPT`. An example usage is shown below.

In [15]:
from rwkv_chains import LLM_MATH_PROMPT
from langchain.chains import LLMMathChain

math_chain = LLMMathChain(llm=model, prompt=LLM_MATH_PROMPT, verbose=True)

math_chain.run("What is 458123412 / 4058912?")





[1m> Entering new LLMMathChain chain...[0m
What is 458123412 / 4058912?[32;1m[1;3m
```text
458123412 / 4058912
```
...numexpr.evaluate("458123412 / 4058912")...
[0m
Answer: [33;1m[1;3m112.86852535851972[0m
[1m> Finished chain.[0m


'Answer: 112.86852535851972'

## Bash with RWKV

In [17]:
from rwkv_chains import LLMBashChain

bash_chain = LLMBashChain.from_llm(llm=model, verbose=True)

bash_chain.run("List all files under current directory")



[1m> Entering new LLMBashChain chain...[0m
List all files under current directory[32;1m[1;3m
# Response:
To list all files under the current directory, you can use the `ls` command. To format it in a readable way, you can add spaces between the items and use bullet points to separate them. Here's an example:
```bash
ls # List all files in the current directory
```
This will display a list of all files in the current directory, including their names and sizes if they are readable. If you want to see only the file names, you can use the `-l` option:
```bash
ls -l # List only file names and their sizes
```[0m
Code: [33;1m[1;3m['ls # List all files in the current directory', 'ls -l # List only file names and their sizes'][0m
Answer: [33;1m[1;3m20B_tokenizer.json
current_date_time.txt
james_webb_news.txt
main.ipynb
rwkv_chains
[0m
[1m> Finished chain.[0m


'20B_tokenizer.json\ncurrent_date_time.txt\njames_webb_news.txt\nmain.ipynb\nrwkv_chains\n'

## Document based QA with RWKV

In [28]:
from rwkv_chains import load_rwkv_qa_chain

qa_chain = load_rwkv_qa_chain(llm=model, verbose=False, chain_type="stuff")

qa_chain.run(input_documents = docs2, question="What does JWST mean for space exploration?")

'\nThe James Webb Space Telescope (JWST) is a powerful space telescope that will be launched in October 2021 and will be the largest and most powerful space telescope ever created. It will help us better understand our universe and its history, including the origins of life, the evolution of galaxies, and the formation of planets around other stars. JWST is also expected to help us answer some of the biggest questions about our place in the universe, such as whether we are alone in the cosmos.'

## Retrieval QA with RWKV

In [None]:
from langchain.chains import RetrievalQA
from langchain.faiss import FAISS
from langchain.vectorstores.base import VectorStoreRetriever
retriever = VectorStoreRetriever(vectorstore=
retrievalQA = RetrievalQA.from_llm(llm=model, retriever=retriever)