# GovernmentGPT: Inference

We wanted to see whether we can teach an LLM to do the job of elected British Members of Parliament (MPs) and debate any issue like they do in the House of Commons.

GovernmentGPT is an LLM fine-tuned with a LoRA adapter. You can see the code for this here: https://github.com/stewhsource/GovernmentGPT/FineTuning

This notebook allows you to download and setup GovernmentGPT, and then seed it with any debate you would like it to continue.

LLMs are computationally heavy, so this notebook needs to be run on a machine with a GPU. Google Colab provides this quickly and easily.

## Installation

Clone the `mistral-finetune` repo:


In [1]:
%cd /content/
!git clone https://github.com/mistralai/mistral-finetune.git

/content
Cloning into 'mistral-finetune'...
remote: Enumerating objects: 401, done.[K
remote: Counting objects: 100% (142/142), done.[K
remote: Compressing objects: 100% (48/48), done.[K
remote: Total 401 (delta 125), reused 94 (delta 94), pack-reused 259[K
Receiving objects: 100% (401/401), 210.17 KiB | 21.02 MiB/s, done.
Resolving deltas: 100% (209/209), done.


Install all required dependencies:

In [2]:
!pip install -r /content/mistral-finetune/requirements.txt

Collecting fire (from -r /content/mistral-finetune/requirements.txt (line 1))
  Downloading fire-0.6.0.tar.gz (88 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/88.4 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m88.4/88.4 kB[0m [31m3.3 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting mistral-common>=1.1.0 (from -r /content/mistral-finetune/requirements.txt (line 4))
  Downloading mistral_common-1.2.1-py3-none-any.whl (704 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m704.9/704.9 kB[0m [31m24.8 MB/s[0m eta [36m0:00:00[0m
Collecting torch==2.2 (from -r /content/mistral-finetune/requirements.txt (line 9))
  Downloading torch-2.2.0-cp310-cp310-manylinux1_x86_64.whl (755.5 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m755.5/755.5 MB[0m [31m1.8 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting triton==2.2 (fro

## Mistral 7B model download
The base Mistral 7B model can be downloaded directly from the Mistral CDN, or via HuggingFace. Choose what works best for you.

In [3]:
#!wget https://models.mistralcdn.com/mistral-7b-v0-3/mistral-7B-v0.3.tar

In [4]:
#!DIR=/content/mistral_models && mkdir -p $DIR && tar -xf mistral-7B-v0.3.tar -C $DIR

In [5]:
# Alternatively, you can download the model from Hugging Face
# (sometimes this is needed as the Mistral mirror is slow from Colab?).
# Note you'll need to set your HuggingFace token as an env var (you can do this easily in colab)

!mkdir /content/mistral_models/7B-v0.3

!pip install huggingface_hub
from huggingface_hub import snapshot_download
from pathlib import Path

mistral_models_path = Path.home().joinpath('content','mistral_models', '7B-v0.3')
mistral_models_path.mkdir(parents=True, exist_ok=True)

# Import Colab Secrets userdata module
from google.colab import userdata

# Set HuggingFace API key
import os
os.environ["HF_TOKEN"] = userdata.get('HF_TOKEN')

snapshot_download(repo_id="mistralai/Mistral-7B-v0.3", allow_patterns=["params.json", "consolidated.safetensors", "tokenizer.model.v3"], local_dir='/content/mistral_models/7B-v0.3')

#! cp -r /root/mistral_models/7B-v0.3 /content/mistral_models
#! rm -r /root/mistral_models/7B-v0.3

mkdir: cannot create directory ‘/content/mistral_models/7B-v0.3’: No such file or directory


Fetching 3 files:   0%|          | 0/3 [00:00<?, ?it/s]

consolidated.safetensors:   0%|          | 0.00/14.5G [00:00<?, ?B/s]

params.json:   0%|          | 0.00/202 [00:00<?, ?B/s]

tokenizer.model.v3:   0%|          | 0.00/587k [00:00<?, ?B/s]

'/content/mistral_models/7B-v0.3'

In [6]:
!ls /content/mistral_models

7B-v0.3


# Load the LoRA adapter
Download the LoRA adapter, and prepare for inference.

In [7]:
# Create directory structure
!mkdir -p /content/governmentgpt
!mkdir -p /content/governmentgpt/lora_adapter

In [8]:
!wget -O /content/governmentgpt/lora_adapter.zip https://stewh-publicdata.s3.eu-west-2.amazonaws.com/governmentgpt/2024-06-07/lora_adapter/mistral7b_v3_governmentgpt_lora_250k_2024-06-07.zip

--2024-06-20 13:17:06--  https://stewh-publicdata.s3.eu-west-2.amazonaws.com/governmentgpt/2024-06-07/lora_adapter/mistral7b_v3_governmentgpt_lora_250k_2024-06-07.zip
Resolving stewh-publicdata.s3.eu-west-2.amazonaws.com (stewh-publicdata.s3.eu-west-2.amazonaws.com)... 52.95.149.106, 3.5.244.176, 52.95.191.6, ...
Connecting to stewh-publicdata.s3.eu-west-2.amazonaws.com (stewh-publicdata.s3.eu-west-2.amazonaws.com)|52.95.149.106|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 264049024 (252M) [application/zip]
Saving to: ‘/content/governmentgpt/lora_adapter.zip’


2024-06-20 13:17:22 (16.0 MB/s) - ‘/content/governmentgpt/lora_adapter.zip’ saved [264049024/264049024]



In [9]:
# Unzip
!unzip /content/governmentgpt/lora_adapter.zip -d /

Archive:  /content/governmentgpt/lora_adapter.zip
  inflating: /__MACOSX/._content     
  inflating: /content/.DS_Store      
  inflating: /__MACOSX/content/._.DS_Store  
  inflating: /__MACOSX/content/._governmentgpt  
  inflating: /content/governmentgpt/metrics.train.jsonl  
  inflating: /__MACOSX/content/governmentgpt/._metrics.train.jsonl  
  inflating: /content/governmentgpt/args.yaml  
  inflating: /__MACOSX/content/governmentgpt/._args.yaml  
   creating: /content/governmentgpt/checkpoints/
  inflating: /__MACOSX/content/governmentgpt/._checkpoints  
   creating: /content/governmentgpt/tb/
  inflating: /__MACOSX/content/governmentgpt/._tb  
   creating: /content/governmentgpt/checkpoints/checkpoint_000100/
  inflating: /__MACOSX/content/governmentgpt/checkpoints/._checkpoint_000100  
  inflating: /content/governmentgpt/tb/events.out.tfevents.1717754616.15dc9f3a59ce.21751.1.eval  
  inflating: /__MACOSX/content/governmentgpt/tb/._events.out.tfevents.1717754616.15dc9f3a59ce.21751.

In [10]:
!pip install mistral_inference

Collecting mistral_inference
  Downloading mistral_inference-1.1.0-py3-none-any.whl (21 kB)
Installing collected packages: mistral_inference
Successfully installed mistral_inference-1.1.0


In [11]:
from mistral_inference.model import Transformer
from mistral_inference.generate import generate

from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
from mistral_common.protocol.instruct.messages import UserMessage
from mistral_common.protocol.instruct.request import ChatCompletionRequest

tokenizer = MistralTokenizer.from_file("/content/mistral_models/7B-v0.3/tokenizer.model.v3")  # change to extracted tokenizer file

# Clear GPU memory first
import torch
torch.cuda.empty_cache()

model = Transformer.from_folder("/content/mistral_models/7B-v0.3")  # change to extracted model dir
model.load_lora("/content/governmentgpt/checkpoints/checkpoint_000100/consolidated/lora.safetensors")

The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.


0it [00:00, ?it/s]

# Inference
The cell below runs the fine-tuned GovernmentGPT LLM with an example seed prompt that encourages debate. You can read the real Hansard (https://hansard.parliament.uk/) for examples of how questions are asked, and I recommend you have a play.

You can change the creativity of debate by varying the temperature parameter between 0 and 2. 0 = rigid and short respones. Conversely, 2 is very creative, but can often go off tangent. I left it at 0.8 and it seems to work well.

### Input Formatting

Note that the model is fine-tuned with debate data in a specific conversation format (and so, will output in that format). You will need to ensure anything you input into the model uses the same format:

> Speaker: \[type of MP\] for \[Location\] \[(additional roles: [Chancellor of the Exchequer, ...], if any)\]:
>
> Speech transcript: \[What the individual said/asked in commons.\]

You can see example types of MP (eg Labour, Liberal Democrat, Conservative, SNP etc), roles and locations in the GovernmentGPT training dataset, as well as in the real Hansard.

You can see examples of this format being used in the code below:

In [20]:
# Seed a debate to prompt GovernmentGPT to continue the debate
content_1 = "Speaker: Labour MP for Durham: Speech transcript: I am deeply concerned about the risk to a dwindling supply of rich tea biscuits that is being reported by the press due to biscuit factory worker strikes. As righteous British citizens we must protect our most important National biscuit identity for our tea breaks. Can the honorable gentleman outline what they intend to do about it? \n\n"
content_2 = "Speaker: Conservative MP for Norwich: \n\n Speech transcript: What plans have we to support the biscuit manufacturing industry in the north east? \n\n"
content_3 = "Speaker: Labour MP for Manchester South: \n\n Speech transcript: It is clear the finances of this country are in a dire state following 7 years of Tory government. We need fresh thinking to address the systemic issues. What policies does the Tory government plan to introduce? \n\n"
content_4 = "Speaker: Liberal Democrat MP for Northwich: \n\n Speech transcript: Prolonged war at this point seems inevitable in Ukraine. We are in support of supply weapons for the long term, however we do not agree that we should make endless payments without strong agreeement as to what that money is intended for. \n\n"

content_5 = "Speaker: Liberal Democrat MP for Northwich: \n\n Speech transcript: My consituents are writing to me expressing concern at the use of AI which could replace their jobs. I share these concerns, not least because I fear our role as MPs could be replaced by using AI to debate on constituents behalf. What does the house think of the prospect of our replacement? \n\n"



completion_request = ChatCompletionRequest(messages=[UserMessage(content=content_5)])

tokens = tokenizer.encode_chat_completion(completion_request).tokens

out_tokens, _ = generate([tokens], model, max_tokens=2048, temperature=0.8, eos_id=tokenizer.instruct_tokenizer.tokenizer.eos_id) # Set temperature to 0.8 for some creative dialogue
result = tokenizer.instruct_tokenizer.tokenizer.decode(out_tokens[0])

print(result)

Speaker: Conservative MP for Witney (additional roles: House Of Commons Minister of State (London).): 
 Speech transcript: I am not sure I would be quite so complacent as the hon. Gentleman seems to be. We are all aware of the potential of AI. I am sure the hon. Gentleman will have seen the fantastic work of the artificial intelligence unit in the Cabinet Office. It is not just about the development of technology, but how it is applied. We are looking at how we can apply it in the health service, in the criminal justice system and in the public sector. We are also looking at how we can use it to help people. I am sure that the hon. Gentleman will be keen to join me in supporting the use of AI in the public sector, in order to help people. 

 Speaker: Conservative MP for North West Leicestershire: 
 Speech transcript: I thank the Minister for his response. Does he agree that, as we look to the future, it is important that we ensure that the use of AI does not lead to job losses, but tha

In [21]:
def format_output(text):
  text = text.replace('\n', '')
  text = text.replace('Speaker:', '\n\nSpeaker:')
  text = text.replace('Speech transcript:', '\nSpeech transcript:')
  return text

# Clean the output format
print(format_output(result))



Speaker: Conservative MP for Witney (additional roles: House Of Commons Minister of State (London).):  
Speech transcript: I am not sure I would be quite so complacent as the hon. Gentleman seems to be. We are all aware of the potential of AI. I am sure the hon. Gentleman will have seen the fantastic work of the artificial intelligence unit in the Cabinet Office. It is not just about the development of technology, but how it is applied. We are looking at how we can apply it in the health service, in the criminal justice system and in the public sector. We are also looking at how we can use it to help people. I am sure that the hon. Gentleman will be keen to join me in supporting the use of AI in the public sector, in order to help people.  

Speaker: Conservative MP for North West Leicestershire:  
Speech transcript: I thank the Minister for his response. Does he agree that, as we look to the future, it is important that we ensure that the use of AI does not lead to job losses, but t