# Language Models on the AI Executive Order

_2023-11-01_

**By Matt Hodges**

![LLM AI EO](https://raw.githubusercontent.com/hodgesmr/llm_ai_eo/main/llm_ai_eo_header.jpg)

On October 30th, 2023, President Biden signed the [Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence](https://www.whitehouse.gov/briefing-room/presidential-actions/2023/10/30/executive-order-on-the-safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence/). The order itself is quite sweeping and touches many government departments and agencies, with a focus on harnessing AI's potential and defending against harms and risks.

In this Notebook, we'll deploy language models to rapidly discover information from the Order. For the easiest setup, I recommend trying this out in a Google Colab notebook.

<a target="_blank" href="https://colab.research.google.com/github/hodgesmr/llm_ai_eo/blob/main/llm_ai_eo.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a> <a target="_blank" href="https://github.com/hodgesmr/llm_ai_eo/blob/main/llm_ai_eo.ipynb">
  <img src="https://img.shields.io/badge/-Open_in_Github-blue?logo=github&labelColor=gray" alt="Open In Github"/>
</a>


Many of the strategies presented here are extensions from Simon Willison's work in his blog post, [Embedding paragraphs from my blog with E5-large-v2](https://til.simonwillison.net/llms/embed-paragraphs). Simon also maintains a handy command line utility for working with various LLM models, aptly named [LLM](https://llm.datasette.io/en/stable/). While Simon's writing largely focuses on the CLI capabilities of the tool (and the usefully opinionated integrations with SQLite), I prefer working with Pandas Dataframes. Here I show how to use the LLM library in that fashion.

Embeddings are kindof a magic black box to end users, but the basic idea is that language models can create vectors or numerical values that represent not only words or sentences, but also the symantic _meaning_ of those words. Early research on this subject comes from [word2vec](https://code.google.com/archive/p/word2vec/). To illustrate: `vector('king') - vector('man') + vector('woman')` is mathematically close to `vector('queen')`. I find that _fascinating_! We'll use this concept to extract and match information against the Executive Order text.

We'll deploy a technique known as [Retrieval-Augmented Generation](https://research.ibm.com/blog/retrieval-augmented-generation-RAG). From a high level, this allows us to inject context into a LLM without training or tuning it. We use another system to locate language that likely contains the answer to our query, and then ask the model to pull it out for us.

Our high livel strategy:

1.   Calculate embeddings on the Executive Order text
2.   Calculate embeddings on a query
3.   Calculate the cosine similarity between every text embedding and the query
4.   Select the top three passages that are symantically similar to the query
5.   Pass the passages and the query to the LLM for rapid summarization

## Environment

First install the dependencies, which include the [MLC LLaMA 2 model](https://mlc.ai) for summarization, the [LLM](https://llm.datasette.io/en/stable/) library, and the [E5-large-v2](https://huggingface.co/intfloat/e5-large-v2) language model for text embedding.

Note, these models are constantly changing, and getting them up and running on your system might take some independent investigation. If running in Google Colab, check [this tutorial for MLC](https://colab.research.google.com/github/mlc-ai/notebooks/blob/main/mlc-llm/tutorial_chat_module_getting_started.ipynb). If running LLaMA with the LLM library on macOS, check the [repository's instructions](https://github.com/simonw/llm-mlc).

In [1]:
%%capture
!pip install --pre -U -f https://mlc.ai/wheels mlc-chat-nightly-cu118 mlc-ai-nightly-cu118
!git lfs install
!pip install llm
!llm install llm-sentence-transformers
!llm sentence-transformers register intfloat/e5-large-v2 -a lv2
!llm install llm-mlc
!llm mlc setup
!llm mlc download-model Llama-2-7b-chat --alias llama2

**Notes:** 
+ for some reason the git lfs install via conda didn't seem to work, had to reinstall it in the llm install llm-mlc step
+ Had to do a manual step after the mlc setup:

```
llm mlc pip install --pre --force-reinstall \
  mlc-ai-nightly \
  mlc-chat-nightly \
  -f https://mlc.ai/wheels
```
but I thought I had already done those.



## Load Data

Before getting started, we need the Executive Order text to work against. This is probably the least interesting part of this Notebook. I simply opened the Order in Firefox reader view, copy+pasted it into VSCode, did some manual find/replace to clean up the white space, and then concatenated paragraphs to get chunks as close to 400 words as I could. I picked 400 because the embedding model truncates at 512 _tokens_ and a token is either a word or a symantically important subset of a word, so I allowed for some buffer. _This took less than half an hour._ Rather than share code to do this work, I simply provide the cleaned text here.

Load it into a Pandas Dataframe with a single column:



In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
%conda info


     active environment : llm_ai_eo
    active env location : /local/home/lmmcinn/anaconda3/envs/llm_ai_eo
            shell level : 2
       user config file : /local/home/lmmcinn/.condarc
 populated config files : 
          conda version : 23.7.3
    conda-build version : 3.26.0
         python version : 3.11.4.final.0
       virtual packages : __archspec=1=x86_64
                          __glibc=2.31=0
                          __linux=5.15.0=0
                          __unix=0=0
       base environment : /local/home/lmmcinn/anaconda3  (writable)
      conda av data dir : /local/home/lmmcinn/anaconda3/etc/conda
  conda av metadata url : None
           channel URLs : https://repo.anaconda.com/pkgs/main/linux-64
                          https://repo.anaconda.com/pkgs/main/noarch
                          https://repo.anaconda.com/pkgs/r/linux-64
                          https://repo.anaconda.com/pkgs/r/noarch
          package cache : /local/home/lmmcinn/anaconda3/pkgs
        

In [22]:
import pandas as pd
import numpy as np

#df = pd.read_csv(
#    "https://raw.githubusercontent.com/hodgesmr/llm_ai_eo/main/eo.txt",
#    sep="_",  # trick to let us read the lines into a Dataframe; '_' not present
#    header=None,
#)
# df = pd.read_csv("eo.txt",
#     sep="_",  # trick to let us read the lines into a Dataframe; '_' not present
#     header=None,
# )

# df.columns = ["passage"]

# df.head()
df = pd.DataFrame({"passage":np.load("data/processed/text_for_embedding.npy")})
df['passage'] = df.passage.str.strip("passage: ")
df.head()

Unnamed: 0,passage
0,"# Executive Order on the Safe, Secure, and Tru..."
1,By the authority vested in me as President by ...
2,Section 1. Purpose. Artificial intelligence ...
3,My Administration places the highest urgency o...
4,"In the end, AI reflects the principles of the ..."


## Calculate Embeddings

Now that we have a Dataframe of chunks of the Executive Order, we can calculate embeddings of each chunk. To do this we'll use the [E5-large-v2](https://huggingface.co/intfloat/e5-large-v2) language model, which was trained to handle text prefixed with either `passage: ` or `query: `. Every chunk is considered a passage. We'll add this as another column on our Dataframe.

In [23]:
import llm

embedding_model = llm.get_embedding_model("lv2")
# text_to_embed = df.passage.to_list()

# # Our embedding model expects `passage: ` prefixes
# text_to_embed = [f'passage: {t}' for t in text_to_embed]

# df['embedding'] = list(embedding_model.embed_multi(text_to_embed))

# df.head()
df['embedding'] = [list(x) for x in np.load("data/processed/all_sentence_embeddings.npy")]
df.head()

Unnamed: 0,passage,embedding
0,"# Executive Order on the Safe, Secure, and Tru...","[0.033159319311380386, -0.0433826707303524, 0...."
1,By the authority vested in me as President by ...,"[0.021663667634129524, -0.044645559042692184, ..."
2,Section 1. Purpose. Artificial intelligence ...,"[0.011593389324843884, -0.04116535559296608, 0..."
3,My Administration places the highest urgency o...,"[0.04392296448349953, -0.06537173688411713, 0...."
4,"In the end, AI reflects the principles of the ...","[0.02728632092475891, -0.023925885558128357, 0..."


For our symantic searching, we'll also need an embedding of our query. And the model would like that prefixed with `query: `. Let's ask what the Order says regarding AI and healthcare:

In [24]:
query = "what does it say about defense, security & intelligence?"

# Our embbeding model expects `query: ` prefix for retrieval
query_to_embed = f"query: {query}"
query_vector = embedding_model.embed(query_to_embed)

print(query_vector)

[0.011814484372735023, -0.04535473510622978, 0.023898035287857056, -0.02212279848754406, -0.019057705998420715, 0.04820643737912178, -0.03767821565270424, -0.051595065742731094, -0.03109927661716938, 0.005171871744096279, 0.031261146068573, -0.025956042110919952, 0.01550287939608097, 0.044989779591560364, 0.020800543949007988, 0.018533820286393166, -0.0046169208362698555, 0.009546766057610512, 0.007762834895402193, -0.06816984713077545, 0.04740738868713379, -0.03433338552713394, -0.02944430336356163, 0.04400010406970978, -0.022500235587358475, 0.003246377920731902, -0.03938307613134384, -0.021725306287407875, -0.028905969113111496, 0.0030106347985565662, 0.03686890751123428, 0.037305280566215515, 0.04020002856850624, -0.055838581174612045, 0.03095896914601326, -0.00394625635817647, -0.04436859115958214, -0.03410295024514198, 0.02483571134507656, 0.0111059146001935, -0.01766352914273739, 0.04015745595097542, -0.03960525989532471, 0.03975656256079674, -0.014060978777706623, -0.0258172657

## Symantic Search

If we were using the LLM module's preferred structures for Collection and storing data in SQLite, we could simply use [llm similar](https://llm.datasette.io/en/stable/embeddings/cli.html#llm-similar) or its [corresponding Python API](https://llm.datasette.io/en/stable/embeddings/python-api.html#retrieving-similar-items). As far as I can tell, the API doesn't yet support other data structures of embeddings (like our Dataframe), so we'll have to calculate [cosine similarities](https://en.wikipedia.org/wiki/Cosine_similarity) ourselves. Lucky for us, we can [borrow from Simon's open source library](https://github.com/simonw/llm/blob/abcb457b20367ee56e27602e3553bb4bd6a17312/llm/__init__.py#L252):

In [25]:
def cosine_similarity(a, b):
    dot_product = sum(x * y for x, y in zip(a, b))
    magnitude_a = sum(x * x for x in a) ** 0.5
    magnitude_b = sum(x * x for x in b) ** 0.5
    return dot_product / (magnitude_a * magnitude_b)

Now, iterate over every embedding in our Dataframe and calculate the similarity score against our query embedding vector:

In [26]:
comp_df = df.copy()
comp_df['similarity'] = comp_df.apply(
    lambda row : cosine_similarity(
        query_vector,
        row.embedding,
    ),
    axis=1,
)

comp_df.head()

Unnamed: 0,passage,embedding,similarity
0,"# Executive Order on the Safe, Secure, and Tru...","[0.033159319311380386, -0.0433826707303524, 0....",0.791992
1,By the authority vested in me as President by ...,"[0.021663667634129524, -0.044645559042692184, ...",0.760333
2,Section 1. Purpose. Artificial intelligence ...,"[0.011593389324843884, -0.04116535559296608, 0...",0.781458
3,My Administration places the highest urgency o...,"[0.04392296448349953, -0.06537173688411713, 0....",0.789164
4,"In the end, AI reflects the principles of the ...","[0.02728632092475891, -0.023925885558128357, 0...",0.788309


And select the 3 passages with the best similary scores. We'll feed this as context to the LLaMA model.

In [27]:
best_10_matches = comp_df.sort_values("similarity", ascending = False).head(10)
context = "\n".join(best_10_matches.passage.values)

In [28]:
best_10_matches

Unnamed: 0,passage,embedding,similarity
647,"* Chemical, biological, radiological, and nucl...","[0.026140371337532997, -0.06312897801399231, 0...",0.818548
97,(iii) As set forth in subsection 4.3(b)(i) of...,"[0.036299142986536026, -0.03013344295322895, 0...",0.813695
145,"(b) direct continued actions, as appropriate ...","[0.013635944575071335, -0.03427845239639282, 0...",0.811139
365,National Security Affairs that includ,"[0.028963901102542877, -0.039021048694849014, ...",0.809883
661,+ Details of the evaluations conducted for pot...,"[0.014696473255753517, -0.0647835060954094, 0....",0.809461
144,(a) provide guidance to the Department of Def...,"[0.017834855243563652, -0.05159175023436546, 0...",0.808733
6,(a) Artificial Intelligence must be safe and ...,"[0.020928237587213516, -0.0637168437242508, 0....",0.807952
369,(iv) recommendations for the Department of D...,"[-0.0004090855654794723, -0.03406615927815437,...",0.807871
531,#### Security,"[0.019572122022509575, -0.06393688917160034, 0...",0.807331
161,(f) The Secretary of State and the Secretary ...,"[0.023280823603272438, -0.04131070151925087, -...",0.806102


## Ask the LLM

Now that we've selected the top 3 passages, let's feed them into LLaMA 2.

In [29]:
model = llm.get_model("llama2")

Even though we're providing prefixed context to the model, it's helpful to give it a system prompt to guide how it responds. This can help it stay "focussed" on the context and respond in the voice that we expect. The system prompt is open to creativity and experimentation.

In [30]:
system = "You are an assistant. You answer questions in a single \
paragraph about the policy. The provided context \
comes directly from the policy. You MUST use the provided information \
as context. Not all provided information will be helpful, ONLY reference \
information if it is related to my query. You may quote the context \
information if helpful."

Now, feed the context and the query into the model.

In [31]:
from IPython.display import display, Markdown
from pprint import pprint

In [None]:
%%time

print(f"Query: {query}\nContext: {context}")
response = model.prompt(
    f'{context}\n{query}',
    system=system,
)

print(f"Response:\n")
print(pprint(response.text()))

[autoreload of typing_extensions failed: Traceback (most recent call last):
  File "/local/home/lmmcinn/anaconda3/envs/llm_ai_eo/lib/python3.11/site-packages/IPython/extensions/autoreload.py", line 276, in check
    superreload(m, reload, self.old_objects)
  File "/local/home/lmmcinn/anaconda3/envs/llm_ai_eo/lib/python3.11/site-packages/IPython/extensions/autoreload.py", line 500, in superreload
    update_generic(old_obj, new_obj)
  File "/local/home/lmmcinn/anaconda3/envs/llm_ai_eo/lib/python3.11/site-packages/IPython/extensions/autoreload.py", line 397, in update_generic
    update(a, b)
  File "/local/home/lmmcinn/anaconda3/envs/llm_ai_eo/lib/python3.11/site-packages/IPython/extensions/autoreload.py", line 365, in update_class
    update_instances(old, new)
  File "/local/home/lmmcinn/anaconda3/envs/llm_ai_eo/lib/python3.11/site-packages/IPython/extensions/autoreload.py", line 323, in update_instances
    object.__setattr__(ref, "__class__", new)
TypeError: can't apply this __setat

Query: what does it say about defense, security & intelligence?
Context: * Chemical, biological, radiological, and nuclear risks, such as the ways in which advanced AI systems can lower barriers to entry, including for non-state actors, for weapons development, design acquisition, or use.
(iii)  As set forth in subsection 4.3(b)(i) of this section, within 270 days of the date of this order, the Secretary of Defense and the Secretary of Homeland Security shall each provide a report to the Assistant to the President for National Security Affairs on the results of actions taken pursuant to the plans and operational pilot projects required by subsection 4.3(b)(ii) of this section, including a description of any vulnerabilities found and fixed through the development and deployment of AI capabilities and any lessons learned on how to identify, develop, test, evaluate, and deploy AI capabilities effectively for cyber defense.
(b)  direct continued actions, as appropriate and consistent with 

Overall, this looks like it does a good job!

Of course, it's extremely important to keep a human in the loop when referencing government documents. The model may still hallucinate, or it could entirely miss important context. Some of these shortcoming are baked into the model itself, others are implementation details of this Notebook.

If nothing else, this shows a fascinating interface to interact with long, wordy, documents!