In [1]:
from datasets import load_dataset
from transformers import AutoTokenizer, GPTNeoModel
import torch
import numpy as np
np.random.seed(20250410)

Let's download a text dataset on which we'll analyze model embeddings. The `fineweb-edu` dataset was gathered by scraping text off webpages and keeping only those that scored high in an "educational contet" metric.

In [2]:
fw = load_dataset("HuggingFaceFW/fineweb-edu", name="CC-MAIN-2024-10", split="train", streaming=True)

Resolving data files:   0%|          | 0/2110 [00:00<?, ?it/s]

Resolving data files:   0%|          | 0/50 [00:00<?, ?it/s]

To get a sense of what's in the data, let's print a few examples.

In [3]:
n_stream = 20
texts = []
for x in fw:
    texts.append(x["text"])
    if len(texts) > n_stream: break

[f"{s[:200]}..." for s in texts[:10]]

['- It means objects are Garbage Collected more quickly. (Incorrect).\n- Its a good way to make sure all your references are set to null. (Not necessary).\n- Its good practice to implement all the time. (...',
 'CANUSWEST and CANUSWEST North were developed to assist Federal, State/Provincial, local, and Tribal/Aboriginal responders to mitigate the effects of oil and hazardous materials spills on human health ...',
 '– Computer viruses are parasitic programs which are able to replicate themselves, attach themselves to other executables in the computer, and perform some unwanted and often malicious actions. A virus...',
 'For those unfamiliar with Cornish, it is classed as a p-Celtic member of the family of Celtic languages, which was once spoken across much of Europe, and is now restricted to the insular world and Bri...',
 'Democracy is in trouble. No matter what index you look at, the number of countries rated as being fully democratic has declined dramatically over the last twenty ye

The helper function below gets the activations from all the neurons just before the final prediction step. We could have worked with any layer that we want, but this pre-prediction layer should have the highest-level representations. We'll work with a pretrained GPT model, but this conceptually there is nothing tied to this choice.

In [4]:
def extract_embeddings(text, model, tokenizer, layer=-1):
    inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=False)
    with torch.no_grad():
        outputs = model(**inputs, output_hidden_states=True)
    return outputs.hidden_states[layer].mean(axis=(0, 1))


# Load pre-trained model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neo-125m")
model = GPTNeoModel.from_pretrained("EleutherAI/gpt-neo-125m")
model = model.eval()

The block below runs that helper function on the first `n_stream` text strings. Each column of `H` corresponds to a neuron, and each row a document.

In [None]:
from tqdm import tqdm

H = []
for x in tqdm(texts[:n_stream]):
    H.append(extract_embeddings(x, model, tokenizer))
H = torch.stack(H).numpy()


100%|██████████| 20/20 [01:11<00:00,  3.55s/it]


Now, let's define some concepts. The proper, CAV approach would be to train a small classifier to identify samples that represent the topic. We'll use a PCBM-style hack that simply uses text embeddings of strings that represent the concept. In fact, since our data are already text, we don't even need to use any multimodal model. 

In [None]:
concepts_text = ["literature", "history", "education", "government", "geography", "biology", "computing", "mathematics"]

C = []
for x in tqdm(concepts_text):
    C.append(extract_embeddings(x, model, tokenizer))
C = torch.stack(C).numpy()

100%|██████████| 8/8 [00:01<00:00,  6.14it/s]


The "bottleneck" part of the PCBM trains a simple model using coordinates of the original samples with respect to the concept basis. The block below shows how these can be extracted. To understand the equation, remember that the projection of a vector $h$ onto the columns of a matrix $X$ is given by $X \left(X^\top X\right)^{-1} X^{\top}h$. In our case, we want to project onto the rows of $C$ extracted above, so we should consider $C^{\top}\left(C C^\top\right)^{-1} C h$ instead. Further, rather than the projected value, we want the coordinates of the projection with respect to the basis, so we remove the first $C^\top$. Finally, to compute this across many vectors $h$ simultaneously, we can horizontally concatenate them into the matrix $H^\top$, as done below.

In [7]:
import pandas as pd
pd.set_option("display.max_colwidth", 1000)

coordinates = np.linalg.inv(C @ C.T) @ C @ H.T
coordinates_df = pd.DataFrame(np.round(coordinates.T, 2), columns=concepts_text)
coordinates_df["text"] = texts[:n_stream]

coordinates_df

Unnamed: 0,literature,history,education,government,geography,biology,computing,mathematics,text
0,0.29,-0.08,0.18,0.06,-0.08,-0.1,0.48,-0.28,"- It means objects are Garbage Collected more quickly. (Incorrect).\n- Its a good way to make sure all your references are set to null. (Not necessary).\n- Its good practice to implement all the time. (Incorrect).\n- The Garbage Collector calls IDisposable.Dispose() automatically. (Incorrect).\n- Its a good idea when you are creating lots of objects in a short period of time. (There is a better way).\nWhat is it?\nThe IDisposable interface seems to be a commonly misunderstood and misused interface. Some developers like to implement it as a matter of course thinking it is good practice. But it is absolutely not necessary when dealing with managed objects (native .Net objects). In .Net the Garbage Collector (GC) is more than capable of finding and disposing all objects quickly and efficiently.\nTo quote Andrew Troelsen: ""Allocate an object onto the heap using the new keyword and forget about it"".\nThe GC stores lists of objects in generations ""young"" objects have a lower generation n..."
1,-0.39,-0.26,0.12,0.29,0.46,0.08,0.01,-0.05,"CANUSWEST and CANUSWEST North were developed to assist Federal, State/Provincial, local, and Tribal/Aboriginal responders to mitigate the effects of oil and hazardous materials spills on human health and safety, environment, and property by specifying the processes needed to facilitate an effective response to environmental emergency incidents on either side of the British Columbia, Canada/USA border. This plan was developed pursuant to the July of 1994 the Canada-United States Joint Inland Pollution Contingency Plan (the Inland Plan) signed by the Administrator of EPA and the Minister for Department of the Environment which divided the common border between the countries into five regions. EPA Region 10 and Environment Canada Pacific and Yukon Regions developed the CANUSWEST and CANUSWEST NORTH Annexes to the Inland Plan to address issues unique to the area. CANUSWEST and CANUSWEST North are based on the principle of escalation and accordingly recognizes the roles of the local, st..."
2,0.26,-0.29,0.13,0.19,-0.11,-0.04,0.6,-0.22,"– Computer viruses are parasitic programs which are able to replicate themselves, attach themselves to other executables in the computer, and perform some unwanted and often malicious actions. A virus is not able to spread itself to another computers, some user actions are needed for it to infect a new computer. Downloading and running software from untrusted sources, inserting an USB drive without a previous scan–remember always disable the AutoRun feature for the drives as CD-ROMs, DVD-ROMs– , downloading and running emails or IM attachments even from known persons, can put you in the nasty situation to have an infected computer. Always when you deal with these situations and to prevent computer infections, scan before to run.\nThe best scanners in my opinion are multi-engine online scanners like virustotal.com or novirusthanks.org. The links of these scanners and many more are on the home page.\nThe main three features of a virus are :\n– the replication mechanism search and fin..."
3,0.35,-0.21,0.06,-0.05,0.3,-0.06,-0.02,-0.05,"For those unfamiliar with Cornish, it is classed as a p-Celtic member of the family of Celtic languages, which was once spoken across much of Europe, and is now restricted to the insular world and Brittany: the only surviving languages being Cornish, Welsh and Breton (all p-Celtic), and Manx, Scots Gaelic and Irish (all q-Celtic).\nThe relationship between these two branches is illustrated by p-Celtic words such as peduar W and their q-Celtic equivalents: cethar [Ir].\nThe etymology, morphology, syntax and phonology of Cornish and the other Celtic languages ultimately derive from a putative proto-Indo European or proto-Celtic language or family of languages spoken in Britain in pre-history.\nCornish Onomastics is the study of onomastics (personal name data) and toponymics (place name data) in relation to Cornwall in the Early Medieval Period (350 CE to 1000 CE). These names are almost completely in the Cornish language (the Brittonic used in Cornwall and a relative of Welsh and Bre..."
4,0.16,-0.11,-0.06,0.42,0.21,-0.29,0.24,-0.24,"Democracy is in trouble. No matter what index you look at, the number of countries rated as being fully democratic has declined dramatically over the last twenty years. Worryingly, this trend shows no signs of abating. Some measures even suggest that a greater number of countries became more authoritarian in 2022 than in any year since 1990. If the decline of democracy continues at the present pace, less than 5% of the world’s population will live in a full democracy by 2026. This process has had tremendous consequences for those living in backsliding states, including greater censorship and human rights abuses. It also represents a challenge to countries that remain democratic, which increasingly risk finding themselves isolated in a predominantly authoritarian world. Given that autocracies are more likely to trigger conflicts, spread disinformation, and engage in cross-border cyber-attacks, this represents an existential threat to democratic life.\nUnderstanding why this process ..."
5,0.15,-0.01,0.14,0.08,0.01,-0.31,0.29,-0.05,"Our cultural identity: Experience the culture and heritage of Cyprus Course Description Culture has the power to transform entire societies, strengthen local communities and forge a sense of identity and belonging for\nOur cultural identity: Experience the culture and heritage of Cyprus\nCulture has the power to transform entire societies, strengthen local communities and forge a sense of identity and belonging for people of all ages. Youth can act as a bridge between cultures and serve as key agents in promoting peace and intercultural understanding.\nCulture is defined as the language, beliefs, values, heritage etc. for any society/country and also identifies the people of each country.\nThis course is addressed to young people and youth workers who want to discover and explore the cultural heritage of Cyprus and its hidden gems that make it a unique part of Europe. During the course you will discover Cyprus’ rich history and culture, explore the traditions and customs of the isl..."
6,0.08,0.03,0.36,0.14,-0.33,-0.21,0.16,0.06,"“The more you empower kids, the more they can do,” said one Providence actor after working with Rhode Island public school students in the Arts/ Literacy Project, based at Brown University’s education department. The following factors are fundamental to the approach, which links local artists with classroom teachers and students to create performances and boost literacy:\nLiteracy and Performance Objectives. All the work of the performance unit–writing, reading, theater activities, rehearsals, and performance–aim toward specific and clearly stated literacy and performance objectives (such as those of New Standards and the National Standards for Arts Education).\nCulminating Performance. All Arts/Literacy units culminate in a student performance in front of an audience including at least students and teachers.\nReturn to Text. At various points during the unit, Arts/Literacy classrooms return to the original text to deepen student comprehension or writing development and to evaluate..."
7,0.41,-0.02,0.23,0.18,-0.15,-0.13,0.11,-0.21,"Rhetorical analysis is not for the faint of heart. It’s for teachers and instructors who don’t mind students feeling uncomfortable enough to take a risk. Rhetorical analysis has changed everything for me since I’ve brought these concepts into the classroom.\nThe activity below is used to simply introduce the concept to students using a news article or a simple short text. Once we begin this conversation, their work gets better, they have more passion for analyzing literature, and they have the words to discuss this in-depth conversation.\nIf you like this activity, check out more of the assignments on Teacher Pay Teacher and see what else might work for your students.\nDescription of Rhetorical Appeals Activity:\nThis worksheet is meant to give you a beginner’s knowledge of how to discuss and identify rhetorical appeals in an expository text. Expository texts are any text that is non-fiction: newspaper articles, informational journals, blogs, magazine articles are just the beginnin..."
8,-0.07,-0.16,0.53,0.17,-0.03,-0.24,0.23,0.02,"Sport plays an important role in the educational process since the TPS when the child’s need for movement is answered by daily activities within the school.\nTPS and PS practice sport with their teacher during playful sessions.\nMS and GS children experience several sports during the year in order to develop physical, social and strategic skills. Those learning cycles (swimming, judo, climbing, orienteering race…) last for around 12 sessions and take place outside of the school with professionals. A lot of different domains can be approached with happiness and effectiveness as long as the teaching methods used are adapted to preschoolers."
9,0.09,-0.03,0.42,0.22,-0.39,-0.31,0.3,0.08,There are a large number of students who have difficulty learning material using traditional teaching methods. Learning disabilities vary from mild forms such as attention deficit disorder to more severe disabilities like autism and mental retardation. Incorporating art into the curriculum of students with learning disabilities can be a useful tool. Students with disabilities are not students who are incapable of learning but instead are students who may need material presented to them using alternative methods. Methods that incorporate art can be very successful for these children.Many students with disabilities are separated from regular students for either part of all of the school day. These students spend a great deal of time focusing on remedial skills and learning new skills to help them catch up with the rest of the class. For students with learning disabilities the knowledge that they are not able to function at the same level as other students can be very discouraging. In...
