Before starting, please run the following code in a terminal (in a new tab)

In [None]:
module purge
module load ollama
export HUGGINGFACE_HUB_CACHE=/pbs/throng/training/universite-hiver/cache/huggingface
export OLLAMA_MODELS=/pbs/throng/training/universite-hiver/cache/ollama/models
export OLLAMA_HOST=127.0.0.1:65383
ollama serve &
# ollama pull gpt-oss:20b

# A Starting Example

Retrieval-Augmented Generation (RAG) is a technique that enhances the output of language models by grounding their responses in external, relevant documents. Instead of relying solely on model parameters, RAG systems first retrieve context passages related to a user query and then generate an answer based on that evidence. This improves factual accuracy, transparency, and adaptability to custom datasets.

In this example, we implement a simple RAG pipeline using U.S. presidential speeches as our knowledge base. We'll walk through the following steps:

- **Data Preparation**: Load and explore a dataset of presidential speech transcripts.
- **TF-IDF Retrieval**: Convert speeches into vector representations using TF-IDF (keywords) and retrieve the most relevant ones based on cosine similarity.
- **Query Answering with Ollama**: Use a locally hosted language model (via Ollama) to generate a natural language answer using the retrieved speech excerpts as context.
- **Keyword Match Exploration**: Inspect which terms from the query appear in each retrieved document to better understand the retrieval step.
- **Embedding-based Retrieval**: Retrieve sematically relevant documents intead of keywords.

This setup demonstrates a lightweight but effective RAG workflow suitable for text collections in the social sciences, especially where domain-specific knowledge is locked inside qualitative documents such as interviews, policy statements, or historical texts.

In [1]:
import numpy as np
import pandas as pd

from ollama import Client

from sentence_transformers import SentenceTransformer

from sklearn.metrics.pairwise import cosine_similarity
from sklearn.feature_extraction.text import TfidfVectorizer

In [2]:
data1 = pd.read_csv('data/speeches.csv')

In [4]:
data1.head()

Unnamed: 0,title,date,transcript,president
0,"July 2, 1807: Proclamation in Response to the ...",1807-07-02T13:03:58-04:56,During the wars which for some time have unhap...,Thomas Jefferson
1,"September 26, 1963: Address at the Mormon Tabe...",1963-09-26T13:00:00-04:00,"Senator Moss, my old colleague in the United S...",John F. Kennedy
2,"October 26, 2020: Swearing-In Ceremony of the ...",2020-10-26T21:00:00-04:00,THE PRESIDENT: Thank you very much. Appreciate...,Donald Trump
3,"December 3, 1900: Fourth Annual Message",1900-12-03T13:00:00-05:00,To the Senate and House of Representatives: \n...,William McKinley
4,"December 2, 1806: Sixth Annual Message",1806-12-02T13:03:58-04:56,TO THE SENATE AND HOUSE OF REPRESENTATIVES OF ...,Thomas Jefferson


In [3]:
len(data1)

1054

In [5]:
data1['transcript'].values[0]

'During the wars which for some time have unhappily prevailed among the powers of Europe the United States of America, firm in their principles of peace, have endeavored, by justice, by a regular discharge of all their national and social duties, and by every friendly office their situation has admitted, to maintain with all the belligerents their accustomed relations of friendship, hospitality, and commercial intercourse. Taking no part in the questions which animate these powers against each other, nor permitting themselves to entertain a wish but for the restoration of general peace, they have observed with good faith the neutrality they assumed, and they believe that no instance of a departure from its duties can be justly imputed to them by any nation. A free use of their harbors and waters, the means of refitting and of refreshment, of succor to their sick and suffering, have at all times and on equal principles been extended to all, and this, too, amidst a constant recurrence of

In [6]:
vectorizer = TfidfVectorizer(stop_words='english', max_df=0.5, min_df=5, max_features=1000)
tfidf_matrix = vectorizer.fit_transform(data1['transcript'])

In [7]:
tfidf_matrix.shape

(1054, 1000)

In [None]:
vectorizer.get_feature_names_out()

In [8]:
def retrieve_relevant_speeches(data, tfidf_matrix, query, top_k=5):
    query_vec = vectorizer.transform([query])
    similarity = cosine_similarity(query_vec, tfidf_matrix).flatten()
    top_indices = np.argsort(similarity)[-top_k:][::-1]
    return data.iloc[top_indices], similarity[top_indices]

In [9]:
def retrieve_relevant_speeches_with_matches(data, tfidf_matrix, query, top_k=5):
    query_vec = vectorizer.transform([query])
    similarity = cosine_similarity(query_vec, tfidf_matrix).flatten()
    top_indices = np.argsort(similarity)[-top_k:][::-1]
    retrieved_df = data.iloc[top_indices].copy()
    
    # Get matched words
    matched_words_list = []
    query_features = query_vec.nonzero()[1]
    query_terms = set([vectorizer.get_feature_names_out()[i] for i in query_features])

    for idx in top_indices:
        speech_vec = tfidf_matrix[idx]
        speech_features = speech_vec.nonzero()[1]
        speech_terms = set([vectorizer.get_feature_names_out()[i] for i in speech_features])
        common_terms = query_terms.intersection(speech_terms)
        matched_words_list.append(list(common_terms))

    retrieved_df["matched_words"] = matched_words_list
    retrieved_df["similarity"] = similarity[top_indices]
    return retrieved_df

In [10]:
def build_context(speeches_df, max_chars=1000):
    context = ""
    for _, row in speeches_df.iterrows():
        context += f"\n---\nSpeech by {row['president']}:\n{row['transcript'][:max_chars]}...\n"
    return context

In [3]:

host='127.0.0.1:65382'
model="gpt-oss:20b"

client = Client(
    host='host',
)

In [6]:
answer = client.chat(model, messages=[
        {
            'role': 'user',
            'content': f"""What did president Grover Cleveland say about immigration?"""
        },
    ])

print(answer.message.content)

ConnectionError: Failed to connect to Ollama. Please check that Ollama is downloaded, running and accessible. https://ollama.com/download

In [12]:
def query_ollama(context, question, client):
    messages = [
        {
            'role': 'user',
            'content': f"""Answer the following question based on the context:\n\nContext:\n{context}\n\nQuestion: {question}\n\nAnswer:"""
        },
    ]

    for part in client.chat(model, messages=messages, stream=True):
        print(part.message.content, end='', flush=True)

In [4]:
query = "What did the presidents say about immigration?"

# Retrieve
retrieved, scores = retrieve_relevant_speeches(data1, tfidf_matrix, query)
context = build_context(retrieved)

NameError: name 'retrieve_relevant_speeches' is not defined

In [14]:
print(context)


---
Speech by Donald Trump:
THE PRESIDENT:Just a short time ago, I had the honor of presiding over the swearing-in of five new great American citizens.It was a beautiful ceremony and a moving reminder of our nations proud history of welcoming legal immigrants from all over the world into our national family.
 I told them that the beauty and majesty of citizenship is that it draws no distinctions of race, or class, or faith, or gender or background.All Americans, whether first generation or tenth generation, are bound together in love and loyalty, friendship and affection.
 We are all equal.We are one team, and one people, proudly saluting one great American flag.We believe in a safe and lawful system of immigration, one that upholds our laws, our traditions, and our most cherished values.
 Unfortunately, our immigration system has been badly broken for a very long time.Over the decades, many Presidents and many lawmakers have come and gone, and no real progress has been made on immigr

In [15]:
# Generate answer
query_ollama(context, query, client)

**What the Presidents said about immigration**

| President | Key points from their speech |
|-----------|------------------------------|
| **Donald Trump** | • Praised the welcoming of legal immigrants, stressing that citizenship “draws no distinctions of race, class, faith, gender or background.”  <br>• Described the U.S. as “one team, one people.”  <br>• Stated that the current immigration system is “badly broken” and that a safe, lawful system that upholds U.S. laws and traditions is needed. |
| **George W. Bush** | • Emphasised compassion and humanity in reforming immigration.  <br>• Claimed that a stronger immigration framework would make America “more compassionate and more humane.”  <br>• Highlighted the need for a system that protects the nation’s borders while treating immigrants with dignity. |
| **Barack Obama** | • Reminded Americans of their long tradition of welcoming immigrants and how it has kept the nation youthful and entrepreneurial.  <br>• Declared that the immigra

**What the Presidents said about immigration**

| President | Key points from their speech |
|-----------|------------------------------|
| **Donald Trump** | • Praised the welcoming of legal immigrants, stressing that citizenship “draws no distinctions of race, class, faith, gender or background.”  <br>• Described the U.S. as “one team, one people.”  <br>• Stated that the current immigration system is “badly broken” and that a safe, lawful system that upholds U.S. laws and traditions is needed. |
| **George W. Bush** | • Emphasised compassion and humanity in reforming immigration.  <br>• Claimed that a stronger immigration framework would make America “more compassionate and more humane.”  <br>• Highlighted the need for a system that protects the nation’s borders while treating immigrants with dignity. |
| **Barack Obama** | • Reminded Americans of their long tradition of welcoming immigrants and how it has kept the nation youthful and entrepreneurial.  <br>• Declared that the immigration system is “broken” and that many families who follow the rules are harmed by those who do not.  <br>• Called for reforms that would give legal pathways for undocumented immigrants to take on responsibilities and integrate fully. |

(John Cleveland’s 1891 immigration bill is not a presidential speech; it is a legislative proposal, so it was omitted.)


In [16]:
query = "What did the presidents say about immigration?"
results_df = retrieve_relevant_speeches_with_matches(data1, tfidf_matrix, query)

for _, row in results_df.iterrows():
    print(f"\n➡️ {row['president']}:")
    print(f"Similarity: {row['similarity']:.4f}")
    print(f"Matched words: {', '.join(row['matched_words'])}")
    print(f"Excerpt: {row['transcript'][:1500]}...\n")


➡️ Donald Trump:
Similarity: 0.5530
Matched words: immigration
Excerpt: THE PRESIDENT:Just a short time ago, I had the honor of presiding over the swearing-in of five new great American citizens.It was a beautiful ceremony and a moving reminder of our nations proud history of welcoming legal immigrants from all over the world into our national family.
 I told them that the beauty and majesty of citizenship is that it draws no distinctions of race, or class, or faith, or gender or background.All Americans, whether first generation or tenth generation, are bound together in love and loyalty, friendship and affection.
 We are all equal.We are one team, and one people, proudly saluting one great American flag.We believe in a safe and lawful system of immigration, one that upholds our laws, our traditions, and our most cherished values.
 Unfortunately, our immigration system has been badly broken for a very long time.Over the decades, many Presidents and many lawmakers have come and gone, 

In [18]:
query = "What did the presidents say about oil?"
results_df = retrieve_relevant_speeches_with_matches(data1, tfidf_matrix, query)

for _, row in results_df.iterrows():
    print(f"\n➡️ {row['president']}:")
    print(f"Similarity: {row['similarity']:.4f}")
    print(f"Matched words: {', '.join(row['matched_words'])}")
    print(f"Excerpt: {row['transcript'][:1500]}...\n")


➡️ Barack Obama:
Similarity: 0.6592
Matched words: oil
Excerpt: Good evening. As we speak, our nation faces a multitude of challenges. At home, our top priority is to recover and rebuild from a recession that has touched the lives of nearly every American. Abroad, our brave men and women in uniform are taking the fight to al Qaeda wherever it exists. And tonight, Ive returned from a trip to the Gulf Coast to speak with you about the battle were waging against an oil spill that is assaulting our shores and our citizens.
 On April 20th, an explosion ripped through BP Deepwater Horizon drilling rig, about 40 miles off the coast of Louisiana. Eleven workers lost their lives. Seventeen others were injured. And soon, nearly a mile beneath the surface of the ocean, oil began spewing into the water.
 Because there has never been a leak this size at this depth, stopping it has tested the limits of human technology. Thats why just after the rig sank, I assembled a team of our nations best scien

In [19]:
query = "What did the presidents say about oil?"

# Retrieve
retrieved, scores = retrieve_relevant_speeches(data1, tfidf_matrix, query)
context = build_context(retrieved)

# Generate answer
query_ollama(context, query, client)

**Oil‑related messages from the presidents in the excerpts**

| President | What was said about oil |
|-----------|------------------------|
| **Barack Obama** | • Recounted the **Deepwater Horizon** oil spill, noting that an explosion on the rig “ripped through” it and that a “leak of unprecedented size at that depth” had begun spewing oil into the Gulf. <br>• Described the spill as an “assault” on the coast and citizens and emphasized that stopping it “has tested the limits of human technology.” <br>• Mentioned he had assembled a top‑tier scientific and engineering team to confront the spill. |
| **Jimmy Carter** | • Stated that the United States faced an “unprecedented” energy crisis that would become “the greatest challenge” in the nation’s history. <br>• Urged that the country must balance its energy demand with dwindling resources, otherwise “the future will control us.” <br>• Promised to present a comprehensive energy proposal to Congress and warned that many of the measures wou

**Oil‑related messages from the presidents in the excerpts**

| President | What was said about oil |
|-----------|------------------------|
| **Barack Obama** | • Recounted the **Deepwater Horizon** oil spill, noting that an explosion on the rig “ripped through” it and that a “leak of unprecedented size at that depth” had begun spewing oil into the Gulf. <br>• Described the spill as an “assault” on the coast and citizens and emphasized that stopping it “has tested the limits of human technology.” <br>• Mentioned he had assembled a top‑tier scientific and engineering team to confront the spill. |
| **Jimmy Carter** | • Stated that the United States faced an “unprecedented” energy crisis that would become “the greatest challenge” in the nation’s history. <br>• Urged that the country must balance its energy demand with dwindling resources, otherwise “the future will control us.” <br>• Promised to present a comprehensive energy proposal to Congress and warned that many of the measures would be unpopular but necessary. <br>• Later, after the proposal, highlighted that the **Department of Energy** had been created, led by James Schlesinger, and that a final National Energy Plan was near completion, underscoring the political will to address oil and energy supply. |
| **Gerald Ford** | • Outlined a plan to make the United States **independent of foreign oil** by 1985, stressing that the country was “increasingly at the mercy” of foreign fuel supplies. <br>• Cited facts: the U.S. imported about 37 % of its petroleum in 1975, and without action would import more than half in ten years, all at prices set by other countries. <br>• Highlighted the growing financial drain—$25 billion per year spent on foreign oil versus $3 billion five years earlier—and warned that the cost would rise dramatically if no action were taken. |

**Bottom line:**  
- Obama focused on the emergency response to the Gulf‑Coast oil spill.  
- Carter framed oil as part of a broader national energy strategy, calling for legislation and a new Energy Department.  
- Ford warned of the risks of continued foreign‑oil dependence and called for a rapid shift toward domestic independence.

In [20]:
query = "What did the presidents say about immigration from mexico?"

# Retrieve
retrieved, scores = retrieve_relevant_speeches(data1, tfidf_matrix, query)
context = build_context(retrieved)

# Generate answer
query_ollama(context, query, client)

**Answer:**  
None of the quoted presidents spoke directly about immigration coming from Mexico in the excerpts provided.  

- **Donald Trump** talked generally about legal immigration from “all over the world” and the need for a “safe and lawful system,” but he made no reference to Mexican immigration.  

- **James K. Polk** discussed the annexation of Texas, the borders of New Mexico and California, and the treaties with Mexico, but he did not address the flow of people from Mexico into the United States.  

- **John Tyler** mentioned the Mexican government’s “highly offensive” language concerning the annexation of Texas, yet he made no comment on immigration from Mexico.

**Answer:**  
None of the quoted presidents spoke directly about immigration coming from Mexico in the excerpts provided.  

- **Donald Trump** talked generally about legal immigration from “all over the world” and the need for a “safe and lawful system,” but he made no reference to Mexican immigration.  

- **James K. Polk** discussed the annexation of Texas, the borders of New Mexico and California, and the treaties with Mexico, but he did not address the flow of people from Mexico into the United States.  

- **John Tyler** mentioned the Mexican government’s “highly offensive” language concerning the annexation of Texas, yet he made no comment on immigration from Mexico.

In [21]:
query = "What did the presidents say about immigration from mexico?"
results_df = retrieve_relevant_speeches_with_matches(data1, tfidf_matrix, query)

for _, row in results_df.iterrows():
    print(f"\n➡️ {row['president']}:")
    print(f"Similarity: {row['similarity']:.4f}")
    print(f"Matched words: {', '.join(row['matched_words'])}")
    print(f"Excerpt: {row['transcript'][:1500]}...\n")


➡️ Donald Trump:
Similarity: 0.4273
Matched words: immigration
Excerpt: THE PRESIDENT:Just a short time ago, I had the honor of presiding over the swearing-in of five new great American citizens.It was a beautiful ceremony and a moving reminder of our nations proud history of welcoming legal immigrants from all over the world into our national family.
 I told them that the beauty and majesty of citizenship is that it draws no distinctions of race, or class, or faith, or gender or background.All Americans, whether first generation or tenth generation, are bound together in love and loyalty, friendship and affection.
 We are all equal.We are one team, and one people, proudly saluting one great American flag.We believe in a safe and lawful system of immigration, one that upholds our laws, our traditions, and our most cherished values.
 Unfortunately, our immigration system has been badly broken for a very long time.Over the decades, many Presidents and many lawmakers have come and gone, 

In [22]:
query = "What did the presidents say about freedom of speech?"

# Retrieve
retrieved, scores = retrieve_relevant_speeches(data1, tfidf_matrix, query)
context = build_context(retrieved)

# Generate answer
query_ollama(context, query, client)

None of the four presidents’ speeches in the excerpt touch on freedom of speech.  

* Andrew Johnson’s remarks focus on the status of Arkansas and the legality of the Reconstruction Acts, not on the First‑Amendment right.  
* Ronald Reagan’s two speeches address international peace and a light‑hearted university visit, again without reference to free‑speech issues.  
* Richard Nixon’s address deals with the Watergate investigation and the press’s coverage of it, but he never speaks about the constitutional right to speak freely.  
* Lyndon B. Johnson’s civil‑rights address centers on voting rights in Selma, not on the broader freedom of expression.

So, in the context provided, no president made any statement about freedom of speech.

In [23]:
print(context)


---
Speech by Andrew Johnson:
To the House of Representatives:
I return without my signature a bill entitled "An act to admit the State of Arkansas to representation in Congress."
The approval of this bill would be an admission on the part of the Executive that the "Act for the more efficient government of the rebel States," passed March 2, 1867, and the acts supplementary thereto were proper and constitutional. My opinion, however, in reference to those measures has undergone no change, but, on the contrary, has been strengthened by the results which have attended their execution. Even were this not the case, I could not consent to a bill which is based upon the assumption either that by an act of rebellion of a portion of its people the State of Arkansas seceded from the Union, or that Congress may at its pleasure expel or exclude a State from the Union, or interrupt its relations with the Government by arbitrarily depriving it of representation in the Senate and House of Representa

# Chunking and Embeddings

In [24]:

embedder = SentenceTransformer('all-MiniLM-L6-v2')  # Small and fast

  from .autonotebook import tqdm as notebook_tqdm


In [25]:
def chunk_speeches_by_paragraph(df):
    chunks = []
    metadata = []
    for i, row in df.iterrows():
        paragraphs = [p.strip() for p in row['transcript'].split('\n') if len(p.strip()) > 50]
        for para in paragraphs:
            chunks.append(para)
            metadata.append({
                'president': row['president'],
                'speech_id': i
            })
    return chunks, metadata

chunks, meta = chunk_speeches_by_paragraph(data1)

In [26]:
chunks[:5]

['During the wars which for some time have unhappily prevailed among the powers of Europe the United States of America, firm in their principles of peace, have endeavored, by justice, by a regular discharge of all their national and social duties, and by every friendly office their situation has admitted, to maintain with all the belligerents their accustomed relations of friendship, hospitality, and commercial intercourse. Taking no part in the questions which animate these powers against each other, nor permitting themselves to entertain a wish but for the restoration of general peace, they have observed with good faith the neutrality they assumed, and they believe that no instance of a departure from its duties can be justly imputed to them by any nation. A free use of their harbors and waters, the means of refitting and of refreshment, of succor to their sick and suffering, have at all times and on equal principles been extended to all, and this, too, amidst a constant recurrence o

In [27]:
len(chunks)

53843

In [28]:
chunk_embeddings = embedder.encode(chunks, convert_to_tensor=True, show_progress_bar=True)

Batches: 100%|██████████| 1683/1683 [00:09<00:00, 177.62it/s]


In [83]:
def retrieve_with_embeddings(query, chunk_embeddings, top_k=5):
    query_embedding = embedder.encode([query], convert_to_tensor=True)
    sims = cosine_similarity(query_embedding.cpu(), chunk_embeddings.cpu())[0]
    top_indices = np.argsort(sims)[-top_k:][::-1]

    results = []
    for idx in top_indices:
        results.append({
            "transcript": chunks[idx],
            "similarity": float(sims[idx]),
            "president": meta[idx]['president'],
            "speech_id": meta[idx]['speech_id']
        })
    return results

In [30]:
query = "How did presidents talk about freedom of speech?"
results = retrieve_with_embeddings(query, chunk_embeddings)

for res in results:
    print(f"\n➡️ {res['president']} (Similarity: {res['similarity']:.4f})")
    print(res['transcript'])


➡️ Lyndon B. Johnson (Similarity: 0.6332)
Yet we know that he did not speak only for America. We know that the four freedoms are not secure in America when they are violently denied elsewhere in the world. We know, too, that it requires more than speeches to resist the international enemies of freedom. We know that men respond to deeds when they are deaf to words. Even the precious word "freedom" may become empty to those without the means to use it.

➡️ George W. Bush (Similarity: 0.5646)
You know, Presidents can try to avoid hard decisions and therefore avoid controversy. That's just not my nature. I'm the kind of person that, you know, is willing to take on hard tasks, and in times of war people get emotional; I understand that. Never really, you know, spent that much time, frankly, worrying about the loud voices. I of course hear them, but they didn't affect my policy, nor did they affect -- affect how I made decisions.

➡️ Lyndon B. Johnson (Similarity: 0.5582)
We have freedom of

In [39]:
def build_chunck_context(speeches_df):
    context = ""
    for row in speeches_df:
        context += f"\n---\nPart of speech by {row['president']}:\n{row['transcript']}\n"
    return context

In [32]:
query = "How did presidents talk about freedom of speech?"

# Retrieve
results = retrieve_with_embeddings(query, chunk_embeddings, top_k=10)
context = build_chunck_context(results)

# Generate answer
query_ollama(context, query, client)

**Presidents’ framing of “freedom of speech”**

| President | How the speech frames the issue | Key take‑away |
|-----------|---------------------------------|---------------|
| **Lyndon B. Johnson** | *“We have freedom of speech… we have freedom of assembly… I have observed no restraints”* | Johnson presented free‑speech rights as a constitutional guarantee that is actively upheld and not subject to government limits. |
| **Franklin D. Roosevelt** | *“The first is freedom of speech and expression everywhere in the world.”* | Roosevelt positioned free speech as the foremost of the four “freedoms” he championed, an international principle that should be protected everywhere. |
| **Richard M. Nixon** | *“The principle of confidentiality is absolutely essential… we need brutal candor to bring warring factions to the peace table.”* | Nixon focused on the *confidentiality* of presidential communications, arguing that candid, unfiltered discussion inside the executive branch is essential to 

**Presidents’ framing of “freedom of speech”**

| President | How the speech frames the issue | Key take‑away |
|-----------|---------------------------------|---------------|
| **Lyndon B. Johnson** | *“We have freedom of speech… we have freedom of assembly… I have observed no restraints”* | Johnson presented free‑speech rights as a constitutional guarantee that is actively upheld and not subject to government limits. |
| **Franklin D. Roosevelt** | *“The first is freedom of speech and expression everywhere in the world.”* | Roosevelt positioned free speech as the foremost of the four “freedoms” he championed, an international principle that should be protected everywhere. |
| **Richard M. Nixon** | *“The principle of confidentiality is absolutely essential… we need brutal candor to bring warring factions to the peace table.”* | Nixon focused on the *confidentiality* of presidential communications, arguing that candid, unfiltered discussion inside the executive branch is essential to national security. |
| **George W. Bush** | *“I’m the kind of person that… is willing to take on hard tasks… they didn’t affect how I made decisions.”* | Bush’s remarks are more about decision‑making in wartime than about free speech itself; he stresses that public opinion or “loud voices” do not dictate policy. |
| **Joe Biden** | *“No responsible American President could remain silent when basic human rights are being so blatantly violated.”* and *“Speaks a silent universal language of hope… anyone who seeks and speaks freedom can understand.”* | Biden links free speech to the broader human‑rights agenda, suggesting that silence in the face of violations is unacceptable and that speech itself is a universal, hopeful symbol. |
| **Donald Trump** | *“Unprecedented assault on free speech… the efforts to censor, cancel, and blacklist… are wrong and dangerous.”* | Trump’s address is a defense against what he sees as rising censorship and a call to protect speech from being silenced by “cancel culture.” |

### Common themes

* **Core democratic value** – Every speaker underscores that free speech is foundational to democracy and to the American identity.  
* **Protection vs. threat** – Roosevelt, Johnson, Biden, and Trump all defend it against external threats (censorship, authoritarianism, or wartime repression).  
* **Internal vs. external focus** – Johnson and Nixon discuss how speech is treated within the government, while Roosevelt, Trump, and Biden look at its role in the broader world or in civil society.  
* **Action over rhetoric** – Johnson and Biden highlight that speech should translate into deeds, and Trump warns that rhetoric alone is insufficient if it leads to silencing.

In short, presidents have used the phrase *freedom of speech* to reaffirm its constitutional status, to warn against its suppression, and to link it to the larger mission of protecting liberty, human rights, and democratic governance.



In [53]:
query = "Which president talked about immigration and immigrants positively in their speech?"

# Retrieve
results = retrieve_with_embeddings(query, chunk_embeddings, top_k=10)
context = build_chunck_context(results)

# Generate answer
query_ollama(context, query, client)

**Donald Trump** – In his statement about the swearing‑in ceremony, Trump highlighted that the President was “welcoming legal immigrants from all over the world into our national family,” framing immigration in a positive light.

In [54]:
print(context)


---
Part of speech by Herbert Hoover:
THE PRESIDENT made the following statement:

---
Part of speech by Donald Trump:
THE PRESIDENT:Just a short time ago, I had the honor of presiding over the swearing-in of five new great American citizens.It was a beautiful ceremony and a moving reminder of our nations proud history of welcoming legal immigrants from all over the world into our national family.

---
Part of speech by Herbert Hoover:
THE PRESIDENT said:

---
Part of speech by Herbert Hoover:
THE PRESIDENT said:

---
Part of speech by Andrew Johnson:
I beg to call your attention to that correspondence, and especially to that part of it which refers to the conversation between the President and General Grant at the Cabinet meeting on Tuesday, the 14th of January, and to request you to state what was said in that conversation.
Very respectfully, yours,
ANDREW JOHNSON.
WASHINGTON, D.C., February 5, 1868.
The PRESIDENT.

---
Part of speech by Gerald Ford:
I have heard many inspiring Pres