# Text Analysis of Presidential Podcasts 

*By Max Orenstein & Julia Keswin*

<img src="images/cover_image.avif" alt="Cover Image" width="1200" />

<img src="images/cover_image3.avif" alt="Cover Image" width="1200" />

## Introduction

Explaining why podcasts played such a significant role in the 2024 election to your parents can be challenging. Not only do you have to explain what a "Call Her Daddy" is and why you know so much about it, you also need to explain why it was worth the valuable time of both presidential candidates to participate in this new influential form of media and why it was so successful. We're not even going to attempt the first explanation, but the latter is more manageable. 

In recent years online podcasting has rose to challenge traditional news networks as arbiters of political discourse. According to survey data from Edison Research in early 2023, 75% of Americans ages 12 and older listened to online audio in the past month, while 70% have listened in the past week. Meanwhile, a Pew Research survey from just before the 2024 election found that about one-in-five Americans – including a much higher share of adults under 30 (37%) – say they regularly get news from influencers on social media. More anecdotal claims on the influence of new media on political have emphasized the dominance of Trump amongst the "Joe Rogan Demographic" referring to a younger, overwhelmingly male, and often independent-minded following that listen to the host of the world's most popular podcast "The Joe Rogan Experience." On the theoretical side, this claim can be further substantiated by parasocial relationship theory, which suggests that audiences develop one-sided emotional connections with media personalities, as if they were real-life acquaintances or friends. Podcasts, with their conversational tone, extended format, and intimate delivery—often directly into a listener’s headphones—are particularly well-suited to fostering these relationships. This dynamic allows podcast hosts to build trust and influence among their audiences in a way that traditional news anchors or fleeting social media posts often cannot. So, if podcasts really have become this influential in the political sphere it's worth asking the question raised in by David Dowling author of the 2019 book *Podcast Journalism*:

>
> To what extent can podcasting perform as principled, narrative journalism capable of fulfilling media’s duty to democracy? *- Dowling (2019)*
>

In this project we aimed to provide insight into this question using text analysis and Natural Language Processing (NLP) techniques contrasting the podcast appearances of each candidate in the 2024 election with their interviews on traditional news networks. By contrasting the podcast appearances of each 2024 presidential candidate with their interviews on traditional news networks, we aim to better understand how podcasts shaped political narratives and influenced voter behavior.

In this project, we sought to provide insight into the evolving role of podcasts in political communication by leveraging text analysis and Natural Language Processing (NLP) techniques. Specifically, we contrasted the podcast appearances of each 2024 presidential candidate with their interviews on traditional news networks. To address this, we posed the following research questions:  

**1. Structural Differences**  
   - **Question:** How many turns/topics are covered in each format?  
   - *Initial Hypothesis:* Interviews on podcasts feature longer, fewer turns compared to traditional news formats.  

**2. Issue Focus**  
   - **Question:** What percentage of the content is focused on policy issues versus personal anecdotes?  
   - *Initial Hypothesis:* Harris is more issue-focused in her content.  

**3. Opponent Mentions**  
   - **Question:** Which candidate mentions their opponent more frequently?  
   - *Initial Hypothesis:* Trump mentions Harris more often.  

**4. Emotional Language**  
   - **Question:** In which outlet is emotional language utilized more prominently?  
   - *Initial Hypothesis:* Podcasts are more likely to feature emotional language.  

To address these questions, we leveraged Natural Language Processing (NLP) and text analysis techniques, conducting the project entirely in Python. We used Picovoice Falcon layered on OpenAI’s Whisper model to transcribe and diarize audio from podcasts and interviews into text files. The data was then structured into dictionaries with metadata such as transcription title, medium, and tokens. We performed frequency analysis to examine key terms and bigrams, conducted keyness analysis to identify statistically significant word usage across formats, and used Key Words in Context (KWIC) to explore how these words were used. Dispersion plots helped us visualize where specific words occurred in the timeline of interviews or podcasts, while sentiment analysis captured emotional tone.

The data for this project was sourced from YouTube videos, converted to MP3 format. Trump appeared on podcasts like The Joe Rogan Experience, Lex Fridman, and Flagrant, while Harris joined platforms such as Call Her Daddy, Breakfast Club, and Howard Stern. Traditional interviews included appearances on networks like Fox News, CNN, and NBC for Harris, and Bloomberg, NABJ, and Fox News for Trump.

By analyzing these contrasting formats, this project sheds light on how podcasts have transformed political messaging, offering deeper insights into their influence on voter behavior and the broader democratic process.

## Structural Differences

Podcasts provide a unique space where political discussions can transcend the rigid formats of traditional media, allowing candidates to reach audiences through a more conversational dialogue. Podcasts tend to be longer than traditional news interviews and often involve more interjections from the interviewers who are considered more of personalities rather than expected to conform to a certain editorial standard. To assess this *structural* difference between the mediums, we conducted a descriptive analysis to quantify the communication patterns. We began by looking at word count per medium:

<img src="images/avg_word_count_per_title.png" alt="Avg Word Count Per Title" width="400" />

For both candidates, podcasts have significantly higher word counts than interviews. Interviews tend to be more structured and are likely constrained by time limits, reflecting their lower average word count. Across both mediums, Trump has a higher word count than Harris. This could reflect Trump's strategy to dominate the conversation and emphasize his points more thoroughly. 

We also analyzed the average number of turns per title in each medium. Turns refers to a switch in speakers (e.g., between the candidate and the host). Podcasts consistently have more turns than interviews, which reflects the medium’s suitability for facilitating conversation. The average number of turns between Harris and Trump within the same medium is similar, suggesting comparable interaction levels between the candidates and their hosts.

<img src="images/avg_word_count_per_turn_per_title_graph.png" alt="Avg Word Count Per Turn Per Title" width="800" />

Here we can see that there are far more turns in podcasts than interviews. This shouldn't be surprising either since the conversational format of podcast allows for more frequent back and forth. This can allow for dynamic and engaging discussions, helping candidates appear relatable and offering room for nuanced storytelling. However, this lack of structure in some cases might dilute key messages and increase the risk of off-the-cuff remarks being misinterpreted or scrutinized. We also see that  Trump’s average word count per turn in podcasts is larger than Harris’ which might again demonstrate his tendency to maintain conversational control. Finally, we can also look at dispersion plots to get a sense of the differing structure of interview formats. These plots show each time throughout a specific podcast or interview the candidate mentioned a word associated with a particular topic (e.g. economy: tariff, tax, jobs). These plots can give us insight into the structure ofthe different formats by looking at the frequency at which different topics are discussed throughout the interview:

<img src="images/trump_rogan_dispersion.png" alt="Avg Word Count Per Turn Per Title" width="1200" />
<img src="images/trump_bloomberg_dispersion.png" alt="Avg Word Count Per Turn Per Title" width="1200" />

Trump's dispersion plots for his appearance on "The Joe Rogan Experience" and Bloomberg interview reveal some potentially impactful insights. While dicuss a wide variety of issues, the Bloomberg interview seems to stay a bit more on track than the Joe Rogan interview. This makes sense seeing as Bloomberg is a publication that focuses on economic issues, and likely discusses the other issues in relation to the economy. Meanwhile Trump's appearance on Joe Rogan appears to have jumped from topic to topic significantly more frequently. Trump's discussed his use of a rhetorical technique called "the weave" he uses in which he seems to go off track from a particular topic, and then tie things back in at the end. This strategy appears to have been more prevalent in his podcast appearances while the interview format keeps him more focused on one particular topic at a time.

<img src="images/harris_callherdaddy_dispersion.png" alt="Avg Word Count Per Turn Per Title" width="1200" />
<img src="images/harris_cnn_dispersion.png" alt="Avg Word Count Per Turn Per Title" width="1200" />

For Harris we see similar trends to Trump where the interviews appear to stay more on track one issue at a time. However in both formats Harris generally does less weaving than Trump. In the Trump-Rogan podcast for instance from 20,000-25,000 tokens it becomes very difficult to even tell what they're discussing, whereas in the Harris Call Her Daddy Podcast 2000-3000 tokens seems more focused on abortion and healthcare issues. 

Overall, the structural analysis reveals that podcasts enable a more dynamic and conversational format, with higher word counts and frequent speaker turns fostering deeper engagement and relatability. This flexibility allows candidates to weave between topics and connect on a personal level, potentially broadening their appeal to diverse audiences. However, the unstructured nature of podcasts also poses challenges, such as the risk of diluted messaging or off-script remarks being scrutinized. In contrast, the structured nature of traditional interviews keeps discussions focused, which can enhance clarity but limit opportunities for nuanced storytelling.

## Issue Focus

Next, we turn our attention to analyzing how Harris and Trump discuss key political issues. We created a list of 9 words to assess the candidates' focus on specific issues. 

In order to make accurate comparisons, we needed to normalize the data. The raw counts of words in each tokenized dataset are not directly comparable because the total number of words in each medium varies. The dataframe below takes this into account and divides the count of each selected word by the total number of words in each category.

<img src="images/Issue_Focus.png" alt="Issue Focus" width="1200" />

Both candidates discuss key political issues more in interviews than in podcasts. Harris emphasizes "border" significantly in interviews (314.40 mentions per 100,000 words), suggesting a focus on immigration-related topics, while Trump also mentions "border" relatively frequently (129.13 mentions). The focus on these issues in interviews rather than podcasts demonstrates a tendency to address policy-heavy topics in formal settings, reserving podcasts for potentially less structured conversations. 

Given the noticeable difference between the amount of times "border" is used in interviews versus podcasts for both candidates, we employed KWIC analysis to understand the context of how the word is used further. 

'Border' mentions in Trump Interviews

<img src="images/Border_Trump_Interviews.png" alt="Border Trump Interviews" width="800" />

When Trump mentions the border in interviews, he frequently is critiquing the current administration while commenting on his own successes. He repeats "she was the worst border czar in history", demonstrating his usage of intense and emotional language.

'Border' mentions in Trump Podcasts

<img src="images/Border_Trump_Podcasts.png" alt="Border Trump Podcasts" width="800" />

There are many similarities to how Trump talks about the border in interviews and podcasts. "She was the border czar", again shoes Trump's tendency to talk about the other side's failures. Again, we see intense language including "very strong", "worst", "best", "safest", demonstrating no clear difference between how Trump discusses political issues in podcasts and interviews. 

'Border' mentions in Harris Interviews

<img src="images/Border_Harris_Interviews.png" alt="Border Harris Interviews" width="800" />

There are clear differences between how Harris and Trump discuss the border. When Harris mentions the border in interviews, she tends to focus on legislative action and support she has already received from law enforcement. In this selection of text, there is no mention of Trump and we see Harris continue to back up her stance on border crossing, which could show her need to convey her values rather than bash her opponent, as she entered the race later.

'Border' mentions in Harris Podcasts

<img src="images/Border_Harris_Podcasts.png" alt="Border Harris Podcasts" width="800" />

When Harris mentions the border in podcasts, we again see a focus on legislative action. She is defending her stance and because of that we see less emotional and definitive language compared to Trump.

There are clear differences between how often the candidates are talking about key political and issues and how they address them. However, both Harris and Trump seem to communicate about these issues similarly regardless of medium.  

## Opponent Mentions

We also analyzed opponents mentions to determine whether the frequency at which each candidate mentioned their opponent and the context in which they did differed across different formats. We started by looking at raw frequency of the candidates names across the different mediums:

<img src="images/trump_harris_opponent_mentions_raw.png" alt="Opponent Mentions" width="800" />

We see that Harris is far more likely to refers to Trump significantly more than Trump refers to her across all formats (especially considering overall word count difference.) However, Trump does mention her by her first name significantly more in podcast appearances. He also mentions himself a lot in podcasts which is an unexpected finding. We can use KWIC analysis to take a deeper look at Trump's podcast mentions of "kamala":

<img src="images/trump_podcast_kamala_mentions.png" alt="Opponent Mentions" width="800" />

This KWIC reveals some interesting trends with Trumps mentions of his opponents. We see his penchant for creating nicknames such as "comrade kamala", "kamala migrant crime" and "lyin kamala." We also see a sort of ranty-ness and less formal of a tone in many of these lines which may be due to the less formal nature of podcasts. Podcasts may have provided Trump more of an avenue to rant against his opponent than a more formal news interview would allow for. 

We can also delve deeper into the differences between formats by looking at keyness which identifies which words or phrases are disproportionately significant in one text or subset of a corpus compared to another, as opposed to a raw score. In this case we're looking at a comparison of Harris' interview appearances to her podcast appearances:

<img src="images/trump_interview_podcast_keyness.png" alt="Opponent Mentions" width="800" />

We see that Harris is mentioning Trump more in her interview appearances than her podcasts. To consider why this might be happening we can further contextualize this result by looking at a KWIC plot of Trump mentions in Harris Interviews:

<img src="images/Trump_Harris_Interviews.png" alt="Trump Harris Interviews" width="800" />

From this KWIC, we see that Harris's mentions of Trump in interviews are highly critical, targeting his character ("admires dictators," "tried to") and policies ("Trump abortion bans," "hand-selected three members"). This increased bashing in interviews likely reflects the formal tone of the medium, where direct opposition and accountability are expected. Traditional news audiences, often older and more politically engaged, may value assertive critiques as a demonstration of leadership and readiness to confront rivals. Additionally, the time constraints of interviews may push Harris to focus on headline-grabbing attacks to frame the narrative quickly. 

## Emotional Language

Next, we looked to analyze differences in emotional language. We chose 3 positive emotional words (good, hope, and happy) and 3 negative emotional words (bad, hate, worry). We also analyzed the intensity modifier "very". 

<img src="images/Emotional_Language.png" alt="Emotional Language" width="1200" />

Trump uses more emotionally charged language overall, with frequent use of words like "good," "bad," and "very," in both interviews and podcasts. His use of "very" stands out, with consistently high frequencies across formats (589.52 mentions per 100,000 words in interviews and 563.27 in podcasts). In contrast, Harris uses "very" more in interviews than podcasts but generally employs emotional language less frequently than Trump. Diving deeper into Trump's use of emotional language using keyness analysis we can look at the words he's using disproportionately more in podcasts versus interviews.

<img src="images/keyness_hyberbole.png" alt="Very Trump Interviews" width="800" />

As we hypothesized, Trumps vocabulary becomes more "Trumpian" in his podcast appearances. We see him use words like "great", "important", "problem", "big", "terrible", and "fight". This increased hyperbolic language is likely due to the more informal tone of podcasts which gives Trump more leeway to speak how he thinks, accentuating his manneurims. We can further contextualize this use of hyperbole by looking at KWIC plots of Trump's use of the word "Very" in both podcasts and interviews:

**Trump Interviews**

<img src="images/Very_Trump_Interviews.png" alt="Very Trump Interviews" width="800" />

When Trump uses the word "very" in interviews" he is often referencing specific people or specific events. He talks about specific relationships with other people and specific countries. This usage suggests that Trump has done a more formal assessment or reflection in preparation for more structured contexts. Trump uses the word "very" to speak with conviction.

**Trump Podcasts**

<img src="images/Very_Trump_Podcasts.png" alt="Very Trump Podcasts" width="800" />

When Trump uses the word "very" in podcasts, he seems to speak more generally referencing, "powerful countries", "radical left people",  and "those industries." "Very" in podcasts adds an emotional intensity, whereas its use in interviews is more measured and tied to diplomatic discussions. Comparing these KWIC's with those from Harris' interview and podcast appearances reveal some interesting differences as well: 

**Harris Interviews**

<img src="images/Very_Harris_Interviews.png" alt="Very Harris Interviews" width="800" />

In Kamala Harris’s interviews, the use of “very” often emphasizes clarity and decisiveness (“very clear about where I stand” and “very important decisions on behalf”).

**Harris Podcasts**

<img src="images/Very_Harris_Podcasts.png" alt="Very Harris Podcasts" width="800" />

In her podcasts, “very” takes on a more reflective and conversational role, highlighting personal moments, as in “very touched, I was” and “very frank and candid.” The podcast examples show a more emotional and narrative-driven use, connecting with listeners on a deeper level.

The emotional language analysis highlights distinct rhetorical strategies between the candidates across mediums. Trump’s frequent use of emotionally charged words, especially "very," underscores his focus on intensifying his messaging to energize and persuade audiences. Harris, meanwhile, adopts a more reflective tone in podcasts, leveraging personal anecdotes and narrative-driven language to foster emotional connection and relatability. These patterns illustrate how candidates strategically adapt their communication to suit the medium—Trump being able to open up and let his hyperbole go, and Harris seeking to build trust and intimacy with listeners.

## Limitations

Before we conclude, we want to highlight some limitations of our research. First, our sample size is relatively small, with only 11 podcasts featuring Trump, 5 podcasts featuring Harris, and 4 interviews for each candidate. This limited dataset may not capture the full breadth of their communication styles or account for outliers. Additionally, we focused solely on the candidates' responses without analyzing the hosts' questions, which could have significantly influenced the tone, content, and depth of their answers, especially if certain hosts asked more probing or emotionally charged questions. Furthermore, the podcast platforms themselves vary widely in format and audience demographics, which might introduce bias into the types of language and topics emphasized by the candidates. For example, Trump primarily appeared on podcasts aimed at male audiences, while Harris did the opposite. 

## Conclusion

The role of podcasts in the 2024 presidential election underscores the growing importance of alternative media platforms in shaping political discourse. As traditional news outlets face declining trust and relevance among younger, digitally native audiences, podcasts have emerged as a powerful medium for political communication. This raises a key question: To what extent can podcasting perform as principled, narrative journalism capable of fulfilling media’s duty to democracy? Our analysis sought to address this by contrasting how the 2024 presidential candidates, Donald Trump and Kamala Harris, navigated podcasts and traditional interviews, revealing how these platforms influenced their messaging strategies.

Our findings show clear distinctions between the two mediums. Podcasts allowed for more dynamic, emotionally charged, and personal communication, particularly for Trump, who used the informal setting to amplify his rhetoric. Harris, on the other hand, used podcasts to foster a reflective, relational connection with voters. The structural differences between podcasts and interviews, with longer word counts, more speaker turns, and a freer-flowing conversation in podcasts, indicated a shift toward more accessible, less structured political communication. In terms of issue focus, both candidates emphasized policy in interviews, but Trump’s podcasts were characterized by frequent topic shifts, while Harris maintained a more focused approach. Emotional language was notably more prevalent in Trump’s appearances, where hyperbole and intensity became central to his appeal, while Harris used podcasts for more personal storytelling aimed at building trust.

The implications of these findings are significant for both political strategy and media consumption. Candidates are increasingly adapting their messaging to suit the medium, with podcasts offering a space for personalized, emotional appeals that might not resonate in the more formal structure of interviews. This speaks to the potential of podcasts to foster deeper connections with voters, especially in an era where traditional media often struggles to maintain credibility.

Looking forward, future research could explore the long-term impact of podcasts on voter behavior, examining whether their more personal, informal style translates into greater political engagement or polarization. Additionally, the influence of podcast hosts and their audiences, particularly in partisan media spaces, warrants further investigation. Lastly, considering the rise of influencers and independent content creators in politics, understanding how podcasts intersect with democracy will be crucial. As podcasts continue to reshape political communication, their role in shaping public discourse and its implications for democratic processes remain vital areas for exploration.

## References

- Dowling, D. (2024). *Podcast Journalism: The Promise and Perils of Audio Reporting*. Columbia University Press.  
- Edison. (2023, March 3). *The Infinite Dial 2023 from Edison Research with Amazon Music, Wondery, and ART19*. Edison Research.  
  [https://www.edisonresearch.com/infinite-dial-2023-from-edison-research-with-amazon-music-wondery-and-art19/](https://www.edisonresearch.com/infinite-dial-2023-from-edison-research-with-amazon-music-wondery-and-art19/)  
- Liedke, G. S., Luxuan Wang, Michael Lipka, Katerina Eva Matsa, Regina Widjaya, Emily Tomasik, and Jacob. (2024, November 18).  
  *America’s News Influencers*. Pew Research Center.  
  [https://www.pewresearch.org/journalism/2024/11/18/americas-news-influencers/](https://www.pewresearch.org/journalism/2024/11/18/americas-news-influencers/)  
- Schlütz, D., & Hedder, I. (2022). *Aural Parasocial Relations: Host–Listener Relationships in Podcasts*.  
  *Journal of Radio & Audio Media*, 29(2), 457–474.  
  [https://doi.org/10.1080/19376529.2020.1870467](https://doi.org/10.1080/19376529.2020.1870467)  
- *Trump’s success among young men illustrates influence of online “manosphere.”* (2024, November 25). PBS News.  
  [https://www.pbs.org/newshour/show/trumps-success-among-young-men-illustrates-influence-of-online-manosphere](https://www.pbs.org/newshour/show/trumps-success-among-young-men-illustrates-influence-of-online-manosphere)  


