# Functions

In [1]:
function tokenize(s::AbstractString)
    tokens = replace(s, r"[,\.\":;!\$\(\)\?\[\]]" => "") #remove non-apostrophe
    tokens = replace(tokens, r"[-—]" => " ") #replace dashes with spaces
    tokens = replace(tokens, r"(THE PRESIDENT)" => "") #remove references to president speaking
    tokens = split(lowercase(tokens))
end

tokenize (generic function with 1 method)

In [61]:
function find_lines_with_word(s::AbstractString, word::AbstractString)
    line_num = 0
    for line in split(s, "\n")
        line_num += 1
        if count(x->occursin(word, x), tokenize(line)) > 0
            println(line_num, ": ", line, "\n")
        end
    end
end

find_lines_with_word (generic function with 1 method)

In [57]:
function word_count(s::AbstractString, word::AbstractString)
    text = tokenize(s)
    count(x -> occursin(word, x), text)
end

word_count (generic function with 1 method)

In [4]:
function get_unique_word_set(filepath::AbstractString)
    io = open(filepath, "r")
    Set(tokenize(read(io, String)))
end

get_unique_word_set (generic function with 1 method)

In [28]:
function print_unique_lines(filepath::AbstractString, word_set::Set, n::Int64)
    line_count = 0
    for line in eachline(filepath)
        line_count += 1
        tokens = tokenize(line)
        if count(x->x in word_set, tokens) > n
            println(line_count, ": ", line, " \n")
        end
    end
end

print_unique_lines (generic function with 1 method)

In [52]:
function bulk_upload(filepaths::Array{String})
    all_str = ""
    for fp in filepaths
        all_str = all_str * read(open(fp, "r"), String)
    end
    all_str
end

bulk_upload (generic function with 1 method)

# Comparison of All SOTUs

In [53]:
obama = bulk_upload(["obama_2009.txt", "obama_2010.txt", "obama_2011.txt", "obama_2012.txt", 
        "obama_2013.txt", "obama_2014.txt", "obama_2015.txt", "obama_2016.txt"])
trump = bulk_upload(["trump_2017.txt", "trump_2018.txt", "trump_2019.txt"])

"Thank you very much. Mr. Speaker, Mr. Vice President, Members of Congress, the First Lady of the United States, and citizens of America: Tonight, as we mark the conclusion of our celebration of Black History Month, we are reminded of our Nation's path towards civil rights and the work that still remains to be done. Recent threats targeting Jewish community centers and vandalism of Jewish cemeteries, as well as last week's shooting in Kansas City, remind us that while we may be a nation divided on policies, we are a country that stands united in condemning hate and evil in all of its very ugly forms.\r\n\r\nEach American generation passes the torch of truth, liberty, and justice in an unbroken chain, all the way down to the present. That torch is now in our hands, and we will use it to light up the world. I am here tonight to deliver a message of unity and strength, and it is a message deeply delivered from my heart. A new chapter of American greatness is now beginning. A new national 

In [66]:
word_count(obama, "hope")

24

In [62]:
find_lines_with_word(obama, "hope")

37: Second, we have launched a housing plan that will help responsible families facing the threat of foreclosure lower their monthly payments and refinance their mortgages. It's a plan that won't help speculators or that neighbor down the street who bought a house he could never hope to afford, but it will help millions of Americans who are struggling with declining home values; Americans who will now be able to take advantage of the lower interest rates that this plan has already helped to bring about. In fact, the average family who refinances today can save nearly $2,000 per year on their mortgage.

81: This budget builds on these reforms. It includes a historic commitment to comprehensive health care reform, a down payment on the principle that we must have quality, affordable health care for every American. It's a commitment that's paid for in part by efficiencies in our system that are long overdue. And it's a step we must take if we hope to bring down our deficit in the years to


1287: Now, I understand that because it's an election season, expectations for what we will achieve this year are low. But, Mr. Speaker, I appreciate the constructive approach that you and other leaders took at the end of last year to pass a budget and make tax cuts permanent for working families. So I hope we can work together this year on some bipartisan priorities like criminal justice reform and helping people who are battling prescription drug abuse and heroin abuse. So, who knows, we might surprise the cynics again.

1403: What I'm suggesting is hard. It's a lot easier to be cynical; to accept that change is not possible and politics is hopeless and the problem is, all the folks who are elected don't care; and to believe that our voices and our actions don't matter. But if we give up now, then we forsake a better future. Those with money and power will gain greater control over the decisions that could send a young soldier to war or allow another economic disaster or roll back 

In [63]:
word_count(trump, "hope")

14

In [64]:
find_lines_with_word(trump, "hope")

17: Dying industries will come roaring back to life. Heroic veterans will get the care they so desperately need. Our military will be given the resources its brave warriors so richly deserve. Crumbling infrastructure will be replaced with new roads, bridges, tunnels, airports, and railways gleaming across our very, very beautiful land. Our terrible drug epidemic will slow down and, ultimately, stop. And our neglected inner cities will see a rebirth of hope, safety, and opportunity. Above all else, we will keep our promises to the American people. [Applause] Thank you.

99: Everything that is broken in our country can be fixed. Every problem can be solved. And every hurting family can find healing and hope.

157: America is friends today with former enemies. Some of our closest allies, decades ago, fought on the opposite side of these terrible, terrible wars. This history should give us all faith in the possibilities for a better world. Hopefully, the 250th year for America will see a

In [70]:
word_count(obama, "war") / 8

14.375

In [69]:
word_count(trump, "war") / 3

11.333333333333334

In [72]:
word_count(obama, "very") / 8

31.5

In [71]:
word_count(trump, "very") / 3

38.666666666666664

In [76]:
word_count(obama, "big") / 8

6.0

In [75]:
word_count(trump, "big") / 3

3.6666666666666665

# Words Specific to Obama/Trump 

In [14]:
o_total_word_set = union(get_unique_word_set("obama_2009.txt"), 
    get_unique_word_set("obama_2010.txt"), 
    get_unique_word_set("obama_2011.txt"),
    get_unique_word_set("obama_2012.txt"), 
    get_unique_word_set("obama_2013.txt"),
    get_unique_word_set("obama_2014.txt"), 
    get_unique_word_set("obama_2015.txt"),
    get_unique_word_set("obama_2016.txt"))

Set(SubString{String}["enriching", "ferret", "falls", "offend", "billions", "who've", "doctor", "enjoy", "whoever", "fight"  …  "power", "summon", "prerequisite", "renew", "goes", "draw", "plotting", "forebears", "reckoning", "popular", "funds"])

In [17]:
t_total_word_set = union(get_unique_word_set("trump_2017.txt"), 
    get_unique_word_set("trump_2018.txt"), 
    get_unique_word_set("trump_2019.txt"))

Set(SubString{String}["billions", "doctor", "fight", "everywhere", "helping", "during", "whose", "prostitution", "borders", "scene"  …  "crimes", "maybe", "tax", "exhausted", "debra's", "power", "summon", "goes", "renew", "heal", "funds"])

In [18]:
length(t_total_word_set)

3101

In [19]:
o_unique = setdiff(o_total_word_set, t_total_word_set)

Set(SubString{String}["enriching", "ferret", "falls", "offend", "who've", "enjoy", "whoever", "schedule", "manufacture", "sleepless"  …  "ethnicity", "bears", "attract", "reestablishing", "woman's", "prerequisite", "draw", "plotting", "forebears", "reckoning", "popular"])

In [21]:
t_unique = setdiff(t_total_word_set, o_total_word_set)

Set(SubString{String}["prostitution", "hampshire", "dethroned", "24", "mandates", "houston", "chart", "rescues", "17", "needy"  …  "heather", "sales", "burglarized", "silenced", "celebration", "unify", "vetting", "inmates", "exhausted", "debra's", "heal"])

In [22]:
length(o_unique)

3156

In [23]:
length(t_unique)

1092

In [40]:
print_unique_lines(raw"transcripts\trump_2019.txt", t_unique, 13)

15: In June, we mark 75 years since the start of what General Dwight D. Eisenhower called the "Great Crusade" — the Allied liberation of Europe in World War II. (Applause.) On D-Day, June 6th, 1944, 15,000 young American men jumped from the sky, and 60,000 more stormed in from the sea, to save our civilization from tyranny. Here with us tonight are three of those incredible heroes: Private First Class Joseph Reilly, Staff Sergeant Irving Locker, and Sergeant Herman Zeitchik. (Applause.) Please. Gentlemen, we salute you. 

75: In June, I commuted Alice's sentence. When I saw Alice's beautiful family greet her at the prison gates, hugging and kissing and crying and laughing, I knew I did something right. Alice is with us tonight, and she is a terrific woman. Terrific. Alice, please. (Applause.) 

103: One in three women is sexually assaulted on the long journey north. Smugglers use migrant children as human pawns to exploit our laws and gain access to our country. Human traffickers and s

In [42]:
print_unique_lines(raw"transcripts\trump_2018.txt", t_unique, 15)

67: Here tonight are two fathers and two mothers: Evelyn Rodriguez, Freddy Cuevas, Elizabeth Alvarado, and Robert Mickens. Their two teenage daughters—Kayla Cuevas and Nisa Mickens—were close friends on Long Island. But in September 2016, on the eve of Nisa's 16th birthday—such a happy time it should have been—neither of them came home. These two precious girls were brutally murdered while walking together in their hometown. 

69: Six members of the savage MS-13 gang have been charged with Kayla and Nisa's murders. Many of these gang members took advantage of glaring loopholes in our laws to enter the country as illegal, unaccompanied alien minors, and wound up in Kayla and Nisa's high school. Evelyn, Elizabeth, Freddy, and Robert: Tonight everyone in this chamber is praying for you. Everyone in America is grieving for you. Please stand. Thank you very much. I want you to know that 320 million hearts are right now breaking for you. We love you. Thank you. 

111: Clearing the second flo

In [43]:
print_unique_lines(raw"transcripts\trump_2017.txt", t_unique, 15)

53: I am greatly honored to have Maureen Scalia with us in the gallery tonight. Thank you, Maureen. Her late, great husband, Antonin Scalia, will forever be a symbol of American justice. To fill his seat, we have chosen Judge Neil Gorsuch, a man of incredible skill and deep devotion to the law. He was confirmed unanimously by the Court of Appeals, and I am asking the Senate to swiftly approve his nomination. 

129: Also with us are Susan Oliver and Jessica Davis. Their husbands, Deputy Sheriff Danny Oliver and Detective Michael Davis, were slain in the line of duty in California. They were pillars of their community. These brave men were viciously gunned down by an illegal immigrant with a criminal record and two prior deportations. Should have never been in our country. 

141: We are blessed to be joined tonight by Carryn Owens, the widow of U.S. Navy Special Operator, Senior Chief William "Ryan" Owens. Ryan died as he lived: a warrior and a hero, battling against terrorism and securi

In [37]:
print_unique_lines(raw"transcripts\obama_2009.txt", o_unique, 20)

13: In other words, we have lived through an era where too often short-term gains were prized over long-term prosperity, where we failed to look beyond the next payment, the next quarter, or the next election. A surplus became an excuse to transfer wealth to the wealthy instead of an opportunity to invest in our future. Regulations were gutted for the sake of a quick profit at the expense of a healthy market. People bought homes they knew they couldn't afford from banks and lenders who pushed those bad loans anyway. And all the while, critical debates and difficult decisions were put off for some other time, on some other day. Well, that day of reckoning has arrived, and the time to take charge of our future is here. 

21: Because of this plan, there are teachers who can now keep their jobs and educate our kids, health care professionals can continue caring for our sick. There are 57 police officers who are still on the streets of Minneapolis tonight because this plan prevented the lay

In [45]:
print_unique_lines(raw"transcripts\obama_2016.txt", o_unique, 25)

45: I believe a thriving private sector is the lifeblood of our economy. I think there are outdated regulations that need to be changed. There is redtape that needs to be cut. [Applause] There you go! Yes! See? But after years now of record corporate profits, working families won't get more opportunity or bigger paychecks just by letting big banks or big oil or hedge funds make their own rules at everybody else's expense. Middle class families are not going to feel more secure because we allowed attacks on collective bargaining to go unanswered. Food stamp recipients did not cause the financial crisis; recklessness on Wall Street did. Immigrants aren't the principal reason wages haven't gone up; those decisions are made in the boardrooms that all too often put quarterly earnings over long-term returns. It's sure not the average family watching tonight that avoids paying taxes through offshore accounts. [Laughter] 

59: Now, medical research is critical. We need the same level of commit