# The Philippine president's SONA

This scrapes the contents of https://www.officialgazette.gov.ph/past-sona-speeches/ for copies of the State of the Nation Addresses of Philippine presidents from 1936 to present.

The goal is to be able to use the SONAs for text analysis. Speeches are delivered before congress every fourth Monday of July and widely anticipated for setting the tone of a sitting administration.

An analysis is provided at the latter part of the notebook.

## Do your imports

In [1]:
import pandas as pd

import time
import re
import numpy as np

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import Select

from webdriver_manager.chrome import ChromeDriverManager

import requests
from bs4 import BeautifulSoup

## Allow Selenium to open up Chrome and automatically navigate through the website

In [2]:
driver = webdriver.Chrome(ChromeDriverManager().install())



Could not get version for google-chrome with the any command: /Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --version
Current google-chrome version is UNKNOWN
Get LATEST chromedriver version for UNKNOWN google-chrome
Trying to download new driver from https://chromedriver.storage.googleapis.com/104.0.5112.79/chromedriver_mac64.zip
Driver has been saved in cache [/Users/prinzmagtulis/.wdm/drivers/chromedriver/mac64/104.0.5112.79]
  driver = webdriver.Chrome(ChromeDriverManager().install())


In [3]:
driver.get("https://www.officialgazette.gov.ph/past-sona-speeches/")

## Scraping proper: table

First step is to scrape all tabled information, that is, excluding all the contents of **links**.

In [4]:
rows= driver.find_elements(By.TAG_NAME, "tr")

We arrange the information into a **list of dictionaries** in preparation to transforming it into a **data frame** for pandas analysis later.

In [5]:
dataset=[]
for dicts in rows[1:]:
    data={}
    all_tds = dicts.find_elements(By.TAG_NAME, "td")
    if len(all_tds) == 5:
        prexy = data['president']= dicts.find_elements(By.TAG_NAME, "td")[0].text
        data['date']= dicts.find_elements(By.TAG_NAME, "td")[1].text
        data['title'] = dicts.find_elements(By.TAG_NAME, "td")[2].text
        try:
            data['link'] = dicts.find_elements(By.TAG_NAME, "a")[1].get_attribute('href')
        except:
            data['link'] = dicts.find_element(By.TAG_NAME, "a").get_attribute('href')
        data['venue'] = dicts.find_elements(By.TAG_NAME, "td")[3].text
        data['session'] = dicts.find_elements(By.TAG_NAME, "td")[4].text
        dataset.append(data)
    else:
        data['president'] = prexy
        data['date']= dicts.find_elements(By.TAG_NAME, "td")[0].text
        data['title'] = dicts.find_elements(By.TAG_NAME, "td")[1].text
        data['link'] = dicts.find_element(By.TAG_NAME, "a").get_attribute('href')
        data['venue'] = dicts.find_elements(By.TAG_NAME, "td")[2].text
        data['session'] = dicts.find_elements(By.TAG_NAME, "td")[3].text
        dataset.append(data)
dataset

[{'president': 'Manuel L. Quezon',
  'date': 'November 25, 1935',
  'title': 'Message to the First Assembly on National Defense',
  'link': 'http://www.officialgazette.gov.ph/1935/11/25/message-of-president-quezon-to-the-first-assembly-on-national-defense-november-25-1935/',
  'venue': 'Legislative Building, Manila',
  'session': 'First National Assembly, First Session'},
 {'president': 'Manuel L. Quezon',
  'date': 'June 16, 1936',
  'title': 'On the Country’s Conditions and Problems',
  'link': 'http://www.officialgazette.gov.ph/1936/06/16/manuel-l-quezon-second-state-of-the-nation-address-june-16-1936/',
  'venue': 'Legislative Building, Manila',
  'session': 'First National Assembly, First Session'},
 {'president': 'Manuel L. Quezon',
  'date': 'October 18, 1937',
  'title': 'Improvement of Philippine Conditions, Philippine Independence, and Relations with American High Commissioner',
  'link': 'http://www.officialgazette.gov.ph/1937/10/18/manuel-l-quezon-third-state-of-the-nation-

Our **first data frame**

In [6]:
df1 = pd.DataFrame(dataset)
df1.head()

Unnamed: 0,president,date,title,link,venue,session
0,Manuel L. Quezon,"November 25, 1935",Message to the First Assembly on National Defense,http://www.officialgazette.gov.ph/1935/11/25/m...,"Legislative Building, Manila","First National Assembly, First Session"
1,Manuel L. Quezon,"June 16, 1936",On the Country’s Conditions and Problems,http://www.officialgazette.gov.ph/1936/06/16/m...,"Legislative Building, Manila","First National Assembly, First Session"
2,Manuel L. Quezon,"October 18, 1937","Improvement of Philippine Conditions, Philippi...",http://www.officialgazette.gov.ph/1937/10/18/m...,"Legislative Building, Manila","First National Assembly, Second Session"
3,Manuel L. Quezon,"January 24, 1938",Revision of the System of Taxation,http://www.officialgazette.gov.ph/1938/01/24/m...,"Legislative Building, Manila","First National Assembly, Third Session"
4,Manuel L. Quezon,"January 24, 1939",The State of the Nation and Important Economic...,http://www.officialgazette.gov.ph/1939/01/24/m...,"Legislative Building, Manila","Second National Assembly, First Session"


## Scraping proper: actual speeches

We use BeautifulSoup on this one. The process is easier since we already have the links in the first df and all we have to do is to just **access and grab** their contents one by one.

I'm commenting this part out to avoid reading through a bunch of texts, but hey, it runs very well so try it on your own!

In [7]:
speeches=[]
for speech in dataset[0:]:
    href = speech['link']
    raw_html = requests.get(href).content
    doc = BeautifulSoup(raw_html, "html.parser")
    headers = doc.find_all(class_= 'large-9 large-centered columns')[1]
    text={}
    text['link']= speech['link']
    text['speech']= headers.text 
    speeches.append(text)
#speeches

As you can see, the speeches are arranged as a **single block** per row to match their place in the df. This is, of course, not the ideal way and may be improved. Below is a **second data frame** containing the links and speeches themselves.

We then **merge** this information with our earlier df.

In [8]:
df2=pd.DataFrame(speeches)
df2

Unnamed: 0,link,speech
0,http://www.officialgazette.gov.ph/1935/11/25/m...,\nMessage\nof\nHis Excellency Manuel L. Quezon...
1,http://www.officialgazette.gov.ph/1936/06/16/m...,\nMessage\nof\nHis Excellency Manuel L. Quezon...
2,http://www.officialgazette.gov.ph/1937/10/18/m...,\nMessage\nof\nHis Excellency Manuel L. Quezon...
3,http://www.officialgazette.gov.ph/1938/01/24/m...,\nMessage\nof\nHis Excellency Manuel L. Quezon...
4,http://www.officialgazette.gov.ph/1939/01/24/m...,\nMessage\nof\nHis Excellency Manuel L. Quezon...
...,...,...
79,https://www.officialgazette.gov.ph/2018/07/23/...,\n\n\n\nSTATE OF THE NATION ADDRESS OF \nRODRI...
80,https://www.officialgazette.gov.ph/2019/07/22/...,\n\n\n\nSTATE OF THE NATION ADDRESS OF \nRODRI...
81,https://www.officialgazette.gov.ph/2020/07/27/...,\n\n\n\n\n\n\n5TH STATE OF THE NATION ADDRESS ...
82,https://www.officialgazette.gov.ph/2021/07/26/...,\n\n\tState of the Nation Address of \n\tRodri...


Our final df.

In [9]:
merged = df1.merge(df2, suffixes=('_left'))
merged

  merged = df1.merge(df2, suffixes=('_left'))


Unnamed: 0,president,date,title,link,venue,session,speech
0,Manuel L. Quezon,"November 25, 1935",Message to the First Assembly on National Defense,http://www.officialgazette.gov.ph/1935/11/25/m...,"Legislative Building, Manila","First National Assembly, First Session",\nMessage\nof\nHis Excellency Manuel L. Quezon...
1,Manuel L. Quezon,"June 16, 1936",On the Country’s Conditions and Problems,http://www.officialgazette.gov.ph/1936/06/16/m...,"Legislative Building, Manila","First National Assembly, First Session",\nMessage\nof\nHis Excellency Manuel L. Quezon...
2,Manuel L. Quezon,"October 18, 1937","Improvement of Philippine Conditions, Philippi...",http://www.officialgazette.gov.ph/1937/10/18/m...,"Legislative Building, Manila","First National Assembly, Second Session",\nMessage\nof\nHis Excellency Manuel L. Quezon...
3,Manuel L. Quezon,"January 24, 1938",Revision of the System of Taxation,http://www.officialgazette.gov.ph/1938/01/24/m...,"Legislative Building, Manila","First National Assembly, Third Session",\nMessage\nof\nHis Excellency Manuel L. Quezon...
4,Manuel L. Quezon,"January 24, 1939",The State of the Nation and Important Economic...,http://www.officialgazette.gov.ph/1939/01/24/m...,"Legislative Building, Manila","Second National Assembly, First Session",\nMessage\nof\nHis Excellency Manuel L. Quezon...
...,...,...,...,...,...,...,...
79,Rodrigo Roa Duterte,"July 23, 2018",Third State of the Nation Address,https://www.officialgazette.gov.ph/2018/07/23/...,"Batasang Pambansa, Quezon City","Seventeenth Congress, Third Session",\n\n\n\nSTATE OF THE NATION ADDRESS OF \nRODRI...
80,Rodrigo Roa Duterte,"July 22, 2019",Fourth State of the Nation Address,https://www.officialgazette.gov.ph/2019/07/22/...,"Batasang Pambansa, Quezon City","Eighteenth Congress, First Session",\n\n\n\nSTATE OF THE NATION ADDRESS OF \nRODRI...
81,Rodrigo Roa Duterte,"July 27, 2020",Fifth State of the Nation Address,https://www.officialgazette.gov.ph/2020/07/27/...,"Batasang Pambansa, Quezon City","Eighteenth Congress, Second Session",\n\n\n\n\n\n\n5TH STATE OF THE NATION ADDRESS ...
82,Rodrigo Roa Duterte,"July 26, 2021",Sixth State of the Nation Address,https://www.officialgazette.gov.ph/2021/07/26/...,"Batasang Pambansa, Quezon City","Eighteenth Congress, Third Session",\n\n\tState of the Nation Address of \n\tRodri...


## Save to CSV

In [165]:
#merged.to_csv('merged.csv', index=False)

# Initial analysis

## regex

We are now ready to take an **initial analysis** of the texts that we have. For this part, I provided some examples below using **regex**.

An important note on this method: the **str.contains** and **str.extractall** functions **ONLY** count *the number of speeches* that contain the word, *not how many times* the word was mentioned in the speech. We would look into the count of the words on the speeches later at a deeper analysis.

Words we ran here are based from peer-reviewed textual studies that gauge **populism.**

In [12]:
#Ran to just check the type of files we are dealing with.
merged.dtypes

president    object
date         object
title        object
link         object
venue        object
session      object
speech       object
dtype: object

### 'elite'

The word "elite" is found to have been often used by populist leaders. We find based on this initial analysis that in the case of Philippine presidents, three leaders (one of whom was **dictator** Ferdinand Marcos Sr.) were found to have included the word in their SONAs.

In [13]:
merged[merged.speech.str.contains(r"\belite", case=False, regex=True)].president.value_counts()

Ferdinand E. Marcos        2
Joseph Ejercito Estrada    1
Rodrigo Roa Duterte        1
Name: president, dtype: int64

In [14]:
pd.set_option('display.max_colwidth', None)
merged.speech.str.extractall(r'(.*\belite.+)', re.IGNORECASE)

Unnamed: 0_level_0,Unnamed: 1_level_0,0
Unnamed: 0_level_1,match,Unnamed: 2_level_1
31,0,"It is fortunate that the nation will, just two years from now, call a constitutional convention. I leave it to the delegates of that convention to evolve a truly democratic system, one which will not merely bend, as our system does today, to the wishes of a traditional elite and perpetuate the status quo. Democratic institutions must be instruments of national advancement. Democracy must symbolize change."
37,0,"Clearly, we face here the danger that our New Society is giving birth to a new government elite, who resurrect in our midst the privileges we fought in the past, who employ the powers of high office for their personal enrichment, as well as of their business colleagues, relatives, and friends."
60,0,"Our war on poverty is in the acceleration of the land redistribution processes under the agrarian reform program. We distributed more than 266,000 hectares of land to 175,000 landless farmers, including land owned by the traditional rural elite. []"
81,0,Great wealth enables economic elites and corporations to influence public policy to their advantage. Media is a powerful tool in the hands of oligarchs like the Lopezes who used their media outlets in their battles with political figures. I am a casualty of the Lopezes during the 2016 election.


### 'democracy' and 'demokrasya'

Dictator Ferdinand E. Marcos mentioned the word **"democracy"** in 10 of his SONAs followed by Gloria Arroyo (7 of 9 SONAs). In Filipino, Benigno Aquino III mentioned **"demokrasya"** in two of his six speeches. 



**Joseph Estrada**, whose term was cut short by a popular revolt in 2001, and **Rodrigo Duterte** mentioned the word in a single SONA. 

In [15]:
merged[merged.speech.str.contains(r"(.*\bdemocracy.+)", case=False, regex=True)].president.value_counts()

  merged[merged.speech.str.contains(r"(.*\bdemocracy.+)", case=False, regex=True)].president.value_counts()


Ferdinand E. Marcos        10
Gloria Macapagal-Arroyo     7
Manuel L. Quezon            5
Corazon C. Aquino           5
Fidel V. Ramos              5
Ramon Magsaysay             4
Diosdado Macapagal          4
Manuel Roxas                3
Elpidio Quirino             3
Carlos P. Garcia            2
Joseph Ejercito Estrada     1
Rodrigo Roa Duterte         1
Name: president, dtype: int64

In [16]:
merged[merged.speech.str.contains(r"(.*\bdemokrasya.+)", case=False, regex=True)].president.value_counts()

  merged[merged.speech.str.contains(r"(.*\bdemokrasya.+)", case=False, regex=True)].president.value_counts()


Benigno S. Aquino III      2
Ferdinand E. Marcos        1
Corazon C. Aquino          1
Gloria Macapagal-Arroyo    1
Name: president, dtype: int64

In [17]:
merged.speech.str.extractall(r'(.*\bdemocracy.+)', re.IGNORECASE).head(7)

Unnamed: 0_level_0,Unnamed: 1_level_0,0
Unnamed: 0_level_1,match,Unnamed: 2_level_1
1,0,"In our day and generation democracy, as an effective system of government, is being challenged. Let this new democracy of ours show to the world that democracy can be as efficient as a dictatorship, without trespassing upon individual liberty and the sacred rights of the people."
2,0,"Still more: The Filipino workingman has heard, if he is not able to read, of the equality before the law of the poor and the rich. He has heard of democracy, liberty, and justice, since every candidate for an elective office discourses on these topics, painting to him in glowing terms the meaning of these words."
2,1,"One of the discoveries which we have made since the establishment of the Government of the Commonwealth is that, despite the large number of children that have gone through our public schools, as shown in the reports of the Bureau of Education, the literacy of the Islands has not increased proportionally, and the knowledge of those rudimentary subjects which the citizen of a democracy should have, has not been acquired by a population corresponding to the number of children that appear to have entered the public schools. The reason for this is simple. A large proprtion of the boys and girls who have been admitted to the schools have not remained long enough to acquire any kind of useful knowledge."
2,2,"Gentlemen of the National Assembly, before closing, allow me to emphasize the need to of giving the common man in the Philippines the benefits that the citizenry of every progressive democracy is entitled to receive. I am sure that every one of you will give to this noble task the best that is in him. An opportunity has been offered us that no past or coming generation has had or will ever have –that of creating a nation where there will be no privileged class, where poverty will be unknown, where every citizen will be duly equipped with the knowledge that will enable him to perform his duties and to exercise his rights properly and conscientiously, and where every man, woman, and child his fireside will be thankful to God for living in this beautiful and blessed land."
3,0,"We are earnestly concerned with social justice. Without a strict application of social justice to all elements of the community, general satisfaction of the people with their government is impossible to achieve. Here, in the just and equitable solution of social problems, is the real test of the sufficiency of democracy to meet present-day conditions of society."
4,0,"As a final word respecting the Army, I want to urge you, once again, to give to all matters concerning our future security the earnest consideration their fundamental importance deserves. If eternal vigilance is the price of freedom, let us then be ceaselessly vigilant. Our defensive system requires no unusual sacrifice by any individual, but its success depends primarily and almost exclusively upon a unification of the efforts of all toward this common and vital purpose. To attain such unification in a democracy, the military plan must be supported by popular intelligence, confidence, and enthusiasm. It is a special function of Government to see that this confidence is fairly earned and assiduously sustained. To this end let us see to it that every law we pass and every military measure we adopt shall reflect an unselfish and national purpose, that it shall impose injustice on none, and that it shall promote the security and defend the peace, the possessions and the liberty of all."
4,1,"Gentlemen of the National Assembly, the world in which we live today is an entirely different world from that which we knew only a few years ago. Whereas before the World War, democracy was gaining ground everywhere, mankind is now divided into two great camps—those who believe in democracy and those who feel contempt for it as a completely discredited system of government. By our political education, by our convictions and by our inclinations, we are a democracy. We have established a democratic system of government and the perpetuation of this system will depend upon our ability to convince our people that democracy can be freed from those vices which have destroyed it in some countries, and that it can be made as efficient as any other system of government known to man. It behooves us; therefore, to prove that through a wise use of democratic processes, the welfare and the safety of the people can be promoted, thus contributing our share to the preservation of democracy in the world."


In [18]:
merged.speech.str.extractall(r'(.*\bdemokrasya.+)', re.IGNORECASE).head()

Unnamed: 0_level_0,Unnamed: 1_level_0,0
Unnamed: 0_level_1,match,Unnamed: 2_level_1
40,0,"Nasa harap ng kapulungang ito ngayon ang katipunan ng mga hamon at pagsubok sa nakalipas na mga Kongreso, at ito na sana ang pangwakas na pagsubok kung makakaya natin gamitin ang demokrasya bilang mabisang sangkap ng katatagan at kaunlarang pambansa. Bagaman at kailangan pa ring magpatuloy ang pansamantalang pamahalaan, taglay ng kapulungang ito ang binhi ng matatag at masiglang lehislaturang tutugon sa ating pangangailangan kung ihahandog natin dito ang lahat ng ating talino at kakayahan."
40,1,"Tayo ngayon ay isang bansang pinalakas ng mga pagsubok na ating pinagdaanan, higit na nagkakaisa pagkaraan ng mga sigalutang dinanas, at higit na handa sa anumang uri ng pagsubok at suliranin. Natapos nating lampasan ang mahihigpit na balakid sa nakaraang lima-at-kalahating taon. Sa liwanag ng makabuluhang yugtong ito ng ating buhay bilang bansa at lahi, magagawa natin ang ating tungkuling pagtahak sa landas ng katuparan ng ating matayog na pangarap na pag-unlad, pagkakapantay-pantay, at ng tunay na demokrasya."
51,0,Binigyang buhay ng mga Kabisig nating ito ang diwa ng ating Saligang Batas; binigyang halimbawa nila ang tunay na kahulugan ng demokrasya.
51,1,"May katiyakan ang ating tagumpay kung tayo’y magkakaisa. Kung kaya’t hinihimok ko kayo—kagalang-galang na mga Senador, Kongresista, at ang iba pang mga pinuno ng bayan—na muli tayong manumpa sa pangarap na nagbigkis sa atin noong 1986: ibalik at panatiliin ang demokrasya, kalayaan, karapatan, katatagang pangkabuhayan, at katarungang panlipunan."
65,0,"Pinapangako ko ang isang bagong direksyon: mamamayan muna. Ang taong bayan ang pinakamalaki nating yaman. Ngunit madalas, kaunti lang ang atensyon na binibigay sa kanilang pag-unlad. Di tuloy matawid ang agwat ng mayaman at mahirap. Di tuloy mapa-abot sa lahat ang biyaya ng demokrasya."


## Segregating by president

We create separate dataframes from a select number of presidents to analyze using text analysis.

In [56]:
aquino = merged[(merged['president'] == 'Benigno S. Aquino III')] #Aquino
duterte = merged[(merged['president'] == 'Rodrigo Roa Duterte')] #Duterte
marcos = merged[(merged['president'] == 'Ferdinand E. Marcos')] #Marcos Sr.
erap = merged[(merged['president'] == 'Joseph Ejercito Estrada')] #Erap
marcosjr = merged[(merged['president'] == 'Ferdinand R. Marcos Jr.')] #Marcos Jr.

## Import text analysis libraries and identify paramaters

We will use Python's National Language Toolkit and [Scikit-Learn](https://scikit-learn.org/stable/index.html).

In [166]:
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfVectorizer
import stopwordsiso as stopwords

In [58]:
def preprocess_text(text):
    text = text.lower()
    text = re.sub(r'\d+', '', text)
    return text #removes all numbers

In [168]:
y_columns = ['president', 'speeches']
BINARY=False
NGRAM_RANGE=(1,1)
MIN_DF=0
STPWORDS=stopwords.stopwords(["en", "tl"]) #removes Tagalog stopwords
STPWORDS.update(['yung', 'iyan', 'yan', 'diyan', 'applause', 'laughter', 'palakpakan', 'rin', 'din', 'po',
                'pong', 'pang', 'pa', 'nang', 'ng', 'pag',
                'kapag', 'nga']) #adds more Tagalog stopwords not included in the package 

vectorizer = CountVectorizer(
    stop_words=STPWORDS,
    ngram_range=NGRAM_RANGE,
    binary=BINARY,
    min_df=MIN_DF,
    preprocessor=preprocess_text
)

## Vectorizing

Simple counting of words that occur in a speech. We use the latest SONA here of the **current president Ferdinand Marcos Jr.** as an example.

In [169]:
X = vectorizer.fit_transform(marcosjr['speech'])
X



<1x1956 sparse matrix of type '<class 'numpy.int64'>'
	with 1956 stored elements in Compressed Sparse Row format>

In [170]:
marcos_vectors = pd.DataFrame(X.toarray(), columns=vectorizer.get_feature_names_out())
# [print(x) for x in marcosjr.speech]
marcos_vectors.round(2)

Unnamed: 0,aagapay,aalaga,aalis,aaral,aatasan,abandon,ability,abolition,abono,abreast,abundance,academic,accelerated,accelerating,access,accessible,accomplish,accountability,accountable,accurate,acquired,acquisition,activities,activity,actual,adapt,address,addressed,addressing,adjusted,adjustments,administrasyon,administration,admission,adopt,adoption,adopts,advance,advancement,advantage,advertising,affairs,affordability,aforementioned,afternoon,agad,agarang,age,agencies,agency,agenda,aggressive,agile,agrarian,agree,agricultural,agriculture,agrikultura,agrikultural,ahensiya,ahensya,aics,aid,aim,aims,airport,airports,akma,akong,alagaan,alam,alert,alerts,alexander,alleged,alleviate,alliances,allied,allocation,ama,ambassadors,ambiguities,amend,amended,amending,amendments,amortization,anak,anchor,annual,annually,anong,antiquated,apatnapung,apostolic,appropriation,approved,arabia,aralan,araneta,araw,archives,arena,arisen,arising,armed,arroyo,artificial,arts,aspire,assembled,assist,assistance,assumptions,atomic,attached,attain,attainment,attract,attractive,augmented,authority,automate,availability,avenues,average,aversion,awarded,ayaw,ayuda,babagay,babuyan,backbone,bagong,bahagi,bakuna,balance,balansehin,band,banda,bank,bansa,banta,barko,barrel,based,basic,bata,batas,batasang,batid,bayan,bayanihan,begun,belief,belong,beloved,beneficial,beneficiaries,benefit,benefiting,benefits,benipisyaryo,beses,beset,bibilhin,bid,biktima,blood,blueprint,bonds,boost,booster,bot,bottleneck,bottlenecks,brand,branded,breadth,breakthrough,breed,bring,broad,broadband,brought,brown,budget,budgeted,budgeting,buhay,build,building,buildings,bukas,bukod,buksan,buksang,bulto,bumaba,bumalik,burahin,burden,bureau,bureaucracy,bureaus,bus,business,buwan,cabinet,calculated,called,capacity,capita,capital,capitalizing,carbon,care,careful,cash,catch,category,caused,cavite,cbs,cdc,cebu,center,centers,certifications,chain,chains,challenges,change,channel,characterized,charles,cheap,cheapest,cheers,chief,child,children,cities,citizen,citizens,city,civil,clarification,clarifying,class,classes,classrooms,climate,clinic,close,clouds,coherence,coinciding,cold,collaborate,collaboration,collection,college,collusion,combination,combined,command,commerce,commercial,commission,commit,commitment,committed,commodities,common,communicated,communications,communities,community,commuter,companies,compared,compete,competent,competition,competitive,complementary,complete,completed,complex,compliance,complications,comply,component,composed,comprehensive,computers,computing,concerned,concurrence,concurrent,condition,conditions,condonation,condoned,conduct,confidently,conflict,confronted,congress,connect,connecting,connectivity,consensus,consideration,considered,consistent,consolidation,consumer,consumption,context,continuation,continue,continued,continues,continuous,continuously,contracts,contributor,control,convenient,conventional,converting,cooperate,cooperation,cooperatives,coordinate,coordination,cornerstones,coronavirus,corp,corporate,corporation,corps,cost,counselling,countries,country,countrymen,courses,court,cover,coverage,covid,create,created,creates,creation,creative,creativity,credible,crisis,critical,crucial,crude,cultural,culture,current,custodians,customs,cut,daan,dagat,dagdag,dala,dalawang,dark,data,database,daunting,davao,day,dayuhang,deal,debt,decision,declarations,decongest,deeply,defense,deficit,degrees,delivered,delivery,demand,demonstrated,denr,department,departments,depensa,deploy,deployment,depth,derivations,deserve,desired,destination,determine,determining,develop,developed,developing,development,developments,devices,dict,difficult,digit,digital,digitalization,digitized,dilg,diminish,dinadagdagan,diplomatic,direct,directed,direst,disabilities,disability,disadvantaged,disaster,disbursement,disbursements,discipline,discussion,discussions,disease,disrupted,distinction,distinguished,distressed,distribution,diversifying,divide,doctors,document,doh,doktor,dollar,domestic,doors,dost,dotr,downstream,dozens,dpwh,drive,drivers,driving,drugs,dswd,dubai,dudulog,dues,duterte,duty,earth,ease,easier,easily,ecological,economic,economy,ecozones,edifice,education,educational,edukasyon,effective,effectiveness,effects,efficiency,efficient,efforts,ejercito,ekonomiya,eksperto,el,electric,electrical,electricity,electronic,element,elevated,emancipate,emerge,emergency,emerges,emerging,empleyado,empleyo,employees,employer,employers,employment,empowerment,enable,enabling,enactment,encourage,encouraged,endure,enemy,energy,enforced,engage,engaged,engineering,english,enhance,enhanced,enhancing,enjoy,ensure,ensuring,entail,entering,enterprises,entire,environment,envoys,epira,equipped,erc,essentials,establish,established,establishes,establishment,esteemed,estrada,events,everyday,evident,examine,examined,examining,exceeds,excel,excellence,excellency,excellent,exchange,executes,executive,exert,existing,expand,expanded,expanding,expansion,expect,expected,expenditure,exploit,explore,exponential,exports,extending,extends,extension,extent,external,extreme,eye,facet,faceted,facilitate,facing,fair,fairer,faith,family,farm,farmers,farming,farms,favored,federal,feeding,feeds,fellow,ferdinand,fiber,field,fighting,filipino,filipinos,filtering,financial,financing,finding,finish,firm,fiscal,focus,follow,food,footprint,forces,forecasts,foreign,foremost,foresee,foster,foundation,fourth,framework,freelancers,fresh,friend,friends,frontliners,fuel,fukushima,functions,fundamental,funding,future,gaanong,gagabayan,gagamitan,gagampanan,gagawa,gagawing,gain,gamot,ganap,ganitong,ganun,garcia,gas,gastusin,gawing,gaya,gayon,gdp,generate,generations,generic,gentlemen,geographically,geothermal,gesmundo,gida,gigawatts,gitna,global,globally,gloria,gni,goal,goals,gobyerno,governance,government,governors,grades,graduate,graduates,grants,grassroots,grateful,gross,ground,grow,growth,guarantee,guests,guide,guiding,halaga,halimbawa,hall,halos,hamong,hanapbuhay,hanay,hand,handbook,hapon,harapin,hard,hari,harmonized,hayop,headline,heal,health,healthcare,heard,heart,hectares,height,hemb,heritage,highly,highways,higit,hihigit,hinihikayat,hirap,history,hit,hold,holistic,honorable,horizon,horror,hospital,hospitals,hotline,hours,house,household,households,hub,human,husto,hydropower,iaayon,iaea,ibang,ibayong,ibayuhin,ict,ideas,ideklara,identify,identity,idly,ids,iibayuhin,ikinagagalak,ilalaan,ilalapit,ilang,ilocos,ilulunsad,imbakan,impact,imperative,impetus,implement,implementation,implemented,impormasyon,importantly,imported,imports,impose,imposition,improve,improved,improvements,improving,ina,inaabuso,inaapi,inang,inatasan,incentives,inch,include,included,including,income,incorporate,increase,increased,increases,incumbent,independent,individuals,industrial,industries,industry,inequalities,inflation,informed,infrastructure,infusion,initial,initiatives,innovation,innovations,inputs,inspirational,institusyon,institute,instituting,institutional,institutionalize,institutionalized,institutions,instructed,instruction,instrumentalities,insufficient,integrated,integrity,intellectual,intelligence,intend,interactions,interesadong,interim,interior,intermediary,international,internet,intranet,introduced,inutusan,inuutusan,invaluable,inventories,investment,investments,investors,involve,involved,involving,inyo,ipaprayoridad,ironed,isandaang,isinapinal,islands,isolated,issuance,issue,issues,itataas,itatatag,itong,iwrm,jealous,job,jobs,john,joseph,journey,jr,juan,july,jurisdiction,justice,justices,kababayan,kabataang,kabilang,kadiwa,kagawaran,kagawarang,kaguluhan,kagyat,kahalagahan,kailan,kailangang,kakailanganin,kakayanin,kakulangan,kalagayan,kalakal,kalamidad,kaluluwa,kalusugan,kampanya,kanlungan,kapahamakan,kapakanan,kapangyarihan,kapasidad,karahasan,karamdaman,karapat,karapatan,kartel,kasalukuyan,kasalukuyang,kasama,kasanayan,kasiguruhan,katuwang,kaugnay,kayang,key,kidney,kilometer,kinabukasan,kinakailangan,kinalaman,klasipikasyon,knowledge,komprehensibong,kongreso,kooperasyon,koordinasyon,krisis,kultura,kumpanya,kumpetisyon,kumplikadong,kumpyansa,kundi,kunin,kwalipikadong,ladies,lady,lalawigan,lalo,lalong,land,landless,lands,lang,language,larger,lastly,law,laws,layunin,lead,leaders,leading,learned,led,left,legislation,legislative,legislators,legislature,lengthy,lessons,level,levels,lgu,lgus,liberalization,licensed,lieu,life,limited,lines,linggo,liquidity,lisanin,listahan,literacy,live,lives,loan,loans,local,location,lockdown,louise,lowering,lrt,lubos,lugar,lung,maagang,maalwan,maayos,mababa,mabatid,mabayarang,mabigyan,mabilis,mabubuksang,macapagal,macro,macroeconomic,madali,mag,magagandang,magagawa,magandang,magawa,magbibigay,magbiyahe,magbubukid,magdadagdag,magdagdag,magiging,magkaroon,maglagak,maglalagay,maglalatag,magpapagamot,magpapatuloy,magpatuloy,magsasaka,magsisiguro,magsisilbing,magtatayo,magtutungo,magtuturo,magulang,mahal,mahigpit,mahirap,maibalik,mailakbay,mailapit,mailigtas,main,maintain,maintained,maintaining,maipapasok,maiparating,maisulong,maitutok,maiwasan,makabagong,makakabalik,makakapiling,makakaya,makatiyak,makipag,makipagtulungan,makitang,malalaking,malalayong,malampaya,malayo,malilinis,maliwanag,mamamayan,mamimili,mamumuhunan,manage,management,manalasa,mananatili,mananatiling,mandanas,mandarayuhan,mandate,mandatory,mangangalaga,mangatok,manggagawang,mangingisda,mangunguna,mangyayari,manila,maninigurong,manufacturer,manufacturing,manukan,maospital,mapakinabangan,mapanatili,mapangalagaan,mapauwi,mapupunta,maramdaman,maraming,mararamdaman,marating,marcos,market,marketplace,martin,mas,masa,masasanay,massive,master,masuportahan,masusing,matagalan,matagalang,matapos,matatawag,materials,mathematics,matiyak,matter,matulungan,mawawaldas,maximize,maximizing,maximum,mayayakap,maynila,mayors,measurable,measure,measures,mechanism,media,medical,medicine,medium,medtech,mental,mergers,merkado,messages,metro,middle,midstream,midwife,migrant,miguel,military,milyong,minamahal,mind,mindanao,minds,minor,misinvoicing,missed,mission,mister,mitigate,mix,mobilize,mobilized,modern,modernisasyon,modernization,modernizing,modes,modular,momentum,monetary,money,monitor,monthly,months,moratorium,motivate,moves,mrc,mrt,msmes,mtff,muling,multi,multiplier,muna,mundong,mups,murang,mutually,nadidiskubreng,nagagalak,nagbibigay,nagkakasakit,nagsasakripisyong,nagsisimula,nagtatapon,nagtitipid,nahaharap,nahihirapan,nahiwalay,naiipit,naiwan,naka,nakakalimutan,nakakatayo,nakalipas,nakaraang,nakikipag,nakikipagtulungan,namagitan,namamatay,naman,namang,namimili,nanay,nangangailangan,nangangailangang,nanganganib,nangungutang,nano,napakinabangan,nararanasan,nararapat,nariyan,nasa,natin,nating,nation,national,nations,natural,naught,navigate,nawalan,ncr,neda,needless,negosasyon,negosyo,neighbor,neighborhood,network,neutral,ngrp,ngunit,nido,nitong,noong,normal,norte,north,nstp,nuclear,nuncio,nurse,nurses,objectives,obligation,observed,oec,offer,offers,office,officers,offices,offshore,ofw,ofws,oil,ongoing,online,onwards,operasyon,operate,operation,operations,ople,oportunidad,opportunities,optics,optimal,optimize,option,orcc,organization,organizations,organize,orphans,ospital,outcomes,outputs,overseas,owners,owwa,paa,paaralang,pababayaan,pacific,package,packs,pagamutan,pagbabago,pagbili,pagbubuhay,pagbubutihin,paghingi,pagka,pagkain,pagkakaibigan,pagkakilanlan,pagkalinga,paglipas,pagluluwag,pagpapakalat,pagpapalago,pagpaparating,pagpapatupad,pagrepaso,pagsasaliksik,pagsipa,pagsubok,pagsusulong,pagsusuri,pagtatanim,pagtibayin,pagtitibayin,pagtugon,pagtupad,pahina,pakikipag,pakikipagtulungan,palaisdaan,palalawakin,pamahalaan,pamahalaang,pamana,pambansa,pamilihan,pamilya,pamilyang,pamimigay,pamphlet,pamumuno,panay,pandemic,pandemya,pangalan,pangangailangan,pangarap,pangingibang,pangkalahatang,pangkalusugan,pantawid,pantay,papahirapan,papasok,papel,papeles,parents,parte,participate,participation,partners,partnership,partnerships,pasahod,pass,passed,passengers,passive,patakaran,patuloy,pautang,pay,paying,payment,payments,pdp,pension,people,percent,perform,period,perseverance,person,personal,personnel,persons,peso,pesos,pestisidyo,petsa,pharmaceutical,phenomenon,philippine,philippines,php,physical,physicians,pifita,pilipinas,pilipino,pilipinong,pinadapa,pinagdaraanan,pinakahuli,pinansiyal,pinapag,pitong,pitumpong,plan,planned,planning,planong,plans,planting,plants,platform,play,players,plays,police,policies,policy,policymaking,pondo,pool,poor,portfolio,posibilidad,post,potential,poverty,power,ppp,ppps,practicable,practical,practices,precarious,predecessor,preparation,prepare,preparedness,preparing,preservation,preserve,preside,president,presidents,presyo,prevention,previous,pribadong,price,prices,pride,primary,primordial,principles,printing,priorities,priority,privacy,private,problemang,procedures,process,processes,processing,produce,product,production,productive,productivity,produksyon,produksyong,produkto,produktong,professionals,program,programa,programang,programs,progress,progressing,project,projected,projects,promote,promoted,promotes,promoting,promotion,prone,pronged,pronounced,proof,propel,properly,property,proposals,propose,proseso,protected,protection,protektahan,protocols,provide,providers,provision,provisions,ps,public,punla,pupuntahan,purchasing,purpose,purposes,pursue,push,puwersang,qualified,quality,quantum,quarter,question,quezon,radically,rail,railway,railways,range,rankings,rapid,rapidly,rate,ratio,rational,raw,reach,reaffirm,real,realigned,reality,reasons,receive,received,recorded,records,recovery,red,redesigning,reducing,refers,refine,reform,reforms,refresher,regard,region,regionally,regions,registered,regular,regulasyon,regulation,regulations,rehabilitated,reinstitute,relation,relations,relationships,relevant,reliable,relief,remain,remaining,remains,remote,removed,renewable,renewables,reopen,reorganization,rep,repair,repatriation,represent,representatives,represents,republic,require,requirement,requires,reserve,residential,resilience,resiliency,resolution,resource,resources,respect,respeto,response,responsibility,responsive,rest,restrictions,restructure,result,resumption,retired,retirees,retirement,return,revenue,reverend,review,revisions,revolution,rhu,rich,rights,rightsizing,rise,risk,risks,roa,road,roads,robotics,robust,rodrigo,role,rollout,romualdez,rooted,rotc,ruling,russia,sacrifices,safe,safety,sahod,sakahan,salamat,sapagkat,sapat,sara,sariling,satellite,saudi,scale,scarring,schemes,school,schools,science,scientific,sea,seamless,seamlessly,seaports,search,season,secretary,sector,sectors,secure,secured,security,seek,seeks,seksyon,sektor,senate,send,senior,sense,sentro,separation,series,serve,service,services,seryoso,session,set,settle,settlements,shared,shelter,shocks,shop,shore,short,shortcoming,shots,simple,simpler,simplified,simulan,single,sinimulan,sining,sistema,situation,situations,sitwasyon,siyensya,skills,smartphone,social,society,socio,solar,solid,solo,solusyon,solutions,solvency,soul,sound,sources,south,sovereignty,spark,speaker,speaking,specialty,speed,spending,spirit,splitting,sports,spots,spouse,square,stability,stabilize,stages,stakeholders,stand,standards,station,status,stay,steadily,stem,stored,stories,story,strategic,strategically,strategies,strategy,streamlined,strengthen,strengthened,strong,stronger,struck,structure,students,studies,study,subject,subjects,submarine,submit,submitted,subsidy,subway,suliraning,sumakay,sumisilong,summarize,sunlight,supervised,suplay,supplemental,supplies,supply,support,supported,supportive,supreme,surviving,susan,suspend,sustainability,sustainable,sustained,susunod,susuportahan,systems,taasan,tahanan,talent,talented,talk,talking,tamang,tangi,taon,taong,tape,target,targets,task,tasked,tatawag,tatay,tatlong,taught,taumbayan,tax,taxation,taxes,tayong,teachers,teaching,tech,technological,technologies,technology,teknikal,teknolohiya,telecommunications,temperature,term,terminal,terms,terrestrial,territorial,territory,tertiary,threats,tightening,time,timely,times,tinatawag,tinatawagan,titiyak,titiyakin,tiyakin,tool,tools,total,tourism,tourist,tower,trabaho,trade,traditional,trafficking,train,training,trainings,transact,transaction,transactions,transaksyong,transfer,transferred,transform,transformation,transformational,transformative,transforming,transit,translate,transmission,transparency,transplant,transport,transportation,transporting,travelers,treasure,trend,tugma,tulong,tuloy,tumaas,tungo,tuntunin,turbines,turbulent,tutulong,ugnayan,ukol,ukraine,unang,unburden,uncertain,uncertainty,undeniable,underpin,undertake,undertaking,undervaluation,undimmed,undiscovered,unfortunate,unified,uniformed,unique,units,universal,universally,unpaid,unprecedented,unti,unused,upang,upcoming,upgrade,uploaded,upper,upstream,uptick,upward,urban,uri,usap,usd,utang,utilize,utilized,utmost,utos,vaccination,vaccine,valuation,values,variable,variables,variants,vehicles,verifiable,verification,veterans,viability,vice,view,violence,vip,virology,virtual,virus,visitors,vital,voluminous,vulnerability,vulnerable,wait,wala,war,warehouse,warehouses,warming,water,waver,wealth,weather,weeks,welcoming,welfare,wellness,wind,windmill,wireless,women,workers,workplace,worse,yields,zimmerman,zubiri
0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,1,1,1,1,1,1,1,3,1,1,1,7,1,2,2,1,1,4,1,1,2,1,1,1,3,1,1,1,1,1,1,2,2,7,2,1,1,1,11,1,6,4,2,1,4,1,1,1,1,3,1,3,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,3,4,1,2,1,1,1,1,1,1,1,2,1,1,3,1,1,1,1,2,1,1,1,1,1,1,4,2,1,2,1,1,1,2,1,1,1,3,1,2,1,2,1,2,1,1,1,1,2,1,1,1,1,2,3,8,2,1,3,2,1,1,1,1,1,3,1,1,2,2,1,2,5,2,1,1,3,2,1,1,1,5,1,1,1,1,3,1,1,1,5,1,1,1,1,3,1,1,3,1,4,1,1,3,5,5,1,1,1,1,1,1,1,1,1,1,2,1,1,2,2,3,1,1,1,6,1,3,1,1,7,1,1,2,1,1,1,2,1,2,7,6,1,2,1,5,7,1,1,1,1,1,4,1,1,6,1,1,2,1,1,1,1,1,3,1,7,1,1,1,1,1,1,1,1,1,1,1,1,1,2,1,1,3,1,1,1,1,2,1,3,1,4,2,2,1,1,1,2,4,1,1,1,1,1,1,1,2,1,3,1,1,2,1,1,1,2,1,1,1,1,2,1,4,2,1,3,1,1,1,1,1,3,1,1,1,11,1,1,1,1,2,1,3,1,1,1,2,1,1,1,2,1,1,1,1,1,3,3,1,1,14,1,1,1,2,1,8,6,1,1,3,4,1,1,2,1,1,1,1,2,2,1,1,1,1,1,1,1,4,1,4,1,1,1,1,1,1,2,1,1,1,1,3,1,1,1,1,1,1,1,22,2,1,1,1,1,1,2,1,1,2,1,3,1,1,18,4,1,2,1,1,9,1,1,1,1,1,2,2,1,1,1,3,1,5,1,1,2,1,1,3,1,1,1,2,3,1,1,1,1,3,1,2,2,1,2,1,1,1,1,1,1,1,1,3,1,1,1,5,1,1,2,1,1,1,13,10,1,1,1,5,1,2,1,4,2,3,3,1,1,1,1,2,2,2,1,1,1,1,1,2,1,1,1,1,1,1,1,7,1,1,2,1,3,1,1,1,17,1,1,1,1,3,2,1,2,1,6,2,1,1,3,2,5,1,1,2,1,1,1,2,1,3,1,1,1,1,1,2,1,1,1,1,1,1,1,1,1,3,1,6,2,2,1,2,1,6,1,1,1,1,1,1,1,1,1,2,1,1,1,1,1,1,1,1,1,1,5,3,1,1,1,1,1,1,1,2,1,1,1,11,2,1,7,1,1,1,3,10,1,1,4,1,4,2,10,1,1,4,1,1,2,1,2,1,2,1,3,1,4,1,1,4,1,1,1,1,2,1,1,5,1,2,1,1,5,1,1,2,3,9,1,1,2,1,2,1,1,1,1,1,8,1,1,1,1,2,2,3,25,1,1,1,2,1,1,1,2,3,3,14,2,1,2,1,1,1,1,1,2,4,1,1,2,1,1,1,1,1,1,2,1,10,2,1,2,2,1,1,1,2,2,3,1,1,1,3,2,3,1,4,1,1,2,2,1,1,4,1,1,1,2,1,1,2,1,11,1,1,1,1,1,1,2,1,3,1,1,1,1,1,1,1,1,2,1,1,2,9,1,1,1,1,2,1,1,8,3,1,3,1,1,1,1,1,2,1,2,1,3,4,1,8,3,1,1,1,1,1,2,9,1,3,1,8,2,1,3,2,2,3,1,1,5,1,1,1,1,2,2,1,1,1,2,1,2,1,2,1,1,1,1,1,6,7,1,1,1,1,1,1,5,4,1,1,1,1,1,1,1,1,1,2,1,1,3,1,1,1,1,1,1,1,1,1,1,1,1,1,2,1,1,1,8,1,3,1,2,1,1,1,1,1,2,1,1,1,1,1,3,1,1,1,1,1,2,1,1,1,1,1,1,1,2,1,2,1,1,1,1,1,5,1,5,1,1,1,1,1,1,1,1,2,2,1,2,2,1,1,2,1,1,1,1,1,7,2,6,4,5,2,2,1,1,10,4,1,2,1,1,1,2,2,1,1,1,1,1,1,4,1,3,2,1,2,1,1,3,2,2,1,1,3,1,2,3,1,1,3,1,2,1,1,2,2,1,1,1,1,3,1,1,1,1,2,1,1,2,4,3,1,1,2,1,1,1,1,1,1,1,2,2,1,1,1,1,2,1,5,1,1,1,1,1,1,1,2,2,1,1,1,1,1,2,1,1,1,1,1,1,1,2,1,1,1,1,1,1,1,1,1,1,2,1,2,4,2,1,1,6,1,1,1,1,2,1,2,1,1,2,1,1,1,4,1,1,2,1,1,1,1,1,1,1,1,1,2,1,2,4,1,1,9,1,1,1,1,1,1,1,1,1,1,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,8,1,9,1,1,1,3,1,2,3,1,1,5,1,2,1,2,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,1,1,2,1,1,1,1,1,2,1,1,2,3,1,3,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,1,1,1,5,1,1,1,1,1,1,1,1,1,1,1,2,3,32,9,3,22,3,6,1,1,1,1,1,1,1,1,1,1,2,1,2,1,1,2,1,1,1,2,1,4,1,1,2,2,1,1,1,1,1,1,1,2,1,3,3,4,2,1,2,1,1,2,3,1,1,4,1,1,1,1,1,1,1,1,1,1,2,1,4,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,1,1,1,1,1,1,1,1,1,1,1,2,1,1,1,2,1,4,1,1,1,1,2,1,2,5,1,1,1,1,4,1,1,1,1,1,9,1,1,2,1,1,1,1,1,2,1,1,1,1,1,2,1,1,1,1,3,1,3,1,1,1,1,2,2,1,1,1,1,1,2,6,14,1,1,1,1,1,1,1,2,2,1,1,1,1,8,21,7,2,1,1,3,4,1,1,1,1,1,1,1,1,6,1,3,1,4,1,4,1,4,1,1,1,1,7,1,1,1,1,1,1,4,3,1,18,2,3,1,1,1,1,1,1,1,2,2,2,1,1,6,1,5,3,1,1,2,3,1,3,1,2,1,1,4,1,7,1,1,2,4,1,1,3,4,1,2,3,1,1,2,2,15,1,1,7,1,1,4,4,7,5,2,2,1,1,1,1,1,1,1,1,5,1,1,2,2,1,1,1,9,1,1,1,1,10,1,1,1,1,1,2,1,1,1,2,1,2,3,1,1,1,10,1,2,1,4,1,4,2,1,1,1,1,5,1,1,2,2,1,1,1,6,1,1,1,1,1,15,1,1,1,3,1,1,1,1,1,2,1,1,1,1,1,2,1,2,1,2,1,1,1,1,6,1,1,1,1,1,1,1,3,1,5,2,1,2,4,2,1,1,1,1,8,1,1,1,1,2,3,1,1,1,1,1,1,1,3,3,1,4,1,1,1,1,3,2,1,2,1,1,1,2,1,1,2,2,2,1,1,2,1,2,1,1,2,1,2,1,1,5,2,2,1,2,3,1,1,5,1,3,1,1,1,1,1,1,1,1,9,3,3,1,2,1,12,1,2,3,1,1,2,1,1,3,1,5,9,1,1,1,1,1,1,1,2,1,1,2,1,2,3,1,1,1,1,1,1,1,3,2,2,1,1,1,3,1,1,4,1,1,1,1,1,1,2,5,2,1,1,1,1,1,2,3,1,1,1,2,1,1,2,1,1,2,2,2,1,1,1,1,1,2,2,1,6,1,1,3,1,3,1,6,2,1,1,5,1,1,1,1,1,1,1,1,1,1,1,1,1,2,1,2,1,1,9,5,1,1,1,1,1,1,2,1,3,2,1,6,1,1,2,1,1,2,2,1,4,1,1,2,2,1,1,1,1,2,1,1,8,3,1,3,2,1,1,1,6,12,1,1,1,1,12,1,2,1,1,1,1,1,1,7,1,3,2,1,1,3,3,1,2,2,7,1,1,1,2,2,1,1,2,1,1,1,2,1,2,1,2,2,1,1,1,1,1,2,1,1,4,3,1,1,1,1,1,4,2,3,1,1,1,1,1,1,1,2,1,1,1,1,1,1,2,1,1,1,1,1,2,1,1,1,1,1,1,1,2,1,16,1,1,1,1,1,1,2,2,1,2,4,1,1,2,1,1,1,2,2,1,1,1,1,1,1,1,2,1,2,2,1,1,4,1,1,1,3,1,1,3,1,1,2,1,1,1,7,1,1,2,1,1,5,1,3,1,1,1,6,1,1,1,1,1


## Simple word count

In [171]:
pd.set_option('display.max_columns', None)
marcos_vectors = marcos_vectors.transpose()

In [172]:
marcos_vectors.columns = ['SONA1']
marcos_vectors.sort_values('SONA1', ascending=False).head(20)

Unnamed: 0,SONA1
natin,32
government,25
department,22
national,22
philippines,21
power,18
development,18
energy,17
upang,16
reform,15


## Marcos first SONA

In [173]:
vectorizer = TfidfVectorizer(
    stop_words=STPWORDS, 
    ngram_range=NGRAM_RANGE,
    binary=BINARY,
    min_df=MIN_DF,
    preprocessor=preprocess_text
)
X = vectorizer.fit_transform(marcosjr['speech'])
marcos_idf = pd.DataFrame(X.toarray(), columns=vectorizer.get_feature_names_out())
#[print(x) for x in speeches.sentence]
marcos_idf.round(2)



Unnamed: 0,aagapay,aalaga,aalis,aaral,aatasan,abandon,ability,abolition,abono,abreast,abundance,academic,accelerated,accelerating,access,accessible,accomplish,accountability,accountable,accurate,acquired,acquisition,activities,activity,actual,adapt,address,addressed,addressing,adjusted,adjustments,administrasyon,administration,admission,adopt,adoption,adopts,advance,advancement,advantage,advertising,affairs,affordability,aforementioned,afternoon,agad,agarang,age,agencies,agency,agenda,aggressive,agile,agrarian,agree,agricultural,agriculture,agrikultura,agrikultural,ahensiya,ahensya,aics,aid,aim,aims,airport,airports,akma,akong,alagaan,alam,alert,alerts,alexander,alleged,alleviate,alliances,allied,allocation,ama,ambassadors,ambiguities,amend,amended,amending,amendments,amortization,anak,anchor,annual,annually,anong,antiquated,apatnapung,apostolic,appropriation,approved,arabia,aralan,araneta,araw,archives,arena,arisen,arising,armed,arroyo,artificial,arts,aspire,assembled,assist,assistance,assumptions,atomic,attached,attain,attainment,attract,attractive,augmented,authority,automate,availability,avenues,average,aversion,awarded,ayaw,ayuda,babagay,babuyan,backbone,bagong,bahagi,bakuna,balance,balansehin,band,banda,bank,bansa,banta,barko,barrel,based,basic,bata,batas,batasang,batid,bayan,bayanihan,begun,belief,belong,beloved,beneficial,beneficiaries,benefit,benefiting,benefits,benipisyaryo,beses,beset,bibilhin,bid,biktima,blood,blueprint,bonds,boost,booster,bot,bottleneck,bottlenecks,brand,branded,breadth,breakthrough,breed,bring,broad,broadband,brought,brown,budget,budgeted,budgeting,buhay,build,building,buildings,bukas,bukod,buksan,buksang,bulto,bumaba,bumalik,burahin,burden,bureau,bureaucracy,bureaus,bus,business,buwan,cabinet,calculated,called,capacity,capita,capital,capitalizing,carbon,care,careful,cash,catch,category,caused,cavite,cbs,cdc,cebu,center,centers,certifications,chain,chains,challenges,change,channel,characterized,charles,cheap,cheapest,cheers,chief,child,children,cities,citizen,citizens,city,civil,clarification,clarifying,class,classes,classrooms,climate,clinic,close,clouds,coherence,coinciding,cold,collaborate,collaboration,collection,college,collusion,combination,combined,command,commerce,commercial,commission,commit,commitment,committed,commodities,common,communicated,communications,communities,community,commuter,companies,compared,compete,competent,competition,competitive,complementary,complete,completed,complex,compliance,complications,comply,component,composed,comprehensive,computers,computing,concerned,concurrence,concurrent,condition,conditions,condonation,condoned,conduct,confidently,conflict,confronted,congress,connect,connecting,connectivity,consensus,consideration,considered,consistent,consolidation,consumer,consumption,context,continuation,continue,continued,continues,continuous,continuously,contracts,contributor,control,convenient,conventional,converting,cooperate,cooperation,cooperatives,coordinate,coordination,cornerstones,coronavirus,corp,corporate,corporation,corps,cost,counselling,countries,country,countrymen,courses,court,cover,coverage,covid,create,created,creates,creation,creative,creativity,credible,crisis,critical,crucial,crude,cultural,culture,current,custodians,customs,cut,daan,dagat,dagdag,dala,dalawang,dark,data,database,daunting,davao,day,dayuhang,deal,debt,decision,declarations,decongest,deeply,defense,deficit,degrees,delivered,delivery,demand,demonstrated,denr,department,departments,depensa,deploy,deployment,depth,derivations,deserve,desired,destination,determine,determining,develop,developed,developing,development,developments,devices,dict,difficult,digit,digital,digitalization,digitized,dilg,diminish,dinadagdagan,diplomatic,direct,directed,direst,disabilities,disability,disadvantaged,disaster,disbursement,disbursements,discipline,discussion,discussions,disease,disrupted,distinction,distinguished,distressed,distribution,diversifying,divide,doctors,document,doh,doktor,dollar,domestic,doors,dost,dotr,downstream,dozens,dpwh,drive,drivers,driving,drugs,dswd,dubai,dudulog,dues,duterte,duty,earth,ease,easier,easily,ecological,economic,economy,ecozones,edifice,education,educational,edukasyon,effective,effectiveness,effects,efficiency,efficient,efforts,ejercito,ekonomiya,eksperto,el,electric,electrical,electricity,electronic,element,elevated,emancipate,emerge,emergency,emerges,emerging,empleyado,empleyo,employees,employer,employers,employment,empowerment,enable,enabling,enactment,encourage,encouraged,endure,enemy,energy,enforced,engage,engaged,engineering,english,enhance,enhanced,enhancing,enjoy,ensure,ensuring,entail,entering,enterprises,entire,environment,envoys,epira,equipped,erc,essentials,establish,established,establishes,establishment,esteemed,estrada,events,everyday,evident,examine,examined,examining,exceeds,excel,excellence,excellency,excellent,exchange,executes,executive,exert,existing,expand,expanded,expanding,expansion,expect,expected,expenditure,exploit,explore,exponential,exports,extending,extends,extension,extent,external,extreme,eye,facet,faceted,facilitate,facing,fair,fairer,faith,family,farm,farmers,farming,farms,favored,federal,feeding,feeds,fellow,ferdinand,fiber,field,fighting,filipino,filipinos,filtering,financial,financing,finding,finish,firm,fiscal,focus,follow,food,footprint,forces,forecasts,foreign,foremost,foresee,foster,foundation,fourth,framework,freelancers,fresh,friend,friends,frontliners,fuel,fukushima,functions,fundamental,funding,future,gaanong,gagabayan,gagamitan,gagampanan,gagawa,gagawing,gain,gamot,ganap,ganitong,ganun,garcia,gas,gastusin,gawing,gaya,gayon,gdp,generate,generations,generic,gentlemen,geographically,geothermal,gesmundo,gida,gigawatts,gitna,global,globally,gloria,gni,goal,goals,gobyerno,governance,government,governors,grades,graduate,graduates,grants,grassroots,grateful,gross,ground,grow,growth,guarantee,guests,guide,guiding,halaga,halimbawa,hall,halos,hamong,hanapbuhay,hanay,hand,handbook,hapon,harapin,hard,hari,harmonized,hayop,headline,heal,health,healthcare,heard,heart,hectares,height,hemb,heritage,highly,highways,higit,hihigit,hinihikayat,hirap,history,hit,hold,holistic,honorable,horizon,horror,hospital,hospitals,hotline,hours,house,household,households,hub,human,husto,hydropower,iaayon,iaea,ibang,ibayong,ibayuhin,ict,ideas,ideklara,identify,identity,idly,ids,iibayuhin,ikinagagalak,ilalaan,ilalapit,ilang,ilocos,ilulunsad,imbakan,impact,imperative,impetus,implement,implementation,implemented,impormasyon,importantly,imported,imports,impose,imposition,improve,improved,improvements,improving,ina,inaabuso,inaapi,inang,inatasan,incentives,inch,include,included,including,income,incorporate,increase,increased,increases,incumbent,independent,individuals,industrial,industries,industry,inequalities,inflation,informed,infrastructure,infusion,initial,initiatives,innovation,innovations,inputs,inspirational,institusyon,institute,instituting,institutional,institutionalize,institutionalized,institutions,instructed,instruction,instrumentalities,insufficient,integrated,integrity,intellectual,intelligence,intend,interactions,interesadong,interim,interior,intermediary,international,internet,intranet,introduced,inutusan,inuutusan,invaluable,inventories,investment,investments,investors,involve,involved,involving,inyo,ipaprayoridad,ironed,isandaang,isinapinal,islands,isolated,issuance,issue,issues,itataas,itatatag,itong,iwrm,jealous,job,jobs,john,joseph,journey,jr,juan,july,jurisdiction,justice,justices,kababayan,kabataang,kabilang,kadiwa,kagawaran,kagawarang,kaguluhan,kagyat,kahalagahan,kailan,kailangang,kakailanganin,kakayanin,kakulangan,kalagayan,kalakal,kalamidad,kaluluwa,kalusugan,kampanya,kanlungan,kapahamakan,kapakanan,kapangyarihan,kapasidad,karahasan,karamdaman,karapat,karapatan,kartel,kasalukuyan,kasalukuyang,kasama,kasanayan,kasiguruhan,katuwang,kaugnay,kayang,key,kidney,kilometer,kinabukasan,kinakailangan,kinalaman,klasipikasyon,knowledge,komprehensibong,kongreso,kooperasyon,koordinasyon,krisis,kultura,kumpanya,kumpetisyon,kumplikadong,kumpyansa,kundi,kunin,kwalipikadong,ladies,lady,lalawigan,lalo,lalong,land,landless,lands,lang,language,larger,lastly,law,laws,layunin,lead,leaders,leading,learned,led,left,legislation,legislative,legislators,legislature,lengthy,lessons,level,levels,lgu,lgus,liberalization,licensed,lieu,life,limited,lines,linggo,liquidity,lisanin,listahan,literacy,live,lives,loan,loans,local,location,lockdown,louise,lowering,lrt,lubos,lugar,lung,maagang,maalwan,maayos,mababa,mabatid,mabayarang,mabigyan,mabilis,mabubuksang,macapagal,macro,macroeconomic,madali,mag,magagandang,magagawa,magandang,magawa,magbibigay,magbiyahe,magbubukid,magdadagdag,magdagdag,magiging,magkaroon,maglagak,maglalagay,maglalatag,magpapagamot,magpapatuloy,magpatuloy,magsasaka,magsisiguro,magsisilbing,magtatayo,magtutungo,magtuturo,magulang,mahal,mahigpit,mahirap,maibalik,mailakbay,mailapit,mailigtas,main,maintain,maintained,maintaining,maipapasok,maiparating,maisulong,maitutok,maiwasan,makabagong,makakabalik,makakapiling,makakaya,makatiyak,makipag,makipagtulungan,makitang,malalaking,malalayong,malampaya,malayo,malilinis,maliwanag,mamamayan,mamimili,mamumuhunan,manage,management,manalasa,mananatili,mananatiling,mandanas,mandarayuhan,mandate,mandatory,mangangalaga,mangatok,manggagawang,mangingisda,mangunguna,mangyayari,manila,maninigurong,manufacturer,manufacturing,manukan,maospital,mapakinabangan,mapanatili,mapangalagaan,mapauwi,mapupunta,maramdaman,maraming,mararamdaman,marating,marcos,market,marketplace,martin,mas,masa,masasanay,massive,master,masuportahan,masusing,matagalan,matagalang,matapos,matatawag,materials,mathematics,matiyak,matter,matulungan,mawawaldas,maximize,maximizing,maximum,mayayakap,maynila,mayors,measurable,measure,measures,mechanism,media,medical,medicine,medium,medtech,mental,mergers,merkado,messages,metro,middle,midstream,midwife,migrant,miguel,military,milyong,minamahal,mind,mindanao,minds,minor,misinvoicing,missed,mission,mister,mitigate,mix,mobilize,mobilized,modern,modernisasyon,modernization,modernizing,modes,modular,momentum,monetary,money,monitor,monthly,months,moratorium,motivate,moves,mrc,mrt,msmes,mtff,muling,multi,multiplier,muna,mundong,mups,murang,mutually,nadidiskubreng,nagagalak,nagbibigay,nagkakasakit,nagsasakripisyong,nagsisimula,nagtatapon,nagtitipid,nahaharap,nahihirapan,nahiwalay,naiipit,naiwan,naka,nakakalimutan,nakakatayo,nakalipas,nakaraang,nakikipag,nakikipagtulungan,namagitan,namamatay,naman,namang,namimili,nanay,nangangailangan,nangangailangang,nanganganib,nangungutang,nano,napakinabangan,nararanasan,nararapat,nariyan,nasa,natin,nating,nation,national,nations,natural,naught,navigate,nawalan,ncr,neda,needless,negosasyon,negosyo,neighbor,neighborhood,network,neutral,ngrp,ngunit,nido,nitong,noong,normal,norte,north,nstp,nuclear,nuncio,nurse,nurses,objectives,obligation,observed,oec,offer,offers,office,officers,offices,offshore,ofw,ofws,oil,ongoing,online,onwards,operasyon,operate,operation,operations,ople,oportunidad,opportunities,optics,optimal,optimize,option,orcc,organization,organizations,organize,orphans,ospital,outcomes,outputs,overseas,owners,owwa,paa,paaralang,pababayaan,pacific,package,packs,pagamutan,pagbabago,pagbili,pagbubuhay,pagbubutihin,paghingi,pagka,pagkain,pagkakaibigan,pagkakilanlan,pagkalinga,paglipas,pagluluwag,pagpapakalat,pagpapalago,pagpaparating,pagpapatupad,pagrepaso,pagsasaliksik,pagsipa,pagsubok,pagsusulong,pagsusuri,pagtatanim,pagtibayin,pagtitibayin,pagtugon,pagtupad,pahina,pakikipag,pakikipagtulungan,palaisdaan,palalawakin,pamahalaan,pamahalaang,pamana,pambansa,pamilihan,pamilya,pamilyang,pamimigay,pamphlet,pamumuno,panay,pandemic,pandemya,pangalan,pangangailangan,pangarap,pangingibang,pangkalahatang,pangkalusugan,pantawid,pantay,papahirapan,papasok,papel,papeles,parents,parte,participate,participation,partners,partnership,partnerships,pasahod,pass,passed,passengers,passive,patakaran,patuloy,pautang,pay,paying,payment,payments,pdp,pension,people,percent,perform,period,perseverance,person,personal,personnel,persons,peso,pesos,pestisidyo,petsa,pharmaceutical,phenomenon,philippine,philippines,php,physical,physicians,pifita,pilipinas,pilipino,pilipinong,pinadapa,pinagdaraanan,pinakahuli,pinansiyal,pinapag,pitong,pitumpong,plan,planned,planning,planong,plans,planting,plants,platform,play,players,plays,police,policies,policy,policymaking,pondo,pool,poor,portfolio,posibilidad,post,potential,poverty,power,ppp,ppps,practicable,practical,practices,precarious,predecessor,preparation,prepare,preparedness,preparing,preservation,preserve,preside,president,presidents,presyo,prevention,previous,pribadong,price,prices,pride,primary,primordial,principles,printing,priorities,priority,privacy,private,problemang,procedures,process,processes,processing,produce,product,production,productive,productivity,produksyon,produksyong,produkto,produktong,professionals,program,programa,programang,programs,progress,progressing,project,projected,projects,promote,promoted,promotes,promoting,promotion,prone,pronged,pronounced,proof,propel,properly,property,proposals,propose,proseso,protected,protection,protektahan,protocols,provide,providers,provision,provisions,ps,public,punla,pupuntahan,purchasing,purpose,purposes,pursue,push,puwersang,qualified,quality,quantum,quarter,question,quezon,radically,rail,railway,railways,range,rankings,rapid,rapidly,rate,ratio,rational,raw,reach,reaffirm,real,realigned,reality,reasons,receive,received,recorded,records,recovery,red,redesigning,reducing,refers,refine,reform,reforms,refresher,regard,region,regionally,regions,registered,regular,regulasyon,regulation,regulations,rehabilitated,reinstitute,relation,relations,relationships,relevant,reliable,relief,remain,remaining,remains,remote,removed,renewable,renewables,reopen,reorganization,rep,repair,repatriation,represent,representatives,represents,republic,require,requirement,requires,reserve,residential,resilience,resiliency,resolution,resource,resources,respect,respeto,response,responsibility,responsive,rest,restrictions,restructure,result,resumption,retired,retirees,retirement,return,revenue,reverend,review,revisions,revolution,rhu,rich,rights,rightsizing,rise,risk,risks,roa,road,roads,robotics,robust,rodrigo,role,rollout,romualdez,rooted,rotc,ruling,russia,sacrifices,safe,safety,sahod,sakahan,salamat,sapagkat,sapat,sara,sariling,satellite,saudi,scale,scarring,schemes,school,schools,science,scientific,sea,seamless,seamlessly,seaports,search,season,secretary,sector,sectors,secure,secured,security,seek,seeks,seksyon,sektor,senate,send,senior,sense,sentro,separation,series,serve,service,services,seryoso,session,set,settle,settlements,shared,shelter,shocks,shop,shore,short,shortcoming,shots,simple,simpler,simplified,simulan,single,sinimulan,sining,sistema,situation,situations,sitwasyon,siyensya,skills,smartphone,social,society,socio,solar,solid,solo,solusyon,solutions,solvency,soul,sound,sources,south,sovereignty,spark,speaker,speaking,specialty,speed,spending,spirit,splitting,sports,spots,spouse,square,stability,stabilize,stages,stakeholders,stand,standards,station,status,stay,steadily,stem,stored,stories,story,strategic,strategically,strategies,strategy,streamlined,strengthen,strengthened,strong,stronger,struck,structure,students,studies,study,subject,subjects,submarine,submit,submitted,subsidy,subway,suliraning,sumakay,sumisilong,summarize,sunlight,supervised,suplay,supplemental,supplies,supply,support,supported,supportive,supreme,surviving,susan,suspend,sustainability,sustainable,sustained,susunod,susuportahan,systems,taasan,tahanan,talent,talented,talk,talking,tamang,tangi,taon,taong,tape,target,targets,task,tasked,tatawag,tatay,tatlong,taught,taumbayan,tax,taxation,taxes,tayong,teachers,teaching,tech,technological,technologies,technology,teknikal,teknolohiya,telecommunications,temperature,term,terminal,terms,terrestrial,territorial,territory,tertiary,threats,tightening,time,timely,times,tinatawag,tinatawagan,titiyak,titiyakin,tiyakin,tool,tools,total,tourism,tourist,tower,trabaho,trade,traditional,trafficking,train,training,trainings,transact,transaction,transactions,transaksyong,transfer,transferred,transform,transformation,transformational,transformative,transforming,transit,translate,transmission,transparency,transplant,transport,transportation,transporting,travelers,treasure,trend,tugma,tulong,tuloy,tumaas,tungo,tuntunin,turbines,turbulent,tutulong,ugnayan,ukol,ukraine,unang,unburden,uncertain,uncertainty,undeniable,underpin,undertake,undertaking,undervaluation,undimmed,undiscovered,unfortunate,unified,uniformed,unique,units,universal,universally,unpaid,unprecedented,unti,unused,upang,upcoming,upgrade,uploaded,upper,upstream,uptick,upward,urban,uri,usap,usd,utang,utilize,utilized,utmost,utos,vaccination,vaccine,valuation,values,variable,variables,variants,vehicles,verifiable,verification,veterans,viability,vice,view,violence,vip,virology,virtual,virus,visitors,vital,voluminous,vulnerability,vulnerable,wait,wala,war,warehouse,warehouses,warming,water,waver,wealth,weather,weeks,welcoming,welfare,wellness,wind,windmill,wireless,women,workers,workplace,worse,yields,zimmerman,zubiri
0,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.06,0.01,0.02,0.02,0.01,0.01,0.03,0.01,0.01,0.02,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.02,0.06,0.02,0.01,0.01,0.01,0.09,0.01,0.05,0.03,0.02,0.01,0.03,0.01,0.01,0.01,0.01,0.02,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.02,0.03,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.01,0.03,0.02,0.01,0.02,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.02,0.01,0.02,0.01,0.02,0.01,0.02,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.02,0.02,0.06,0.02,0.01,0.02,0.02,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.02,0.02,0.01,0.02,0.04,0.02,0.01,0.01,0.02,0.02,0.01,0.01,0.01,0.04,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.04,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.02,0.01,0.03,0.01,0.01,0.02,0.04,0.04,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.02,0.02,0.02,0.01,0.01,0.01,0.05,0.01,0.02,0.01,0.01,0.06,0.01,0.01,0.02,0.01,0.01,0.01,0.02,0.01,0.02,0.06,0.05,0.01,0.02,0.01,0.04,0.06,0.01,0.01,0.01,0.01,0.01,0.03,0.01,0.01,0.05,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.06,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.02,0.01,0.02,0.01,0.03,0.02,0.02,0.01,0.01,0.01,0.02,0.03,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.02,0.01,0.01,0.02,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.02,0.01,0.03,0.02,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.09,0.01,0.01,0.01,0.01,0.02,0.01,0.02,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.02,0.02,0.01,0.01,0.11,0.01,0.01,0.01,0.02,0.01,0.06,0.05,0.01,0.01,0.02,0.03,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.02,0.02,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.03,0.01,0.03,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.17,0.02,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.02,0.01,0.02,0.01,0.01,0.14,0.03,0.01,0.02,0.01,0.01,0.07,0.01,0.01,0.01,0.01,0.01,0.02,0.02,0.01,0.01,0.01,0.02,0.01,0.04,0.01,0.01,0.02,0.01,0.01,0.02,0.01,0.01,0.01,0.02,0.02,0.01,0.01,0.01,0.01,0.02,0.01,0.02,0.02,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.04,0.01,0.01,0.02,0.01,0.01,0.01,0.1,0.08,0.01,0.01,0.01,0.04,0.01,0.02,0.01,0.03,0.02,0.02,0.02,0.01,0.01,0.01,0.01,0.02,0.02,0.02,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.06,0.01,0.01,0.02,0.01,0.02,0.01,0.01,0.01,0.13,0.01,0.01,0.01,0.01,0.02,0.02,0.01,0.02,0.01,0.05,0.02,0.01,0.01,0.02,0.02,0.04,0.01,0.01,0.02,0.01,0.01,0.01,0.02,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.05,0.02,0.02,0.01,0.02,0.01,0.05,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.04,0.02,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.09,0.02,0.01,0.06,0.01,0.01,0.01,0.02,0.08,0.01,0.01,0.03,0.01,0.03,0.02,0.08,0.01,0.01,0.03,0.01,0.01,0.02,0.01,0.02,0.01,0.02,0.01,0.02,0.01,0.03,0.01,0.01,0.03,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.04,0.01,0.02,0.01,0.01,0.04,0.01,0.01,0.02,0.02,0.07,0.01,0.01,0.02,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.06,0.01,0.01,0.01,0.01,0.02,0.02,0.02,0.2,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.02,0.02,0.02,0.11,0.02,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.02,0.03,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.08,0.02,0.01,0.02,0.02,0.01,0.01,0.01,0.02,0.02,0.02,0.01,0.01,0.01,0.02,0.02,0.02,0.01,0.03,0.01,0.01,0.02,0.02,0.01,0.01,0.03,0.01,0.01,0.01,0.02,0.01,0.01,0.02,0.01,0.09,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.02,0.07,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.06,0.02,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.02,0.01,0.02,0.03,0.01,0.06,0.02,0.01,0.01,0.01,0.01,0.01,0.02,0.07,0.01,0.02,0.01,0.06,0.02,0.01,0.02,0.02,0.02,0.02,0.01,0.01,0.04,0.01,0.01,0.01,0.01,0.02,0.02,0.01,0.01,0.01,0.02,0.01,0.02,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.05,0.06,0.01,0.01,0.01,0.01,0.01,0.01,0.04,0.03,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.06,0.01,0.02,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.04,0.01,0.04,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.02,0.01,0.02,0.02,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.06,0.02,0.05,0.03,0.04,0.02,0.02,0.01,0.01,0.08,0.03,0.01,0.02,0.01,0.01,0.01,0.02,0.02,0.01,0.01,0.01,0.01,0.01,0.01,0.03,0.01,0.02,0.02,0.01,0.02,0.01,0.01,0.02,0.02,0.02,0.01,0.01,0.02,0.01,0.02,0.02,0.01,0.01,0.02,0.01,0.02,0.01,0.01,0.02,0.02,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.02,0.03,0.02,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.02,0.01,0.01,0.01,0.01,0.02,0.01,0.04,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.02,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.02,0.03,0.02,0.01,0.01,0.05,0.01,0.01,0.01,0.01,0.02,0.01,0.02,0.01,0.01,0.02,0.01,0.01,0.01,0.03,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.02,0.03,0.01,0.01,0.07,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.02,0.06,0.01,0.07,0.01,0.01,0.01,0.02,0.01,0.02,0.02,0.01,0.01,0.04,0.01,0.02,0.01,0.02,0.02,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.02,0.02,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.04,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.02,0.25,0.07,0.02,0.17,0.02,0.05,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.02,0.01,0.01,0.02,0.01,0.01,0.01,0.02,0.01,0.03,0.01,0.01,0.02,0.02,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.02,0.02,0.03,0.02,0.01,0.02,0.01,0.01,0.02,0.02,0.01,0.01,0.03,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.03,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.02,0.01,0.03,0.01,0.01,0.01,0.01,0.02,0.01,0.02,0.04,0.01,0.01,0.01,0.01,0.03,0.01,0.01,0.01,0.01,0.01,0.07,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.02,0.01,0.02,0.01,0.01,0.01,0.01,0.02,0.02,0.01,0.01,0.01,0.01,0.01,0.02,0.05,0.11,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.02,0.01,0.01,0.01,0.01,0.06,0.17,0.06,0.02,0.01,0.01,0.02,0.03,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.05,0.01,0.02,0.01,0.03,0.01,0.03,0.01,0.03,0.01,0.01,0.01,0.01,0.06,0.01,0.01,0.01,0.01,0.01,0.01,0.03,0.02,0.01,0.14,0.02,0.02,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.02,0.02,0.01,0.01,0.05,0.01,0.04,0.02,0.01,0.01,0.02,0.02,0.01,0.02,0.01,0.02,0.01,0.01,0.03,0.01,0.06,0.01,0.01,0.02,0.03,0.01,0.01,0.02,0.03,0.01,0.02,0.02,0.01,0.01,0.02,0.02,0.12,0.01,0.01,0.06,0.01,0.01,0.03,0.03,0.06,0.04,0.02,0.02,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.04,0.01,0.01,0.02,0.02,0.01,0.01,0.01,0.07,0.01,0.01,0.01,0.01,0.08,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.02,0.01,0.02,0.02,0.01,0.01,0.01,0.08,0.01,0.02,0.01,0.03,0.01,0.03,0.02,0.01,0.01,0.01,0.01,0.04,0.01,0.01,0.02,0.02,0.01,0.01,0.01,0.05,0.01,0.01,0.01,0.01,0.01,0.12,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.02,0.01,0.02,0.01,0.01,0.01,0.01,0.05,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.04,0.02,0.01,0.02,0.03,0.02,0.01,0.01,0.01,0.01,0.06,0.01,0.01,0.01,0.01,0.02,0.02,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.02,0.01,0.03,0.01,0.01,0.01,0.01,0.02,0.02,0.01,0.02,0.01,0.01,0.01,0.02,0.01,0.01,0.02,0.02,0.02,0.01,0.01,0.02,0.01,0.02,0.01,0.01,0.02,0.01,0.02,0.01,0.01,0.04,0.02,0.02,0.01,0.02,0.02,0.01,0.01,0.04,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.07,0.02,0.02,0.01,0.02,0.01,0.1,0.01,0.02,0.02,0.01,0.01,0.02,0.01,0.01,0.02,0.01,0.04,0.07,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.02,0.01,0.02,0.02,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.02,0.02,0.01,0.01,0.01,0.02,0.01,0.01,0.03,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.04,0.02,0.01,0.01,0.01,0.01,0.01,0.02,0.02,0.01,0.01,0.01,0.02,0.01,0.01,0.02,0.01,0.01,0.02,0.02,0.02,0.01,0.01,0.01,0.01,0.01,0.02,0.02,0.01,0.05,0.01,0.01,0.02,0.01,0.02,0.01,0.05,0.02,0.01,0.01,0.04,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.02,0.01,0.01,0.07,0.04,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.02,0.02,0.01,0.05,0.01,0.01,0.02,0.01,0.01,0.02,0.02,0.01,0.03,0.01,0.01,0.02,0.02,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.06,0.02,0.01,0.02,0.02,0.01,0.01,0.01,0.05,0.1,0.01,0.01,0.01,0.01,0.1,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.01,0.06,0.01,0.02,0.02,0.01,0.01,0.02,0.02,0.01,0.02,0.02,0.06,0.01,0.01,0.01,0.02,0.02,0.01,0.01,0.02,0.01,0.01,0.01,0.02,0.01,0.02,0.01,0.02,0.02,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.03,0.02,0.01,0.01,0.01,0.01,0.01,0.03,0.02,0.02,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.13,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.02,0.01,0.02,0.03,0.01,0.01,0.02,0.01,0.01,0.01,0.02,0.02,0.01,0.01,0.01,0.01,0.01,0.01,0.01,0.02,0.01,0.02,0.02,0.01,0.01,0.03,0.01,0.01,0.01,0.02,0.01,0.01,0.02,0.01,0.01,0.02,0.01,0.01,0.01,0.06,0.01,0.01,0.02,0.01,0.01,0.04,0.01,0.02,0.01,0.01,0.01,0.05,0.01,0.01,0.01,0.01,0.01


In [174]:
marcos_idf = marcos_idf.stack().reset_index()
marcos_idf

Unnamed: 0,level_0,level_1,0
0,0,aagapay,0.007936
1,0,aalaga,0.007936
2,0,aalis,0.007936
3,0,aaral,0.007936
4,0,aatasan,0.007936
...,...,...,...
1951,0,workplace,0.007936
1952,0,worse,0.007936
1953,0,yields,0.007936
1954,0,zimmerman,0.007936


In [175]:
marcos_idf = marcos_idf.rename(columns={'level_0': 'sona_no','level_1': 'term', 'tfidf': 'term', 0: 'tfidf'})
marcos_idf.head()

Unnamed: 0,sona_no,term,tfidf
0,0,aagapay,0.007936
1,0,aalaga,0.007936
2,0,aalis,0.007936
3,0,aaral,0.007936
4,0,aatasan,0.007936


In [176]:
marcos_firstsona = marcos_idf.sort_values(by=['sona_no','tfidf'], ascending=[True,False]).groupby(['sona_no']).head(10)
marcos_firstsona.head()

Unnamed: 0,sona_no,term,tfidf
1199,0,natin,0.25396
666,0,government,0.198406
397,0,department,0.174598
1202,0,national,0.174598
1369,0,philippines,0.166661


## Most relevant words from Marcos Jr.'s first SONA

In [177]:
# adding a little randomness to break ties in term ranking
marcos_firstsona_plusRand = marcos_firstsona.copy()
marcos_firstsona_plusRand['tfidf'] = marcos_firstsona_plusRand['tfidf'] + np.random.rand(marcos_firstsona.shape[0])*0.0001

# base for all visualizations, with rank calculation
base = alt.Chart(marcos_firstsona_plusRand).encode(
    x = 'rank:O',
    y = 'sona_no:N'
).transform_window(
    rank = "rank()",
    sort = [alt.SortField("tfidf", order="descending")],
    groupby = ["sona_no"],
)

# heatmap specification
heatmap = base.mark_rect().encode(
    color = 'tfidf:Q'
)

# text labels, white for darker heatmap colors
text = base.mark_text(baseline='middle').encode(
    text = 'term:N',
    color = alt.condition(alt.datum.tfidf >= 0.23, alt.value('white'), alt.value('black'))
)

# display the three superimposed visualizations
(heatmap + text).properties(width = 600, height=400)