# User keyword query

    Input 1: user query keywords
    Input 2: condition--and, or

    Output: frequency of the keywords
    (1): How many times are the keywords mentioned? 這個關鍵字被提到多少次?
    (2): How many pieces of news contain (mention) the keywords?  有幾篇新聞提到這個關鍵字?

    First article: ['肺炎','疫情', '肺炎']
    Second article:['陳時中','指揮中心','肺炎','陳時中']

    肺炎: 
    (1) '肺炎' are mentioned three times.  ==> frequency is 3
    (2) Two pieces of news mention '肺炎'. ==> occurrence is 2

    陳時中: 
    (1) '陳時中' are mentioned two times.  ==> frequency is 2
    (2) One pieces of news mention '陳時中'. ==> occurrence is 1
    


# Step 0: Load preprocessed news dataset

In [1]:
import pandas as pd
from datetime import datetime, timedelta

In [2]:
df = pd.read_csv('./cna_news_preprocessed.csv',sep='|')

In [3]:
df.head(1)

Unnamed: 0,item_id,date,category,title,content,sentiment,summary,top_key_freq,tokens,tokens_v2,entities,token_pos,link,photo_link
0,aipl_20220314_1,2022-03-14,政治,外交部援烏物資已募4000箱 吳釗燮感謝捐贈民眾,民眾捐贈烏克蘭的愛心物資持續湧入外交部，截至今天傍晚累計已收到約4000箱，外交部長吳釗燮中...,0.01,"['外交部除感謝熱心民眾踴躍捐贈援助烏克蘭人道物資外', '親赴外交部捐贈物資的民眾約173...","[('外交部', 14), ('民眾', 7), ('物資', 7), ('烏克蘭', 5)...","['民眾', '捐贈', '烏克蘭', '的', '愛心', '物資', '持續', '湧入...","['民眾', '烏克蘭', '愛心', '物資', '外交部', '收到', '外交部長',...","[NerToken(word='烏克蘭', ner='GPE', idx=(4, 7)), ...","[('民眾', 'Na'), ('捐贈', 'VD'), ('烏克蘭', 'Nc'), ('...",https://www.cna.com.tw/news/aipl/202203140364....,https://imgcdn.cna.com.tw/www/WebPhotos/200/20...


# Step 1: Select news with user input keywords, category, duration

## (1)Improved Search from "content" column

In [4]:
from datetime import datetime, timedelta
# Searching keywords from "content" column
# Here this function uses df.content column, while filter_dataFrame() uses df.tokens_v2
def filter_dataFrame(user_keywords, cond, cate, weeks):

    # end date: the date of the latest record of news
    end_date = df.date.max()
    
    # start date
    start_date = (datetime.strptime(end_date, '%Y-%m-%d').date() - timedelta(weeks=weeks)).strftime('%Y-%m-%d')

    # (1) proceed filtering: a duration of a period of time
    # 期間條件
    period_condition = (df.date >= start_date) & (df.date <= end_date)  # [True, Fase, .....] 
    
    # (2) proceed filtering: news category
    # 新聞類別條件
    if (cate == "全部"):
        condition = period_condition  # "全部"類別不必過濾新聞種類
    else:
        # category新聞類別條件
        condition = period_condition & (df.category == cate)  # [False, Fase, .....] 

    # (3) proceed filtering: keywords 
    # and or 條件
    if (cond == 'and'):
        # query keywords condition使用者輸入關鍵字條件and
        condition = condition & df.content.apply(  lambda content: all(    (qk in content) for qk in user_keywords   )   ) #寫法:all()     (qk in text) for qk in user_keywords
    elif (cond == 'or'):
        # query keywords condition使用者輸入關鍵字條件
        condition = condition & df.content.apply(lambda text: any((qk in text) for qk in user_keywords)) #寫法:any()
    # condiction is a list of True or False boolean value
    df_query = df[condition]

    return df_query


## (2)Search keywords from "content" column (Another way)

In [5]:
# Searching keywords from "content" column
# Here this function uses df.content column, while filter_dataFrame() uses df.tokens_v2
def filter_dataFrame_v0(user_keywords, cond, cate, weeks):

    # end date: the date of the latest record of news
    end_date = df.date.max()
    
    # start date
    start_date = (datetime.strptime(end_date, '%Y-%m-%d').date() - timedelta(weeks=weeks)).strftime('%Y-%m-%d')

    # proceed filtering
    if (cate == "全部") & (cond == 'and'):
        df_query = df[(df.date >= start_date) & (df.date <= end_date) 
            & df.content.apply(lambda text: all((qk in text) for qk in user_keywords))]
    elif (cate == "全部") & (cond == 'or'):
        df_query = df[(df['date'] >= start_date) & (df['date'] <= end_date) 
            & df.content.apply(lambda text: any((qk in text) for qk in user_keywords))]
    elif (cond == 'and'):
        df_query = df[(df.category == cate) 
            & (df.date >= start_date) & (df.date <= end_date) 
            & df.content.apply(lambda text: all((qk in text) for qk in user_keywords))]
    elif (cond == 'or'):
        df_query = df[(df.category == cate) 
            & (df['date'] >= start_date) & (df['date'] <= end_date) 
            & df.content.apply(lambda text: any((qk in text) for qk in user_keywords))]

    return df_query

In [6]:
user_keywords=['烏克蘭','俄羅斯']
cond='and'
cate='全部'
weeks=4
df_query = filter_dataFrame(user_keywords, cond, cate, weeks)
len(df_query)

17

In [7]:
user_keywords=['烏克蘭','俄羅斯']
cond='or'
cate='全部'
weeks=4
df_query = filter_dataFrame(user_keywords, cond, cate, weeks)
len(df_query)

26

In [8]:
user_keywords=['烏克蘭','俄羅斯']
cond='and'
cate='國際'
weeks=4
df_query = filter_dataFrame(user_keywords, cond, cate, weeks)
len(df_query)

8

In [9]:
df_query.head()

Unnamed: 0,item_id,date,category,title,content,sentiment,summary,top_key_freq,tokens,tokens_v2,entities,token_pos,link,photo_link
140,aopl_20220314_1,2022-03-14,國際,澳洲擴大制裁俄羅斯 歐盟要凍結切爾西老闆資產,（中央社雪梨/布魯塞爾14日綜合外電報導）澳洲為懲罰俄羅斯侵略烏克蘭，今天再將33名俄國寡頭...,0.0,"['將阿布拉莫維奇加入制裁的俄羅斯富豪名單', '而歐盟各國代表將在今天開會通過這項措施',...","[('制裁', 8), ('俄羅斯', 7), ('俄國', 7), ('歐盟', 5), ...","['（', '中央社', '雪梨', '/', '布魯塞爾', '14日', '綜合', '...","['中央社', '雪梨', '布魯塞爾', '綜合', '外電', '澳洲', '懲罰', ...","[NerToken(word='中央社', ner='ORG', idx=(1, 4)), ...","[('（', 'PARENTHESISCATEGORY'), ('中央社', 'Nc'), ...",https://www.cna.com.tw/news/aopl/202203140376....,
141,aopl_20220314_2,2022-03-14,國際,俄羅斯提核協議新要求 伊朗外長將赴莫斯科討論,關於伊朗核子協議的談判，數天前因為俄羅斯提出新要求而觸礁，伊朗外交部長阿布杜拉希安15日將赴...,0.0,['他已與俄羅斯外長拉夫羅夫（Sergei Lavrov）討論了正在奧地利維也納進行的伊朗核...,"[('伊朗', 11), ('協議', 9), ('談判', 6), ('外長', 6), ...","['關於', '伊朗', '核子', '協議', '的', '談判', '，', '數', ...","['伊朗', '核子', '協議', '談判', '俄羅斯', '提出', '要求', '伊...","[NerToken(word='伊朗', ner='GPE', idx=(2, 4)), N...","[('關於', 'P'), ('伊朗', 'Nc'), ('核子', 'Na'), ('協議...",https://www.cna.com.tw/news/aopl/202203140361....,
142,aopl_20220314_3,2022-03-14,國際,2022酷寒演習展開 3萬北約兵力集結挪威,北大西洋公約組織（NATO）及夥伴國約3萬名軍人今天於挪威展開大規模演習，與此同時，西方與俄...,0.93,"['俄國駐挪威大使館上週告訴法新社：「北約在俄羅斯邊境附近的任何軍力集結', '負責「酷寒演...","[('挪威', 7), ('俄羅斯', 4), ('北約', 4), ('邊境', 3), ...","['北大西洋', '公約', '組織', '（', 'NATO', '）', '及', '夥...","['北大西洋', '公約', '組織', '夥伴國', '軍人', '挪威', '展開', ...","[NerToken(word='NATO', ner='ORG', idx=(9, 13))...","[('北大西洋', 'Nc'), ('公約', 'Na'), ('組織', 'Na'), (...",https://www.cna.com.tw/news/aopl/202203140357....,https://imgcdn.cna.com.tw/www/webphotos/WebCov...
143,aopl_20220314_4,2022-03-14,國際,俄國遭制裁降價求售石油和商品 印度考慮採購,俄羅斯入侵烏克蘭遭西方制裁，莫斯科亟欲尋找出口管道。印度官員表示，印度考慮莫斯科提出的打折價...,0.0,"['印度1名官員表示：「俄國提出的原油和商品價格折扣大', '這名要求匿名的官員並未透露俄國...","[('印度', 13), ('俄國', 9), ('官員', 7), ('原油', 7), ...","['俄羅斯', '入侵', '烏克蘭', '遭', '西方', '制裁', '，', '莫斯...","['俄羅斯', '烏克蘭', '制裁', '莫斯科', '尋找', '出口', '管道', ...","[NerToken(word='俄羅斯', ner='GPE', idx=(0, 3)), ...","[('俄羅斯', 'Nc'), ('入侵', 'VCL'), ('烏克蘭', 'Nc'), ...",https://www.cna.com.tw/news/aopl/202203140348....,
144,aopl_20220314_5,2022-03-14,國際,烏克蘭戰事中國疫情添不安 亞股多數收黑,投資人持續追蹤烏克蘭戰事發展及為化解危機所做的最新外交努力，同時焦慮等待美國央行本週可能宣布...,0.0,"['由於美國股市上週收盤再度挫跌', '香港股市今天收盤重挫達5.0%', '尤其全球最大經...","[('股市', 10), ('美國', 7), ('烏克蘭', 3), ('重挫', 3),...","['投資人', '持續', '追蹤', '烏克蘭', '戰事', '發展', '及', '為...","['投資人', '追蹤', '烏克蘭', '戰事', '發展', '化解', '危機', '...","[NerToken(word='美國央行', ner='ORG', idx=(36, 40)...","[('投資人', 'Na'), ('持續', 'VL'), ('追蹤', 'VC'), ('...",https://www.cna.com.tw/news/aopl/202203140338....,


# Step 2: calculate frequency and occurence

* How many pieces of news were the keyword(s) mentioned in?

被多少篇新聞報導 How many pieces of news contain the keywords.

* How many times were the keyword(s) mentioned?

被提到多少次? How many times are the keywords mentioned

In [10]:
import re

In [11]:
# For the query_df, count the occurence and frequency for each category.

# (1) cate_occurence={}  被多少篇新聞報導 How many pieces of news contain the keywords.
# (2) cate_freq={}       被提到多少次? How many times are the keywords mentioned

news_categories = ['政治', '科技', '運動', '證卷', '產經', '娛樂', '生活', '國際', '社會', '文化', '兩岸', '全部']

def count_keyword(df_query, query_keywords):

    cate_occurrence = {}
    cate_freq = {}
    
    # 字典初始化
    for cate in news_categories:
        cate_occurrence[cate] = 0   # {'政治':0, '科技':0}
        cate_freq[cate] = 0         # {'政治':0, '科技':0}
        

    for idx, row in df_query.iterrows():
        # count the number of articles各類別篇數統計
        cate_occurrence[row.category] += 1  #   {'政治':+1, '科技':0}
        cate_occurrence['全部'] += 1
        
        # count the keyword frequency各類別次數統計
        # 計算這一篇文章的content中重複含有多少個這些關鍵字(頻率)
        freq = sum([ len(re.findall(keyword, row.content, re.I)) for keyword in query_keywords] ) # 計算這篇文章中關鍵字的頻率
        cate_freq[row.category] += freq # 在該新聞類別中累計頻率
        cate_freq['全部'] += freq  # 在"全部"類別中累計頻率

    return cate_freq, cate_occurrence


In [12]:
user_keywords=['烏克蘭','俄羅斯']
cond='or'
cate='全部'
weeks=4
# Step 1 fitering data
df_query = filter_dataFrame(user_keywords, cond, cate, weeks)
len(df_query)

# Step 2: calculating frequency and occurence
count_keyword(df_query, user_keywords)

({'政治': 13,
  '科技': 21,
  '運動': 4,
  '證卷': 2,
  '產經': 15,
  '娛樂': 5,
  '生活': 0,
  '國際': 47,
  '社會': 0,
  '文化': 14,
  '兩岸': 24,
  '全部': 145},
 {'政治': 3,
  '科技': 2,
  '運動': 1,
  '證卷': 2,
  '產經': 3,
  '娛樂': 2,
  '生活': 0,
  '國際': 8,
  '社會': 0,
  '文化': 3,
  '兩岸': 2,
  '全部': 26})

In [13]:
user_keywords=['柯文哲','阿伯',"柯P","柯p"]
cond='or'
cate='全部'
weeks=4
# Step 1 fitering data
df_query = filter_dataFrame(user_keywords, cond, cate, weeks)
len(df_query)

# Step 2: calculating frequency and occurence
count_keyword(df_query, user_keywords)

({'政治': 3,
  '科技': 1,
  '運動': 0,
  '證卷': 0,
  '產經': 0,
  '娛樂': 0,
  '生活': 1,
  '國際': 0,
  '社會': 0,
  '文化': 0,
  '兩岸': 0,
  '全部': 5},
 {'政治': 1,
  '科技': 1,
  '運動': 0,
  '證卷': 0,
  '產經': 0,
  '娛樂': 0,
  '生活': 1,
  '國際': 0,
  '社會': 0,
  '文化': 0,
  '兩岸': 0,
  '全部': 3})

In [14]:
len(df_query)

3

# Demonstrate step by step

How many news are related to "烏克蘭" ?

    How many pieces of news mentioned "烏克蘭"
    How many pieces of news are related to "烏克蘭"?

    You can calculate and get the answer from the following fields: tokens, tokens_v2, or content. (Get very similar results)

    We use "tokens_v2" because it contains only some important keywords which were selected in the pre-process step.

    
A flexible appraoch for And OR condiction

        User all()  any()
        df = pd.DataFrame({'col': ["apple is delicious",
                                "banana is delicious",
                                "apple and banana both are delicious"]})

        targets = ['apple', 'banana']

        # Any word from `targets` are present in sentence.
        >>> df.col.apply(lambda sentence: any(word in sentence for word in targets))
        0    True
        1    True
        2    True
        Name: col, dtype: bool

        # All words from `targets` are present in sentence.
        >>> df.col.apply(lambda sentence: all(word in sentence for word in targets))
        0    False
        1    False
        2     True
        Name: col, dtype: bool

## re.findall

從文章中找出所有的特定文字

re.I表示英文大小寫皆可

In [15]:
content = "民眾捐贈烏克蘭的愛心物資持續湧入外交部，截至今天傍晚累計已收到約4000箱，外交部長吳釗燮中午表示，這些物資會在明天運送到烏克蘭，並感謝民眾的愛心。"
keyword = '烏克蘭' 

re.findall(keyword, content, re.I)

['烏克蘭', '烏克蘭']

In [16]:
keyword = '吳釗燮' 

re.findall(keyword, content, re.I)

['吳釗燮']

In [17]:
keyword = '外交部' 

re.findall(keyword, content, re.I)

['外交部', '外交部']

In [18]:
keyword = '外交部長吳釗燮' 

re.findall(keyword, content, re.I)

['外交部長吳釗燮']

In [19]:
keyword = '俄羅斯' 

re.findall(keyword, content, re.I)

[]

In [20]:
content = "This is a test content with some keywords like Ukraine and Russia."
keyword = 'Ukraine' 

re.findall(keyword, content, re.I)

['Ukraine']

In [21]:
content = "This is a test content with some keywords like Ukraine and Russia."
keyword = 'ukraine' 

re.findall(keyword, content, re.I)

['Ukraine']

## and &, or | 

In [22]:
True & True

True

In [23]:
True & False

False

In [24]:
True | True

True

In [25]:
True | False

True

## "in" is very powerful in Python!

#### in a string

In [26]:
text = '武漢烏克蘭疫情全球延燒，國防部2月針對29個疫情高風險國家地區勸阻官兵前往（包括過境）。'

In [27]:
'勸阻官兵' in text

True

In [28]:
'延燒，國防部' in text

True

In [29]:
'台灣' in text

False

In [30]:
'烏克蘭' in text

True

In [31]:
# & and
('台灣' in text)  & ('烏克蘭' in text)

False

In [32]:
# & and
('台灣' in text) and ('烏克蘭' in text)

False

In [33]:
('台灣' in text)  | ('烏克蘭' in text)

True

In [34]:
('台灣' in text)  or ('烏克蘭' in text)

True

In [35]:
# This is also a string.
text = "['武漢', '烏克蘭', '疫情', '全球', '延燒', '國防部', '疫情', '高風險', '國家', '地區', '官兵', '過境', '國防部', '政策', '全球', '國家', '地區', '轄下', '單位']"

In [36]:
'烏克蘭' in text

True

In [37]:
'台灣' in text

False

In [38]:
('台灣' in text)  & ('烏克蘭' in text)

False

In [39]:
('台灣' in text)  | ('烏克蘭' in text)

True

#### in a list

In [40]:
user_keyword=['烏克蘭','台灣']
'烏克蘭' in user_keyword

True

In [41]:
# Check out the first news
df.content[0]

'民眾捐贈烏克蘭的愛心物資持續湧入外交部，截至今天傍晚累計已收到約4000箱，外交部長吳釗燮中午親自到現場為協助整理物資的志工加油，並對捐贈民眾表達感謝。外交部晚間發布新聞稿指出，外交部從7日開始向民間募集捐贈烏克蘭難民的物資，獲得熱烈響應，親赴外交部捐贈物資的民眾約1730人，加上郵寄包裹，目前約已收到4000箱物資，品項以醫療口罩、毛毯、女性衛生用品、尿片、餅乾等為主，募集活動將持續到18日。外交部表示，為了感謝捐贈民眾，與在現場辛苦分類整理的志工、外交部人員，吳釗燮今天中午特別前往外交部西側門地下停車場視察，吳釗燮與在場的慈濟等民間慈善組織志工，以及其他自發到場幫忙的善心人士親切互動，對於也有烏克蘭旅台人士自願擔任義工在現場協助，吳釗燮特別致意慰問。根據外交部提供的照片，到場幫忙的烏克蘭志工是極為關心家鄉情勢的網紅佳娜。外交部再度提醒有意捐贈物資的民眾，捐贈物品請依照外交部網站所公布的清單為限，切勿捐贈或郵寄二手物品或衣物。送到外交部的捐贈物品務必為全新物品、未拆封包裝、有效期至少6個月以上，以免造成整理及後續轉運捐贈的困擾。募集截止時間是3月18日下午5時以前，民眾可以用面送或郵寄清單所列的20類物品及 14 類藥品至外交部。外交部除感謝熱心民眾踴躍捐贈援助烏克蘭人道物資外，也感謝許多志工義務幫忙、貢獻己力。外交部對各界人士奉獻時間與精神投入國際人道援助，表達最高的敬意。'

In [42]:
qk = '烏克蘭'
text = df.content[0]
qk in text

True

In [43]:
qk = '外交部'
text = df.content[0]
qk in text

True

In [44]:
qk = '台灣'
text = df.content[0]
qk in text

False

In [45]:
text = df.content[0]
('烏克蘭' in text) & ('外交部' in text )

True

### Another "in" in Python. It is used for "for" loop.

In [46]:
user_keywords = ['烏克蘭','外交部']
text = df.content[0]
[(qk in text) for qk in user_keywords]

[True, True]

In [47]:
user_keywords = ['烏克蘭','台灣']
text = df.content[0]
[(qk in text) for qk in user_keywords]

[True, False]

### all() any() 

    How to perform logical opertion with several conditions? 如何針對很多項去做邏輯運算?

    all(): perform "and" logical opertion 
    any(): perform "or" logical opertion

In [48]:
all( [True, True, True] ) #　True & True

True

In [49]:
any( [True, True] )

True

In [50]:
all( [True, False] )

False

In [51]:
any( [True, False] )

True

In [52]:
user_keywords = ['烏克蘭','國防部']
[word for word in user_keywords]

['烏克蘭', '國防部']

In [53]:
user_keywords=['烏克蘭','國防部']
text = '武漢烏克蘭疫情全球延燒，國防部2月針對29個疫情高風險國家地區勸阻官兵前往（包括過境）。'
[(word in text) for word in user_keywords]

[True, True]

In [54]:
user_keywords=['烏克蘭','國防部']
text = '武漢烏克蘭疫情全球延燒，國防部2月針對29個疫情高風險國家地區勸阻官兵前往（包括過境）。'
[word in text for word in user_keywords] # () can be removed

[True, True]

In [55]:
user_keywords=['烏克蘭','國防部']
text = '武漢烏克蘭疫情全球延燒，國防部2月針對29個疫情高風險國家地區勸阻官兵前往（包括過境）。'
all([word in text for word in user_keywords])

True

In [56]:
user_keywords=['烏克蘭','國防部']
text = '武漢烏克蘭疫情全球延燒，國防部2月針對29個疫情高風險國家地區勸阻官兵前往（包括過境）。'
all(word in text for word in user_keywords) # square brackets [] can be removed

True

In [57]:
# Check out the first news
user_keywords = ['烏克蘭','外交部']
text = df.content[0]
print([(qk in text) for qk in user_keywords])
all((qk in text) for qk in user_keywords)

[True, True]


True

In [58]:
user_keywords = ['烏克蘭','外交部']
text = df.content[0]
any((qk in text) for qk in user_keywords)

True

In [59]:
user_keywords = ['烏克蘭','台灣']
text = df.content[0]
[(qk in text) for qk in user_keywords]

[True, False]

In [60]:
user_keywords = ['烏克蘭','台灣']
text = df.content[0]
all((qk in text) for qk in user_keywords)

False

In [61]:
user_keywords = ['烏克蘭','台灣']
text = df.content[0]
any((qk in text) for qk in user_keywords)

True

## Using apply() and lambda function

How to check out keyword occurency for every news?

In [62]:
# Use apply() and lambda function
user_keywords = ['烏克蘭','俄羅斯']
user_keywords = ['烏克蘭','外交部']
df.content.apply(   lambda text: all([(qk in text) for qk in user_keywords])  )
#df.content.apply(lambda text: any([(qk in text) for qk in user_keywords]))

0       True
1      False
2      False
3      False
4      False
       ...  
208    False
209    False
210    False
211    False
212    False
Name: content, Length: 213, dtype: bool

In [63]:
user_keywords = ['烏克蘭','外交部']

In [64]:
[qk for qk in user_keywords]

['烏克蘭', '外交部']

In [65]:
text = df.content[0]
text

'民眾捐贈烏克蘭的愛心物資持續湧入外交部，截至今天傍晚累計已收到約4000箱，外交部長吳釗燮中午親自到現場為協助整理物資的志工加油，並對捐贈民眾表達感謝。外交部晚間發布新聞稿指出，外交部從7日開始向民間募集捐贈烏克蘭難民的物資，獲得熱烈響應，親赴外交部捐贈物資的民眾約1730人，加上郵寄包裹，目前約已收到4000箱物資，品項以醫療口罩、毛毯、女性衛生用品、尿片、餅乾等為主，募集活動將持續到18日。外交部表示，為了感謝捐贈民眾，與在現場辛苦分類整理的志工、外交部人員，吳釗燮今天中午特別前往外交部西側門地下停車場視察，吳釗燮與在場的慈濟等民間慈善組織志工，以及其他自發到場幫忙的善心人士親切互動，對於也有烏克蘭旅台人士自願擔任義工在現場協助，吳釗燮特別致意慰問。根據外交部提供的照片，到場幫忙的烏克蘭志工是極為關心家鄉情勢的網紅佳娜。外交部再度提醒有意捐贈物資的民眾，捐贈物品請依照外交部網站所公布的清單為限，切勿捐贈或郵寄二手物品或衣物。送到外交部的捐贈物品務必為全新物品、未拆封包裝、有效期至少6個月以上，以免造成整理及後續轉運捐贈的困擾。募集截止時間是3月18日下午5時以前，民眾可以用面送或郵寄清單所列的20類物品及 14 類藥品至外交部。外交部除感謝熱心民眾踴躍捐贈援助烏克蘭人道物資外，也感謝許多志工義務幫忙、貢獻己力。外交部對各界人士奉獻時間與精神投入國際人道援助，表達最高的敬意。'

In [66]:
[(qk in text) for qk in user_keywords]

[True, True]

In [67]:
# 
all([(qk in text) for qk in user_keywords])

True

In [68]:
# Square brackets can be removed
all((qk in text) for qk in user_keywords)

True

## Usage of apply, map (For reference)

In [69]:
# apply usage
import pandas as pd

sample_df = pd.DataFrame({
    'Col 1': [3,4,5,6],
    'Col 2': [2,3,6,4],
    'Col 3': [8,8,9,8],

},index=["A","B","C","D"])
sample_df

Unnamed: 0,Col 1,Col 2,Col 3
A,3,2,8
B,4,3,8
C,5,6,9
D,6,4,8


In [70]:
sample_df=sample_df.apply(lambda x: x+10)
sample_df

Unnamed: 0,Col 1,Col 2,Col 3
A,13,12,18
B,14,13,18
C,15,16,19
D,16,14,18


In [71]:
sample_df["Col 1"]=sample_df["Col 1"].apply(lambda x: x-10)
sample_df

Unnamed: 0,Col 1,Col 2,Col 3
A,3,12,18
B,4,13,18
C,5,16,19
D,6,14,18
