# bnrs_algorithm: recommendation

This notebook contains the recommendation steps of the BNRS recommendation algorithm.

## 0. setup

Suggested (conda) environment setup:

```bash
# Create base env with conda
conda create -n bnrs_algorithm python=3.11 numpy pandas scikit-learn networkx tqdm nltk -c conda-forge
conda activate bnrs_algorithm

# Install PyTorch — CPU-only example:
conda install pytorch-c pytorch

# Install packages from PyPI
pip install sentence-transformers keybert keyphrase-vectorizers spacy datasets

# Install spaCy models used in the notebook
python -m spacy download en_core_web_lg

# Download NLTK stopwords
python -c "import nltk; nltk.download('stopwords')"

In [1]:
import numpy as np
import pandas as pd
from sklearn.metrics.pairwise import cosine_similarity
from scipy.spatial.distance import cosine

## 1.  load corpus

This takes the output from `01_preprocess.ipynb` as input.

Which contains the news article content as well as their pre-computed document, subject, and context embeddings. 

In [2]:
#load output from 01_processing.py (news corpus)
news_df = pd.read_csv('./data/01_processing_output.csv') 

#convert dates to datetime objects
news_df['date'] = pd.to_datetime(news_df['date'])

#convert embedding strings to numpy arrays
def convert_embeddings(df, column_name):
    df[column_name] = df[column_name].apply(
        lambda x: np.fromstring(x.strip('[]'), sep=' ')
    )
    return df
embedding_columns = ['document_embedding', 'subject_embedding', 'context_embedding']
for column in embedding_columns:
    news_df = convert_embeddings(news_df, column_name=column)

#convert all keyword columns from strings to sets: 
news_df['subject_keywords'] = news_df['subject_keywords'].apply(eval)
news_df['context_keywords'] = news_df['context_keywords'].apply(eval)

#report on the corpus size and date range
print(f"News data loaded with {news_df.shape[0]} unique articles from {news_df['date'].min()} to {news_df['date'].max()}")
news_df.head(3)

News data loaded with 1933 unique articles from 2018-04-04 00:00:00 to 2018-04-04 00:00:00


Unnamed: 0,date,publication,title,article,docs,entities,document_embedding,subject_keywords,context_keywords,subject_weights,context_weights,subject_embedding,context_embedding,clustNum
0,2018-04-04,Reuters,Brazil soy exporters set to win big from U.S.-...,SAO PAULO (Reuters) - China’s move to slap tar...,Brazil soy exporters set to win big from U.S.-...,"[('Brazil', 'GPE'), ('SAO PAULO', 'GPE'), ('Ch...","[-0.0333679169, -0.0847597122, 0.00371262198, ...","(latin america, china, brazilian, south americ...","(soybean exports, soybean prices, u.s. farm pr...","(0.3251, 0.3188, 0.3181, 0.3089, 0.2885, 0.278...","(0.596, 0.527, 0.5101, 0.5042, 0.4913)","[0.0308918837, 0.000622280951, -0.0400897287, ...","[-0.0346553768, -0.0455606684, -0.0337220982, ...",16.0
1,2018-04-04,Vox,Mueller: Trump’s not a target. 4 theories on w...,The new report that special counsel Robert Mue...,Mueller: Trump’s not a target. 4 theories on w...,"[('Mueller', 'PERSON'), ('Trump', 'ORG'), ('Ro...","[-0.0121588996, 0.0335707292, -0.0258219242, -...","(mueller, robert mueller, rosenstein, rod rose...","(dubious indictment, potential impeachment)","(0.5037, 0.402, 0.2596, 0.2294, 0.2163, 0.2017...","(0.3692, 0.3576)","[-0.0256747247, 0.00851988334, -0.0596536841, ...","[-0.104109472, 0.0411785132, 0.0161309472, -0....",12.0
2,2018-04-04,Reuters,Trump to order National Guard to protect borde...,WASHINGTON (Reuters) - President Donald Trump ...,Trump to order National Guard to protect borde...,"[('Trump', 'ORG'), ('National Guard', 'ORG'), ...","[0.0256739892, 0.0943503529, 0.0393003039, -0....","(deploy national guard, national guard, border...","(protect border, border patrol, department hom...","(0.6191, 0.5678, 0.4664, 0.4566, 0.4293, 0.324...","(0.502, 0.4664, 0.4632, 0.4566, 0.4353)","[-0.0156751699, 0.0443499305, -0.0270405, 0.00...","[-0.00284021459, 0.0566167201, -0.0340135201, ...",13.0


## 2. select reference items (`Hx`)
We take the n largest event clusters of the day and select the most representative item from each cluster as our `reference` items.

The most representative item is defined as the item closest to the cluster centroid in embedding space (by cosine similarity).

This method was used in the user study to select the N=5 starting articles for each day (also stratified by topic; see Methods).

In [3]:
def select_reference_articles_auto(news_df, num_clusters):
    """
    Automatically select reference articles from news clusters by choosing the most
    representative article (closest to centroid) from each of the top N largest clusters.
    
    Parameters:
    -----------
    news_df (pandas.DataFrame): DataFrame with columns for 'clustNum', 'title',
        and 'document_embedding'
    num_clusters (int): Number of top clusters to process
    
    Returns:
    --------
    pandas.DataFrame: Selected reference articles with their cluster information
    """
    #sort clusters by size in descending order
    cluster_sizes = news_df.groupby('clustNum').size().sort_values(ascending=False)
    
    #initialize list to store selected article IDs
    selected_ids = []
    
    #process only the top N clusters
    for cluster_num in cluster_sizes.index[:num_clusters]:
        #filter dataframe for current cluster
        cluster_df = news_df[news_df['clustNum'] == cluster_num]
        
        #calculate average embedding (centroid) for current cluster
        avg_embedding = np.mean(np.vstack(cluster_df['document_embedding']), axis=0)
        
        #compute cosine similarity between average embedding and all articles in cluster
        similarities = cosine_similarity(
            np.vstack(cluster_df['document_embedding']),
            avg_embedding.reshape(1, -1)
        ).flatten()
        
        #get the most similar article's ID (closest to centroid)
        most_representative_id = cluster_df.iloc[np.argmax(similarities)].name
        selected_ids.append(most_representative_id)
    
    #create dataframe of selected reference articles
    reference_df = news_df.loc[selected_ids].copy()
    
    #print summary of selected articles
    print("Selected reference articles:")
    for _, row in reference_df.iterrows():
        print(f"Cluster {row['clustNum']} (size: {cluster_sizes[row['clustNum']]}): {row['title']}")
    
    return reference_df

In [4]:
reference_df = select_reference_articles_auto(news_df, num_clusters=5)

Selected reference articles:
Cluster 11.0 (size: 69): China slaps tariffs on 106 US products, including soy, cars, chemicals
Cluster 21.0 (size: 39): Facebook says up to 87 million people affected by Cambridge Analytica scandal | TheHill
Cluster 10.0 (size: 25): The YouTube shooter was “angry” at the company for demonetizing her videos
Cluster 13.0 (size: 12): Trump, stymied on wall, to send troops to U.S.-Mexico border
Cluster 15.0 (size: 11): President Trump agrees to keep troops in Syria for now: Reports


## 3. define pipeline (`Rx`)

#### 3.1 most-similar recommendations:
* We estimate most-similar recommendations by a simple bounded cosine similarity approach.
* We first calculate pair-wise cosine similarity between each Hx's document embedding and that of all candidates (Rx).
* We then filter these recommendation candidates:
  - To exclude articles with very high similarity (e.g., >0.95), that are likely to be near-duplicates. 
  - To exclude articles with very low similarity (e.g., <0.5), to truncate the long tail of extremely unrelated articles.
* We rank remaining candidates by their cosine similarity (decreasing). In the user study, we randomly selected Rx from the top-5. 

In [5]:
def get_similarity_recs(recommendation_df, news_df, max_sim=0.95, min_sim=0.5):
   """
   Generate article recommendations based on embedding similarity.
   
   Parameters:
   -----------
   recommendation_df (pandas.DataFrame): DataFrame containing reference articles
   news_df (pandas.DataFrame): DataFrame with all news articles and their embeddings
   
   Returns:
   --------
   pandas.DataFrame: Updated recommendation_df with similarity recommendations
   """
   #initialize new column for recommendations
   recommendation_df['similarity_recs'] = None
   
   for reference_id in recommendation_df.index:
       #get the reference article's embedding
       reference_embedding = np.array(news_df.loc[reference_id]['document_embedding']).reshape(1, -1)
       
       #compute cosine similarity with all other articles 
       all_embeddings = np.vstack(news_df['document_embedding'])
       ref_to_all_sim = cosine_similarity(all_embeddings, reference_embedding).flatten()
       
       #create a Series from similarities using news_df's index to keep track of _id
       ref_to_all_sim_indexed = pd.Series(ref_to_all_sim, index=news_df.index)
       
       #exclude the chosen article's _id
       ref_to_all_sim_indexed = ref_to_all_sim_indexed.drop(index=[reference_id])
       
       #sort similarities in descending order
       ref_to_all_sim_indexed_sorted = ref_to_all_sim_indexed.sort_values(ascending=False)
       
       #drop those with greater than or equal to max_sim similarity
       ref_to_all_sim_indexed_filtered = ref_to_all_sim_indexed_sorted[ref_to_all_sim_indexed_sorted < max_sim]
       
       #drop those with less than or equal to min_sim similarity
       ref_to_all_sim_indexed_filtered = ref_to_all_sim_indexed_filtered[ref_to_all_sim_indexed_filtered > min_sim]
       
       #get all ids and corresponding cosine similarities
       candidate_ids = ref_to_all_sim_indexed_filtered.index.to_list()
       candidate_sims = ref_to_all_sim_indexed_filtered.values
       
       #pack into list of tuples
       candidates = list(zip(candidate_ids, candidate_sims))
       
       #save the candidates to a new column in the recommendation_df
       recommendation_df.at[reference_id, 'similarity_recs'] = candidates
   
   # Report
   print(f"Computed similarity recommendations for each of {len(recommendation_df)} reference articles.")
   return recommendation_df

####  3.2 filter same-event:

* We then remove candidate items that belong to the same news event as the reference item.
* This helps eliminate large groupings of highly similar articles in the candidate list passed on to the pivot generation step (next).

In [6]:
def get_event_removed_recs(recommendation_df, news_df):
   """
   Filter out recommendations that belong to the same event cluster as the reference article.
   
   Parameters:
   -----------
   recommendation_df (pandas.DataFrame): DataFrame containing reference articles with similarity recommendations
   news_df (pandas.DataFrame): DataFrame with all news articles and their cluster assignments
   
   Returns:
   --------
   pandas.DataFrame: Updated recommendation_df with event-filtered recommendations
   """
   #initialize new column for recommendations
   recommendation_df['event_removed_recs'] = None
   
   for reference_id in recommendation_df.index:
       #get the cluster number of the reference article
       reference_cluster = news_df.loc[reference_id]['clustNum']
       
       #get the similarity recommendations for the reference article
       similarity_recs = recommendation_df.loc[reference_id]['similarity_recs']
       
       #filter out the recommendations that are from the same cluster as the reference article
       event_removed_recs = [rec for rec in similarity_recs if news_df.loc[rec[0]]['clustNum'] != reference_cluster]
       
       #calculate the reduction in recommendations due to event removal
       #reduction_percentage = 100 * (1 - len(event_removed_recs) / len(similarity_recs))
       #print(f"Reference belongs to cluster {reference_cluster}: removed {reduction_percentage:.2f}% of candidates belonging to same news event.")
       
       #save the event removed recommendations to the new column in the df
       recommendation_df.at[reference_id, 'event_removed_recs'] = event_removed_recs
   
   print(f"Completed event removal for the recommendations of {len(recommendation_df)} reference articles.")
   return recommendation_df

#### 3.3 bisociative recommendations:

We estimate **bisociative recommendations** that preserve an *anchor* dimension while introducing separation along a *pivot* dimension.

- For each reference article we generate two ranked lists:
  - **Subject-pivoted recommendations** — high similarity in subject embeddings and high dissimilarity in context embeddings (same subject, different context).
  - **Context-pivoted recommendations** — high similarity in context embeddings and high dissimilarity in subject embeddings (same context, different subject).

- Scoring function (per candidate a):
  - Pivot strength: `Pivot(a) = M(a)^β * (1 - m(a))^γ`
    - For subject pivots: `M(a)` = subject similarity, `m(a)` = context similarity.
    - For context pivots: `M(a)` = context similarity, `m(a)` = subject similarity.
    - Weights we used in the user study: `β = 3.0`, `γ = 1.2`.
- In the user study we sampled Rx by randomly selecting from the top‑3 of each ranked list.



In [7]:
def get_pivot_recs(recommendation_df, news_df, beta=3.0, gamma=1.2):
   """
   Generate entity-based pivot recommendations by comparing subject and context embeddings
   to find recommendations that either share subject but differ in context or vice versa.
   
   Uses a scoring function: Score(a) = M(a)^β * (1-m(a))^γ where:
     - For subject pivots: M(a)=subject similarity, m(a)=context similarity
     - For context pivots: M(a)=context similarity, m(a)=subject similarity
     - β=3.0: Controls penalty for lower values of maximized similarity
     - γ=1.2: Controls penalty for higher values of minimized similarity
   
   Parameters:
   -----------
   recommendation_df (pandas.DataFrame): DataFrame containing reference articles with event-removed recommendations
   news_df (pandas.DataFrame): DataFrame with all news articles and their subject/context embeddings
   
   Returns:
   --------
   pandas.DataFrame: Updated recommendation_df with subject and context pivot recommendations
   """
   
   nan_counter = 0  # failed to calculate scores
   nonan_counter = 0  # successfully calculated scores
   
   #initialize new columns for recommendations
   recommendation_df['subject_pivot_recs'] = None
   recommendation_df['context_pivot_recs'] = None

   # - - - REFERENCE LEVEL - - - >>>
   for reference_id in recommendation_df.index:
       
       #initialize lists to store recommendations
       recommendation_df.at[reference_id, 'subject_pivot_recs'] = []
       recommendation_df.at[reference_id, 'context_pivot_recs'] = []
        
       #get the reference article's subject/context embeddings
       reference_subject_embedding = news_df.loc[reference_id]['subject_embedding']
       reference_context_embedding = news_df.loc[reference_id]['context_embedding']
       
       # - - - CANDIDATE LEVEL - - - >>>
       for candidate_id, candidate_similarity in recommendation_df.loc[reference_id]['event_removed_recs']:
           #get the candidate article's subject/context embeddings
           candidate_subject_embedding = news_df.loc[candidate_id]['subject_embedding']
           candidate_context_embedding = news_df.loc[candidate_id]['context_embedding']
           
           #check that all embeddings are not None and that they are not all zeros
           if (reference_subject_embedding is None or candidate_subject_embedding is None or
               reference_context_embedding is None or candidate_context_embedding is None or
               np.any(np.sum(reference_subject_embedding) == 0) or np.any(np.sum(candidate_subject_embedding) == 0) or
               np.any(np.sum(reference_context_embedding) == 0) or np.any(np.sum(candidate_context_embedding) == 0)):
               
               #if invalid embeddings, skip this candidate
               nan_counter += 1
               continue
               
           else:
               #calculate subject and context similarities
               subject_similarity = 1 - cosine(reference_subject_embedding, candidate_subject_embedding)
               context_similarity = 1 - cosine(reference_context_embedding, candidate_context_embedding)
               
               #set negative similarities to 0 (safeguard)
               subject_similarity = max(0, subject_similarity)
               context_similarity = max(0, context_similarity)
               
               #calculate pivot scores using the formula:
               # Score(a) = M(a)^β * (1-m(a))^γ
               
               #for subject pivots: M(a)=subject_similarity, m(a)=context_similarity
               subject_pivot = (subject_similarity**beta) * ((1 - context_similarity)**gamma)
               
               #for context pivots: M(a)=context_similarity, m(a)=subject_similarity
               context_pivot = (context_similarity**beta) * ((1 - subject_similarity)**gamma)

               # Increment success counter   
               nonan_counter += 1
           
               #append the candidate's ID, similarity scores, and pivot score to the respective lists
               recommendation_df.at[reference_id, 'subject_pivot_recs'].append((candidate_id, candidate_similarity,
                                                                              subject_similarity, context_similarity,
                                                                              subject_pivot))
               recommendation_df.at[reference_id, 'context_pivot_recs'].append((candidate_id, candidate_similarity,
                                                                              subject_similarity, context_similarity,
                                                                              context_pivot))

       #sort the lists by pivot score in descending order    
       for pivot_type in ['subject_pivot_recs', 'context_pivot_recs']:
           recommendation_df.at[reference_id, pivot_type] = sorted(
               (rec for rec in recommendation_df.at[reference_id, pivot_type] if not np.isnan(rec[-1])), 
               key=lambda x: x[-1], 
               reverse=True)

   #report results
   print(f"Failed to calculate pivot scores for {nan_counter} candidate articles.")
   print(f"Successfully calculated pivot scores for {nonan_counter} candidate articles.")
   
   return recommendation_df

#### 3.4 random recommendations:
We estimate random recommendations as a baseline condition:

In [8]:
def get_random_recs(recommendation_df, news_df, num_articles=5):
   """
   Generate random article recommendations as a baseline comparison.
   
   Parameters:
   -----------
   recommendation_df (pandas.DataFrame): DataFrame containing reference articles
   news_df (pandas.DataFrame): DataFrame with all news articles
   num_articles (int): Number of random articles to recommend (default: 5)
   
   Returns:
   --------
   pandas.DataFrame: Updated recommendation_df with random recommendations
   """
   #initialize new column for recommendations
   recommendation_df['random_recs'] = None
   
   for reference_id in recommendation_df.index:
       #select num_articles random articles from news_df
       random_articles = news_df.sample(n=num_articles)
       
       #get the IDs and embeddings of the random articles
       candidate_ids = random_articles.index.to_list()
       candidate_embeddings = random_articles['document_embedding'].values
       
       #put these into tuples
       candidates = list(zip(candidate_ids, candidate_embeddings))
       
       #save the candidates to the new column in the recommendation_df
       recommendation_df.at[reference_id, 'random_recs'] = candidates
   
   #report results
   print(f"Computed {num_articles} random recommendations for each of {len(recommendation_df)} reference articles.")
   return recommendation_df

## 4. run pipeline

In [9]:
recommendation_df = reference_df.copy()
recommendation_df = get_similarity_recs(recommendation_df, news_df)
recommendation_df = get_event_removed_recs(recommendation_df, news_df)
recommendation_df = get_pivot_recs(recommendation_df, news_df)
recommendation_df = get_random_recs(recommendation_df, news_df, num_articles=5)

Computed similarity recommendations for each of 5 reference articles.
Completed event removal for the recommendations of 5 reference articles.
Failed to calculate pivot scores for 0 candidate articles.
Successfully calculated pivot scores for 239 candidate articles.
Computed 5 random recommendations for each of 5 reference articles.


## 5. explore topN 

In [10]:
def print_recommendations(recommendation_df, news_df, top_n=3, similar=False, random=False):
   """
   Print formatted recommendations from different strategies for all reference articles.
   
   Parameters:
   -----------
   recommendation_df (pandas.DataFrame): DataFrame containing reference articles with recommendations
   news_df (pandas.DataFrame): DataFrame with all news articles and their details
   top_n (int): Number of top recommendations to display for each strategy (default: 3)
   similar (bool): Whether to display similarity-based recommendations (default: False)
   random (bool): Whether to display random recommendations as control (default: False)
   """
   #iterate over each reference article in recommendation_df
   for selected_reference_id in recommendation_df.index:
       
       #get the reference article details
       reference_article = news_df.loc[selected_reference_id]
       
       print("\n" + "="*80)
       print(f"RECOMMENDATIONS FOR REFERENCE ARTICLE:")
       ref_title = reference_article['title']
           
       print(f"CLUSTER: \033[1;37;41m{reference_article['clustNum']}\033[0m")
       print(f"TITLE: \033[1;34;43m{ref_title}\033[0m")
       print(f"DATE: {reference_article['date']}")
       print("="*80 + "\n")

       #get the top N recommendations from each strategy
       similarity_recs = recommendation_df.loc[selected_reference_id]['similarity_recs'][:top_n]
       subject_pivot_recs = recommendation_df.loc[selected_reference_id]['subject_pivot_recs'][:top_n]
       context_pivot_recs = recommendation_df.loc[selected_reference_id]['context_pivot_recs'][:top_n]
       random_recs = recommendation_df.loc[selected_reference_id]['random_recs'][:top_n]

       #print the most similar recommendations if requested
       if similar:
           print(f"Most-similar recommendations (S):")
           for i, (rec_id, rec_sim) in enumerate(similarity_recs, start=1):
               row = news_df.loc[rec_id]
               title = row['title']
               subject_kw = row['subject_keywords']
               context_kw = row['context_keywords']
               
               article_info = f"""
               CLUSTER: \033[1;37;41m{row['clustNum']}\033[0m SIMILARITY: \033[1;37;42m{rec_sim:.2f}\033[0m
               DATE: {row['date']},
               TITLE: \033[1;34;43m{title}\033[0m
               SUBJECT KEYWORDS: {subject_kw[:10]}
               CONTEXT KEYWORDS: {context_kw[:10]}"""
               print(article_info)

       #print subject-pivoted recommendations
       print('\n')
       print(f"Subject-pivoted recommendations:")
       for i, (rec_id, rec_sim, sub_sim, con_sim, rec_piv) in enumerate(subject_pivot_recs, start=1):
           row = news_df.loc[rec_id]
           title = row['title']
           subject_kw = row['subject_keywords']
           context_kw = row['context_keywords']
           
           article_info = f"""
           CLUSTER: \033[1;37;41m{row['clustNum']}\033[0m SIMILARITY: \033[1;37;42m{rec_sim:.2f}\033[0m 
           SUBJECT: \033[1;37;44m{sub_sim:.2f}\033[0m CONTEXT: \033[1;37;44m{con_sim:.2f}\033[0m PIVOT: \033[1;37;45m{rec_piv:.2f}\033[0m
           DATE: {row['date']}, 
           TITLE: \033[1;34;43m{title}\033[0m
           SUBJECT KEYWORDS: {subject_kw[:10]}
           CONTEXT KEYWORDS: {context_kw[:10]}"""
           print(article_info)

       #print context-pivoted recommendations
       print('\n')
       print(f"Context-pivoted recommendations:")
       for i, (rec_id, rec_sim, sub_sim, con_sim, rec_piv) in enumerate(context_pivot_recs, start=1):
           row = news_df.loc[rec_id]
           title = row['title']
           subject_kw = row['subject_keywords']
           context_kw = row['context_keywords']
           
           article_info = f"""
           CLUSTER: \033[1;37;41m{row['clustNum']}\033[0m SIMILARITY: \033[1;37;42m{rec_sim:.2f}\033[0m 
           SUBJECT: \033[1;37;44m{sub_sim:.2f}\033[0m CONTEXT: \033[1;37;44m{con_sim:.2f}\033[0m PIVOT: \033[1;37;45m{rec_piv:.2f}\033[0m
           DATE: {row['date']}, 
           TITLE: \033[1;34;43m{title}\033[0m
           SUBJECT KEYWORDS: {subject_kw[:10]}
           CONTEXT KEYWORDS: {context_kw[:10]}"""
           print(article_info)

       #print random recommendations if random is True
       if random:
           print('\n')
           print(f"Random recommendations (R):")
           for i, (rec_id, _) in enumerate(random_recs, start=1):
               row = news_df.loc[rec_id]
               title = row['title']
               
               article_info = f"""
               CLUSTER: \033[1;37;41m{row['clustNum']}\033[0m
               DATE: {row['date']}, 
               TITLE: \033[1;34;43m{title}\033[0m"""
               print(article_info)
               
       # Add a separator between reference articles
       print("\n" + "-"*80 + "\n")

In [11]:
print_recommendations(recommendation_df, news_df, top_n=3, similar=True)


RECOMMENDATIONS FOR REFERENCE ARTICLE:
CLUSTER: [1;37;41m11.0[0m
TITLE: [1;34;43mChina slaps tariffs on 106 US products, including soy, cars, chemicals[0m
DATE: 2018-04-04 00:00:00

Most-similar recommendations (S):

               CLUSTER: [1;37;41m11.0[0m SIMILARITY: [1;37;42m0.86[0m
               DATE: 2018-04-04 00:00:00,
               TITLE: [1;34;43mCommerce Secretary Wilbur Ross: China tariffs amount to 0.3% of US GDP[0m
               SUBJECT KEYWORDS: ('dow', 'china', 'commerce', 'wilbur ross', 'chinese', 'trump', 'coca cola', 'ross', 'donald trump', 'united states')
               CONTEXT KEYWORDS: ('united states tariffs', 'additional tariffs', 'new tariffs', 'announced tariffs', 'tariffs', 'heavy tariffs')

               CLUSTER: [1;37;41m11.0[0m SIMILARITY: [1;37;42m0.85[0m
               DATE: 2018-04-04 00:00:00,
               TITLE: [1;34;43mGLOBAL MARKETS-Shares recoil as China retaliates in U.S. trade war[0m
               SUBJECT KEYWORDS: ('chin