# What is this playground for?

I've got a solid amount of data (5 years) and want to start seeing what kind of results are returned by the api as is. If it looks _kino_ then I'll move on to try and deploy a super basic version, send it to my friends to look at, polish, then hackernews the shit out of it!

In [19]:
from datetime import datetime, timedelta
from dateutil.relativedelta import relativedelta
import os
from dotenv import load_dotenv
from pynytimes import NYTAPI
from annoy import AnnoyIndex
from tqdm import tqdm
from util import month_to_str, str_to_month
from gen import EMBED_LENGTH, get_vec
from schema import DBManager

load_dotenv()

API_KEY = os.getenv("NYT_KEY") or ""
nyt = NYTAPI(API_KEY, parse_dates=True)

In [20]:
# Get the top stories to test on
top_stories = nyt.top_stories()

In [35]:
# Get all of the annoy indexes into memory
start_date = datetime(2001, 1, 1)
end_date = datetime(2005, 12, 31)

month_to_hix = {}
date = start_date
while date < end_date:
    mstr = month_to_str(date)
    ix_file = f"models/head/h_{mstr}.ann"
    aix = AnnoyIndex(EMBED_LENGTH, "angular")
    aix.load(ix_file)
    month_to_hix[mstr] = aix
    date += relativedelta(months=1)


In [36]:
# Prep the metadata database for querying
dbman = DBManager("meta.db")

In [37]:
for story in top_stories:
    title = story["title"]
    min_ix = -1
    min_date = ""
    min_dist = float("inf")
    for date, hix in month_to_hix.items():
        # Get the top 10 results
        emb = get_vec(title)
        ixs, dists = hix.get_nns_by_vector(emb, 10, include_distances=True)
        for ix, dist in zip(ixs, dists):
            if dist > min_dist:
                continue
            min_ix = ix
            min_date = date
            min_dist = dist
    article = dbman.get_article_by_month_and_ix(min_date, min_ix)
    if article is None:
        print("WHAT! No results")
        continue
    print("------------------------------------------------")
    print(title)
    print("MAPPED TO")
    print(article.headline)
            

------------------------------------------------
The Steep Cost of Ron DeSantis’s Vaccine Turnabout
MAPPED TO
Vaccines and Privatization
------------------------------------------------
How a Drug Maker Profited by Slow-Walking a Promising H.I.V. Therapy
MAPPED TO
Drug Maker Shifts to Profit
------------------------------------------------
In Belarus, the Protests Were Three Years Ago. The Crackdown Is Never-Ending.
MAPPED TO
World Briefing | Europe: Belarus: Protesters Sentenced
------------------------------------------------
As Tensions Rise, Zelensky Pushes for Way to Ship Grain Through Black Sea
MAPPED TO
World Briefing | Europe: Polish-Belarussian Tensions Increase
------------------------------------------------
Venezuela’s Oil Industry Is Broken. Now It’s Breaking the Environment.
MAPPED TO
Venezuela Woes Worsen as State Oil Company Calls Strike
------------------------------------------------
Japan Says It Can Make Coal Cleaner. Critics Say Its Plan Is ‘Almost Impossible.’
MAP