# What is this playground for?

I've got a solid amount of data (5 years) and want to start seeing what kind of results are returned by the api as is. If it looks _kino_ then I'll move on to try and deploy a super basic version, send it to my friends to look at, polish, then hackernews the shit out of it!

In [19]:
from datetime import datetime, timedelta
from dateutil.relativedelta import relativedelta
import os
from dotenv import load_dotenv
from pynytimes import NYTAPI
from annoy import AnnoyIndex
from tqdm import tqdm
from util import month_to_str, str_to_month
from gen import EMBED_LENGTH, get_vec
from schema import DBManager

load_dotenv()

API_KEY = os.getenv("NYT_KEY") or ""
nyt = NYTAPI(API_KEY, parse_dates=True)

In [20]:
# Get the top stories to test on
top_stories = nyt.top_stories()

In [32]:
# Get all of the annoy indexes into memory
start_date = datetime(2001, 1, 1)
end_date = datetime(2001, 11, 31)

month_to_hix = {}
date = start_date
while date < end_date:
    mstr = month_to_str(date)
    ix_file = f"models/head/h_{mstr}.ann"
    aix = AnnoyIndex(EMBED_LENGTH, "angular")
    aix.load(ix_file)
    month_to_hix[mstr] = aix
    date += relativedelta(months=1)


In [33]:
# Prep the metadata database for querying
dbman = DBManager("meta.db")

In [34]:
for story in top_stories:
    title = story["title"]
    min_ix = -1
    min_date = ""
    min_dist = float("inf")
    for date, hix in month_to_hix.items():
        # Get the top 10 results
        emb = get_vec(title)
        ixs, dists = hix.get_nns_by_vector(emb, 10, include_distances=True)
        for ix, dist in zip(ixs, dists):
            if dist > min_dist:
                continue
            min_ix = ix
            min_date = date
            min_dist = dist
    article = dbman.get_article_by_month_and_ix(min_date, min_ix)
    if article is None:
        print("WHAT! No results")
        continue
    print("------------------------------------------------")
    print(article.headline)
    print(article.lead_paragraph)
            

------------------------------------------------
The Latest Artillery in the War Against the Flu Virus
HEALTH officials strongly urge -- even at this late date -- that people get flu shots, especially those 65 and older or those with weakened immune systems and chronic ailments like heart disease, kidney disease and severe asthma.
------------------------------------------------
Cancer Drug Developer Gets Partners
Genentech Inc. and Roche Holding have won a bidding contest for the rights to a promising cancer drug from its developer, a small Long Island biotechnology company, the three companies announced today.
------------------------------------------------
Sex and Power vs. Law and Order; Abuse Charges Against Police Underscore a Historic Tension
It usually happens on the midnight tour, the petri dish of police misbehavior.
------------------------------------------------
In Letter to Bush, Putin Urges Wider U.S.-Russian Cooperation
President Vladimir V. Putin sent a letter to Pres