# Deck checker

This little script is used to compare the code-generated deck to the version available on AnkiWeb. It is useful so the differences between the decks can be easily viewed. This ensures that changes to the deck are those expected.

## Prerequisits

 - Download the deck off anki-web
 - Run the corresponding python script (createJLPTDeck.py) and copy the `generated` folder to this directory 

Do this for the corresponding note type (`normal` or `extended`) 

The generated deck should be named `generated.anki2` and stored in the same directory as this notebook.
The deck off anki should be named `collection.anki2` and also stored in this directory.
`apkg` files are just zip files. Unzip it and you should find the `.anki2` file within, which should be renamed accordingly

In [2]:
import sqlite3
import os

import pandas as pd

In [3]:
cols = ["Expression", "English definition", "Reading", "Grammar", "Additional definitions", "jlpt"]

def prepare_deck(level: str, name: str) -> pd.DataFrame:
    """ Creates a dataframe for a given jlpt level and from a given name path relative to this file.

    Args:
        level: jlpt level. Include it in the dataframe if the 'jlpt' field contains this string. Use '' to get all.
        name: path to .anki2 file
    """
    if not os.path.isfile(name):
        raise FileNotFoundError(f'{name} does not exist.')
    
    # open the sql db as a dataframe
    conn = sqlite3.connect(name)
    query = "SELECT * FROM notes"
    df_col = pd.read_sql_query(query,conn)
    
    df5 = df_col[df_col["tags"].str.contains(level)]
    fls = df5["flds"].apply( lambda x: x.split('\x1f'))

    df = pd.DataFrame()
    for i,c in enumerate(cols):
        df[c] = fls.str[i]
    df["jlpt"] = df5["tags"]
    df = df.sort_values('Expression')
    df = df.reset_index(drop=True)
    return df

In [4]:
# check the generated deck

dg = prepare_deck('', 'generated.anki2')
dg

Unnamed: 0,Expression,English definition,Reading,Grammar,Additional definitions,jlpt
0,×,"x (mark), cross",ばつ,Noun,"MDMA, ecstasy, molly",jlpt-n2 usually_kana
1,Α,alpha,Α,Noun,,
2,Β,beta,Β,Noun,,
3,Γ,gamma,Γ,Noun,,
4,Δ,delta,Δ,Noun,,
...,...,...,...,...,...,...
17802,Ｔバック,"T-back, bikini thong",Ｔバック,Noun,,
17803,ＵＮＩＣＥＦ,United Nations Children's Fund (formerly Child...,ＵＮＩＣＥＦ,Noun,,
17804,ＵＳＡ,"United States of America, USA",ＵＳＡ,Noun,"United States Army, US Army, USA",
17805,Ｘ線,X-ray,Ｘ 線[ックスせん],Noun,,


In [5]:
# check the downloaded deck

dd = prepare_deck('', 'downloaded.anki2')
# dd.loc[0, "Expression"] = ("foo") # add a false value to see how it flags and behaves
dd

Unnamed: 0,Expression,English definition,Reading,Grammar,Additional definitions,jlpt
0,×,"x (mark), cross",ばつ,Noun,"MDMA, ecstasy, molly",jlpt-n2usually_kana
1,Α,alpha,Α,Noun,,
2,Β,beta,Β,Noun,,
3,Γ,gamma,Γ,Noun,,
4,Δ,delta,Δ,Noun,,
...,...,...,...,...,...,...
17700,Ｔシャツ,"T-shirt, tee shirt",Ｔシャツ,Noun,,
17701,Ｔバック,"T-back, bikini thong",Ｔバック,Noun,,
17702,ＵＮＩＣＥＦ,United Nations Children's Fund (formerly Child...,ＵＮＩＣＥＦ,Noun,,
17703,Ｘ線,X-ray,Ｘ 線[ックスせん],Noun,,


In [6]:
def compare(df_d: pd.DataFrame, df_g: pd.DataFrame) -> pd.DataFrame:
    df = df_d.merge(df_g.drop_duplicates(), on=['Expression'], 
                   how='outer', indicator=True, suffixes=('_net', '_gen'))
    df = df[~df['_merge'].isin(['both'])]
    return df
df = compare(df_d = dd, df_g = dg)
df
# left is generated. right is downloaded

Unnamed: 0,Expression,English definition_net,Reading_net,Grammar_net,Additional definitions_net,jlpt_net,English definition_gen,Reading_gen,Grammar_gen,Additional definitions_gen,jlpt_gen,_merge
18,あの人,"he, she, that person",あの 人[ひと],Pronoun,you,,,,,,,left_only
24,あり得ない,,,,,,impossible,ありえない,I-adjective,"unthinkable, ridiculous, absurd",usually_kana,right_only
34,いいや,,,,,,"no, nope",いいや,,,,right_only
83,お会計,,,,,,"bill (at a restaurant), check",お 会計[かいけい],Noun,,polite,right_only
93,お化け,,,,,,"ghost, apparition",おばけ,Noun,"goblin, monster, demon; something unusually large",usually_kana,right_only
...,...,...,...,...,...,...,...,...,...,...,...,...
19607,鼻紙,"tissue paper, facial tissue, paper handkerchief",鼻紙[はながみ],Noun,,,,,,,,left_only
19612,０,,,,,,"zero, 0, nought, nil",０,Noun,"nothing, zilch",,right_only
19614,１対１,"one-to-one, one-on-one",１ 対[ちたいいち]１,Noun which may take the genitive case particle...,,,,,,,,left_only
19618,１００均,,,,,,"hundred-yen store, 100 yen shop",１００ 均[きん],Noun,,,right_only


In [7]:
#only in downloaded
dd_only = df[~df['_merge'].isin(['right_only'])]
# drop empty columns
dd_only = dd_only.loc[:, ~dd_only.columns.str.contains('_gen', case=False)]
dd_only

Unnamed: 0,Expression,English definition_net,Reading_net,Grammar_net,Additional definitions_net,jlpt_net,_merge
18,あの人,"he, she, that person",あの 人[ひと],Pronoun,you,,left_only
149,お神輿,portable shrine (carried in festivals),お 神輿[みこし],Noun,"buttocks, lower back, waist, hips",,left_only
151,お節,"osechi, food eaten during the New Year's Holidays",おせち,Noun,,usually_kana,left_only
153,お節料理,"osechi, osechi-ryōri, traditional food eaten d...",お 節料理[せちりょうり],Noun,,,left_only
210,がらんと,clanging,がらんと,"Adverb , Adverb taking the 'to' particle","empty, deserted",,left_only
...,...,...,...,...,...,...,...
19509,高齢,"advanced age, old age",高齢[こうれい],"Noun, Noun which may take the genitive case pa...",,,left_only
19543,鯖,"mackerel (esp. the chub mackerel, Scomber japo...",さば,Noun,server (in an online game),usually_kana,left_only
19586,黒っぽい,"dark, blackish",黒[くろ]っぽい,I-adjective,,,left_only
19607,鼻紙,"tissue paper, facial tissue, paper handkerchief",鼻紙[はながみ],Noun,,,left_only


In [8]:
#only in generated
dg_only = df[df['_merge'].isin(['right_only'])]
dg_only = dg_only.loc[:, ~dg_only.columns.str.contains('_net', case=False)]
dg_only

Unnamed: 0,Expression,English definition_gen,Reading_gen,Grammar_gen,Additional definitions_gen,jlpt_gen,_merge
24,あり得ない,impossible,ありえない,I-adjective,"unthinkable, ridiculous, absurd",usually_kana,right_only
34,いいや,"no, nope",いいや,,,,right_only
83,お会計,"bill (at a restaurant), check",お 会計[かいけい],Noun,,polite,right_only
93,お化け,"ghost, apparition",おばけ,Noun,"goblin, monster, demon; something unusually large",usually_kana,right_only
99,お坊さん,"Buddhist priest, monk",お 坊[ぼう]さん,Noun,son (of others),respectful,right_only
...,...,...,...,...,...,...,...
19596,黒煙,black smoke,黒煙[こくえん],Noun,,,right_only
19600,黙とう,silent prayer,黙[もく]とう,"Noun, Suru verb, Intransitive verb",,,right_only
19612,０,"zero, 0, nought, nil",０,Noun,"nothing, zilch",,right_only
19618,１００均,"hundred-yen store, 100 yen shop",１００ 均[きん],Noun,,,right_only
