# The top k approach

Another method to inspect a model's results is to look at the most and least successful examples, and try to identify patterns. This is what we will do here.

First, we load the model

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from pathlib import Path

import sys
sys.path.append("..")

from ml_editor.data_processing import format_raw_df

data_path = Path('../data/writers.csv')
df = pd.read_csv(data_path)
df = format_raw_df(df.copy())

  interactivity=interactivity, compiler=compiler, result=result)


In [2]:
from ml_editor.data_processing import get_split_by_author, get_vectorized_inputs_and_label, add_features_to_df

df = add_features_to_df(df.loc[df["is_question"]].copy(), pretrained_vectors=True)
train_df, test_df = get_split_by_author(df, test_size=0.2, random_state=40)

In [3]:
X_train, y_train = get_vectorized_inputs_and_label(train_df)
X_test, y_test = get_vectorized_inputs_and_label(test_df)

In [4]:
from sklearn.externals import joblib

model_path = Path("../models/model_1.pkl")
clf = joblib.load(model_path) 

y_predicted = clf.predict(X_test)
y_predicted_proba = clf.predict_proba(X_test)

Now, we'll use the top k method to look at:

- The k best performing examples for each class (answered and unsanswered)
- The k worst performing examples for each class (answered and unsanswered)
- The k most unsure examples, where our models prediction probability is close to .5

In [14]:
from ml_editor.model_evaluation import get_top_k

test_analysis_df = test_df.copy()
test_analysis_df["predicted_proba"] = y_predicted_proba[:, 1]
test_analysis_df["true_label"] = y_test

to_display = [
    "predicted_proba",
    "true_label",
    "Title",
    "body_text",
    "text_len",
    "action_verb_full",
    "question_mark_full",
    "language_question",
]
threshold = 0.5


top_pos, top_neg, worst_pos, worst_neg, unsure = get_top_k(test_analysis_df, "predicted_proba", "true_label", k=5)
pd.options.display.max_colwidth = 500

Most confident correct positive predictions

In [15]:
top_pos[to_display]

Unnamed: 0_level_0,predicted_proba,true_label,Title,body_text,text_len,action_verb_full,question_mark_full,language_question
Id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
39327,0.83,True,How to write female characters as a male writer?,"Yesterday I asked a question about writing a female character who has agency. Much to my surprise, it was well-received and generated a lot of great discussion. In reading that discussion, however, I realized that I asked the wrong question, at least for what I'm trying to understand.\nAs a somewhat introspective male, I have a reasonably easy time writing from the perspective of a male character. Internal monologue, or at least describing his perspective, comes easy for me. From experie...",890,True,True,False
24729,0.82,True,"Can I use ""fuck"" as a non-vulgar verb in a fantasy/steampunk world?","I've been sending my fourth-ish novel through the my writing group. It is about a trio of teenagers running away from some mercenaries. One of them (Maris) is a girl who has only had a year of formal education but grew up on a crowded lumber mill. She has a rather blunt way of speaking.\nIn the story, the POV character (Kanéko) is rescued by the other two.\n\nKanéko worried her lip. ""Why?""\n""You were in need.""\nMaris' ears drooped and she looked sad. ""And Ruben said you were in\n trouble. A...",397,True,True,False
35266,0.82,True,How to manage getting depressed by what my main character goes through?,"I'm writing a war (sci-fi) novel. The MC dies in the end. It's not as thoroughly depressing as ""All Quiet on the Western Front"", but Remarque's work is definitely one source of inspiration.\nNow, partway through writing, I find that writing is too painful for me to continue. It's not that I'm writing a particularly depressing passage - on the contrary. At this point my MC is eager and full of hope. What gets to me is the projection of the story: there are many ups, there's love and camarader...",257,True,True,False
27420,0.81,True,What's better in fiction: to make personal statements or universal statements?,"Here's an example from my own writing:\n\nWatching the ceiling fan stir my thoughts, I said, “His favorite thing\n was to tell me about his day.”\nMrs. Saeki gawked at me behind her square glasses. “Unusual from a\n husband.”\nI nodded. “And they were all about little things. You know, how he\n picked his polka tie instead of the stripped one. What mobile games he\n played on his way to work. Why he called me at four and not at five.”\n“You didn’t feel bored?”\nI shook my head. “When you...",197,True,True,False
40511,0.8,True,"Painting ritualistic murder in a ""good-guy"" light?","My good guys murder people. They slowly carve runes onto them to help defeat the bad guys. Sure they try to use ""society's worst"" people for the rituals, but realistically that doesn't always happen.\nThis is revealed to the main character and when the MC pushes, the good guys are unapologetic about all of it (similar to how the first paragraph is written). Of course MC shuns them and dissociates himself with them.\nMy goal is to make the MC come to view the good guys as Good Guys.\nEventual...",312,True,True,False


Most confident correct negative predictions

In [17]:
top_neg[to_display]

Unnamed: 0_level_0,predicted_proba,true_label,Title,body_text,text_len,action_verb_full,question_mark_full,language_question
Id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
7878,0.18,False,"When quoting a person's informal speech, how much liberty do you have to make changes to what they say?","Even during a formal interview for a news article, people speak informally. They say ""uhm"", they cut off sentences half-way through, they interject phrases like ""you know?"", and they make innocent grammatical mistakes.\nAs somebody who wants to fairly and accurately report the discussion that takes place in an interview, what guidelines should I use in making changes to what a person says?\nWhile the simplest solution is to write exactly what they say and [sic] any errors they make, that can...",116,True,True,False
28569,0.27,False,How to invent a new language in writing?,I'm inventing a new planet with a new species and I don't have an official language for them. I need help with how to give them speech.\n,31,False,True,False
8204,0.29,False,Separate paragraphs without line breaks,"I have a medium which does not have line breaks, and a few paragraphs of text. How can I separate the paragraphs from each other visually, clearly? I additionally cannot add information which isn't either the content or this paragraph separator, so the separator must be intuitively understood by the reader.\nThis is the best I've got so far:\n\nLorem ipsum dolor sit amet, consectetur adipiscing elit. Integer congue laoreet sapien eu sollicitudin. Vestibulum et iaculis dui, nec elementum enim...",217,True,True,False
7831,0.3,False,in text citation for handbook,"I want to do in text citation of my ""DK Handbook"", \nIn other words, I don't want put it's citation in 'work cited' section as it would be short and so obvious\nI searched internet and MLA handbook but foud nothing,\nI am wondering can someone help me to cite it in MLA format:\nit is a DK-Handbook second edition & custom edition for my college and published by Longman (I don't know whether I should mention publisher in 'in text citation') and I want to cite page 85 which is in third chapter\n",107,True,False,False
9472,0.3,False,Using emails in an autobiography,I am writing about my life experiences and I want to use email correspondence between myself and another to tell part of the story. I have not used real names in my work and have altered the emails accordingly. Do I need permission to use these emails?\n,50,False,True,False


It seems most of the correct negative predictions have short length. This matches up with our feature importance analysis which showed question length as one of the most important features.

Let's look at the most confident incorrect negative predictions

In [18]:
worst_pos[to_display]

Unnamed: 0_level_0,predicted_proba,true_label,Title,body_text,text_len,action_verb_full,question_mark_full,language_question
Id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
19509,0.27,True,How to copyright a book without lawyer and outside USA?,"I would like to publish an ebook with amazon and i dont have time/money to keep copyrights with a lawyer. I am in EU, are there any ways to minimally protect my copyright? \nMaybe i can send the wordfile to my email account as basic proof that i wrote the text first?\n",56,True,True,False
18613,0.28,True,"Addressing ""logo-ification"" of an organization's name in their literature","I need help finding some style rules to address an issue with a client. I'm working with an organization whose logo uses caps and italics with no spacing, like so:\nVANDELAYindustries\nKind of '90s, but whatever.\nThe problem is, every time they write out their company name on the web or printed media, they do it with caps and italics. So it looks like this:\n\nVANDELAYindustries consectetur adipiscing elit. Sed mollis lorem nisl, ac egestas odio tincidunt sed. Mauris ligula VANDELAYindustri...",230,False,True,False
36217,0.29,True,How can I write in a way that makes the book very interesting to read?,"This is a really weird question, but, for me this matters because English is not my first language.\nWhenever I read these famous novels, the authors use words which have really deep meanings and which makes the book very interesting to read. So my question is: how do I write that way?\nI really want to improve my writing as well as my English.\nAnd also I am writing a book and I want to make it as good as I can, I would love to get some tips regarding that.\n",101,True,True,False
18945,0.3,True,What makes the death of a character satisfying?,"My wife and I were watching a movie where one of the main characters died shortly after being reunited with his long lost love. His death was not meaningless -- he died defending his only son, but it was completely unsatisfying. I wasn't happy that he died. I felt torn. \nOn the other hand, there have been movies and stories where a key character dies and, although it is sad, I am good with it. It works. The sorrow and death are satisfying and add to the depth of the story rather than make m...",126,True,True,False
27609,0.31,True,How to publish a collection of short stories?,I have a collection of short stories. I want to publish them. I need help in this. Should I pay for publishing or are there any agencies that publish/encourage budding short story writers?\n,37,False,True,False


On the flipside, we find a lot of short questions that were answered, and that our model got wrong.

Most confident incorrect positive predictions

In [19]:
worst_neg[to_display]

Unnamed: 0_level_0,predicted_proba,true_label,Title,body_text,text_len,action_verb_full,question_mark_full,language_question
Id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
38996,0.79,False,What are key features and pacing in a satisfying ending to a science fiction novel?,"My novel has been through multiple drafts and beta reads, and by and large is in good shape. I've learned how to cure a saggy middle, how to stay in point-of-view, how to keep the protagonist driving the action by working toward their want. And so on. The shape of my novel is generally OK, but by the time I reach the end (climax), I'm simply ready for all the ends to be tied up.\nSo they are, all the contracts are filled, and by and large the ending does what it needs to. It solves the puzzl...",281,True,True,False
39613,0.79,False,How to write an interview-style story without it being an infodump?,"Inspired by the Underworld setting where vampires slept in steampunk-styled sarcophagi, slowly deteriorating until they were woken and fed, I wrote a story about vampires as aliens who crash-landed on Earth in the twelfth century, holed up in the hills of Eastern Europe, and somehow survived.\nIn the opening of the story, the protagonist is sent to interview one of the prominent vampires by someone in the American government. It seems that world governments have known of the vampires for so...",489,True,True,False
35061,0.77,False,"""Calm"" vs Adventurous Main Protagonist","When I first started thinking about this one particular story I wanted to write, I envisioned the main protagonist as a more ""calm"", ""reactive"" type of character. But as I spend more time building the world and fleshing out the characters, I grow convinced that such a protagonist would be boring at best and Mary Sue-ish at worst.\nI think an adventurous, explorer of a protagonist would be much more fun--proactive as opposed to reactive. However, I just cannot fit the character I'm picturing ...",666,True,False,False
31456,0.75,False,"Is the strategy described here an effective one, to distinguish character voice?","I have roughly 30 speaking characters; about ten speak often enough that their voices should be well defined. I have compiled, from web sources, various considerations when building distinct character voice. \nOriginally, they all sounded similar to each other and to my education and background. I didn't worry about this in the mad frenzy to finish the first draft. Since then, I have been using a brute force, somewhat mechanical approach to making the characters distinct from one another (an...",677,True,True,False
27851,0.74,False,"Flashback or Framing, does either work","After reading up on flashbacks, both on this site and others, I learned that flashbacks should be used sparingly since most readers enjoy a story from A to Z. I feel very strongly about having a form of a flashback, but can't decide which would be more appropriate. I want to hook the reader by displaying the danger and darkness of my world, but I don't feel that's possible starting off with a relatively safe adolescent child. \n\nFull Flashback - My protagonist starts off during some event a...",531,True,True,False


Most unsure questions

In [20]:
unsure[to_display]

Unnamed: 0_level_0,predicted_proba,true_label,Title,body_text,text_len,action_verb_full,question_mark_full,language_question
Id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
6545,0.5,True,"What is the ""three-asterisk-break"" section called?","What is the name for a text subdivision that is shorter than a chapter, but longer than a paragraph? They are always untitled (unlike chapters, which are at the very least called ""Chapter 1"", ""2"" etc), and denoted with either three asterisks, a longer blank space between paragraphs, or some other fancy marks.\n(specifically, I'm not asking about name for the break itself, but for the section of text delimited by such breaks.)\n",93,True,True,False
2977,0.5,False,Correct format when talking about money $$,"I'm writing a formal research paper (Highschool)\nI have lots of statistics involving money (In fact, my whole essay is about the economy). \nSo a question arises about the format of writing money. \nWould it be:\nThe company spent $4.5 billion dollars.\nor simply\nThe company spent $4.5 billion. \nDo I need to include the ""dollars"" at the end? Which one is more customary?\nThank you\n",83,False,True,False
18921,0.5,True,"Thesis: Discussion, Conclusion, Summary, Outlook,","Finally, my phd thesis in natural science is coming to an end and I am facing the problem of structuring the concluding chapters. Since my faculty/university does not dictate anything, I have to decide this on my own. Now, I am realizing a slight chaos in my thoughts which I would appreciate to reduce with your help.\nHow would you conclude the thesis? What are the differences and overlaps between summary, discussion, outlook, conclusion, etc.? May I, for example, leave out a ""Summary"" chapt...",188,True,True,False
6888,0.5,False,Too much exposition in my full-length play: how to fix it?,"Dear Writers & Playwrights;\nI'm working on my first full-length play. In workshops, the feedback is consistently, ""Too much exposition."" I agree with the criticism.\nWhat I'm struggling with is . . . now what?\nAny suggestions on how to transform exposition to action?\nI know there's no simple answer to that question but any thoughts would be appreciated.\n",73,True,True,False
38907,0.5,True,Should I avoid sex scenes / nudity in my horror game (or in general)?,"As the title might suggest, I'm working on a horror video game, which happens to have a few sex scenes and a bit of nudity here and there - at least, that's what I planned to do. Just FYI: I don't actually plan on selling it anywhere, I plan on making it freely available and self-published. I don't care about getting a rating, but I do care that nobody underage even sees the game. And no, I would probably not even publish it on Steam, even if they recently opened the floodgates for games wit...",582,True,True,False
