Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail to use Classifier class for predict politeness probability #67

Closed
jionghaolin opened this issue Sep 19, 2020 · 6 comments
Closed

Comments

@jionghaolin
Copy link

Hi,

I tried to run the politeness_demo.ipynb, but I got <KeyError: 'politeness_strategies'>. Is there any file which was missed?

Initialized default classification model (standard scaled logistic regression).
Using corpus objects...

KeyError Traceback (most recent call last)
in
3 labeller=lambda utt: utt.meta['Binary'] == 1)
4
----> 5 clf_cv.evaluate_with_cv(binary_corpus)

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/convokit/classifier/classifier.py in evaluate_with_cv(self, corpus, objs, cv, selector)
219 if corpus:
220 print("Using corpus objects...")
--> 221 X, y = extract_feats_and_label(corpus, self.obj_type, self.pred_feats, self.labeller, selector)
222 else:
223 assert objs is not None

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/convokit/classifier/util.py in extract_feats_and_label(corpus, obj_type, pred_feats, labeller, selector)
85 :return: matrix of predictive features and numpy array of labels
86 """
---> 87 obj_id_to_feats = extract_feats_dict(corpus, obj_type, pred_feats, selector)
88 obj_id_to_label = extract_label_dict(corpus, obj_type, labeller, selector)
89

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/convokit/classifier/util.py in extract_feats_dict(corpus, obj_type, pred_feats, selector)
36 :return: dictionary mapping object id to a dictionary of predictive features
37 """
---> 38 obj_id_to_feats = {obj.id: extract_feats_from_obj(obj, pred_feats) for obj in corpus.iter_objs(obj_type, selector)}
39
40 return obj_id_to_feats

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/convokit/classifier/util.py in (.0)
36 :return: dictionary mapping object id to a dictionary of predictive features
37 """
---> 38 obj_id_to_feats = {obj.id: extract_feats_from_obj(obj, pred_feats) for obj in corpus.iter_objs(obj_type, selector)}
39
40 return obj_id_to_feats

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/convokit/classifier/util.py in extract_feats_from_obj(obj, pred_feats)
18 retval = dict()
19 for feat_name in pred_feats:
---> 20 feat_val = obj.meta[feat_name]
21 if type(feat_val) == dict:
22 retval.update(feat_val)

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/convokit/model/convoKitMeta.py in getitem(self, item)
16
17 def getitem(self, item):
---> 18 return dict.getitem(self, item)
19
20 @staticmethod

KeyError: 'politeness_strategies'

@calebchiam
Copy link
Collaborator

calebchiam commented Sep 19, 2020

Hi @jionghaolin, I wasn't able to replicate the error. KeyError: 'politeness_strategies' means that the utterance metadata does not contain the politeness_strategies key, i.e. the utterances have likely not been annotated with their politeness strategies.

I would recommend re-running the notebook from scratch or confirming that this line wiki_corpus = ps.transform(wiki_corpus, markers=True) ran successfully. Let us know how it goes!

@jionghaolin
Copy link
Author

Hi @calebchiam, thanks for your reply. Yes, it did work successfully for the demo. However, I would have asked the question about predicting the politeness scores on my dataset. Sorry for my misleading question.

Based on my understanding, the parameters in Classifier() class will use the features as the input, and set logistic regression as the default classifier. I obtained the test score 0.73. However, the paper [1] claimed that the acc score is 83.79% using politeness strategies and unigram.

Therefore, my question is whether you have the script or the pretrained model (which can achieve the claimed model performance [1]) that I can use to predict politeness scores using my dataset?

[1] A computational approach to politeness with application to social factors

@calebchiam
Copy link
Collaborator

calebchiam commented Sep 22, 2020

The Classifier uses a standard-scaled logistic regression by default, but the paper used a different model involving a SVM if I remember correctly, so you probably want to pass a different sklearn model to Classifier. @cristiandnm, could you advise on this?

@cristiandnm
Copy link
Contributor

Thanks @calebchiam.

Convokit only provides a demonstration of a classifier based on politeness strategies only, as @jionghaolin noted, the results in the paper refer to a classifier that uses both strategies and unigrams, so that explains the difference. We do not have the pretrained model for that, but it should be relatively straightforward to extend the demo to include the unigrams as well, using the new VectorClassifier class.

@BonJovi1
Copy link

Hi Caleb @calebchiam
I was running into a similar issue and realized that my parser.transform isn't working. Here's what I'm doing:

from convokit import TextParser
wiki_corpus = Corpus(download("wikipedia-politeness-corpus"))
parser = TextParser(verbosity=1000)

And then when I do:

wiki_corpus = parser.transform(wiki_corpus)

It gives me a weird Stop Iteration error. I'm trying to replicate this from the example notebook here.

I also tried with the other dataset like this:

train_corpus = Corpus(filename=download('wiki-politeness-annotated'))
parser = TextParser()
parser.transform(train_corpus)

But running into the same error.

Although, when I do the transform using PolitenessStrategies, that works!!

from convokit import PolitenessStrategies
ps = PolitenessStrategies()
wiki_corpus = ps.transform(wiki_corpus, markers=True)

This one works perfectly!
Could you kindly help me out with the TextParser?

Thanks a lot,
Abhinav

@calebchiam
Copy link
Collaborator

calebchiam commented Dec 13, 2021

Hi Abhinav, I wasn't able to replicate your issue. Could you elaborate on the StopIteration error thrown? That might shed some light on what the actual issue is.

Also, this seems like a distinct error from the one originally raised here, so please open a separate issue to continue this discussion, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants