# Using a custom text classifier

Now that we've built a custom text classifier, let's try to use it!

In [None]:
pip install -q transformers sentencepiece huggingface_hub sklearn

In [23]:
from transformers import pipeline
import pandas as pd
import huggingface_hub

pd.options.display.max_colwidth = 500

## Load in our dataset

In [24]:
df = pd.read_csv("wapo-app-reviews-huggingface-full.csv")
df.head()

Unnamed: 0,index,Country,Date,Rating,Review,Version,source,racism,bullying,sexual
0,2,US,11/22/2019,1,Get rid of micro transactions or i will find a new app to use. Why should i have to pay for that it’s so stupid,3.8,holla,0.0,0.0,0.0
1,6,US,11/21/2019,1,This is good but most of my messages never show up. This is very crapy and needs to be fixed,3.3.1,skout,0.0,0.0,0.0
2,8,US,11/20/2019,1,I was really enjoying this app. This brought me out of the box. I’m an extremely shy person and this gave me somewhere to talk to nice people. I just got kicked of bc I’m 16 not “18” and I think that this change it kind of stupid bc yeah it’s for protection but like someone else said all you have to do is put age preferences like you do for gender not that hard I wish this wasn’t the case or this would have received a 5 star rating bc I really liked this app I would like to re download this ...,4.2.1,holla,0.0,0.0,0.0
3,13,US,11/14/2019,1,It won’t lemme go live or anything like I think you fixed it for everyone but me and now it says I’m banned for no reason I didn’t even do anything,2.9,holla,0.0,0.0,0.0
4,15,US,11/12/2019,1,No real ppl all fake or no reply,5.8.2,skout,0.0,0.0,0.0


## Use our model

We've set our model to private, so we'll need to log in to Hugging Face to be able to use it.

In [1]:
huggingface_hub.login()

Token is valid.
Your token has been saved to /Users/soma/.cache/huggingface/token
Login successful


But once we do that, we can use the model just like we used for the sentiment analysis notebook!

**You'll need to change the `model="XXXXXX"` line to match your model's name.** Mine was something like `wendys-llc/autotrain-wapo-v3-38832102021` (I recommend using the copy button at the top of your model's web page).

In [16]:
sentiment_pipeline = pipeline(
    "sentiment-analysis",
    # tokenizer="wendys-llc/autotrain-wapo-v3-38832102021",
    model="wendys-llc/autotrain-wapo-v3-38832102021",
    use_auth_token=True)

Downloading pytorch_model.bin:   0%|          | 0.00/1.24G [00:00<?, ?B/s]

Downloading tokenizer_config.json:   0%|          | 0.00/341 [00:00<?, ?B/s]

Downloading vocab.txt:   0%|          | 0.00/208k [00:00<?, ?B/s]

Downloading tokenizer.json:   0%|          | 0.00/654k [00:00<?, ?B/s]

Downloading special_tokens_map.json:   0%|          | 0.00/125 [00:00<?, ?B/s]

In [18]:
results = sentiment_pipeline(df.Review.tolist())
results = pd.DataFrame(results).add_prefix('prediction_')
scored = df.join(results)

In [38]:
scored.sort_values(by=['prediction_label', 'prediction_score'], ascending=False).head(20)

Unnamed: 0,index,Country,Date,Rating,Review,Version,source,racism,bullying,sexual,prediction_label,prediction_score
305,931,US,12/06/2018,2,"All the guys just want nudes and are somehow worse than tinder. All the girls ghost you and just want you to add them on Snapchat. Why on earth would I download this app to add you on another platform?? This is a terrible app, not because of its features but because there is zero moderation. Half of the accounts are inactive and if you do get matched with someone they’ll never message you, ever.",3.33,holla,0.0,0.0,1.0,1,0.773591
38,82,US,09/24/2019,1,"The App Makes It too Easy For Creeps To Be Creeps, If You’re having a Good Time, all good Vibes and someone less social wants attention they can easily get you blocked or banned. A lot of foreign guys Constantly trying to show off their body parts, Not cool, App staff don’t care at all but will respond with Robots. And the list goes on. Good Luck is all I’m Saying smh.",5.5.2,skout,0.0,0.0,1.0,1,0.756856
227,642,US,02/14/2019,2,"It would be nice if there were actually people on the site to chat, talk to, and etc. I don't know the difference between the paid and unpaid version is. It would be nice to possibly have other ways to unlock features instead of downloading apps, buying offers, or general buying. It's hard on the phone after awhile in use. There is a lot of people on it but it is complete silent work other wise.",3.8.0,skout,0.0,0.0,0.0,1,0.737548
307,933,US,12/06/2018,1,This app is now full of crappy ppl .. The gals here r total fake I chatted with 6 gals back to back n all turned out to b asking to view their webcam. And pay by registering to a third party site. Scout Plz filter ur users well .. Not even worth putting any time in this app anymore.,4.21.2,skout,0.0,0.0,1.0,1,0.735461
260,767,US,01/08/2019,2,"Through this app, I've met interesting people from around the world. I HATE the lack of safeguards against potentially underage users though. They really need to come up with a way to better enforce the age requirement. The idea of underage people coming into contact with with so many older people, some suspiciously old, on a dating app makes me nervous. It can be difficult to discern what is legitimate information, what profile is real, what pictures are real -- even the age or sex of a use...",4.19.1,skout,0.0,0.0,0.0,1,0.727437
237,668,US,01/29/2019,2,"I have not used this app in over a year, and it's worse now than it was then. It's even less organized. It was easier to display or organized people in the meet feature before. It still doesn't do the best job or sorting people by distance. The current version is a step backwards. I've used the app in a few different cities , but only a few have active profiles from real people.",4.22.1,skout,0.0,0.0,0.0,1,0.7259
61,138,US,08/27/2019,1,I went on there to find new friends but I can’t go 3 matches without finding a creepy 30 year old or finding some guy with his d*** out!!!,3.1.3,holla,0.0,0.0,1.0,1,0.724402
172,500,US,03/26/2019,1,"Wow! I was looking for an app to try and meet some new friends in the local area and was instantly exposed to everything wrong with society today. Numerous ads popping up, fake users galore, spam emails, the list goes on. Steer clear!",5.6.1,skout,0.0,0.0,0.0,1,0.724078
234,659,US,02/03/2019,1,"Half of the people on this app are fake. Tons of people are rude and when I was rude back to those people they banned me from the site. I ""appealed"" my case and got no response. Needless to say I was far from happy with this app.",3.3.1,skout,0.0,1.0,0.0,1,0.719157
142,408,US,04/27/2019,1,"The only reason I even had this app was because I could talk to strangers and meet new people WITH a friend. By taking away two player mode, you are taking away the best part of the app. Unless it is brought back, I will not be using anymore.",4.1.8,holla,0.0,0.0,0.0,1,0.716221


In [27]:
scored.prediction_label.value_counts()

0    209
1    121
Name: prediction_label, dtype: int64

## But how did it really do?

While we have measurements like "precision" and "accuracy" and "recall," looking at the actual results in tiny boxes is far more useful than those abstract numbers.

In [39]:
from sklearn.metrics import confusion_matrix

# The predictions are string 0 and 1, so we
# need to convert the 'sexual' column
y_true = scored.sexual.replace({0: '0', 1: '1'})
y_pred = scored.prediction_label
matrix = confusion_matrix(y_true, y_pred)

label_names = pd.Series(['not creepy', 'creepy'])
pd.DataFrame(matrix,
     columns='Predicted ' + label_names,
     index='Is ' + label_names)


Unnamed: 0,Predicted not creepy,Predicted creepy
Is not creepy,209,105
Is creepy,0,16
