In [1]:
from transformers import pipeline
classifier = pipeline("zero-shot-classification",
                      model="facebook/bart-large-mnli")

# transformers-4.6.0-py3-none-any.whl
# torch-1.8.1-cp39-cp39-win_amd64.whl

In [2]:
sequence_to_classify = "one day I will see the world"
candidate_labels = ['time to code']

classifier(sequence_to_classify, candidate_labels)

{'sequence': 'one day I will see the world',
 'labels': ['time to code'],
 'scores': [0.010450021363794804]}

Start making some tests, what should get recognized and what shouldn't?

In [18]:
cases = [
"time to code!",
"time to code i guess. ehhhh",
"time to code! make sure I stay off of here please",
"gotta work on my house, the roof caved in. I guess it wasn't up to code",
"gotta work on my house, the roof caved in. I thought it was up to code",
"gotta work on my house, the roof caved in - it was up to code!",
"im programming today",
"im programming today, ama",
"no programming today",
"too distracted to code rn",
"agahghaggagghhhhh",
"at a dance recital, the program has awful kerning",
"i'd really like to code today",
"dang, wanna program today",
"tried to program but kept running into tensorflow errors",
"goodbye for now!",
"please keep me off twitter, thanks",
"too much twitter for today",
"alright, let's get some work done today. see ya!",
"Time for code, if you catch me on Twitter tell me to stop"
]

candidate_labels = ['time to code', 'twitter', 'offline', 'goodbye', 'online']

thresh = 0.8

for case in cases:
    y = classifier(case, candidate_labels, multi_label=True)
    for l in range(len(candidate_labels)):
        print(y['scores'][l], "\t\t", y['scores'][l] > thresh, "\t\t", candidate_labels[l], "\t\t", case)
    print('---')

0.991550624370575 		 True 		 time to code 		 time to code!
0.5582633018493652 		 False 		 twitter 		 time to code!
0.1177329421043396 		 False 		 offline 		 time to code!
0.004087286535650492 		 False 		 goodbye 		 time to code!
0.00010145625128643587 		 False 		 online 		 time to code!
---
0.9919251799583435 		 True 		 time to code 		 time to code i guess. ehhhh
0.3530835211277008 		 False 		 twitter 		 time to code i guess. ehhhh
0.029680512845516205 		 False 		 offline 		 time to code i guess. ehhhh
0.008168640546500683 		 False 		 goodbye 		 time to code i guess. ehhhh
0.000701444165315479 		 False 		 online 		 time to code i guess. ehhhh
---
0.9928596019744873 		 True 		 time to code 		 time to code! make sure I stay off of here please
0.982445240020752 		 True 		 twitter 		 time to code! make sure I stay off of here please
0.5659012794494629 		 False 		 offline 		 time to code! make sure I stay off of here please
0.1019396036863327 		 False 		 goodbye 		 time to code! make sure I

```
0.9943760633468628 		 True 		 time to code 		 Time for code, if you catch me on Twitter tell me to stop
0.9410650134086609 		 True 		 twitter 		 Time for code, if you catch me on Twitter tell me to stop
0.9284548759460449 		 True 		 offline 		 Time for code, if you catch me on Twitter tell me to stop
0.9129378199577332 		 True 		 goodbye 		 Time for code, if you catch me on Twitter tell me to stop
0.35043689608573914 		 False 		 online 		 Time for code, if you catch me on Twitter tell me to stop
```

is kinda interesting, I was expecting that 'offline' was getting tagged because of the association with 'twitter' - so I added 'online' to verify. If you can't trust a single class, maybe you can trust class+anticlass consensus?

there are some cases I can imagine where I may say something like 'too distracted to code rn' and intend this to be a trigger for berduck.

```
0.6467213034629822 		 False 		 time to code 		 too distracted to code rn
0.15865597128868103 		 False 		 twitter 		 too distracted to code rn
0.0538114532828331 		 False 		 offline 		 too distracted to code rn
0.03206020966172218 		 False 		 goodbye 		 too distracted to code rn
0.0007501693326048553 		 False 		 online 		 too distracted to code rn
```

berduck should take an action when certain criteria are met.

for instance, the 'stay off twitter' action may have multiple triggers, including 'time to code' and 'stay off twitter'

If I tweet 'too distracted to code rn' and I want berduck to change it's classification, I could '@' them and tel them to add a generic classification. However, I've noticed that by simply adding more classes to the candidate label list in multi_class mode, the scores change. 

In [26]:
cases = [
"stay off twitter",
"get off twitter",
"time to get off twitter",
"too much twitter",
"time to code",
"Time for code, if you catch me on Twitter tell me to stop"
]

# candidate_labels = ['time to code', 'stay off twitter', 'get off twitter', 'twitter bad', 'stop twitter', 'lasagna']
# candidate_labels = ['time to code', 'stay off twitter', 'get off twitter', 'twitter bad', 'stop twitter']
# candidate_labels = ['time to code', 'stay off twitter', 'get off twitter', 'twitter bad']
# candidate_labels = ['time to code', 'stay off twitter', 'get off twitter', 'lasagna']
# candidate_labels = ['time to code', 'stay off twitter', 'get off twitter']

thresh = 0.8

for case in cases:
    y = classifier(case, candidate_labels, multi_label=True)
    print(case)
    for l in range(len(candidate_labels)):
        print(y['scores'][l], "\t\t", y['scores'][l] > thresh, "\t\t", candidate_labels[l])
    print('---')

stay off twitter
0.9973948001861572 		 True 		 time to code
0.9973940253257751 		 True 		 stay off twitter
0.0070142061449587345 		 False 		 get off twitter
0.0037266332656145096 		 False 		 lasagna
---
get off twitter
0.9982969164848328 		 True 		 time to code
0.9982510805130005 		 True 		 stay off twitter
0.005795542150735855 		 False 		 get off twitter
0.0015922728925943375 		 False 		 lasagna
---
time to get off twitter
0.9986544847488403 		 True 		 time to code
0.9986002445220947 		 True 		 stay off twitter
0.007352679967880249 		 False 		 get off twitter
0.00040462729521095753 		 False 		 lasagna
---
too much twitter
0.8323720097541809 		 True 		 time to code
0.31241679191589355 		 False 		 stay off twitter
0.0010842622723430395 		 False 		 get off twitter
0.00021289211872499436 		 False 		 lasagna
---
time to code
0.9925968647003174 		 True 		 time to code
0.007158499211072922 		 False 		 stay off twitter
0.0005604507168754935 		 False 		 get off twitter
0.0002839835942722857 		