In [3]:
import __init__

import pandas as pd

from bin.splitLogFile import extractSummaryLine

# Experience

Based on __wordnet__ as ground truth, we tried to learn a classifier to __detect antonymics relations__ between words _(small != big / good != bad)_

To do so we will explore the __carthesian product__ of:
* __simple / bidi:__ consider each adjective to have only one antonyms or not
* __strict:__ try to compose missing concept 
* __randomForest / knn:__ knn allow us to check if there is anything consistent to learn, randomForest is a basic model as a first approach to learn the function
* __feature:__ one of the feature presented in the guided tour
* __postFeature:__ any extra processing to apply to the feature extraction (like normalise)

We use a 10 K-Fold cross validation.

Negative sampling is generating by shuffling pairs.

_Once you downloaded the files, you can use this script reproduce the experience at home_:

```
python experiment/trainAll_antoClf.py > ../data/learnedModel/anto/log.txt
```

# Results

Here is the summary of the results we gathered,
You can find details reports in logs.

In [5]:
summaryDf = pd.DataFrame([extractSummaryLine(l) for l in open('../../data/learnedModel/anto/summary.txt').readlines()],
                        columns=['bidirectional', 'strict', 'clf', 'feature', 'post', 'precision', 'recall', 'f1'])

summaryDf.sort_values('f1', ascending=False)[:10]

Unnamed: 0,bidirectional,strict,clf,feature,post,precision,recall,f1
47,bidi,,RandomForestClassifier,pCosSim,postNormalize,0.921,0.921,0.921
119,bidi,strict,RandomForestClassifier,pCosSim,postNormalize,0.917,0.916,0.916
118,bidi,strict,RandomForestClassifier,pCosSim,postAbs,0.915,0.915,0.915
46,bidi,,RandomForestClassifier,pCosSim,postAbs,0.913,0.912,0.912
45,bidi,,RandomForestClassifier,pCosSim,noPost,0.912,0.911,0.911
10,bidi,,KNeighborsClassifier,pCosSim,postAbs,0.91,0.91,0.91
117,bidi,strict,RandomForestClassifier,pCosSim,noPost,0.911,0.91,0.91
9,bidi,,KNeighborsClassifier,pCosSim,noPost,0.909,0.909,0.909
82,bidi,strict,KNeighborsClassifier,pCosSim,postAbs,0.91,0.909,0.909
189,simple,,RandomForestClassifier,pCosSim,noPost,0.906,0.906,0.906


We can observe quite good f1-score on __RandomForest__ with __normalised projected cosine similarity__.

Results are even better with not bidirectional relations (bidi). It makes sense since we can find several antonyms for one word:
* small != big
* small != tall

Allowing to compose concept also seems to have a positive impact.

# Study errors

Here is the detail of:
* False positive - ie: pairs considered as antonyms but not included in wordnet
* False negative - ie: not detected antonyms

The __false positives are especially interresting here...__

In [12]:
!python ../../toolbox/script/detailConceptPairClfError.py ../../data/voc/npy/wikiEn-skipgram.npy ../../data/learnedModel/anto/bidi__RandomForestClassifier_pCosSim_postNormalize.dill ../../data/wordPair/wordnetAnto.txt anto ../../data/wordPair/wordnetAnto_fake.txt notAnto

1388424 loaded from wikiEn-skipgram
mem usage 1.6GiB
loaded time 6.17159080505 s
input: (antecedent, '!=', subsequent)  /  predicted: notAnto  /  true: anto  /  proba:[ 0.48819086  0.51180914]
input: (autogenous, '!=', heterogenous)  /  predicted: notAnto  /  true: anto  /  proba:[ 0.47802944  0.52197056]
input: (bettering, '!=', worsening)  /  predicted: notAnto  /  true: anto  /  proba:[ 0.44246362  0.55753638]
input: (faced, '!=', faceless)  /  predicted: notAnto  /  true: anto  /  proba:[ 0.4090522  0.5909478]
input: (breathing, '!=', breathless)  /  predicted: notAnto  /  true: anto  /  proba:[ 0.48782499  0.51217501]
input: (fraternal, '!=', identical)  /  predicted: notAnto  /  true: anto  /  proba:[ 0.46262573  0.53737427]
input: (comparable, '!=', incomparable)  /  predicted: notAnto  /  true: anto  /  proba:[ 0.49709918  0.50290082]
input: (concise, '!=', prolix)  /  predicted: notAnto  /  true: anto  /  proba:[ 0.46595617  0.53404383]
input: (branchy, '!=', branchless)  /  p

__Some__ False positive here __rises some questions__, challenging the wordnet ground truth:
* unhelpful, '!=', useful
* unambitious, '!=', intelligent
* discouraging, '!=', ostentatious
* ...

__Considered as antonyms by the classifier__, they are __not supposed to be__ according to the Human expert annotations __but would also match__ from __a semantic point of view__.

Moreover, __different Human expert would probably have different understanding__ of thoses cases and consider these exemple as side effect or not.

# Conclusion

The __recognition rate is quite satisfying__ here considering the basic model we use. More __advanced techniques__ could __improve the results.__

By using a different approach on feature extraction, we also potentially __highlight a fitted function who is able to oppose word__ from a semantic point of view.

This __learned point of view__ of how words oppose themself is __depending of the corpus__ and may be controvertional or __raise ethical / philosophical questions a single Human expert cannot answer__. A less average performing model provided results like:
* honnest, '!=', social
* inorganic, '!=', ineficient

_____________________

__The question now is:__
* Did we __reached an edge of supervised classification__ (the human expert is not able to decide by a yes/no answer) ?

or

* Are __these result a biasis__ inctroduced by my understanding of what an AI can do ?