# Spacy Wordnet

* [How to get domain of words using WordNet in Python?](https://stackoverflow.com/questions/21902411/how-to-get-domain-of-words-using-wordnet-in-python)

* [spacy-wordnet](https://spacy.io/universe/project/spacy-wordnet)

> spacy-wordnet creates annotations that easily allow the use of WordNet and [WordNet Domains](http://wndomains.fbk.eu/) by using the [NLTK WordNet interface](http://www.nltk.org/howto/wordnet.html).

* [PyPi spaCy WordNet](https://pypi.org/project/spacy-wordnet/)

> You also need to install the following NLTK wordnet data:
> ```
> python -m nltk.downloader wordnet
> python -m nltk.downloader omw
> ```

!python -m nltk.downloader wordnet
!python -m nltk.downloader omw

In [1]:
import pandas as pd
import spacy
from spacy.symbols import nsubj, dobj, iobj, VERB
import spacy_wordnet
from spacy_wordnet.wordnet_annotator import WordnetAnnotator 

In [2]:
def _to_text(tokens, sep=',') -> str:
    return sep.join(map(str, list(tokens)))

# Model

In [3]:
nlp = spacy.load("en_core_web_sm")
nlp.add_pipe("spacy_wordnet", after='tagger')

<spacy_wordnet.wordnet_annotator.WordnetAnnotator at 0x103684280>

# Document

In [41]:
text = """
Post Traumatic Stress Disorder symptoms (PTSD) often co-exist with 
other conditions such as substance use disorders, depression and anxiety. 
A comprehensive medical evaluation resulting in an individualized treatment plan is optimal.
"""
doc = nlp(' '.join(text.split()))

In [53]:
for noun in doc.noun_chunks:
    word = wordnet.synsets(noun.text, pos=wordnet.NOUN)
    if word:
        print(f"{noun.text:50} {word[0]}")

PTSD                                               Synset('posttraumatic_stress_disorder.n.01')
depression                                         Synset('depression.n.01')
anxiety                                            Synset('anxiety.n.01')


# Finding Subject Noun Phrase

In [31]:
from nltk.corpus import (
    wordnet,
    words
)


In [7]:
# WordNet object links spaCy token with NLTK WordNet interface by giving access to synsets and lemmas 
token = doc[0]
print(token._.wordnet.synsets())
print(token._.wordnet.lemmas())

# And automatically add info about WordNet domains
token._.wordnet.wordnet_domains()

[]
[Lemma('autonomous.s.01.autonomous'), Lemma('autonomous.s.01.independent'), Lemma('autonomous.s.01.self-governing'), Lemma('autonomous.s.01.sovereign'), Lemma('autonomous.s.02.autonomous'), Lemma('autonomous.s.03.autonomous'), Lemma('autonomous.s.03.self-directed'), Lemma('autonomous.s.03.self-reliant')]


['politics']