# Types of Knowledge (Knowledge Acquisition 1) - Solutions

In this exercise you familiarise yourself with different knowledge sources that could help you in gathering the necessary knowledge for your group project. We will start by looking in more, practical detail at the sources WordNet, DBpedia & ConceptNet you heard about in the lecture. Afterwards you will come together in your group to think about the knowledge types and resources that suit your domain.

## 01 - Using WordNet

One powerful resource for collecting lexical knowledge and connecting words with their semantic meaning is WordNet. It can be accessed either through a [webinterface](http://wordnetweb.princeton.edu/perl/webwn) or through a Python library. A brief overview over the library and its most important functions can be seen below:

```Python
import nltk # import the natural language toolkit
nltk.download('wordnet') # download the wordnet corpus used for accessing wordnet
nltk.download('punkt') # download the punkt library necessary for using the LESK algorithm
nltk.download('averaged_perceptron_tagger') # download the part-of-speech tagger used by the LESK algorithm

from nltk.corpus import wordnet as wn # import the wordnet library 
synset = wn.synset('movie.n.01') # get a specific WordNet synset
synset.lemma_names() # get all synonyms that are part of this synset
synset.hyponyms() # get the hyponyms for the synset
synset.hypernyms() # get the hypernyms for the synset

import pywsd.lesk as lesk # import the LESK Algorithm used for automatically getting the synset
synset = lesk.simple_lesk(example_sentence, word, pos='v') # algorithm that returns the most probable synset based on the usage of 'word' in the 'example_sentence'. pos can be provided to specify the role of the word in the sentence (e.g. 'v' for 'verb or 'n' for noun)
```

**Tasks:**

In [None]:
### a) What are the synonyms for 'Movie'?
from nltk.corpus import wordnet as wn
synset = wn.synset('movie.n.01')
print(synset.lemma_names())

In [None]:
### b) Write a function that prints all hypernyms for a given synset (direct as well as indirect hypernyms).
### You can verify your code using the 'fruit.n.01' synset. 
### The hypernyms are: 'reproductive_structure.n.01', 'plant_organ.n.01', 'plant_part.n.01', 'natural_object.n.01', 'whole.n.02', 'object.n.01', 'physical_entity.n.01', 'entity.n.01'

def get_hypernyms(synset):
    curr_hyps = synset.hypernyms()
    for hyp in curr_hyps:
        print(hyp)
        if hyp.hypernyms():
            get_hypernyms(hyp)

from nltk.corpus import wordnet as wn
fruits = wn.synset('fruit.n.01')
get_hypernyms(fruits)

In [None]:
### c) Take the following text and count how many different meanings of the word 'cut' occur:
txt = 'Sarah attempted to cut through the dense forest with her trusty machete. She couldn\'t help but notice a cut on her finger from the sharp blade. The thick foliage seemed to resist her every effort, making each cut through the branches a struggle. Suddenly, she stumbled upon a clearing where sunlight cut through the trees, illuminating the vibrant colors of the wildflowers scattered around. Taking a moment to catch her breath, Sarah realized she needed to cut her losses and find another path. Her initial plan had been cut short by the unforgiving terrain. With determination, she adjusted her course, ready to cut across the forest in a new direction, hoping to reach her destination before nightfall.'

import pywsd.lesk as lesk
sentences = txt.split(".")
mean = {}
for s in sentences:
    syn = lesk.simple_lesk(s, 'cut')
    if syn in mean:
        mean[syn] += 1
    else:
        mean[syn] = 1
        
print(f"{len(mean)} different meanings occur:")
for m in mean:
    print(f"{m} - {mean[m]}")

## 02 - Navigating DBpedia
Since DBpedia contains the same knowledge as Wikipedia, it is possible to also play the [Wikipedia Race](https://en.wikipedia.org/wiki/Wikipedia:Wiki_Game). For this, you start on a specific page and try to navigate to another, pre-defined page. 

Starting Page: [Afrika](https://dbpedia.org/page/Africa)

Destination: [Freie Hansestadt Bremen](https://dbpedia.org/page/Bremen)

Please note the relations you chose so that you can reconstruct the steps along your navigation path. Is it possible to use the same relations to navigate back?

**Example:**

_(dbr:Africa dbo:wikiPageWikiLink dbr:Togo)_

_(dbr:Togo dbo:language dbr:German_language)_

_(dbr:German_language dbp:commonLanguages dbr:Bremen-Verden)_

_(dbr:Bremen-Verden dbo:wikiPageWikiLink dbr:Bremen)_

## 03 - Extracting Related Concepts from ConceptNet

As explained in the lecture, [ConceptNet](https://conceptnet.io/) is a semantic network focused on modelling commonsense knowledge. In this task you will access ConceptNet to return a list of related concepts to the provided word. To access ConceptNet, we will use the `requests` package as shown below. In addition to querying ConceptNet, you need to think about a suitable relation and whether it is bidirectional or not.

```Python
import requests # import the requests library
url = f'http://api.conceptnet.io/c/en/{concept}' # base url appended with the concept to query for
response = requests.get(url) # requesting the ConceptNet API
data = response.json() # accessing the data in the JSON format
```

To check your solution, you can use the concept *fruit*, for which you should find the following related concepts: *['apple', 'grape', 'produce', 'orange', 'lemon', 'lime', 'raspberry', 'peach']*.

In [None]:
import requests

concept = "fruit"
url = f'http://api.conceptnet.io/c/en/{concept}' 
response = requests.get(url)
data = response.json()
related_concepts = []

for edge in data['edges']:
    if edge['rel']['label'] == 'RelatedTo':
        if edge['start']['label'] == concept:
            related_concepts.append(edge['end']['label'])
        else:
            related_concepts.append(edge['start']['label'])

print(f"Related concepts for '{concept}': {related_concepts}")