# Part B - Syntactic and Semantic Feature Extraction

# Imports

In [12]:
from nltk import sent_tokenize, word_tokenize, pos_tag, ne_chunk

## Text of the story

In [13]:
story = '''Chandaraka was a jackal living in a forest. One day, driven by hunger, he came to a nearby town in search of
food. Seeing him, a group of mongrels began chasing and attacking him whenever possible. The jackal fled
in panic and entering the house of a washer man hid in a vat full of blue used for bleaching clothes. When he
came out, he became a blue animal. Thinking that he was not the jackal they chased, the mongrels dispersed.

The jackal came back to the forest with his body dyed in blue. When the lions, tigers, panthers, wolves and
other animals in the forest saw him, they took fright and ran in all directions. They thought to themselves,
“We do not know his power and strength. It is better we keep a distance from him. Haven't the elders warned
not to trust him whose conduct, caste and courage are not known.”

Seeing them scared, the dyed jackal said, “Why do you run away like that. There is no need to fear. I am a
special creation of God. He told me that the animals in the jungle here had no ruler and that he was
nominating me as your king. He named me as Kakudruma and told me to rule all of you. Therefore all of you
can live safely under the umbrella of my protection.”

All the animals in the jungle accepted him as the king. He in turn appointed the lion as his minister, the tiger
as his chamberlain and the wolf as the gatekeeper. After distributing office to the animals, the new king
Kakudruma banished all the jackals in the forest. The lions, tigers and the wolves killed other animals and
brought them as food for the king. Taking his share, Kakudruma would distribute the rest of the kill among
his subjects.

One day when the blue jackal was holding court, he heard a gang of jackals howling. Thrilled by the sound of
his own ilk,Kakudruma began loudly responding in his natural voice. The lions and other animals
immediately recognized that their king was after all a jackal and not a Godsend. They at once pounced on the
blue jackal and killed him.

“The moral is,” Damanaka said, “he who abandons his own folk will perish.”

“But how do I believe that Sanjeevaka has evil intentions,” asked Pingalaka.

“He told me today that he would kill you tomorrow. If you notice him carefully tomorrow, you will find him
red-eyed and occupying a seat he does not deserve. He would stare at you angrily. If what I say comes true, it
is for you what to do with Pingalaka,” said Damanaka.

After this meeting with the lion king, Damanaka went to meet Pingalaka. The bullock received him with
courtesy and said, “We are meeting after a long time. What can I do for you? They are the blessed who are
visited by friends.”

“Your are right, sir. But where is rest for servants. They have lost their freedom for the sake of money. They
know no sleep, no interest in food nor can they speak without fear. Yet they live. Somebody has rightly
compared service to a dog's life,” said Damanaka.

“Come to the point, my friend” The bullock was now impatient.

Damanaka said, “Sir, a minister is not supposed to give bad advice. He cannot also disclose state secrets. If
he does, he will go to hell after his death. But in the cause of your friendship, I have revealed a secret. It is on
my suggestion that you have taken up service in the royal household. Pingalaka has evil designs against you.
When we were alone, he told me he would kill you and bring happiness to everyone in the palace.

“I told the king that this was stabbing a friend in the back,” Damanaka continued. “The king was angry and
said that you were a vegetarian and he lived on a diet of meat and so there was natural discord between you
and him. He said that this was enough reason for him to kill you. This is a secret I have kept to myself for a
long time. It is now for you to do what is necessary.”

Sanjeevaka fainted on hearing these words. Recovering after some time, he said, “It is truly said that a person
who serves the king is like a bullock without horns. It is difficult to know the mind of a king who has
different ideas. It is not easy to serve a king. Even sages could not read the minds of kings. I think some
servants who were jealous of my friendship with the king must have poisoned his mind.”

“Don't worry,” Damanaka said. “Forget what tales the servants carried to the king. You can still win his
favour by your sweet words.”

“That is not true. It is impossible to live with wicked people, however small they are. They can always think
of a hundred ways to get you in the same manner the jackal and crow trapped the camel.”

“Sounds interesting. Let me know what happened to the camel,” asked Damanaka.

Sanjeevaka began to tell him the story.'''



## Tokenization
Perform sentence tokenization and word tokenization and print the number of sentences and words in the given story

In [14]:
sentences = sent_tokenize(story)
words = word_tokenize(story)

print("number of sentences:%d" %len(sentences))
print("number of words:%d" %len(words))

number of sentences:62
number of words:1006


In [15]:
sentences[0]

'Chandaraka was a jackal living in a forest.'

## POS tagging of words

In [16]:
pos = pos_tag(words)
nounAndVerb = [i for i in pos if i[1] in ['NN', 'NNS', 'NNP', 'NNPS', 'VB', 'VBD','VBG', 'VBN', 'VBP', 'VBZ']]
noun = [i for i in pos if i[1] in ['NN', 'NNS', 'NNP', 'NNPS']]
verb = [i for i in pos if i[1] in ['VB', 'VBD','VBG', 'VBN', 'VBP', 'VBZ']]

print("number of POS tagged unique nouns and verbs in the story: ", len(set(nounAndVerb)))
print("number of POS tagged unique nouns in the story: ", len(set(noun)))
print("number of POS tagged unique verbs in the story: ", len(set(verb)))

number of POS tagged unique nouns and verbs in the story:  253
number of POS tagged unique nouns in the story:  136
number of POS tagged unique verbs in the story:  117


In [31]:
for item in pos:
    if item[1] in ['NN', 'NNS', 'NNP', 'NNPS']:
        

[('Chandaraka', 'NNP'),
 ('was', 'VBD'),
 ('a', 'DT'),
 ('jackal', 'NN'),
 ('living', 'NN'),
 ('in', 'IN'),
 ('a', 'DT'),
 ('forest', 'NN'),
 ('.', '.'),
 ('One', 'CD'),
 ('day', 'NN'),
 (',', ','),
 ('driven', 'VBN'),
 ('by', 'IN'),
 ('hunger', 'NN'),
 (',', ','),
 ('he', 'PRP'),
 ('came', 'VBD'),
 ('to', 'TO'),
 ('a', 'DT')]

## NER tagging
Performing NER tagging on the POS tagged text and extracting labels gives 5 persons and 1 organization. The word 'Come' is wrongly classified as organization. The words 'Pingalaka', 'Sanjeevaka', 'Damanaka', 'Chandaraka', 'Kakudruma' have been rightly classified as person.

In [7]:
NER_pos = ne_chunk(pos)
NER_chunk = []
for chunk in NER_pos:
      if hasattr(chunk, 'label'):
         NER_chunk.append((chunk.label(), ' '.join(c[0] for c in chunk)))
print('unique NER extracted: ')
list(set(NER_chunk))

unique NER extracted: 


[('PERSON', 'Damanaka'),
 ('PERSON', 'Chandaraka'),
 ('PERSON', 'Kakudruma'),
 ('PERSON', 'Sanjeevaka'),
 ('PERSON', 'Pingalaka'),
 ('ORGANIZATION', 'Come')]

In [10]:
noun1 = [i for i in pos if i[1] in ['NNS']]
noun1

[('mongrels', 'NNS'),
 ('clothes', 'NNS'),
 ('mongrels', 'NNS'),
 ('lions', 'NNS'),
 ('tigers', 'NNS'),
 ('panthers', 'NNS'),
 ('wolves', 'NNS'),
 ('animals', 'NNS'),
 ('directions', 'NNS'),
 ('elders', 'NNS'),
 ('”', 'NNS'),
 ('animals', 'NNS'),
 ('animals', 'NNS'),
 ('animals', 'NNS'),
 ('jackals', 'NNS'),
 ('lions', 'NNS'),
 ('tigers', 'NNS'),
 ('wolves', 'NNS'),
 ('animals', 'NNS'),
 ('subjects', 'NNS'),
 ('jackals', 'NNS'),
 ('lions', 'NNS'),
 ('animals', 'NNS'),
 ('“', 'NNS'),
 ('intentions', 'NNS'),
 ('servants', 'NNS'),
 ('secrets', 'NNS'),
 ('designs', 'NNS'),
 ('words', 'NNS'),
 ('horns', 'NNS'),
 ('ideas', 'NNS'),
 ('sages', 'NNS'),
 ('minds', 'NNS'),
 ('kings', 'NNS'),
 ('servants', 'NNS'),
 ('servants', 'NNS'),
 ('people', 'NNS'),
 ('ways', 'NNS')]