## Chapter 5

### 1. What are the most common adverbs in the brown corpus (categories=”news”)? Please sort all the adverbs by frequency, with the most frequent ones first. (Please use the universal tagset)

In [1]:
import nltk
from nltk.corpus import brown
from nltk import FreqDist

# get the tagged words for news category
brown_news_tagged = brown.tagged_words(categories='news', tagset='universal')
adverbs = [w[0] for w in brown_news_tagged if w[1] == 'ADV']

# printing only the top 25
print(FreqDist(adverbs).most_common(25))

[('not', 254), ('when', 128), ('also', 120), ('now', 76), ('as', 75), ('here', 67), ('where', 58), ('then', 56), ('back', 55), ('about', 49), ('more', 49), ('only', 48), ('even', 48), ('so', 47), ('most', 45), ('well', 41), ('When', 41), ('just', 41), ('never', 38), ('p.m.', 38), ('however', 37), ('too', 37), ('how', 37), ('still', 36), ('ago', 36)]


### 2. What are the part-of-speech tags before the word “news” in the brown corpus (categories=”news”)? (Please use the universal tagset)

In [2]:
from nltk import bigrams

# get the tagged words for news category
brown_news_tagged = brown.tagged_words(categories='news', tagset='universal')

# pair of (word, tag) that comes before the word news
print(sorted(set(a for (a, b) in bigrams(brown_news_tagged) if b[0] == 'news')))

[('League', 'NOUN'), ('Romantic', 'ADJ'), ('The', 'DET'), ('a', 'DET'), ('first', 'ADJ'), ('foreign', 'ADJ'), ('good', 'ADJ'), ('made', 'VERB'), ('startling', 'ADJ'), ('the', 'DET')]


### 3. What are the words that are highly ambiguous as to their part-of-speech tags ((i.e. the word has more than 3 pos tags) in the brown corpus (categories=”reviews”). (Please use the universal tagset)

In [3]:
from nltk import ConditionalFreqDist

# get the tagged words for reviews category from brown corpus
brown_reviews_tagged = brown.tagged_words(categories='reviews', tagset='universal')
data = ConditionalFreqDist((w.lower(), t) for (w, t) in brown_reviews_tagged)

# run a for loop on the data such that we include only cases where more than 3 pos are observed
for word in sorted(data.conditions()):
    if len(data[word]) > 3:
        tags = [t for (t, _) in data[word].most_common()]
        print(word, data[word].most_common(10)) 

close [('ADJ', 6), ('ADV', 4), ('VERB', 1), ('NOUN', 1)]
that [('ADP', 167), ('PRON', 115), ('DET', 65), ('ADV', 1)]


### 4.Train a unigram tagger on the brown corpus (categories="humor"). a) Split the data into training and testing dataset- training on the 95% of data and testing on the remaining 5%. b) Evaluate the performance of this tagger. c) Use this tagger to tag some new text ['this','is','a','NLP','class']. d) Observe that some words are not assigned a tag. Explain why not?  (Please do not use the universal tagset)

In [4]:
from nltk.tag import UnigramTagger

# get the tagged words for the category humor
brown_tagged_sents = brown.tagged_sents(categories='humor')

# a) split the data
limit = round(0.95 * len(brown_tagged_sents)) 
training_data, test_data = brown_tagged_sents[ :limit], brown_tagged_sents[limit: ]
unigram_tagger = UnigramTagger(training_data)

#b) evaluate the performance of the tagger
print("Performance of the tagger is", round((unigram_tagger.evaluate(test_data)*100),2), "%")

Performance of the tagger is 70.33 %


In [5]:
#c) initiating the tagger with a new text
unigram_tagger.tag(['this','is','a','NLP','class'])

# d) Explanation to what we see from ouptut in part c)
# Ans: The words never appeared in the training text, and therefore the tagger can't speculate the word's tag.

[('this', 'DT'), ('is', 'BEZ'), ('a', 'AT'), ('NLP', None), ('class', 'NN')]

### 5.Explore the nps_chat corpus and find out what part-of-speech tags occur before a noun, with the most frequent ones first.(Please use the universal tagset)

In [6]:
# create taggeed words for nps_chat corpus
nps_chat_tagged = nltk.bigrams(nltk.corpus.nps_chat.tagged_words(tagset = 'universal'))

# define the condition for post tags before noun
pos_noun = [a[1] for (a,b) in nps_chat_tagged if b[1] == 'NOUN']

# print results
print(FreqDist(pos_noun).most_common())

[('X', 2558), ('DET', 1308), ('NOUN', 1262), ('VERB', 947), ('ADJ', 823), ('PRON', 676), ('.', 636), ('ADP', 567), ('CONJ', 238), ('NUM', 223), ('ADV', 218), ('PRT', 189)]


### 6. Explore the brown corpus (categories="romance") to find out all tags starting with VB and its associated (word, frequency) pairs (no more than 6 pairs). (Please do not use the universal tagset)
### For example, one of the outputs should look like:
### VBG [('going', 59), ('looking', 36), ('trying', 23), ('thinking', 21), ('watching', 20), ('taking', 19)]

In [7]:
# writing a functiont that takes all tags and words where the tag starts with VB
def findtags(tag_prefix, tagged_text):
    cfd = nltk.ConditionalFreqDist((t, w) for (w, t) in tagged_text if t.startswith(tag_prefix))
    return dict((t, cfd[t].most_common(6)) for t in cfd.conditions())

# get the tagged words for romance category and the tag_prefix in a dictioanry
tagdict = findtags('VB', brown.tagged_words(categories='romance'))

# print the values
for tag in sorted(tagdict):
    print(tag, tagdict[tag])

VB [('get', 92), ('know', 88), ('go', 76), ('see', 74), ('take', 62), ('say', 59)]
VB+PPO [("Let's", 10), ("let's", 5)]
VBD [('said', 318), ('went', 82), ('thought', 80), ('came', 75), ('knew', 69), ('looked', 68)]
VBG [('going', 59), ('looking', 36), ('trying', 23), ('thinking', 21), ('watching', 20), ('taking', 19)]
VBG+TO [('gonna', 4)]
VBG-TL [("Racin'", 1), ('Dancing', 1), ('Surviving', 1)]
VBN [('got', 36), ('come', 29), ('done', 29), ('gone', 25), ('seen', 20), ('made', 20)]
VBN+TO [('gotta', 1)]
VBN-TL [('United', 3), ('Armed', 1), ('Forked', 1)]
VBZ [('says', 7), ('wants', 7), ('goes', 5), ('gets', 4), ('thinks', 4), ('makes', 4)]


### 7. Write programs to process the Brown Corpus and find answers to the following questions (Please do not use the universal tagset):
### a.Which nouns are more common in their plural form (e.g. tag='NNS'), rather than their singular form (e.g.tag='NN')? (Only consider regular plurals, formed with the -s suffix.)
### b.What do the 10 most frequent tags represent in the Brown Corpus? Please output the tags and explain.


In [8]:
# a)

# list all the singular and plural words
singular = set([w.lower() for (w, t) in brown.tagged_words() if t == 'NN'])
plurals = set([w.lower() for (w, t) in brown.tagged_words() if t == 'NNS'])

# check for the common regular plurals only with -s suffix
common = [a for a in singular if a + "s" in plurals]

# get FreqDist for singular nouns and plural nouns
singular_fd = FreqDist(w.lower() for (w, _) in brown.tagged_words() if w in singular)
plurals_fd = FreqDist(w.lower() for (w, _) in brown.tagged_words() if w in plurals)

# find out which words are more common in the plural form
common_plurals = [(plurals_fd[i + 's'], i + 's', singular_fd[i], i) for i in common if plurals_fd[i + 's'] > singular_fd[i]]

# print the common plural - top 25 
sorted(common_plurals, reverse = True)[:25]
# it first shows the count of each plural form with the plural word, and
# then the count of singular form with the singular word

[(943, 'years', 649, 'year'),
 (391, 'eyes', 119, 'eye'),
 (361, 'things', 331, 'thing'),
 (312, 'members', 133, 'member'),
 (291, 'means', 199, 'mean'),
 (269, 'words', 261, 'word'),
 (204, 'students', 109, 'student'),
 (193, 'minutes', 54, 'minute'),
 (188, 'months', 130, 'month'),
 (179, 'conditions', 89, 'condition'),
 (173, 'hours', 145, 'hour'),
 (169, 'miles', 42, 'mile'),
 (160, 'terms', 79, 'term'),
 (150, 'friends', 125, 'friend'),
 (138, 'methods', 137, 'method'),
 (125, 'sales', 44, 'sale'),
 (115, 'arms', 91, 'arm'),
 (106, 'leaders', 69, 'leader'),
 (103, 'elements', 52, 'element'),
 (102, 'factors', 71, 'factor'),
 (99, 'events', 81, 'event'),
 (98, 'techniques', 58, 'technique'),
 (97, 'dollars', 43, 'dollar'),
 (96, 'institutions', 38, 'institution'),
 (95, 'trees', 56, 'tree')]

In [9]:
# I also verified if it works or not and picked examples like days which doesn't show up because the singular form
# 'day' is more common.

print(singular_fd['day'])
print(plurals_fd['days'])

# here singular form is more common and appeared 623 times over 377 times for plural form and hence this gets excluded.

623
377


In [10]:
# b)

# condition for tags which in brown tagged words. Then we print the 10 most common records
fd = nltk.FreqDist([t for (_, t) in brown.tagged_words()])
print(fd.most_common(10))

[('NN', 152470), ('IN', 120557), ('AT', 97959), ('JJ', 64028), ('.', 60638), (',', 58156), ('NNS', 55110), ('CC', 37718), ('RB', 36464), ('NP', 34476)]


### 8. Write code to search the Brown Corpus for particular words and phrases according to tags, to answer the following questions (please do not use the universal tagset):
### a. Produce an alphabetically sorted list of the distinct words tagged as MD.
### b. Identify three-word prepositional phrases of the form IN + AT + NN (eg. in the lab).

In [11]:
# a)

# Just one condition for distinct words tagged as MD
print(sorted(set([w[0] for w in brown.tagged_words() if w[1] == 'MD'])))

['Can', 'Could', 'May', 'Might', 'Must', 'Ought', 'Shall', 'Should', 'Will', 'Would', "c'n", 'can', 'colde', 'could', 'dare', 'kin', 'maht', 'mai', 'may', 'maye', 'mayst', 'might', 'must', 'need', 'ought', 'shall', 'should', 'shuld', 'shulde', 'wil', 'will', 'wilt', 'wod', 'wold', 'wolde', 'would']


In [12]:
# b)

# 3 word prepostitional phrases of the form IN + AT + NN
def process(sentence):
    for (w1,t1), (w2,t2), (w3,t3) in nltk.trigrams(sentence): 
        if (t1 == 'IN' and t2 == 'AT' and t3 == 'NN'): 
            print(w1, w2, w3)

In [13]:
# print the phrases
for tagged_sent in brown.tagged_sents():
    process(tagged_sent)

of the election
for the manner
in the election
to the end
on a number
of the law
through the welfare
in the state
with the exception
in the future
in the appointment
in a manner
for the purpose
at the jail
for the mayor
than a year
on the petition
from the audience
for a state
under the county
of the highway
of the number
in the session
to the state
of the act
with a battle
against the issuance
on the increase
of the action
by a vote
of a resolution
in the past
in a privilege
in the event
with a pistol
to the election
in a dispute
with the county
During the election
for the measure
to a subcommittee
at the end
to the state
to the economy
of the bill
on the hearing
to the state
for the day
in the state
on the calendar
in the future
of a school
in the meantime
of a site
of a series
under a bill
by a board
of the governor
of the poll
in the committee
of the committee
of the fire
in the subject
of the proposal
at the close
upon the recommendation
in a report
on the aid
of the ADC
in the co

to the public
by the section
in the exhibition
of the family
of the person
at a luncheon
of a daughter
at a luncheon
at the family
down the back
with a skirt
among the court
to the queen
with an edging
on the bodice
in the movement
in the salary
in the union
on the job
in the face
with a pistol
after the robbery
of the cab
of a food
from the cash
in a fire
to the hospital
of the face
to the practice
of the superintendent
For a number
of the school
at the hospital
of a barber
of the shop
in a letter
on the job
in the snow
through the week
in the back
to the hospital
of the lack
on the bay
to a call
on the first-floor
on the kitchen
from the house
of the car
into the post
From the outset
of a hardware
of the terrace
on the ground
to a study
from the reactor
at the end
on a roll
from the mystery
of the week-end
under the kitchen
for the attorney
in the spy
to the cabinet
about the ship
to a jury
in the name
With the machinery
for the hull
of the hunter-killer
for the future
to the transmi

against the union
with the movement
of an understatement
in a week
after a parade
to a total
with the breakoff
to the break
to a cry
by a swarm
in the world
to the setting
in the establishment
to the extension
of a piece
Considering the state
for a finish
to the floor
under the name
In the heat
with the necessity
on the floor
in the prospect
at the beginning
on the filibuster
to a settlement
with the convening
of the election
of the man
for the presidency
of an election
for the victory
through the tournament
of the tournament
off the course
by a stroke
in the afternoon
up the fairway
into a bunker
with a sand
over the green
down the slope
toward a TV
for a tie
for a second-place
of the field
of the draw
at the time
of the tour
around the course
at the flag-stick
toward the hole
of the tournament
in the day
by the time
on the course
for the rest
of the weekend
around the course
in a twosome
of the gallery
for a contender
with a birdie
On the back
of a gallery
of the tournament
by the ro

beneath the hood
over the country
At the head
of the CDC
into an array
to the Aj
to the county
on the Aj
with the Aj
of the GOP
of the world
to the USSR
over the status
of the fire
from the war
in the USSR
of a newspaper
In the middle
of the century
with a circulation
in the country
of the century
of the country
on the road
of a century
over the ceiling
with a leader
of the nation
of a deadlock
for a lawyer
to the prison
of an iron
of the cell
at the death
into a baseball
for a mechanic
of the painting
in a market
of the man
in the administration
of the signing
of the treaty
on the job
to the comment
for the rest
from a youngster
of the freshman
on the tour
with the publishing
in the spring
in the light
under the chairmanship
in the publishing
with the dignity
of the water
of the water
at the beginning
without the necessity
To a novice
of the church
of the meaning
for a restudy
with the rapidity
to the public
of the work
of the project
of a heel
with a lustre
in a variety
including the

in a world
in a tempo
of the dialogue
in the film
of the drama
of the wonderfulness
of the court
in the program
on the bill
to the medium
of the family
of the NBC
on the number
of the hour
In an hour
of a mood
on the music
on the choice
of the in-person
of the past
with the word
in the title
on the highway
of an interruption
considering the talent
with the help
of the composer
at the moment
of a voice
at the start
In the fullness
into the score
of the cast
in a concert
for the star
to the possibility
of the time
to the music
with the difference
on a number
on the record
on the master
to the character
from the fact
in a room
on the wall
of a tennis
During the making
of the motion
to an industry
of the globe
of the globe
of the program
in the case
by the directness
to the nickname
by the composer
for the keyboard
in the treble
for the benefit
in the field
in a series
of the war
by the courage
by the ability
in the way
by the name
for the production
in the interpretation
of the music
afte

from the discussion
to the event
on the turn
of the day
in a column
on the bonfire
of the story
of the increase
in the chapel
to a newspaper
in the air
in the city
in the mass
against the portrait
during the sickness
of a man
in the life
with a biography
for a while
in the crucifixion
with the history
by a spiral
of the spiral
on the basis
on the basis
of the story
in the sense
in the sense
with the fall
in the similitude
in the statement
from the soul
of an animal
through the fall
of the dominion
about the image
between the image
with the capacity
under the pretext
of the beginning
on the basis
of the end
to the word
in the beginning
into the world
with the inception
on the spiral
At the nadir
with the fruit
under the power
to the death
of the tree
to the shape
at the nadir
of the circuit
from the tree
by the dissolution
of the flesh
in the earth
into the scheme
for the passage
by the absence
in the achievement
of the work
within the process
of the process
for the creation
into the li

of a bull
in a china
in the course
at the outset
by the individuality
into the spirit
to the performance
with the disc
with a group
by the magnificence
by no means
by the fact
From the beginning
of the original
On the basis
to the degree
on the basis
than a copy
of the recording
with the frequency
to the score
on the intention
of the symphony
of the scherzo
of the finale
In the end
to a melody
to the depth
of the orchestration
into the ensemble
in the score
in the history
at the show
from the dog
for a number
of the ring
into the ring
in the ring
on the subject
at the show
around the ring
over the public-address
for the year
from the group
in the center
of the ring
of the year
for the founder-originator
throughout the year
in the horse
in the horse
in the dog
in a dog
in the dog
for a room
of the way
in the room
In the morning
to the fact
near a body
in every state
of the government
over the country
to the water
to a lake
into the car
to the water
to the boat
to the gear
at the door
at

of the house
in a closet
From the coil
in the yard
in a mild-winter
above the cost
in the basement
in the attic
to a point
than the price
on a variety
besides the nature
from the outside
of a conditioner
in an hour
to the cooling
for the horsepower
of the compressor
of the unit
With a unit
on the outside
of the house
in the roof
of the house
of a gas
to the moisture
in the ceiling
in the side
to a minimum
in the installation
on the basis
above a bedroom
of a site
through the work
of the site
from the county
in the field
during the time
of the year
of the climate
to the sun
in the field
at the site
at the office
in the area
regarding the site
to the site
by a group
on the character
of the site
of the investigator
of the area
for the future
to the public
of the recreation
at the site
on a body
for a park
of a valley
on the coast
of the surf
on a beach
through the beauty
for the public
for a recreation
on an ocean
of the water
in the springtime
on a reservoir
of the water
of the water
to 

to the rest
of the world
in the sound
At a minimum
by the therapist
by a battery
for the therapist
of the therapist
by the linguist
of the voice
to the therapist
to the patient
from the book
by the use
of the triad
to the linguist
to the therapist
to the therapist
of a divorce
in the voice
of the temperament
of the real-life
to the point
up the calf
of the boot
on the mole
by a pair
in a world
of the mushroom
from a variety
to the use
to the stockpiling
into the hole
of the nitrogen-mustard
in the way
to the attack
at a convoy
over the city
of the war
in the city
in the hold
in the earth
at the delay
at the chance
down the gangplank
of the crew
with a cargo
from the chocolate
of the city
in the light
in a lake
for a raft
over the side
in the merriment
in the past
for the fact
with a friend
upon an event
by the arm
in a dream
to the understanding
of the dream
of the dream
on the subject
in the fact
of a dream
through the horror
from the hodge-podge
of the future
for a theory
in the theo

at the outset
in the past
on the frontier
in a country
of the twentieth-century
for the historian
in the street
for the enemy
with the front
of a complex
to the man
through the history
in an age
for the amateur
about the dollar-sign
to the collector
to the field
with the meaning
of the material
at the thought
beyond the sewage
with the amateur
of the analyst
of the university
to the collector
with a superstition
by the amateur
at the college
of the folk
unlike the union
with an awareness
of the folk
in the spirit
with a kind
in the arrangement
by a psychologist
of an immigrant
with the child
of the neighbourhood
At the age
by a streetcar
with the police
On the occasion
by an alderman
with a tool
in a labour
at a cost
by the opposition
of the state
of the banquet
for an explanation
into the detective
during the ceremony
with the presentation
for a tip
with a platinum
in the relationship
by a clergyman
At the end
of the performance
among a surge
in the stomach
of the theatre
in the view


on the prod
on the peck
by the cowhand
of an animal
by the tail
of a rope
through a turn
of the tail
'bout the saddle
in the case
of a steer
for the rest
of the day
From a pen
by the tail
into the ground
with a stripe
from the rest
down the back
of the longhorn
of a color
up the trail
with a lineback
by the owner
of the rustler
on the range
in a pasture
for the purpose
through the pasture
at the request
in the presence
of a representative
of the bank
against the herd
to the profit
of the seller
durin' the boom
to a saloonkeeper
by the name
from a wilderness
of the word
for the hoss
to a cow
of the maverick
near the carcass
of a zebra
to the ground
with the discovery
in the shadow
on the plain
in a heap
of the party
from a pack
to the pile
of a lion
toward the sound
from the top
At the base
toward the sound
of the barking
of the lion
to the crest
of a bush
with a start
to the top
at the rock
from the run
In the graveyard
of a bush
on the rock
on the back
of a chance
on the back
to the r

at the station
on the trip
by the war
of the loveliness
than a year
of the science
at the place
After a supper
at the foot
for the condition
at a loss
to the museum
on the train
on the way
at the hotel
from the town
In the evening
on the train
on the train
of the river
In the midst
to the hotel
at the train
in a class
for the use
about the value
into the community
on the campus
in the world
of the family
with a model
for the storage
in a classroom
in the cafeteria
of the USIS
for a visit
with a professor
of the desegregation
of the force
of a boycott
throughout the nation
between the practice
for the sake
of the policy
in the matter
within the company
of the country
of the country
over the slavery
in a situation
of the matter
in every case
of the fact
by the use
by the use
of the action
of the use
in the way
of the sit-in
from the top
to the bottom
in the world
in the order
in the way
in the choice
of the victim
by the user
of the fact
over the justice
of the cause
in the use
in the fa

in the end
on the way
to the churchyard
from the highroad
of a conflict
of a proverb
of a conflict
against the background
at the turn
of the century
in the art-shop
of the shop
by the porter
On the street
to the witness
of the world
in the form
of the artist
in a world
by the march
Without the decay
of a sense
into the nature
in the philosophy
from the solipsism
by the truth
in the education
to the power
upon a recognition
unlike the philosopher
of the imagination
in the absence
of the force
without the incitement
of the vitality
to a sense
in the presentation
with the force
into an attitude
upon the door
upon the self
by the persistence
concerning the nature
to the attitude
of the world
of a self
of the environment
of the mind
through the present
toward the future
of the nature
to the world
in the mode
in the mode
of the quality
in the way
in the mode
from the past
of a chain
into the past
unlike the entity
during an automobile
in every instance
of the sort
with the renewal
of the lan

at the end
to the stream-of-consciousness
to an independence
from the pattern
of the region
of the planter
to a society
of the war
toward the end
of the century
on the planter
in the character
with a class
in the writing
of the frontier
of the village
to an audience
during the ice
upon the idea
to a conception
in the view
from the fact
with a plan
with a child
round the world
with a stake
with the town
of the status
of the age
with a gun
on a desk
of the day
around the world
to a temper
for a way
from the lesson
at the moment
by the extent
of the world
at the response
by the fact
to an understanding
of the marketplace
in a society
into the ground
to the situation
for the reason
from a knowledge
of the present
in the bottle
by a corporation
for the present
of the host
for the advent
of the situation
around the pole
on the horizon
during the course
from the contour
of the shadow
by a gnomon
over the horizon
at the center
of the universe
in a jar
at the bottom
in the air
to the center
of 

toward the bridge
from the bicycle
beyond the town
on the line
on the line
of a visa
From the crowd
on the line
to the side
of the road
on the sidewalk
on the back
of the bridge
of the situation
at the bridge
of the girl
from the bridge
to the side
with a toy
by a neighborhood
with the soup
at the post
of the tent
around the fire
of the tent
In the mail
by a world
above the patter
of the rain
On the evening
to the hospital
of the situation
with the enemy
of the war
after the war
to a gallop
for the story
in the record
of the campaign
to a counterattack
of the mayor
On the morning
with the enemy
of the enemy
of the gain
from the balance
of the line
with the note
in the campaign
of a part
in the air
In the midst
of the war
on an average
In the summary
of the campaign
near the letter
in the word
of the day
for the movement
of the skirmish
in the way
in the way
in the cavalry
into a prejudice
off the country
for a delay
in a flank
from the sight
of the enemy
for the foot
for the test
into 

on the basis
to the amount
upon the soundness
in the amateur
of the historian
concerning the meaning
between the leader
at the expense
of the propagandist
on the suppression
for a synthesis
on the writing
to a profession
of an hypothesis
to the test
without the statement
in the value
to the woolen
to the lifting
of a fog
of the author
despite the insight
of the nature
for a law
of the unknown
on the flow
with the divine
through the miracle
of the supernatural
in the influence
on the miracle
concerning the nature
to the understanding
to the status
to the fact
of the race
for a museum
during a cholera
into the emerald
in the forest
in the house
of the telephone
of a convict
during the depression
round the house
of a man
to no good
on the telephone
in the army
on the train
for a while
in a movie
of the house
with the feeling
with a comparison
about a dinner
at the title
of the book
in a rainstorm
of a taxi
at the time
at the apex
in the spirit
in the end
at the age
in the church
in the ho

for the presentation
in the year
For the purpose
throughout the world
throughout the land
in the year
by the choreographer
in the middle
for the purpose
of the art
in the year
of the world
with the oppression
with the conviction
in the future
in the future
in the past
in the year
for the development
for the large-scale
of a quality
of the research
to the point
in the construction
of a demonstration
on the basis
by a report
on the size
for the recovery
from the conversion
in the case
with a view
to the end
to the defense
in the interest
in the interest
to the ownership
at the beginning
of the action
beyond the end
beyond the end
to the approval
in the development
to the program
for the use
throughout the world
for the use
of the joint
upon the expiration
after the date
Upon the expiration
of a period
of the sale
from the operator
for the purpose
for the study
after the date
of the sale
to the metal
of the difference
for the month
of the difference
for the month
during the calendar
durin

During the period
upon the operation
at the location
of the class
in the case
by an analysis
of the time
of the time
to the extent
to the degree
In the allocation
in the allocation
On the remainder
In the daytime
to the class
on the basis
in the light
of the proceeding
after the accumulation
in the number
to the service
from the ionosphere
within the time
across the top
after the close
for the calendar
for the calendar
for the calendar
for the calendar
to the withholding
after the close
after the close
for the decedent
on the basis
of a calendar
for the period
for a decedent
of an overpayment
to the return
after the close
of the tax
in a letter
for the year
for an extension
at the rate
for a receipt
to the tax
to a refund
on the return
in the place
for the district
for the tax
to a trade
for the production
to the property
with the performance
with the performance
by the estate
of the excess
in the return
to a refund
by the withholding
of the mind
on the basis
of a corps
of the subject


to the force
in the fluid
of an ellipsoid
in the ab
at the end
of the axis
in the literature
on the shape
in a paper
of the material
on the photograph
from the angle
of the material
over the range
to the relation
on the particle
of the hydrogen
of the Af
for the study
on the chromium
of the hydrogen
of the interaction
by the electron
of the hydrogen
in the absence
in the solid
to the conclusion
with the Af
in the unit
in the unit
to a structure
of the structure
of the structure
of a center
of the Af
of a packing
of a center
for the hydrogen
on a sheet
between the oxygen
on a sheet
of the absence
of the sample
in a bomb
from the atmosphere
of the material
with the pattern
of the Af
on the basis
of the shape
with an amplitude
with the absorption
of the spin
of the resonance
over the temperature
For the low-temperature
for the period
to the atmosphere
of the Af
of the crystal
of the sample
from a sample
on the end
of a capillary
to the top
in the capillary
at a time
In the household
of a 

In an attempt
in the weakness
for the patient
in the muscle
over the thyroid
with a conversion
of the anemia
of the demineralization
at the outflow
of the heart
of the aorta
in the apex
of the stomach
of the stomach
of the jejunum
of the colon
of the mucosa
from the jejunum
On the surface
of the gastrocnemius
of the muscle
in the gastrocnemius
in the pectoralis
In the gastrocnemius
of the length
in the muscle
throughout the fiber
of the fiber
with an 80%
through a diethylaminoethyl
in a cryostat
under the slide
under a fan
with a drop
of the saline
of the liquid
under the fan
with a mercury
from the lamp
In the eyepiece
with the code
of the pseudophloem
by the Af
with the staining
through a DEAE-cellulose
through a column
of the DEAE-cellulose-treated
through a DEAE-cellulose
with the conjugate
of the titer
of the conjugate
under the microscope
of the xylem
in the pseudophloem
in the xylem
Within the pseudophloem
in the cytoplasm
of the stem
from the fact
for the failure
through a DEAE

in the unconscious
between the self
for the reason
in the session
before the end
of the session
into a suspicion
in a form
in a world
in the air
in the air
in the voume
between the occurrence
in the dictionary
into a description
in the dictionary
to the problem
in a language
by a space
of a form
by the text
by the computer
on the spelling
in the text
in the computer
about the form
of the form
from the dictionary
of the translation
in the sense
with the information
to the form
in the W-region
in the text-form
in the X-region
of a cell
in the X-region
from the form
of a form
in the text-form
of the W-region
to the form
of an occurrence
in the text
with the form
in the cell
from the form
in the X-region
in the chain
to an address
to the address
in the Y-region
in the chain
in the chain
of the Y-cell
in the chain
of the Y-cell
of the chain
of the chain
in the form
to the form
in the text-form
in the text-form
by the dictionary
in the text-form
by a text
with a dictionary
in the dictionary


in a reorganization
of a type
by a transferor
of a corporation
in the trade
of a transferor
in the business
of a group
following a reorganization
of the transferor
of the transferee
for a person
to a corporation
over the dissent
in the statute
in the statute
of a mail
of the opportunity
to a mail
in the question
for the benefit
for the benefit
for the benefit
of the defense
for the benefit
for the benefit
of a mail
In the preparation
of the questionnaire
of the questionnaire
to the following
of the questionnaire
of the questionnaire
of the questionnaire
of the questionnaire
of the questionnaire
of the questionnaire
of the questionnaire
for the name
of the respondent
of the respondent
over a period
of the questionnaire
with a cover
in the state
of the research
of the questionnaire
in the state
throughout the country
of the questionaire
from a list
by the research
by the respondent
of the study
in the study
in the state
to the research
in the compilation
from a group
to the research
in t

of the word
in the midst
on the grave
to the point
in the service
to the fact
around a series
over the world
in the half-light
in a variety
After a while
of the behavior
concerning the motion-pattern
with a number
in a series
to the occasion
by a number
with a challenge
in the room
for the future
of the world
of the intellect
in every respect
with a minimum
from the outset
of the book
to the question
on the problem
by the use
of the feeling-state
of the population
of the malaise
of the intellectual
of the malaise
of the intellectual
with the evidence
in the sphere
of the bureaucratization
of the corporation
from the point
in the history
of the world
from a writer
in the factory
to the proposition
Considering the nature
to the reader
of the pseudo-happiness
of the automaton
of a work-satisfaction
in a system
at the moment
in the example
on the premise
on the validity
of the failure
by a number
at the list
of the use
of the term
on the stage
including the chorus
in the music
to the orche

upon the dose
of the food
of a source
in the development
in the range
for the future
on a megawatt
in the neighborhood
in a reactor
in the cost
of the development
with the radiopasteurization
in the absence
within the package
in the range
in the area
for the cesium-137
in the range
of an electron
with a minimum
to a month
from the seacoast
of the removal
along the substrate
into the instrument
by a repeat
over a period
from a substrate
by an instrument
with a knife
upon the thickness
of the coating
during the course
of the knife
of a coating
of the coating
to the substrate
of the coating
along a plane
at the tip
of the knife
to the coating
in a region
to the surface
of the substrate
of the removal
of a coating
by a knife
through a coating
of the removal
from the reaction
against the face
of the knife
along the shear
with the substrate
from the substrate
along the shear
on the shear
on the face
of the knife
of the knife
with the substrate
with the plane
of the substrate
of the vector
of

in a mountain
on the patrol
of the cave
of a gold
around the neck
with a knife
from the knife
for the DSM
in the war
by the window
over the river
on a day
into the bone
of the spirit
up the ladder
across the campus
By the time
over a youth
about the visit
at every piece
in the hall
in the library
in the library
of the lawyer
in the shadow
in a machine
of the beauty
near the river
in a region
with no sewage
save the river
on the hill
In the midst
with the support
to a sermon
of the audience
to the audience
on the loss
of the audience
at the time
with no plumbing
in the home
of the church
from the path
of the pastor
to the city
from the church
with a side
of the church
of the church
in the center
in every condition
across the river
of a man
of the church
across the river
across the river
on the subject
on the part
of the night
From the outside
of the gentry
up the stoop
For a moment
for a hotel
with a sort
At the top
through a pantomime
at the time
for the bed
in a nightdress
into the ha

on the chair
on the table
on the bed
from the pillow
for the coffeepot
in the apartment
on the back
into a fight
with the percolator
without an expression
in the world
from the bed
on the sidewalk
in the kitchenette
on the telephone
on the telephone
of a situation
in the hall
at the head
on the street
in the evening
on the table
of the water
for the christening
towards a layette
with a job
over the labor
in the port
off a fellow
in the movement
for a woman
in the mission
for a position
in the hierarchy
of the church
for the bishopry
to the colonel
to the church
in the village
in the triumph
in the world
in an area
with the thought
to the village
in the print
in the village
at the prospect
at an inn
near the hospital
with the enemy
of a child
on a holiday
in the mission
From the moment
into the mission
for a job
on a piece
for a time
of a biography
on the coast
between the land
on the beach
at the age
into the mission
from the village
on the day
to the village
in the village
of the line

in a container
by the office
of the motel
over the outside
against the wall
on the side
With the aid
for the night
down the road
of a mile
on the car
against the wall
by the man
in the car
to the drive-in
on a woman
in a hurry
in a hurry
in the evening
in a hurry
on a diet
on the device
on the door
toward the door
around the corner
At the door
for the latch
to the back
In a flash
to the back
from the male
from the restaurant
at the counter
at the counter
to the door
at a time
to the safety
with the certainty
to the phone
from the phone
to the front
of the restaurant
at the garage
with the zipper
in a while
With a curse
against the wall
with a hiss
in the silence
over the profanity
in the back
of the car
to a box
on the floor
behind the counter
under the counter
in the tool
at the clock
near the border
In the tool
in a smile
against the blanket
in the back
outside a restaurant
at the window
on the counter
on a tissue
to the inside
of the door-frame
of a car
at a point
with the sole
on a

on the plot
of the champagne
into the house
up the marble
At the top
of a room
of the corridor
to the captain
in a plot
to the slaughter
over the laundry
along the corridor
into the building
in a sari
to the street
at the building
of the party
from the window
In a minute
out the window
across the street
across the street
with the rifle
of the block
on a shelf
from the building
from a roof
along the roof
of a policeman
With a cop
inside a building
into the hallway
to the vestibule
from a glance
at the hall
of a man
of a row
to the lore
over the city
on the landing
by the light
with a thrill
to the street
of a rifle
of the door
against the window
on a kitchen
to the sill
to the source
of the disturbance
to the window
for the moment
to the wall
for the switch
in the dark
at the instant
from the table
from the table
with the floor
into the hall
against the wall
to the gum
inside the safe
against the gun
at the clerk
in the mirror
over the washbowl
for the son
of a bitch
in a gesture
behind

on the brain
of a play
from the half-man
in the face
against the cartilage
after the torso
in a ground-truck
in the truck
to the door
of the cabin
of the doorway
into the cabin
by a family
to the group
across the plain
to the herd
for the wisdom
from the field
on a rock
at the end
toward the place
in the middle
of the night
for the bitterness
against the fence
toward the house
of the rest
for a time
against the fence
toward the house
to the porch
In the kitchen
against the front
of the house
to the girl
in the house
by the sun
of the skin
into the house
on the couch
into the kitchen
into a chair
beside the table
from the water
on the shelf
of the dipper
for a moment
through the country
to the kitchen
from the spring
into the copper
on the stove
from the front
of the house
of the line
about the couple
in the house
of a killer
about the boy
in the morning
in a mine
on a ranch
in a restaurant
of the house
to the table
by an arm
into the kitchen
at the table
of a meal
of a meal
from the sp

in the house
about a piece
on the trigger
into the holster
into a claw
for the gun
on a man
with a club
in the distance
into the stall
With a roar
to the shoulder
into the wall
with a roundhouse
for a moment
with a pitchfork
of the fork
into the floor
in an attempt
by the hair
into the wall
from the house
from the wall
on the nose
with a blow
at the floor
into a corner
of the shirt
to the floor
to the stall
of the stall
of the action
by the shoulder
into the barn
for a while
toward the stall
through the door
for a walk
on the haggle
by the river
for the evening
in the sleeve
From the way
in the road
of the time
at the foot
down the grade
with a glory
over the wheel
in the heat
of the wagon
of the day
on the news
on the fiddle
on the shoulder
in the circle
of a hand
on the handhold
on the face
of the cliff
on the edge
of the grub
in the midst
in the dell
in the dell
in the dell
At the sight
with the memory
in the center
of the circle
to the line
of the circle
in the circle
in the train


with a lot
by a dream
for an instant
to the toilet
behind the house
in the flesh
by the way
of the year
through a window
on the floor
for a moment
out the window
down the hill
from the school
of the schoolhouse
to a walk
in the act
in the act
with the solicitude
in the kitchen
from the quarry
from the table
out the kitchen
up the hill
through a corn
to the school
by the quarry
into the profanity
in a while
of the schoolhouse
into the schoolhouse
across the aisle
in the sixth-grade
to the schoolhouse
out the road
in the well-house
of the schoolhouse
of the schoolhouse
on the ground
on the end
of the line
in the middle
of the line
out the road
toward the field
to the schoolhouse
of the girl
in a corner
to the plan
near a window
On the fringe
with the snake
to the woman
with the slave
for the dance
to the center
against the gourd
after a fashion
for the girl
for a spell
with the king
with the avidity
of the snake
on the floor
in a torpor
about the price
in the back
of the carriage
on the 

in the hall
of the tension
to the airport
for a couple
For a moment
on the receiver
up the path
at the moment
on the sand
around the curve
in the wind
on the beach
about a suit
on the moon
to the rock
on a business
of the nightmare
about the beach
in the back
In the light
from the bedside
on the beach
into a ball
on the rock
by the ocean
of the nightmare
of the rock
into the water
into the air
around the neck
with the crook
on the floor
of a tail
on a ham
in the morning
with a fork
to the beach
on the sand
of the encyclopedia
to the dentist
on the porch
on the ham
in the lagoon
on the stage
to the rock
in the outfield
on the rock
in the back
on the one-o'clock
with a chill
along the horizon
into the water
about the nightmare
in an iron
of the lung
for a second
of the community
of a disappointment
on the rebound
of the stag
from the stag
from a business
at the church
during the ceremony
at the pool
from the gang
at the steel
to the grindstone
into a crowd
for the church
of the church
du

of a slob
in the picture
in the world
with a lot
in the world
in the course
of a day
in the middle
of the night
in the world
in the evening
through the end
of the chapter
with a compulsion
in the sun
in a world
down the valley
of the dawn
with a friend
in the village
to the cellar
without a barrage
in a corner
of the cellar
at the blank
in the world
to a stone
to a stone
After the pegboard
on the problem
in the stone
to the workbench
on a scrap
into a workbench
on the cellar
with an assortment
to a stone
of a problem
of the worktable
on the school
at the moment
on the school
of the fact
at a time
in a cellar
in the wall
of the cellar
with a pile
over the end
with the paneling
over the workbench
By the way
'ceptin' the light
of the power
up the cellar
in the midst
of a confusion
for a shelf
above the kitchen
over a century
of the jest
on the question
in the observance
in the breach
in a sentence
with the doctor
to the car
in the gagline
outside a theater
with the head
of the family
in a

### 9. Use a default dictionary and itemgetter (n) to sort the most frequent tags used in the brown corpus (categories="reviews"). Please first convert the tags into the universal tags.

In [14]:
from operator import itemgetter

# get the tagged words for brown corpus from review category
brown_reviews_tagged = [t for (w,t) in brown.tagged_words(categories="reviews", tagset = 'universal')]

# use the default dictionary from nltk as asked in the ques to set a counter
counts = nltk.defaultdict(int)
for t in brown_reviews_tagged:
    counts[t] += 1
    
# print the tags with their frequency using itemgetter(1) as the key
print(sorted(counts.items(), key=itemgetter(1), reverse=True))

[('NOUN', 10528), ('VERB', 5478), ('.', 5354), ('ADP', 4832), ('DET', 4720), ('ADJ', 3554), ('ADV', 2083), ('CONJ', 1453), ('PRON', 1246), ('PRT', 870), ('NUM', 477), ('X', 109)]


### 10. Explore the brown corpus (categories="learned") to find out the most 200 frequent words and store their most likely tags. We can then use this information as the model for a "lookup tagger" (an NLTK UnigramTagger). If the words are not among the 200 most frequent words, we would like to assign the default tag of "NN" to them. Then use this lookup tagger to tag a new sentence of your own. 

In [15]:
# create variables to store frequently distribution and conditionally frequently distributed tags for learned category
fd  = FreqDist(brown.words(categories='learned'))
cfd = ConditionalFreqDist(brown.tagged_words(categories='learned'))

# creating a dictionary for possible tags for 200 most common words and then checking the same with unigram tagger
likely_tags     = dict((word, cfd[word].max()) for (word, _) in fd.most_common(200))
baseline_tagger = nltk.UnigramTagger(model=likely_tags)

# assign the default tag of NN to them
baseline_tagger = nltk.UnigramTagger(model=likely_tags, backoff=nltk.DefaultTagger('NN'))

In [16]:
# running the experiment for a new example
sent = 'Today is such a nice sunny day'
print(baseline_tagger.tag(sent.split()))

[('Today', 'NN'), ('is', 'BEZ'), ('such', 'JJ'), ('a', 'AT'), ('nice', 'NN'), ('sunny', 'NN'), ('day', 'NN')]
