#                                                 Chinking in NLP

<b>Chinking</b> is used together with chunking, but while <b>chunking</b> is used to include a pattern, <b>chinking</b> is used to exclude a pattern.

 One can even define a pattern or words that can’t be a part of chuck and such words are known as chinks. A **ChunkRule class** specifies what words or patterns to include and exclude in a chunk. Chinking is a lot like chunking, it is basically a way for you to remove a chunk from a chunk. The chunk that you remove from your chunk is your chink.

Let’s reuse the quote you used in the section on chunking. You already have a list of tuples containing each of the words in the quote along with its part of speech tag:

In [7]:
sentence= "It's a dangerous business, Frodo, going out your door."

Now we tokenize the sentence

In [8]:
import nltk
from nltk.tokenize import word_tokenize 

In [9]:
pos_tagger=word_tokenize(sentence)

In [10]:
pos_tagger


['It',
 "'s",
 'a',
 'dangerous',
 'business',
 ',',
 'Frodo',
 ',',
 'going',
 'out',
 'your',
 'door',
 '.']

Now we tagging the each  words 

In [11]:
lotr_pos_tags=nltk.pos_tag(pos_tagger)


In [12]:
lotr_pos_tags

[('It', 'PRP'),
 ("'s", 'VBZ'),
 ('a', 'DT'),
 ('dangerous', 'JJ'),
 ('business', 'NN'),
 (',', ','),
 ('Frodo', 'NNP'),
 (',', ','),
 ('going', 'VBG'),
 ('out', 'RP'),
 ('your', 'PRP$'),
 ('door', 'NN'),
 ('.', '.')]

The next step is to create a grammar to determine what you want to **include and exclude in your chunks**


Before doing it you need to specify the rulz.  u’re going to use more than one line because you’re going to have more than one rule. Because you’re using more than one line for the grammar, you’ll be using **triple quotes ("""):**


In [13]:
 grammar = """
... Chunk: {<.*>+}
...        }<JJ>{"""

First Rule of grammer

The first rule of your grammar is {<.*>+} . 
This rule has curly braces that face inward ({}) because it’s used to determine what patterns you want to include in you chunks. In this case, you want to include everything: <.*>+.

Second rule

Create a chunk parser with this grammar

In [18]:
chunk_parser = nltk.RegexpParser(grammar)

In [20]:
chunk_parser

<chunk.RegexpParser with 1 stages>

Now chunk your sentence with the chink you specified:

In [21]:
tree = chunk_parser.parse(lotr_pos_tags)

The tree shows that adjective is excluded from the text

In [None]:
tree.draw()

We excluded the adjective 'dangerous' from your chunks and are left with two chunks containing everything else. The first chunk has all the text that appeared before the adjective that was excluded. The second chunk contains everything after the adjective that was excluded.