# Chunking in NLTK 

Text chunking, also referred to as shallow parsing, is a task that follows Part-Of-Speech Tagging and that adds more structure to the sentence. The result is a grouping of the words in “chunks”

In shallow parsing, there is maximum one level between roots and leaves while deep parsing comprises of more than one level. Shallow Parsing is also called light parsing or chunking.

The primary usage of chunking is to make a group of "noun phrases." The parts of speech are combin.

### Rules of Chunking:

For example: Need to tag Noun, verb (past tense), adjective, and coordinating junction from the sentence. 

chunk:{<NN.?>*<VBD.?>*<JJ.?>*<CC>?}

![image.png](attachment:image.png)

### Description:

Name of symbol	Description

( . )	Any character except new line

( * )	Match 0 or more repetitions

( ? )	Match 0 or 1 repetitions

In [6]:
import nltk
from nltk import pos_tag
from nltk import RegexpParser

#Text
txt ="This is NLP Chunking NoteBook"
text = txt.split()
print("After Split:",text)

After Split: ['This', 'is', 'NLP', 'Chunking', 'NoteBook']


In [7]:
#POS Tags
POS_tag = pos_tag(text)

print("After POS tags:",POS_tag)


After POS tags: [('This', 'DT'), ('is', 'VBZ'), ('NLP', 'NNP'), ('Chunking', 'NNP'), ('NoteBook', 'NNP')]


In [8]:
#Chunking Form

patterns= """mychunk:{<NN.?>*<VBD.?>*<JJ.?>*<CC>?}"""
chunker = RegexpParser(patterns)
print("After Regex:",chunker)

After Regex: chunk.RegexpParser with 1 stages:
RegexpChunkParser with 1 rules:
       <ChunkRule: '<NN.?>*<VBD.?>*<JJ.?>*<CC>?'>


In [9]:
output = chunker.parse(POS_tag)
print("After Chunking",output)

After Chunking (S This/DT is/VBZ (mychunk NLP/NNP Chunking/NNP NoteBook/NNP))


In [None]:
#output.draw()

![image.png](attachment:image.png)