## Shallow Parsing
Shallow parsing, also known as chunking or light parsing, is a process in Natural Language Processing (NLP) that identifies the constituents (parts of speech) of sentences and links them into higher-order units that are less than the sentence but more than individual words. Unlike full parsing, which provides a deep grammatical analysis of a sentence, shallow parsing focuses on identifying phrases (such as noun phrases, verb phrases) without diving into detailed syntactic relationships.

Shallow parsing, or chunking, is the process of extracting phrases from unstructured text. This involves chunking groups of adjacent tokens into phrases on the basis of their POS tags. There are some standard well-known chunks such as noun phrases, verb phrases, and prepositional phrases.

Key aspects of shallow parsing:

- Phrase Identification: Shallow parsing aims to identify phrases in a sentence. For example, in the sentence "The quick brown fox jumps over the lazy dog", a shallow parser might identify "The quick brown fox" as a noun phrase and "jumps over" as a verb phrase.
- Simpler than Full Parsing: Shallow parsing is less complex than full parsing (dependency or constituency parsing). It doesn't provide a full syntactic structure but gives useful information about the general structure of sentences.
- Use of POS Tags: Shallow parsing typically builds upon the results of part-of-speech tagging. It uses these POS tags to group words into phrases.
- Applications: Shallow parsing is useful in various NLP tasks such as information extraction, question answering, and text summarization where detailed syntactic parsing might be unnecessary.
- Chunking: In shallow parsing, chunking is the process of extracting phrases. A chunk is a collection of words sequentially tagged by part-of-speech taggers. Rules or machine learning algorithms can define how words are grouped into chunks.
- Named Entity Recognition (NER): Although NER is a related task, it's more specific than shallow parsing. NER focuses on identifying and classifying named entities (like names of people, organizations, locations) in the text.
Shallow parsing is a balance between the complexity and depth of full parsing and the simplicity and speed of basic POS tagging, making it a valuable tool in many NLP applications.


## Noun Phrase Detection
A noun phrase is a phrase that has a noun as its head. It could also include other kinds of words, such as adjectives, ordinals, and determiners. Noun phrases are useful for explaining the context of the sentence. They help you understand what the sentence is about.

spaCy has the property .noun_chunks on the Doc object.

In [1]:
import spacy

nlp = spacy.load("en_core_web_sm")

conference_text = "There is a developer conference happening on 21 July 2019 in London."
conference_doc = nlp(conference_text)

# Extract Noun Phrases
for chunk in conference_doc.noun_chunks:
    print(chunk)

a developer conference
21 July
London


## Verb Phrase Detection
A verb phrase is a syntactic unit composed of at least one verb. This verb can be joined by other chunks, such as noun phrases. Verb phrases are useful for understanding the actions that nouns are involved in.

spaCy has no built-in functionality to extract verb phrases, so you’ll need a library called textacy. You can use pip to install textacy:

In [3]:
import textacy

about_talk_text = (
    "The talk will introduce reader about use"
    " cases of Natural Language Processing in"
    " Fintech, making use of"
    " interesting examples along the way."
)

patterns = [{"POS": "AUX"}, {"POS": "VERB"}]
about_talk_doc = textacy.make_spacy_doc(about_talk_text, lang="en_core_web_sm")
verb_phrases = textacy.extract.token_matches(about_talk_doc, patterns=patterns)

# Print all verb phrases
for chunk in verb_phrases:
    print(chunk.text)


# Extract noun phrase to explain what nouns are involved
for chunk in about_talk_doc.noun_chunks:
    print(chunk)

will introduce
The talk
reader
use cases
Natural Language Processing
Fintech
use
interesting examples
the way
