# **Implement POS tagging on a text corpus using an NLP library (like NLTK or SpaCy). Analyze the grammatical structure of sentences using syntactic/dependency parsing.**

In [1]:
import spacy

In [2]:
nlp = spacy.load("en_core_web_sm")

In [3]:
text = """Uttarakhand's history dates back to prehistoric times with archaeological evidence showcasing human habitation. It was part of the ancient Kuru and the Panchal kingdoms during the Vedic age and later saw the rise of dynasties like the Kunindas and influence of Buddhism as evidenced by Ashokan edicts.
Though primarily driven by agriculture and hydropower the state's economy is now dominated by the service industry. The service sector comprises primarily travel, tourism, and hotel industry.  The state contributes five seats to the lower house Lok Sabha and three seats to the upper house Rajya Sabha.Inhabitants of the state are called either Garhwali or Kumaoni depending on their region of origin. Hinduism is practiced by more than three-fourths of the population with Islam being the next-largest religious group. Hindi is the most widely spoken language and is also the official language of the state, along with native regional languages include Garhwali, Jaunsari, Gurjari and Kumaoni. The state is often referred to as the Devabhumi lit. Land of the Gods due to its religious significance and numerous Hindu temples and pilgrimage centres found throughout the state.Along with several historical, natural and religious tourist destinations, including Char Dham, Haridwar, Rishikesh, Panch Kedar, Himalayas, and Sapta Badri. Uttarakhand is also home to two World Heritage sites.
"""

In [4]:
doc=nlp(text)

#POS tagging

In [5]:
for token in doc:
    print(f"{token.text:15}  POS: {token.pos_:10}  TAG: {token.tag_:10}  DEP: {token.dep_}  Description: {spacy.explain(token.tag_)}")

Uttarakhand      POS: PROPN       TAG: NNP         DEP: poss  Description: noun, proper singular
's               POS: PART        TAG: POS         DEP: case  Description: possessive ending
history          POS: NOUN        TAG: NN          DEP: nsubj  Description: noun, singular or mass
dates            POS: VERB        TAG: VBZ         DEP: ROOT  Description: verb, 3rd person singular present
back             POS: ADV         TAG: RB          DEP: advmod  Description: adverb
to               POS: ADP         TAG: IN          DEP: prep  Description: conjunction, subordinating or preposition
prehistoric      POS: ADJ         TAG: JJ          DEP: amod  Description: adjective (English), other noun-modifier (Chinese)
times            POS: NOUN        TAG: NNS         DEP: pobj  Description: noun, plural
with             POS: ADP         TAG: IN          DEP: prep  Description: conjunction, subordinating or preposition
archaeological   POS: ADJ         TAG: JJ          DEP: amod  Descript

#Dependency Parsing

In [6]:
for token in doc:
    print(f"{token.text:12} <--{token.dep_:10}-- {token.head.text}")

Uttarakhand  <--poss      -- history
's           <--case      -- Uttarakhand
history      <--nsubj     -- dates
dates        <--ROOT      -- dates
back         <--advmod    -- dates
to           <--prep      -- back
prehistoric  <--amod      -- times
times        <--pobj      -- to
with         <--prep      -- dates
archaeological <--amod      -- evidence
evidence     <--pobj      -- with
showcasing   <--acl       -- evidence
human        <--amod      -- habitation
habitation   <--dobj      -- showcasing
.            <--punct     -- dates
It           <--nsubj     -- was
was          <--ROOT      -- was
part         <--attr      -- was
of           <--prep      -- part
the          <--det       -- Kuru
ancient      <--amod      -- Kuru
Kuru         <--pobj      -- of
and          <--cc        -- Kuru
the          <--det       -- kingdoms
Panchal      <--compound  -- kingdoms
kingdoms     <--conj      -- Kuru
during       <--prep      -- was
the          <--det       -- age
Vedic      

#Sentence wise grammar parsing

In [7]:
for sent in doc.sents:
    print("Sentence:", sent.text)
    for token in sent:
        print(f"  {token.text:15} {token.pos_:10} {token.dep_:10} --> {token.head.text}")
    print()

Sentence: Uttarakhand's history dates back to prehistoric times with archaeological evidence showcasing human habitation.
  Uttarakhand     PROPN      poss       --> history
  's              PART       case       --> Uttarakhand
  history         NOUN       nsubj      --> dates
  dates           VERB       ROOT       --> dates
  back            ADV        advmod     --> dates
  to              ADP        prep       --> back
  prehistoric     ADJ        amod       --> times
  times           NOUN       pobj       --> to
  with            ADP        prep       --> dates
  archaeological  ADJ        amod       --> evidence
  evidence        NOUN       pobj       --> with
  showcasing      VERB       acl        --> evidence
  human           ADJ        amod       --> habitation
  habitation      NOUN       dobj       --> showcasing
  .               PUNCT      punct      --> dates

Sentence: It was part of the ancient Kuru and the Panchal kingdoms during the Vedic age and later saw the ri

#Visualization

In [8]:
from spacy import displacy

In [9]:
displacy.render(doc, style="dep", jupyter=True)