### **POS Tagging**

---

### **What is POS Tagging?**

**POS (Part-of-Speech) tagging** means labeling each word in a sentence with its **grammatical role**, like noun, verb, adjective, etc.

🔸 Example:

```text
The dog barked loudly.
```

* The → Determiner
* dog → Noun
* barked → Verb
* loudly → Adverb

---

### **Why is it Needed?**

* Helps NLP models **understand the structure** of a sentence
* Required for accurate **lemmatization**
* Useful in **named entity recognition**, **sentiment analysis**, and **question answering**

---

### **Common POS Tags :**

| Tag   | Meaning        | Example       |
| ----- | -------------- | ------------- |
| NOUN  | Noun           | book, dog     |
| VERB  | Verb           | run, eat      |
| ADJ   | Adjective      | big, fast     |
| ADV   | Adverb         | quickly, very |
| PRON  | Pronoun        | he, they      |
| AUX   | Auxiliary Verb | is, was       |
| ADP   | Preposition    | in, on, at    |
| DET   | Determiner     | the, a, this  |
| PUNCT | Punctuation    | . , ? !       |

---


![image.png](attachment:image.png)

### **How to Do It with spaCy:**

In [1]:
import spacy

# Load the model
nlp = spacy.load("en_core_web_sm")

# Input text
text = "The children are playing in the park."

# Process the text
doc = nlp(text)

In [2]:
doc

The children are playing in the park.

In [3]:
type(doc)

spacy.tokens.doc.Doc

In [4]:
doc.text

'The children are playing in the park.'

In [5]:
type(doc.text)

str

In [8]:
doc[2]

are

In [12]:
doc[0].pos_

'DET'

In [13]:
doc[2].pos_

'AUX'

In [15]:
doc[7].pos_

'PUNCT'

In [4]:
doc[2].tag_

'VBP'

In [5]:
spacy.explain('VBP')

'verb, non-3rd person singular present'

In [6]:
doc

The children are playing in the park.

In [10]:
for word in doc:
    print(word, "->", word.pos_)

The -> DET
children -> NOUN
are -> AUX
playing -> VERB
in -> ADP
the -> DET
park -> NOUN
. -> PUNCT


In [11]:
spacy.explain('ADP')

'adposition'

In [13]:
for word in doc:
    print(word, "->", word.pos_,word.tag_ , spacy.explain(word.tag_))

The -> DET DT determiner
children -> NOUN NNS noun, plural
are -> AUX VBP verb, non-3rd person singular present
playing -> VERB VBG verb, gerund or present participle
in -> ADP IN conjunction, subordinating or preposition
the -> DET DT determiner
park -> NOUN NN noun, singular or mass
. -> PUNCT . punctuation mark, sentence closer


In [14]:
from spacy import displacy

In [None]:
displacy.render(doc)

In [19]:
# Input text
text = """She said, "I am tired." Then she left. Prof. Ahmed works at A.I. Lab."""

# Process the text
doc1 = nlp(text)

In [None]:
displacy.render(doc1)

**More Practices**

In [21]:
import spacy

# Load the model
nlp = spacy.load("en_core_web_sm")

# Input text
text1 = "I saw a man with a telescope."
text2= "The telescope was used by the man I saw."

# Process the text
doc1 = nlp(text1)
doc2 = nlp(text2)

In [24]:
for word in doc1:
    print(word, "-->" , word.pos_ )

I --> PRON
saw --> VERB
a --> DET
man --> NOUN
with --> ADP
a --> DET
telescope --> NOUN
. --> PUNCT


In [27]:
from spacy import displacy

In [None]:
displacy.render(doc1)


In [29]:
displacy.render(doc2)

> dep_ tells you what job each word is doing in the sentence.

In [31]:
for word in doc1:
    print(word, "-->" , word.pos_ , "|" , word.tag_ , "-->" , spacy.explain(word.tag_))

I --> PRON | PRP --> pronoun, personal
saw --> VERB | VBD --> verb, past tense
a --> DET | DT --> determiner
man --> NOUN | NN --> noun, singular or mass
with --> ADP | IN --> conjunction, subordinating or preposition
a --> DET | DT --> determiner
telescope --> NOUN | NN --> noun, singular or mass
. --> PUNCT | . --> punctuation mark, sentence closer


In [44]:
paragaph = """Data Solution-360 is a consulting company and online learning platform focused on data science, 
artificial intelligence, machine learning, and data analytics etc . 
They offer online courses, consulting services, and job-ready bootcamps to help individuals gain skills and knowledge in 
these areas. Their goal is to make data science accessible and empower individuals to leverage data for better decisions 
and careers. """

In [45]:
docx= nlp(paragaph)

In [46]:
for words in docx:
    print(words, "-->" , words.pos_ , "|" , words.tag_ ,"-->" , spacy.explain(words.tag_))

Data --> NOUN | NNS --> noun, plural
Solution-360 --> VERB | VBP --> verb, non-3rd person singular present
is --> AUX | VBZ --> verb, 3rd person singular present
a --> DET | DT --> determiner
consulting --> VERB | VBG --> verb, gerund or present participle
company --> NOUN | NN --> noun, singular or mass
and --> CCONJ | CC --> conjunction, coordinating
online --> ADJ | JJ --> adjective (English), other noun-modifier (Chinese)
learning --> VERB | VBG --> verb, gerund or present participle
platform --> NOUN | NN --> noun, singular or mass
focused --> VERB | VBN --> verb, past participle
on --> ADP | IN --> conjunction, subordinating or preposition
data --> NOUN | NN --> noun, singular or mass
science --> NOUN | NN --> noun, singular or mass
, --> PUNCT | , --> punctuation mark, comma

 --> SPACE | _SP --> whitespace
artificial --> ADJ | JJ --> adjective (English), other noun-modifier (Chinese)
intelligence --> NOUN | NN --> noun, singular or mass
, --> PUNCT | , --> punctuation mark, com

In [47]:
for words in docx:
    if words.pos_ not in ["SPACE","PUNCT","X"]:
        print(words, "-->" , words.pos_ , "|" , words.tag_ ,"-->" , spacy.explain(words.tag_))

Data --> NOUN | NNS --> noun, plural
Solution-360 --> VERB | VBP --> verb, non-3rd person singular present
is --> AUX | VBZ --> verb, 3rd person singular present
a --> DET | DT --> determiner
consulting --> VERB | VBG --> verb, gerund or present participle
company --> NOUN | NN --> noun, singular or mass
and --> CCONJ | CC --> conjunction, coordinating
online --> ADJ | JJ --> adjective (English), other noun-modifier (Chinese)
learning --> VERB | VBG --> verb, gerund or present participle
platform --> NOUN | NN --> noun, singular or mass
focused --> VERB | VBN --> verb, past participle
on --> ADP | IN --> conjunction, subordinating or preposition
data --> NOUN | NN --> noun, singular or mass
science --> NOUN | NN --> noun, singular or mass
artificial --> ADJ | JJ --> adjective (English), other noun-modifier (Chinese)
intelligence --> NOUN | NN --> noun, singular or mass
machine --> NOUN | NN --> noun, singular or mass
learning --> NOUN | NN --> noun, singular or mass
and --> CCONJ | CC 

**Following Basic Prep**

In [48]:
import string

In [49]:
exclude_Punctuation=string.punctuation

In [50]:
def rmv_Punctuation(text):
    for i in exclude_Punctuation:
        text= text.replace(i,"")
    return text

In [52]:
paragaph1 = """Data Solution-360 is a consulting company and online learning platform focused on data science, 
artificial intelligence, machine learning, and data analytics etc . 
They offer online courses, consulting services, and job-ready bootcamps to help individuals gain skills and knowledge in 
these areas. Their goal is to make data science accessible and empower individuals to leverage data for better decisions 
and careers. """

In [56]:
new_doc=rmv_Punctuation(paragaph1)

In [57]:
new_docx=nlp(new_doc)

In [58]:
for words in new_docx:
    print(words, "-->" , words.pos_ , "|" , words.tag_ ,"-->" , spacy.explain(words.tag_))

Data --> NOUN | NNS --> noun, plural
Solution360 --> NUM | CD --> cardinal number
is --> AUX | VBZ --> verb, 3rd person singular present
a --> DET | DT --> determiner
consulting --> VERB | VBG --> verb, gerund or present participle
company --> NOUN | NN --> noun, singular or mass
and --> CCONJ | CC --> conjunction, coordinating
online --> ADJ | JJ --> adjective (English), other noun-modifier (Chinese)
learning --> VERB | VBG --> verb, gerund or present participle
platform --> NOUN | NN --> noun, singular or mass
focused --> VERB | VBN --> verb, past participle
on --> ADP | IN --> conjunction, subordinating or preposition
data --> NOUN | NN --> noun, singular or mass
science --> NOUN | NN --> noun, singular or mass

 --> SPACE | _SP --> whitespace
artificial --> ADJ | JJ --> adjective (English), other noun-modifier (Chinese)
intelligence --> NOUN | NN --> noun, singular or mass
machine --> NOUN | NN --> noun, singular or mass
learning --> NOUN | NN --> noun, singular or mass
and --> CCO