<img src='https://www.di.uniroma1.it/sites/all/themes/sapienza_bootstrap/logo.png' width="200"/>  

# Part_1_8_Part_of_Speech_Tagging  

In Natural Language Processing (`NLP`), tagging is a crucial process for annotating text with meaningful labels that aid in linguistic and semantic analysis. Among these, **Part-of-Speech (`POS`) tagging** plays a foundational role in identifying the grammatical roles of words in a sentence, such as noun, verb, adjective, or adverb. This understanding is critical for tasks like syntactic parsing, named entity recognition, machine translation, and text-to-speech systems.  

`POS` tagging methods have evolved from rule-based systems to sophisticated algorithms like **Hidden Markov Models (`HMMs`)** and **Conditional Random Fields (CRFs)**, which leverage statistical properties for better contextual analysis. More recently, **neural network-based models** have introduced significant advancements, enabling state-of-the-art performance by leveraging word embeddings and deep learning architectures.  

### **Objectives:**  
In this notebook, Parham provides an overview of Part-of-Speech tagging, its significance in `NLP`, and the algorithms behind it, including Hidden Markov Models (`HMMs`) and neural networks. Through practical exercises, Parham will train a neural network for `POS` tagging and use `NLTK` to implement the Stanford `POS` Tagger.  

### **References:**  
- [https://www.nltk.org/book/ch05.html](https://www.nltk.org/book/ch05.html)  
- [https://web.stanford.edu/~jurafsky/slp3/old_oct19/8.pdf](https://web.stanford.edu/~jurafsky/slp3/old_oct19/8.pdf)  
- [https://www.linguisticsweb.org/doku.php?id=linguisticsweb:tutorials:linguistics_tutorials:automaticannotation:stanford_pos_tagger_python](https://www.linguisticsweb.org/doku.php?id=linguisticsweb:tutorials:linguistics_tutorials:automaticannotation:stanford_pos_tagger_python)  
- [https://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html](https://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html)

### **Contributors:**  
- Parham Membari  
    - <img src="https://upload.wikimedia.org/wikipedia/commons/7/7e/Gmail_icon_%282020%29.svg" alt="Logo" width="20" height="20"> **Email**: p.membari96@gmail.com  
    - <img src="https://www.iconsdb.com/icons/preview/red/linkedin-6-xxl.png" alt="Logo" width="20" height="20"> **LinkedIn**: [LinkedIn](https://www.linkedin.com/in/p-mem/)  
    - <img src="https://upload.wikimedia.org/wikipedia/commons/a/ae/Github-desktop-logo-symbol.svg" alt="Logo" width="20" height="20"> **GitHub**: [GitHub](https://github.com/parham075)  
    - <img src="https://upload.wikimedia.org/wikipedia/commons/e/ec/Medium_logo_Monogram.svg" alt="Logo" width="20" height="20"> **Medium**: [Medium](https://medium.com/@p.membari96)  

**Table of Contents:**  
1. Import Libraries
2. Introduction to Tagging in NLP  
3. Algorithms Behind `POS` Tagging (Rule-Based, HMM, Neural Networks)  
4. Fine tunning of a Neural Network for `POS` Tagging  
5. Using NLTK to Handle Stanford POS Tagger  
6. Closing Thoughts  

# 1. Import Libraries

In [1]:
import os
import nltk
import numpy as np
import spacy
import torch

# 2. Introduction to Tagging in NLP  



n Natural Language Processing (NLP), **tagging** involves assigning meaningful labels to elements of text, such as words, phrases, or sentences. These labels capture linguistic or semantic information that is essential for various NLP applications. For example:  
- **Part-of-Speech (POS) Tagging:** Assigns grammatical roles (e.g., noun, verb, adjective).  

- **Named Entity Recognition (NER):** Identifies proper nouns like names, locations, or organizations.  

- **Semantic Role Labeling (SRL):** Describes the roles words play in the semantic structure of a sentence.  

Each tagging approach serves a unique purpose, contributing to tasks like text parsing, translation, summarization, and information extraction. Techniques for tagging range from traditional rule-based systems to modern neural network-based methods:  
- **Rule-Based Tagging:** Relies on linguistic rules and patterns. It works well for predictable structures but struggles with ambiguity and language variability.  
- **Statistical Tagging:** Algorithms like Hidden Markov Models (HMMs) and Conditional Random Fields (CRFs) use probabilistic methods to predict tags based on contextual patterns in labeled data.  
- **Neural Network-Based Tagging:** Leverages word embeddings and deep learning architectures like BiLSTMs and Transformers to achieve state-of-the-art performance by capturing complex patterns in language.  

### 2.1. Part-of-Speech Tagging: A Closer Look  

Among these approaches, **Part-of-Speech (POS) tagging** is a foundational task in NLP. It identifies the grammatical role of each word in a sentence, helping to structure raw text for downstream tasks. Consider the sentence:  

_"Computer Science department of Sapienza University of Rome is intellectually lively and reputed for its research outcome."_  

POS tagging identifies:  
- Computer      → Proper Noun (NNP)  
- Science       → Proper Noun (NNP)  
- department    → Noun (NN)  
- of            → Preposition (IN)  
- Sapienza      → Proper Noun (NNP)  
- University    → Proper Noun (NNP)  
- of            → Preposition (IN)  
- Rome          → Proper Noun (NNP)  
- is            → Verb (VBZ)  
- intellectually → Adverb (RB)  
- lively        → Adjective (JJ)  
- and           → Coordinating Conjunction (CC)  
- reputed       → Verb, Past Participle (VBN)  
- for           → Preposition (IN)  
- its           → Possessive Pronoun (PRP$)  
- research      → Noun (NN)  
- outcome       → Noun (NN)  

> Note: for more identifiers please check this [documentation](https://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html)

By providing information about grammatical structure, this tagging helps machines understand not just individual words, but also the connections between them within a sentence.

### 2.2. Two classes of words: **Open** vs. **Closed**:
- Closed class words
    - Relatively fixed membership
    - Usually function words: short, frequent words with grammatical function
    - determiners: a, an, the
    - pronouns: she, he, I
    - prepositions: on, under, over, near, by, …
- Open class words
    - Usually content words: Nouns, Verbs, Adjectives, Adverbs
    - Plus interjections: oh, ouch, uh-huh, yes, hello
    - New nouns and verbs like iPhone or to fax



### 2.3. Why Part-of-Speech Tagging?  

Here’s why POS tagging is so valuable:  

- **Supports Other NLP Tasks**: POS tagging provides crucial insights for tasks like syntactic parsing, sentiment analysis, and text-to-speech systems.  
- **Parsing**: Knowing POS tags can improve syntactic parsing accuracy, which is vital for machine translation and language understanding.  
- **Machine Translation (MT)**: POS tags help reordering structures, such as adjectives and nouns, when translating between languages like Spanish and English.  
- **Sentiment Analysis**: Distinguishing adjectives or verbs can reveal sentiment or emotional tone in text.  
- **Text-to-Speech**: Pronunciation ambiguity, as seen with words like *lead* or *object*, can be resolved using POS tags.  
- **Linguistic Analysis**: POS tagging aids in studying linguistic evolution, identifying meaning shifts, and creating new words.  

In short, POS tagging acts as a bridge, enabling both practical NLP tasks and linguistic research to benefit from accurate syntactic understanding.  


### 2.4. How Difficult is POS Tagging in English?  

Although English `POS` tagging has achieved high accuracy, it is not without challenges. Ambiguity is a major issue:  

- About **15% of word types** in English are ambiguous (e.g., *back* can be a noun, verb, adjective, or adverb).  
- However, **85% of word types are unambiguous** (e.g., *Sapienza* is always a proper noun, and *intellectually* is always an adverb).  
- The ambiguous 15% are highly frequent in text, meaning **~60% of word tokens** in actual usage are ambiguous.  

Here are examples of how the word *back* varies based on context:  

- **Adjective (ADJ)**: _Earnings growth took a **back** seat._  
- **Noun (NOUN)**: _A small building in the **back**._  
- **Verb (VERB)**: _A clear majority of senators **back** the bill._  
- **Particle (PART)**: _Enable the country to buy **back** debt._  
- **Adverb (ADV)**: _I was twenty-one **back** then._  


### 2.5. POS Tagging Performance  

How accurate is POS tagging? Modern methods have achieved impressive results:  

- **Tagging Accuracy**: About **97%**, which hasn't changed much in the last decade. Hidden Markov Models (HMMs), Conditional Random Fields (CRFs), and neural network-based approaches like BERT perform similarly.  
- **Baseline Accuracy**: Even a "stupid" baseline, such as tagging every word with its most frequent tag or unknown words as nouns, achieves **92%** accuracy.  

The high accuracy is partly because many words are unambiguous. However, improving the remaining 3% can be difficult due to rare and ambiguous cases.  

# 3. Algorithms Behind `POS` Tagging (Rule-Based, HMM, Neural Networks) 

# 4. Fine tunning of a Neural Network for `POS` Tagging  

# 5. Using NLTK to Handle Stanford POS Tagger  

# 6. Closing Thoughts  