## What is POS tagging:
Parts of Speech (POS) tagging is the process of assigning each word in a text a grammatical category such as nouns, verbs, adjectives or preposition.<br><br>
Just to make it simple, consider POS tagging techniques or models as a black box where you input a sentence and get the same sentence with POS tags with it for each word.<br><br>
For example, consider the following sentence:<br><br>
"A quick brown fox jumps over a lazy dog"<br><br>
This is the same sentence with POS tags:<br><br>
A: DT (Determiner)<br>
quick: JJ (Adjective)<br>
brown: JJ (Adjective)<br>
fox: NN (Noun)<br>
jumps: VB (Verb)<br>
over: IN (Preposition)<br>
a: DT (Determiner)<br>
lazy: JJ (Adjective)<br>
dog: NN (Noun)<br>

### POS tagging with NLTK:
The following code implements a function that takes in a sentence and returns POS tags for every word.

In [1]:
import nltk
from nltk.tokenize import word_tokenize

def pos_tagger(sentence):
  words = word_tokenize(sentence)
  
  return nltk.pos_tag(words)

sentence = "Cooking is delightful"
pos_tagger(sentence)

[('Cooking', 'NN'), ('is', 'VBZ'), ('delightful', 'JJ')]

All the magic is done by pos_tag() function under the hood.<br><br>
It uses a PerceptronTagger by default for POS tagging. This falls under the category of stochastic POS tagging.<br><br>
It's trained on a large corpus of pre-tagged text data. <br><br>
During inference, it considers features of the word and its context to predict the most likely part of speech tag.<br><br>

### POS Tagging with Spacy:
The following code implements a function that takes in a sentence and returns the sentence with POS tags.

In [2]:
import spacy

def pos_tagger(sentence):
  nlp = spacy.load("en_core_web_sm")
  doc = nlp(sentence)
  
  return [(token.text, token.pos_) for token in doc]

sentence = "Cooking is delightful"
pos_tagger(sentence)

[('Cooking', 'NOUN'), ('is', 'AUX'), ('delightful', 'ADJ')]

We first import spaCy and load the English language model "en_core_web_sm".<br><br>
We process the sentence using spaCy's nlp pipeline, which tokenizes the sentence and assigns POS tags to each token and then returns the sentence with pos tags in the above format.