In [89]:
import spacy

# Load English NLP pipeline
nlp = spacy.load("en_core_web_sm")

text = "Omg!!! I just tried the new burger at BurgerZone 😍🍔... soooo delicious!!! You guys HAVE to try it out!! 😭😭 It's like heaven in a bun. 😂😂 Check it out here: https://burgerzone.com/new-burger-deal 🔥🔥!!! Alsooo... I emailed them at foodlover@burgerzone.com for coupons 😏 #foodie #yum #bestburgerever 100/10 would recommend!!! 😋😋"
doc = nlp(text)

In [90]:
for token in doc:
    print(f"{token.text}| {token.pos_} | {spacy.explain(token.pos_)} | {spacy.explain(token.tag_)} | {token.tag_}")


Omg| PROPN | proper noun | noun, proper singular | NNP
!| PUNCT | punctuation | punctuation mark, sentence closer | .
!| PUNCT | punctuation | punctuation mark, sentence closer | .
!| PUNCT | punctuation | punctuation mark, sentence closer | .
I| PRON | pronoun | pronoun, personal | PRP
just| ADV | adverb | adverb | RB
tried| VERB | verb | verb, past tense | VBD
the| DET | determiner | determiner | DT
new| ADJ | adjective | adjective (English), other noun-modifier (Chinese) | JJ
burger| NOUN | noun | noun, singular or mass | NN
at| ADP | adposition | conjunction, subordinating or preposition | IN
BurgerZone| PROPN | proper noun | noun, proper singular | NNP
😍| PROPN | proper noun | noun, proper singular | NNP
🍔| PROPN | proper noun | noun, proper singular | NNP
...| PUNCT | punctuation | punctuation mark, colon or ellipsis | :
soooo| NOUN | noun | noun, singular or mass | NN
delicious| PROPN | proper noun | noun, proper singular | NNP
!| PUNCT | punctuation | punctuation mark, sentence

In [91]:
import emoji

# Function to check if a token is an emoji
def is_emoji(token):
    return any(char in emoji.EMOJI_DATA for char in token.text)

# Remove emojis
tokens_without_emojis = [token.text for token in doc if not is_emoji(token)]

# Join back into text
cleaned_text1 = " ".join(tokens_without_emojis)

print(cleaned_text1)

Omg ! ! ! I just tried the new burger at BurgerZone ... soooo delicious ! ! ! You guys HAVE to try it out ! ! It 's like heaven in a bun . Check it out here : https://burgerzone.com/new-burger-deal ! ! ! Alsooo ... I emailed them at foodlover@burgerzone.com for coupons # foodie # yum # bestburgerever 100/10 would recommend ! ! !


In [92]:
doc2 = nlp(cleaned_text1)

In [93]:
# Unwanted detailed POS tags (Penn Treebank tags)
remove_tags = {".",":"}

# Remove tokens with unwanted tags
cleaned_tokens = [token.text for token in doc2 if token.tag_ not in remove_tags]

# Join tokens into cleaned text
cleaned_text2 = " ".join(cleaned_tokens)

print(cleaned_text2)

Omg I just tried the new burger at BurgerZone soooo delicious You guys HAVE to try it out It 's like heaven in a bun Check it out here https://burgerzone.com/new-burger-deal Alsooo I emailed them at foodlover@burgerzone.com for coupons # foodie # yum # bestburgerever 100/10 would recommend


In [94]:
doc3 = nlp(cleaned_text2)

In [95]:
for token in doc3:
    print(f"{token.text}| {token.pos_} | {spacy.explain(token.pos_)} | {spacy.explain(token.tag_)} | {token.tag_}")


Omg| PROPN | proper noun | noun, proper singular | NNP
I| PRON | pronoun | pronoun, personal | PRP
just| ADV | adverb | adverb | RB
tried| VERB | verb | verb, past tense | VBD
the| DET | determiner | determiner | DT
new| ADJ | adjective | adjective (English), other noun-modifier (Chinese) | JJ
burger| NOUN | noun | noun, singular or mass | NN
at| ADP | adposition | conjunction, subordinating or preposition | IN
BurgerZone| NOUN | noun | noun, singular or mass | NN
soooo| NOUN | noun | noun, singular or mass | NN
delicious| NOUN | noun | noun, singular or mass | NN
You| PRON | pronoun | pronoun, personal | PRP
guys| NOUN | noun | noun, plural | NNS
HAVE| VERB | verb | verb, non-3rd person singular present | VBP
to| PART | particle | infinitival "to" | TO
try| VERB | verb | verb, base form | VB
it| PRON | pronoun | pronoun, personal | PRP
out| ADP | adposition | adverb, particle | RP
It| PRON | pronoun | pronoun, personal | PRP
's| AUX | auxiliary | verb, 3rd person singular present | VB