# Introduction to NLP Libraries: NLTK & spaCy

In this notebook, we will explore two popular Python libraries used in Natural Language Processing (NLP): NLTK and spaCy.
Let's learn what they are and see some simple examples of how to use them for text tokenization.

## NLTK vs spaCy

<div style="display: grid; grid-template-columns: 1fr 1fr; gap: 20px;">
  <div style="background: #e8f4fd; padding: 20px; border-radius: 10px;">
    <h4>📚 NLTK (Natural Language Toolkit)</h4>
    <ul>
      <li>Educational and research-focused</li>
      <li>Lots of algorithms to choose from</li>
      <li>Great for learning NLP concepts</li>
      <li>Been around since 2001</li>
    </ul>
  </div>
  <div style="background: #f0f8e8; padding: 20px; border-radius: 10px;">
    <h4>⚡ spaCy</h4>
    <ul>
      <li>Production-ready and fast</li>
      <li>Modern and industrial-strength</li>
      <li>Great for real applications</li>
      <li>Easy to use out-of-the-box</li>
    </ul>
  </div>
</div>

## Installation & Setup

In [None]:
# Install the libraries
!pip install nltk spacy

# Download NLTK data
import nltk
nltk.download('punkt')
nltk.download('stopwords')

# Download spaCy model
!python -m spacy download en_core_web_sm

## NLTK in Action

In [None]:
import nltk
from nltk.tokenize import word_tokenize, sent_tokenize

text = "Hello world! How are you today? I'm learning NLP."

# Word tokenization
words = word_tokenize(text)
print("Words:", words)

# Sentence tokenization
sentences = sent_tokenize(text)
print("Sentences:", sentences)

## spaCy in Action

In [None]:
import spacy

# Load the English model
nlp = spacy.load("en_core_web_sm")

text = "Apple Inc. is looking at buying U.K. startup for $1 billion"
doc = nlp(text)

# Tokenization with extra info
for token in doc:
    print(f"{token.text} -> {token.lemma_} ({token.pos_})")

[Try Both Libraries in Colab](https://colab.research.google.com/github/Roopesht/codeexamples/blob/main/genai/python_easy/1/concept_4.ipynb)

## NLTK & spaCy Made Simple

**Think of them as different types of scissors:**

<div style="font-size: 1.1em;">
  <p>✂️ <strong>NLTK:</strong> Swiss Army knife (many tools, learn how each works)</p>
  <p>🔧 <strong>spaCy:</strong> Electric cutter (fast, professional, just works)</p>
  <p>📝 <strong>Both:</strong> Cut text into perfect pieces</p>
  <p>🎯 <strong>Result:</strong> Smart tokenization with context!</p>
</div>

## Libraries from a Different Angle

Imagine NLTK and spaCy as different restaurants:

<ul>
  <li>🍳 <strong>NLTK Cafe:</strong> Teach you to cook, show all ingredients & techniques</li>
  <li>🍽️ <strong>spaCy Restaurant:</strong> Serves perfect dishes fast, no fuss</li>
  <li>👨‍🍳 <strong>Both:</strong> Serve delicious tokenized text</li>
  <li>🎓 <strong>You choose:</strong> Based on your needs and goals!</li>
</ul>
<p><em>I hope this restaurant analogy makes library choice clear now!</em></p>

## Quick Reflection

Now you've seen both NLTK and spaCy in action...
💭 For your first NLP project, would you prefer learning-focused NLTK or production-ready spaCy?