# Experiment 7
## Perform Stemming.

Stemming is a text normalization technique that truncates words to their base form by removing prefixes or suffixes. It's often used in Natural Language Processing (NLP) to simplify word variations. Unlike lemmatization, stemming is a less precise method, as it doesn't always produce valid words but is computationally efficient.

In [None]:
# Importing Library
import nltk

# Importing Stemmer
from nltk.stem import PorterStemmer

# Importing tokenizer
from nltk.tokenize import word_tokenize

nltk.download('punkt')

# Initializing Stemmer Object
ps = PorterStemmer()

# List of words
words = ['program', 'programs', 'programmer', 'programming', 'programmers']

print("Words after Stemming:")

# Iterating over list and conveting to lemma using Stemmer
for word in words:

    # Converting to stem
    print(f"{word} -> {ps.stem(word)}")

Words after Stemming:
program -> program
programs -> program
programmer -> programm
programming -> program
programmers -> programm


[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!


## Differences between Lemmatization and Stemming:

### Lemmatization:

- **Definition:**
  - Lemmatization is a linguistic process that involves reducing words to their base or root form, known as the "lemma."

- **Goal:**
  - The goal of lemmatization is to convert words to a common base form to capture their inherent meaning.

- **Precision:**
  - Lemmatization tends to be more precise than stemming because it considers the context of the word and its part of speech.

- **Example:**
  - *Example:* For the word "running," lemmatization would return "run."

### Stemming:

- **Definition:**
  - Stemming is a more heuristic process that involves removing prefixes or suffixes from words to obtain a common base or root form, known as the "stem."

- **Goal:**
  - The goal of stemming is to reduce words to a common form, even if the result is not a valid word.

- **Simplicity:**
  - Stemming is simpler and faster than lemmatization but may sacrifice precision.

- **Example:**
  - *Example:* For the word "running," stemming might return "run" without considering the grammatical context.

### Comparison:

- **Precision vs. Speed:**
  - Lemmatization is generally more precise but slower than stemming.

- **Context Awareness:**
  - Lemmatization considers the context and part of speech, while stemming is more rule-based and may not consider linguistic context.

- **Use Cases:**
  - Lemmatization is often preferred in applications where precise word forms are crucial (e.g., search engines, information retrieval), while stemming might be more suitable for applications where speed is a priority (e.g., information retrieval systems, text classification).

In summary, lemmatization aims for precision by considering the context and part of speech, while stemming is a faster but less precise process that focuses on removing prefixes and suffixes to obtain a common base form.


### Examples:

### Lemmatization Examples:

1. **Word: "Studies"**
   - **Stemming Result:** "Studi" (may not be a valid word)
   - **Lemmatization Result:** "Study"

2. **Word: "Better"**
   - **Stemming Result:** "Better" (no change as it's already in its base form)
   - **Lemmatization Result:** "Good" (the base form of "better")

3. **Word: "Caring"**
   - **Stemming Result:** "Car"
   - **Lemmatization Result:** "Care"

### Stemming Examples:

1. **Word: "Happiness"**
   - **Stemming Result:** "Happi"
   - **Lemmatization Result:** "Happiness" (no change as lemmatization retains valid words)

2. **Word: "Running"**
   - **Stemming Result:** "Run"
   - **Lemmatization Result:** "Run" (same result in this case, but lemmatization considers context)

3. **Word: "Believable"**
   - **Stemming Result:** "Believ"
   - **Lemmatization Result:** "Believable" (lemmatization retains valid words)

### Additional Note:

- **Context Consideration:**
  - Lemmatization considers the context and part of speech, so it may provide more accurate results in some cases where stemming might yield an invalid or less meaningful base form.

These examples demonstrate how lemmatization produces valid words and considers linguistic context, while stemming may result in non-words or less meaningful base forms.