- Install the NLTK library if you haven't already.
- Import the necessary modules from NLTK.
- Download the required NLTK data files to complete the next subtasks.
- Open the Gutenberg corpus.
- Choose a specific file (e.g., 'austen-emma.txt') and tokenize it into sentences.
- Print the total number of sentences and the first sentence.
- Tokenize the text into words and print the tokens.
- Generate bigrams and trigrams from the word tokens and print the first 10 of each.
- Perform POS tagging on the word tokens and print the first 10 tokens with their POS tags.
- Stem each word token and print the original token, its POS tag, and its stem.
- Lemmatize each word token and print the original token and its lemma.
- Create a frequency distribution of the word tokens and plot the top 20 words.
- Python Programming
- Data Analysis
- Documentation

