🧠 NLP Mini Assignment – Python

This project contains three simple NLP exercises implemented in Python for learning and demonstration purposes.

📘 Exercises

1. Regex for Email Extraction

Used Python’s re module to extract valid email addresses from text using regular expressions.

Example Output: Extracted emails: ['info@university.edu', 'contact@datasciencehub.org', 'laiba.azhar01@gmail.com']

2. Compute Edit Distance

Implemented Levenshtein Edit Distance to measure how many edits (insertions, deletions, substitutions) are required to convert one string into another.

Example: Edit distance between 'kitten' and 'sitting': 3 Edit distance between 'flaw' and 'lawn': 2 Edit distance between 'intention' and 'execution': 5 Edit distance between '' and 'abc': 3 Edit distance between 'abc' and 'abc': 0

3. Normalize and Tokenize Tweets

Performed basic tweet cleaning using regular expressions and NLTK — removed mentions, URLs, emojis, and stopwords.

Example Output: Original: RT @JohnDoe: Loving the new AI model! 😍🔥 Check it out: https://t.co/xyz123 #AI #NLP Clean Tweet: rt loving the new ai model check it out ai nlp Tokens: ['rt', 'loving', 'the', 'new', 'ai', 'model', 'check', 'it', 'out', 'ai', 'nlp']

🧰 Tools Used

Python 3.x
re (Regular Expressions)
nltk
string

✨ Author

Laiba Azhar
MS Data Science | NLP Enthusiast
LinkedIn Profile | GitHub Profile

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
edit_distance.py		edit_distance.py
extract_emails.py		extract_emails.py
tweet_normalize.py		tweet_normalize.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧠 NLP Mini Assignment – Python

📘 Exercises

1. Regex for Email Extraction

2. Compute Edit Distance

3. Normalize and Tokenize Tweets

🧰 Tools Used

✨ Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🧠 NLP Mini Assignment – Python

📘 Exercises

1. Regex for Email Extraction

2. Compute Edit Distance

3. Normalize and Tokenize Tweets

🧰 Tools Used

✨ Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages