This project contains three simple NLP exercises implemented in Python for learning and demonstration purposes.
Used Python’s re module to extract valid email addresses from text using regular expressions.
Example Output: Extracted emails: ['info@university.edu', 'contact@datasciencehub.org', 'laiba.azhar01@gmail.com']
Implemented Levenshtein Edit Distance to measure how many edits (insertions, deletions, substitutions) are required to convert one string into another.
Example: Edit distance between 'kitten' and 'sitting': 3 Edit distance between 'flaw' and 'lawn': 2 Edit distance between 'intention' and 'execution': 5 Edit distance between '' and 'abc': 3 Edit distance between 'abc' and 'abc': 0
Performed basic tweet cleaning using regular expressions and NLTK — removed mentions, URLs, emojis, and stopwords.
Example Output: Original: RT @JohnDoe: Loving the new AI model! 😍🔥 Check it out: https://t.co/xyz123 #AI #NLP Clean Tweet: rt loving the new ai model check it out ai nlp Tokens: ['rt', 'loving', 'the', 'new', 'ai', 'model', 'check', 'it', 'out', 'ai', 'nlp']
- Python 3.x
- re (Regular Expressions)
- nltk
- string
Laiba Azhar
MS Data Science | NLP Enthusiast
LinkedIn Profile | GitHub Profile