varunkapoor/Natural-Language-Processing
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|
Repository files navigation
This repository contains some Natural Language Processing tools.
1. Email extraction: extracts links present on the web pages by the use of efficient regular expressions.
2. Spelling auto correction:
Implements the following language models:
• Laplace Unigram Language Model
• Laplace Bigram Language Model
• Stupid Backoff Language Model
• Custom Language Model
3. Sentiment Analysis: trained a Naïve Bayes classifier on IMDB dataset for binary classification of movies.
4. Named Entity Recognition: build a maximum entropy Markov model for identifying person names in newswire text.
5. Semantics: written a probabilistic context free grammar for a small subset of English.
6. CKY Parsing Algorithm: build a probabilistic parser by implementing the CKY parser.
7. Information Retrieval: build an inverted index to quickly retrieve documents.