Skip to content

This repository contains project work from a natural language processing course taken at The Cooper Union. Project work includes using Naive Bayes to perform text categorization and logistic regression (with Tf-IDf feature vectors) for sentiment analysis.

Notifications You must be signed in to change notification settings

FrankLongueira/Natural-Language-Processing

Repository files navigation

Natural-Language-Processing

Course: Natural Language Processing (ECE-467)

This repository contains project work from a natural language processing course taken at The Cooper Union.

One project was to build a text categorizer program. The following is a description of what was done:

• Built a text categorization system in Python using a Naïve Bayes approach on unigrams • Used Python’s NLTK package for tokenization & stemming • Used a unique variation of Laplace smoothing that was a function of the test sets • Achieved highest average accuracy score, during testing of the systems, in this project's history for Cooper Union's NLP course

The final project of the course involved sentiment analyis on amazon fine food review. The following is a description of what was done:

• Performed polarity (positive / negative) sentiment analysis on an Amazon fine food reviews dataset distributed by Kaggle • The main components of the project include pre-processing the reviews (lowercasing / stemming), using bigram tokenization in creating Tf-IDf feature vectors for the dataset, and using a logistic regression classifier. • Generated average precision, average recall, and average F1 scores of 97% during testing

About

This repository contains project work from a natural language processing course taken at The Cooper Union. Project work includes using Naive Bayes to perform text categorization and logistic regression (with Tf-IDf feature vectors) for sentiment analysis.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published