Contains coursework, assignment and projects from UT's Unstructured Data Processing course.
Scrapped user comments data from Edmunds website followed by analysis on brand perceptions amidst users.
- Data Scrapper using Selenium
- Validating Zipf's law
- Text cleaning and frequency based data-cuts
- Tf-Idf and Lift analysis
- Multi-dimensional Scaled plots to visualize brand differences
Scrapped data from pertinent beer review websites.
- Tokenization, POS tagging and cleaning text data
- Cosine distances using CountVectorizers and Spacy word-embeddings
- Tf-Idf and Lift analysis
Scrapped both image and text data from instagram and analyzed as below to differentiate Adidas' vs Nike's social media engagements.
- Lemmatization, Tf-Idf and Wordclouds to understand text
- Topic Modelling to cluster documents into topics
- Named Entity Recognition to peel out additional details from text
- Google analytics to tag iamges
- Extract facial expression to get general sense of sentiment, if possible
- Analyze difference between Nike and Adidas, results detailed in Brand-Analytics.pdf