Skip to content

code and other relevant documents for Unstructured Data Analytics course

Notifications You must be signed in to change notification settings

vicgpt/Unstructured-Data-Analytics

Repository files navigation

Unstructured Data Processing - NLP, Image analytics

Contains coursework, assignment and projects from UT's Unstructured Data Processing course.

Car Brands Text Analytics

Scrapped user comments data from Edmunds website followed by analysis on brand perceptions amidst users.

  • Data Scrapper using Selenium
  • Validating Zipf's law
  • Text cleaning and frequency based data-cuts
  • Tf-Idf and Lift analysis
  • Multi-dimensional Scaled plots to visualize brand differences

Beer Review

Scrapped data from pertinent beer review websites.

  • Tokenization, POS tagging and cleaning text data
  • Cosine distances using CountVectorizers and Spacy word-embeddings
  • Tf-Idf and Lift analysis

Project

Scrapped both image and text data from instagram and analyzed as below to differentiate Adidas' vs Nike's social media engagements.

  • Lemmatization, Tf-Idf and Wordclouds to understand text
  • Topic Modelling to cluster documents into topics
  • Named Entity Recognition to peel out additional details from text
  • Google analytics to tag iamges
  • Extract facial expression to get general sense of sentiment, if possible
  • Analyze difference between Nike and Adidas, results detailed in Brand-Analytics.pdf

About

code and other relevant documents for Unstructured Data Analytics course

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •