Skip to content

bharatvem/TED-ANLP

Repository files navigation

Introduction

It is human tendency to label all the things we encounter. The internet along with its advantages also nurtured the availability and abundance in data. The larger the data gets, the greater the need to divide larger things into smaller chunks so that they could be accessed and used better. It might be an evolutionary learning to have ability to label the content based continuously training machine learning models. The goal is to design a model that could train on a corpus of text files to generate a finite bag of words that could be used to and predict an unknown/unlabelled text document. For this project we worked on designing and building topic prediction model for the TED talks to predict similar topic labels for TED Talks.

Methods used:

  1. LDA Model
  2. TfIdf Weight Ranking Model
  3. k-NN Model
  4. Word2Vec Model

About

Perform topic modelling on the transcripts of the TED Talks

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published