Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create Pipeline to Classify Tweets by Topic #77

Open
pahdo opened this issue Mar 22, 2019 · 0 comments
Open

Create Pipeline to Classify Tweets by Topic #77

pahdo opened this issue Mar 22, 2019 · 0 comments
Assignees

Comments

@pahdo
Copy link
Collaborator

pahdo commented Mar 22, 2019

Key topics

  • Machine learning
  • Human-in-the-loop
  • Predictive modeling

Objective
We want to develop a pipeline that can tag tweets by topic. In this project, a topic is something that Muni riders are talking about. For example, bus bunching and bike safety are topics. If we can tag tweets by topic in an automated fashion, we will have a data-backed understanding of what riders care about at a given moment.

First steps
We have an initial set of manually labeled tweets located in this Google Sheet. Also, it may be valuable to look at the ClassificationExperiment CodeLab, located here.

Useful tools
Classification - scikit-learn
Text preprocessing spacy

@nathanhc nathanhc self-assigned this Mar 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants