"Our corpus is tweets."
Branch: master
Clone or download
Type Name Latest commit message Commit time
Failed to load latest commit information.
main Update credentials readme (#60) Feb 14, 2019
outreach added dummy hold.keep files to force git to show folders Jan 12, 2019
sandbox added dummy hold.keep files to force git to show folders Jan 12, 2019
.gitignore Moving script to new folder (#59) Feb 14, 2019
PULL_REQUEST_TEMPLATE.md Adding PR Template (#20) Jan 16, 2019
README.md Update README.md (#61) Feb 14, 2019
environment.yml Adding anaconda environment file (#21) Jan 24, 2019



Project Intro/Objective

See the Wiki! This project is a part of the Data Science Working Group at Code for San Francisco. Other DSWG projects can be found at the main GitHub repo.


Please refer to this article for how these folders should work together.

The "/main" folder is for production code and has 4 sub folders:

  • /data
  • /code
  • /pipeline
  • /output

Use "/sandbox" folder for storing experiments and playing around. "/outreach" is for organizing materials for producing presentations.

-- Project Status: [In Discovery]

Methods Used


  • Python
  • Spacy
  • scikit-learn
  • gensim


Contributing NLTweets Members

Name Slack Handle
Daniel Zou @daniel.zou
Josh Freivogel @Josh Freivogel
Nathan Chau @Nathan Chau


  • If you haven't joined the SF Brigade Slack, you can do that here.
  • Our slack channel is #nltweets
  • Feel free to contact team leads with any questions or if you are interested in contributing!