Fast, simple identification of codeswitching in Tweets and other short messages.
Python Shell
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.

Codeswitchador was developed as a part of the SCALE 2012 summer workshop at the Johns Hopkins Human Language Technology Center of Excellence.

Runnable shell scripts

  • qsub-able wrapper for
  • qsub-able wrapper for

Runnable Python scripts

  • A sample of using the codeswitchador API.
  • Evaluate the performance of codeswitching models 0, 1.0, 1.5.
  • Create a frequency ratio list from two wordlists.
  • Create a wordlist from a corpus.


  • Support for codeswitching detection.
  • Constants used by many files.
  • Wordlists and paths used by idiotLID and wordlist-based models.
  • Support for reading from common SCALE file formats (e.g., Jerboa output).
  • Wordlist-based LID/CS models.

The Basics

Most common things you'll need to do:

  1. Create wordlists. See:
    • tools/
    • tools/

TODO: More things here!


Codeswitchador is distributed under the Simplified BSD License. See LICENSE for more information.