Skip to content
Fast, simple identification of codeswitching in Tweets and other short messages.
Python Shell
Branch: master
Clone or download

Latest commit

Fetching latest commit…
Cannot retrieve the latest commit at this time.


Type Name Latest commit message Commit time
Failed to load latest commit information.

Codeswitchador was developed as a part of the SCALE 2012 summer workshop at the Johns Hopkins Human Language Technology Center of Excellence.

Runnable shell scripts

  • qsub-able wrapper for
  • qsub-able wrapper for

Runnable Python scripts

  • A sample of using the codeswitchador API.
  • Evaluate the performance of codeswitching models 0, 1.0, 1.5.
  • Create a frequency ratio list from two wordlists.
  • Create a wordlist from a corpus.


  • Support for codeswitching detection.
  • Constants used by many files.
  • Wordlists and paths used by idiotLID and wordlist-based models.
  • Support for reading from common SCALE file formats (e.g., Jerboa output).
  • Wordlist-based LID/CS models.

The Basics

Most common things you'll need to do:

  1. Create wordlists. See:
    • tools/
    • tools/

TODO: More things here!


Codeswitchador is distributed under the Simplified BSD License. See LICENSE for more information.

You can’t perform that action at this time.