Crashsimilarity is a tool to cluster slightly different crash reports, reporting the same problem. It supports crashes reported to the Mozilla Socorro crash reporting system
It uses doc2vec for word embedding and WMD as a distance metric.
How to run
For now only command line interface is supported. Have a look at the cli directory.
# Run all tests py.test tests # Run a specific test py.test tests/test_whatever.py