From JOSS review (suggestion, not requirement):
I've uploaded logs of my pytest runs (again, on Windows) here: https://gist.github.com/vaneseltine/d493a9741d8319b6bec7ddc94b19e112
I'm not sure what the issue with dataloading_tests is -- it fails 1 test on an assertion about receiving the correct files.
analyzer_tests looks like it's getting tripped up on exact float requirements, and so fails 3 tests on the score due to inequalities like 2.3758572130853186 != 2.375857213085318
No problems with jurisdiction_prepper_tests
Nothing too alarming, but I wanted to provide you these logs for reference. #2 is probably another issue that boils down to a difference in platform/OS but it's worth considering a test strategy that will make a "close enough" comparison. pytest.approx() is designed to do this, but it can't work on large mixed-type dicts like the ones being compared there, so it's not a drop-in solution for you.
From JOSS review (suggestion, not requirement):