We currently use a naive strategy that compares line by line, which means if a line is added or removed, then you end up with a silly number of mismatching lines printed to stdout. It would be better to just use a proper diff algorithm, for example from this crate. Often we end up manually diffing files instead. While this isn't the end of the world, but it's annoying if it's specifically a failure on CI or something.