-
Notifications
You must be signed in to change notification settings - Fork 269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Takes hours for finish a 300kb diff file #67
Comments
@escitalopram I did some debugging and the |
I'll have a look |
The problem seems to be triggered by large blocks of changes, like OASIS.csproj having 2,2k lines added and removed in one block. The algorithm is O(nm) time with n lines added and m lines removed in a single block, starting almost 5 million levenshtein distance calculations, which are in turn O(op) time with o,p being the line lengths. I'd suggest we'll just disable the line matching on blocks larger than say n*m=2500 (and maybe make that limit configurable). The memory hunger will probably go away with that, too, because there is some cache for distance function results. If that isn't enough, maybe I could also introduce some hash function for the cache keys. |
I think that is a great idea. Can you make a PR? |
Which branch should I base it on? |
master |
Fixed by #68 in release 2.0.0-beta10 |
MOVED FROM diff2html-cli#17
HI,
I really love the tool and currently running it under windows.
however my git diff file is around 300KB, the tool takes 3 hours to finish , without any output file (am using -F option). memory usage is around 800MB.
Just wondering if you have encountered the same issue before?
Tried diffy.org without no issues at all.
https://diffy.org/diff/4wng00ndqz7iudi
thanks.
Travis
diffReport.txt
The text was updated successfully, but these errors were encountered: