Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad space/time complexity on non-trivial files #26

Closed
GoogleCodeExporter opened this issue Mar 27, 2015 · 5 comments
Closed

Bad space/time complexity on non-trivial files #26

GoogleCodeExporter opened this issue Mar 27, 2015 · 5 comments

Comments

@GoogleCodeExporter
Copy link

What steps will reproduce the problem?
1. Download the attached sample files (two whitespace-simplified versions
of a generated source file in the Hadoop project).
2. Use the library to compute the diff. The diff timeout would have to be
increased significantly or set to 0.0f.

What is the expected output? What do you see instead?
GNU diff computes a diff with about 1260 edit steps in 0.125s on my
machine. diff-match-patch with the diff timeout removed fails to terminate
in both its C++ and Java versions, consuming all available system memory.

What version of the product are you using? On what operating system?
Latest svn trunk on GNU/Linux amd64.

Please provide any additional information below.
The attached files are just an extreme example; I have also found it
infeasible to compute a diff between two files of about 2000 lines each in
Java with a 2G heap. The timeout prevents all memory from being used, but
results in a trivial "delete A, insert B" diff, which is not useful.

Original issue reported on code.google.com by pavgusti...@gmail.com on 20 Jan 2010 at 12:03

Attachments:

@GoogleCodeExporter
Copy link
Author

I've also encountered this problem. Are there factors which I can change to 
increase performance?

Original comment by Nathanae...@gmail.com on 3 Feb 2010 at 5:19

  • Added labels: ****
  • Removed labels: ****

@GoogleCodeExporter
Copy link
Author

Just released a new version.  The Java and Python versions have been tripled in 
speed, and the JavaScript version has been doubled in speed (for Chrome).

Performance improvements are ongoing.  The C++ version is known to be extremely 
bad due to its use of the QT library and complete lack of pointer math.

Original comment by neil.fra...@gmail.com on 26 Aug 2010 at 4:10

  • Changed state: Started
  • Added labels: ****
  • Removed labels: ****

@GoogleCodeExporter
Copy link
Author

Issue 20 has been merged into this issue.

Original comment by neil.fra...@gmail.com on 26 Aug 2010 at 4:12

  • Added labels: ****
  • Removed labels: ****

@GoogleCodeExporter
Copy link
Author

Issue 29 has been merged into this issue.

Original comment by neil.fra...@gmail.com on 26 Aug 2010 at 4:40

  • Added labels: ****
  • Removed labels: ****

@GoogleCodeExporter
Copy link
Author

A huge update landed today that should resolve all the out of memory errors and 
greatly improve performance.  Take a look.

Over the past year and a half the Java and C# versions have improved in speed 
by more than a factor of twenty.

I'm going to close this bug, because although speed improvements will continue 
to be made, I've certainly addressed the examples in this bug and the ones 
merged with it.

Original comment by neil.fra...@gmail.com on 17 Dec 2010 at 4:52

  • Changed state: Fixed
  • Added labels: Performance
  • Removed labels: ****

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant