Skip to content
This repository has been archived by the owner on Jun 27, 2023. It is now read-only.

Only rewrite output files if changes are needed. #604

Open
sodul opened this issue Dec 10, 2021 · 6 comments
Open

Only rewrite output files if changes are needed. #604

sodul opened this issue Dec 10, 2021 · 6 comments

Comments

@sodul
Copy link
Contributor

sodul commented Dec 10, 2021

Actual behavior
We have a fairly large project and running mockgen takes several minutes. During that time all of the generated mock files get deleted and rewritten. This causes 2 issues:

  • if we run the mocks in parallel mockgen can try to read the files while they are being rewritten, causing mockgen to fail as it was trying to read the file. We worked around this by targeting directories in parallel but run mockgen sequentially within a directory.
  • our IDEs see the files being modified and re-index the codebase over and over, making the IDE mostly unusable while mockgen is running.

Expected behavior
If the output file already exist and has the same content mockgen should not rewrite it and leave it untouched.

To Reproduce Steps to reproduce the behavior

Just run mockgen as usual.

Additional Information

  • gomock 1.6.0
  • golang 1.17.3
@codyoss
Copy link
Member

codyoss commented Dec 30, 2021

This seems like a reasonable feature request to me. I would be happy to accept a PR for such a change.

kozmod added a commit to kozmod/mock that referenced this issue Feb 8, 2022
sodul added a commit to clumio-oss/mock that referenced this issue Feb 22, 2022
When running mockgen with the output option this checks if the existing file content already exists and skips writing if there is nothing to change.

This will help reduce i/o when changing lots of files, but also reduce the re-indexing triggering in IDEs.

Solves golang#604
sodul added a commit to clumio-oss/mock that referenced this issue Feb 22, 2022
When running mockgen with the output option this checks if the existing file content already exists and skips writing if there is nothing to change.

This will help reduce i/o when changing lots of files, but also reduce the re-indexing triggering in IDEs.

Solves golang#604
sodul added a commit to clumio-oss/mock that referenced this issue Feb 22, 2022
When running mockgen with the output option this checks if the existing file content already exists and skips writing if there is nothing to change.

This will help reduce i/o when changing lots of files, but also reduce the re-indexing triggering in IDEs.

Solves golang#604
sodul added a commit to clumio-oss/mock that referenced this issue Feb 23, 2022
When running mockgen with the output option this checks if the existing file content already exists and skips writing if there is nothing to change.

This will help reduce i/o when changing lots of files, but also reduce the re-indexing triggering in IDEs.

Solves golang#604
@sodul
Copy link
Contributor Author

sodul commented Mar 4, 2022

@codyoss gentle ping. There are 2 potential PRs, small ones, that can address this issue.

FYI in our situation we have dozens, if not hundreds of generate mock files. So this change really help with the machines load since it reduce the re-indexing from IDEs, and reduces the background antivirus workload as well.

@sodul
Copy link
Contributor Author

sodul commented Mar 28, 2022

@codyoss @sanposhiho gentle ping. I'm not sure who have merge permission rights on this repository.

sodul added a commit to clumio-oss/mock that referenced this issue Aug 11, 2022
When running mockgen with the output option this checks if the existing file content already exists and skips writing if there is nothing to change.

This will help reduce i/o when changing lots of files, but also reduce the re-indexing triggering in IDEs.

Solves golang#604
codyoss pushed a commit that referenced this issue Aug 12, 2022
When running mockgen with the output option this checks if the existing file content already exists and skips writing if there is nothing to change.

This will help reduce i/o when changing lots of files, but also reduce the re-indexing triggering in IDEs.

Solves #604
@sodul
Copy link
Contributor Author

sodul commented Sep 16, 2022

This was merged last month and should be in the final 1.7 release. @codyoss do you have a timeline for that?

@amarjeetanandsingh
Copy link

Currently we are generating the mock and then compare if the generated content is already in the mock file. We are saving only the write efforts and it’s implications and not generation effort. I think there is a scope of improvement. I understand there will be challenges but can we consider below 2 approaches for optimisation?

  • We can store the last modified UTC time of the actual file inside the generated file and use it to compare it on next mockgen trigger.
  • Another approach can be to store a fast hash of the actual file content inside the mock file and compare it on next mockgen trigger

@codyoss @sodul
If considerable, I’ll be happy to implement.

@sodul
Copy link
Contributor Author

sodul commented Mar 28, 2023

@amarjeetanandsingh I'm not an official maintainer, I just contributed a PR here.

That said something to consider about adding timestamps inside files is that this makes them mutable which breaks build reproducibility and breaks tools such as Bazel which expect that same input results in identical output. Timestamps, or git hashes are really bringing unwanted side effects. Furthermore our own internal CI pipeline run mockgen on every PR and file the pipeline if we detect a diff, so we definitely would not want to introduce 'random' data (time is random) in the generated files.

While a hash of the content has the advantage of being more consistent when re-run, I'm not sure that would help much. You want a way to uniquely identify the input data, which would take a while to compute, by that point generating the output in memory is probably cheap to do. The current code will now check if the content of the generated file is different before reading it do do a in memory diff, so there is already an optimization on writing here.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants