Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to run coala only on changed files #1991

Closed
abhay-raizada opened this issue Apr 1, 2016 · 11 comments
Closed

Add option to run coala only on changed files #1991

abhay-raizada opened this issue Apr 1, 2016 · 11 comments

Comments

@abhay-raizada
Copy link
Member

so coala right now runs project wide, i think it'd be a neat idea for projects already using coala to run the analysis only on files that have changed , since that would save a lot of redundancy.

@gitmate-bot
Copy link
Collaborator

Thanks for reporting this issue!

Your aid is required, fellow coalaian. Help us triage and solving this issue!

CC @sils1297, @AbdealiJK

@adtac
Copy link
Member

adtac commented Apr 3, 2016

I think we can do this using the last modified tag on the files (and directories). At least that's how git does it. But then we need to keep track of the last time a coala was run to compute the list of files changed.

We could improve it further by keeping a hash of each file and only running coala on those with a difference, in addition to the last modified tag, but this would slow down coala unnecessarily (and it's rare: this happens only when you edit and save a file and then revert the change and then save again).

But at that point we are just reinventing git :P

@sils
Copy link
Member

sils commented Apr 3, 2016

ahm isn't this a dupe of #18 ?

@adtac
Copy link
Member

adtac commented Apr 3, 2016

Two points:

  • What did you mean by cache? I didn't fully understand what you wanted to cache.
  • Also live reload might be annoying: imagine writing something and saving it. Suddenly the file gets altered by coala - this might mess with the concentration.

@abhay-raizada
Copy link
Member Author

@hypothesist yes, reinventing git is not what we want, but how about just running coala on git diff HEAD or git diff HEAD^ we can have a starting point something like coala-commit then from there we can just run the specified bears on the changed files(changed from previous commit at least), this would be really helpful for projects that make sure all files are coala compliant, and they dont have to run coala on the entire project.

@Makman2
Copy link
Member

Makman2 commented Apr 3, 2016

This would be easier to implement if the next-gen-bear design is there (which is processing tasks inside a thread pool), as we can filter tasks depending on the parallization level of the bear. For that sake we could even introduce Linewise parallelization (that would improve performance even more, especially plugins for editors would work way faster). The only thing still to design is the "cache" :)

@abhsag24 I'm not sure we want to enforce the user to use git to make use of this feature^^ Maybe we can use some functionalities of git without having an actual repo?

@Makman2 Makman2 modified the milestone: 0.5.0 Apr 7, 2016
@Makman2 Makman2 added this to the 0.6.0 milestone Apr 13, 2016
@adtac
Copy link
Member

adtac commented Apr 13, 2016

I'm working on how to implement this and I can't think of a way without storing the last time coala was on disk as a file. My current idea is to basically store it in a file after every run. At every run, get the list of files to run coala against and run coala only if the file has been modified after the last run time. But this would mean another file. I can think of two solutions for this:

  • either store the file in the project directory - so this would make it two coala-related files per project
  • store all the last run information in somewhere else like ~/.config/coala/ as a json like {"/path/to/project1/coafile/": 1460542013, "/path/to/project2/coafile/": 1460542007} where we basically store the path to the project as a key and epoch time of last run as a value.

I'm personally leaning towards the second option as it seems much cleaner. Also a major drawback with the first one is that when two people are working from the same starting point, and one finishes first and pushes, the second person would have his last run date overwritten. We could prevent this by adding that file to .gitignore but it doesn't feel right. Let me know what you think. Open to any other solution too :)

@sils
Copy link
Member

sils commented Apr 13, 2016

I'd rather pickle the data and not use JSON, like the idea of storing it into the config, use appdirs though for this

@adtac
Copy link
Member

adtac commented Apr 13, 2016

@sils1297 why pickle the data? What advantage would that give? It would store it in a less readable format (in case the user wants to manually change the last run date or something).

Also agree with using appdirs. I just used ~/.config/coala as an example ^^

@sils
Copy link
Member

sils commented Apr 13, 2016 via email

@Makman2
Copy link
Member

Makman2 commented Apr 13, 2016

where's the problem with using hashes? Just hash every file, store that value, if the hash changed: run the analysis^^

adtac added a commit that referenced this issue Apr 16, 2016
With `--caching` the user can run coala only on those files that
had changed since the last time coala was run. This should improve
the running time of coala.

Fixes #1991
adtac added a commit that referenced this issue Apr 23, 2016
With `--caching` the user can run coala only on those files that
had changed since the last time coala was run. This should improve
the running time of coala.

Fixes #1991
@sils sils added this to the 0.7 milestone Apr 25, 2016
adtac added a commit that referenced this issue May 23, 2016
With `--caching` the user can run coala only on those files that
had changed since the last time coala was run. This should improve
the running time of coala.

Fixes #1991
adtac added a commit that referenced this issue May 25, 2016
With `--caching` the user can run coala only on those files that
had changed since the last time coala was run. This should improve
the running time of coala.

Fixes #1991
sils pushed a commit that referenced this issue May 27, 2016
With `--caching` the user can run coala only on those files that
had changed since the last time coala was run. This should improve
the running time of coala.

Fixes #1991
adtac added a commit that referenced this issue Jun 1, 2016
With `--caching` the user can run coala only on those files that
had changed since the last time coala was run. This should improve
the running time of coala.

Fixes #1991
adtac added a commit that referenced this issue Jun 10, 2016
With `--caching` the user can run coala only on those files that
had changed since the last time coala was run. This should improve
the running time of coala.

Fixes #1991
@rultor rultor closed this as completed in 91c109d Jun 12, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

5 participants