Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

coala: Add file caching and remove Tagging #2016

Merged
merged 5 commits into from Jun 12, 2016
Merged

coala: Add file caching and remove Tagging #2016

merged 5 commits into from Jun 12, 2016

Conversation

adtac
Copy link
Member

@adtac adtac commented Apr 14, 2016

With --caching the user can run coala only on those files that
had changed since the last time coala was run. This should improve
the running time of coala.

@adtac adtac force-pushed the hypothesist/wip branch 2 times, most recently from 6d68870 to bac2c36 Compare April 14, 2016 19:30
@sils
Copy link
Member

sils commented Apr 14, 2016

@fneu please take a look ASAP

import calendar
import time
import appdirs
import pickle
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I'm not completely sold on pickle, yet. It's nice that until now we have resided with human-readable files, although I get that pickle makes things a little easier in this case. And this is something that peolpe would most likely not interfere with besides deleting said file in case they suspect it leading to errors.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hypothesist what happens if coala is interrupted while writing to the pickled file? Will this break things (and therefore permanently break coala until a human intervenes?)

^@sils1297

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On CTRL+C pickle ignores it and writes everything to file. If you kill the process its mostly corrupted^^
In this case we just need to catch UnpicklingError^^

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fneu I think it will, at worst, corrupt the file. I didn't think about the possibility of that happening.

So I think, as @Makman2 said, adding a UnpickingError should suffice. The worst that will happen in that case is that every project will need to be cached again (maybe we can display an error saying caching broke, so every project needs to be cached again). I don't think that's too bad.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think that's the way forward. Post a warning message, delete the
corrupt file, re-cache.

All in all I like this very much :)

Am 15. April 2016 9:02:27 vorm. schrieb Adhityaa Chandrasekar
notifications@github.com:

@@ -0,0 +1,142 @@
+import os
+import calendar
+import time
+import appdirs
+import pickle

@fneu I think it will, at worst, corrupt the file. I didn't think about the
possibility of that happening.

So I think, as @Makman2 said, adding a UnpickingError should suffice. The
worst that will happen in that case is that every project will need to be
cached again (maybe we can display an error saying caching broke, so every
project needs to be cached again). I don't think that's too bad.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
https://github.com/coala-analyzer/coala/pull/2016/files/621fc8d315bd5a587846946475b6ce9370489585#r59833966

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fneu I've added a verify_file_integrity function that will check if all files are proper. It will delete all of them if even one of them is corrupted (doesn't make sense to have the others if they are all connected). This is done at the start of each coala execution (before the first time any pickling is done).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds fine to me

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for me too 👍

@fneu
Copy link
Contributor

fneu commented Apr 14, 2016

this actually does look rather sane!

there are still code paths not covered by tests, please complete them

@fneu
Copy link
Contributor

fneu commented Apr 14, 2016

@hypothesist you write: "This should improve the running time of coala". I'm on windows right now and cannot check: Is there a run time decrease and how big is the improvement?

@Makman2
Copy link
Member

Makman2 commented Apr 14, 2016

@fneu depending on how many files were changed. Last time @hypothesist told me he was 66% faster^^

@adtac
Copy link
Member Author

adtac commented Apr 15, 2016

@fneu @Makman2 is right, it was 3x faster.

@adtac adtac force-pushed the hypothesist/wip branch 2 times, most recently from 914e6b3 to dd21d95 Compare April 15, 2016 08:06
@@ -0,0 +1,171 @@
import os
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hypothesist Perhaps we should have it alphabetically sorted.

@tushar-rishav
Copy link
Member

@hypothesist 1c45bd7 needs minor rework

@@ -423,6 +436,22 @@ def yield_ignore_ranges(file_dict):
len(file[line_number])))


def get_file_list(results):
"""
Gets the list of files that are affected in the given results.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hypothesist You just missed this 😉

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed it. In my defense, the whole file uses imperative! :P

user's config directory.
"""
path = appdirs.user_config_dir("coala")
return os.path.join(path, filename)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't we have a function like this in tagging somewhere?

@@ -132,6 +132,7 @@ def Analyze(self):
global_bear_list=global_bears[section_name],
local_bear_list=local_bears[section_name],
print_results=lambda *args: True,
cache=None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should at least add a TODO or something

@sils
Copy link
Member

sils commented Jun 12, 2016

unack 611bee3

@sils
Copy link
Member

sils commented Jun 12, 2016

ack 7abfbbb

@@ -470,6 +509,8 @@ def process_queues(processes,
filename as keys.
:param print_results: Prints all given results appropriate to the
output medium.
:param cache: An instance of ``misc.Caching.FileCache`` to use
as a file cache buffer.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

file a newcomer issue about this being allowed None

@sils
Copy link
Member

sils commented Jun 12, 2016

@sils
Copy link
Member

sils commented Jun 12, 2016

ack 7aa7c3c

This commit introduces helper functions that will be used in a later
commit for caching (running coala only on changed files).
In this commit, coalib.misc.Caching is introduced which contains a
class Cache which contains methods to act as a caching mechanism
that will be useful to determine which files are new or changed and
run coala only on those files to improve coala run time speed.
With `--changed-files` the user can run coala only on those files that
had changed since the last time coala was run. This should improve
the running time of coala.

Fixes #1991
@adtac
Copy link
Member Author

adtac commented Jun 12, 2016

@sils1297 sorry, I amended the previous commits by habit.

I'll detail the changes I made so that it's easier to review:

@sils
Copy link
Member

sils commented Jun 12, 2016

ack 622a3e5

@sils
Copy link
Member

sils commented Jun 12, 2016

@sils
Copy link
Member

sils commented Jun 12, 2016

ack 91c109d

@sils
Copy link
Member

sils commented Jun 12, 2016

ack ad3ec72

@adtac
Copy link
Member Author

adtac commented Jun 12, 2016

@rultor merge

@rultor
Copy link
Contributor

rultor commented Jun 12, 2016

@rultor merge

@hypothesist OK, I'll try to merge now. You can check the progress of the merge here

@sils sils added process/approved The PR is approved and will be merged soon and removed process/pending review labels Jun 12, 2016
@rultor rultor merged commit ad3ec72 into master Jun 12, 2016
@rultor
Copy link
Contributor

rultor commented Jun 12, 2016

@rultor merge

@hypothesist Done! FYI, the full log is here (took me 1min)

@sils sils deleted the hypothesist/wip branch June 12, 2016 13:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

None yet