Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Credo runs really slowly on my project #84

Closed
myronmarston opened this issue May 24, 2016 · 24 comments
Closed

Credo runs really slowly on my project #84

myronmarston opened this issue May 24, 2016 · 24 comments
Assignees

Comments

@myronmarston
Copy link

We've got a pretty large umbrella project (with 41 sub-apps), and I'm considering adding credo to the project and using a particular strict configuration (for the rules we care most about) as part of our travis CI build process. However, it's running reaaaaally slow:

Analysis took 404.3 seconds (1.2s to load, 403.1s running checks)
2432 mods/funs, found 7 consistency issues, 37 warnings, 1 code readability issue.

That's nearly 7 minutes. Our Travis build currently has a few matrix entries that each run in the 2-4 minute range, so I'm not too keen to add another entry that takes 7 minutes as it'll effectively double our build times.

Is there a way to configure things to speed credo up? Or are there plans to improve its perf?

I already tried decreasing the number of enabled checks (commenting out most of them from my .credo.exs file) but that didn't appear to help.

@rrrene
Copy link
Owner

rrrene commented May 24, 2016

This is a valid concern (and true: Credo is not fast, yet), but I haven't gotten around to address this (mostly due to time constraints).

A couple of thoughts:

  1. The slowest checks are the consistency checks. You can use mix credo --ignore Consistency to exclude all of those on CI (a quick run on Phoenix shows a decrease from 17 to 4 seconds).
  2. We do not cache anything. I could imagine there are clean ways to cache the results of untouched files, but I have made the decision to delay this feature until post-1.0 or whenever somebody voluntarily wants to implement it.
  3. We could certainly do more work in parrallel. Same point as above: I decided that I have to convince people of the approach first, with speed being a post-1.0-concern.

Please understand that this does not mean that I do not take speed seriously. I do, there is the plan to tackle it after 1.0 is out the door, and if anyone would like to look into this, I am very likely to merge those changes even before 1.0 - it's just that I have finite time for this project 😞

Any thoughts on this subject would be greatly appreciated! 👍

@myronmarston
Copy link
Author

--ignore Consistency is a great tip -- it dropped it down to about 2 seconds. IOW, including consistency checks in my project makes credo about 20x slower!

Is the slow nature of the Consistency checks due to the need to infer the established style in the project in existing code? I really like the consistency checks (in fact, it's one of the main reasons I'm trying credo: on our project we've been using a home-grown grep-based lint check for some style consistency enforcement, and credo's approach seems better) and it would be a bummer to do without them entirely. I wonder -- could it improve the perf of the consistency checks if we could explicitly configure it with the style choices we've made on the project instead of it spending time trying to infer those choices?

BTW, as someone who has maintained several open source projects, I completely understand the time crunch and choosing carefully what to focus on now :).

@rrrene
Copy link
Owner

rrrene commented May 27, 2016

Is the slow nature of the Consistency checks due to the need to infer the established style in the project in existing code?

Precisely. I'm sure there is some potential here, since my approach was a bit pragmatic.

could it improve the perf of the consistency checks if we could explicitly configure it with the style choices we've made on the project instead of it spending time trying to infer those choices?

Huh, I will have to think about that. The point is in some cases we could possibly use regexes instead of the elixir_tokenizer if Credo knows what you want.

I just released 0.4.0-beta3 which should give a significant (albeit not sufficient, don't get your hopes up ^^) speed boost. I did this by parallelizing the trivial parts of the consistency checks. Could you check if you experience a difference?

@myronmarston
Copy link
Author

I gave 0.4.0-beta3 a try and now I get a timeout:

Checking 602 source files (this might take a while) ...
** (exit) exited in: Task.await(%Task{owner: #PID<0.47.0>, pid: #PID<0.619.0>, ref: #Reference<0.0.1.1853>}, 30000)
    ** (EXIT) time out
    (elixir) lib/task.ex:336: Task.await/2
    (elixir) lib/enum.ex:604: Enum."-each/2-lists^foreach/1-0-"/2
    (elixir) lib/enum.ex:604: Enum.each/2
    lib/credo/check/runner.ex:66: Credo.Check.Runner.run_checks_that_run_on_all/2
    (stdlib) timer.erl:166: :timer.tc/1
    lib/credo/check/runner.ex:12: Credo.Check.Runner.run/2
    (stdlib) timer.erl:166: :timer.tc/1
    lib/credo/cli/command/suggest.ex:21: Credo.CLI.Command.Suggest.run/2

@rrrene
Copy link
Owner

rrrene commented May 31, 2016

I upped the timeout limit from 30 seconds to 10 minutes. 0.4.0-beta4 contains a fix.

@myronmarston
Copy link
Author

I gave 0.4.0-beta4 a try but it doesn't seem to help:

Analysis took 423.4 seconds (1.2s to load, 422.2s running checks)

If anything, that's slower :(.

@dpehrson
Copy link

If it would solve this issue, I would wholeheartedly support changing the consistency checks to not infer and decide which form is correct but rather be explicitly told. "This project uses 2 spaces"

The speed issues with credo are sadly piling up fast and it's almost entirely due to consistency checks. The worst part is, the consistency checks are some of the checks I want most.

@rrrene rrrene self-assigned this Jun 17, 2016
@olafura
Copy link
Contributor

olafura commented Jan 3, 2017

In our code base it's these two checks that give us 10x slowdown:
Credo.Check.Consistency.TabsOrSpaces adds 60s
Credo.Check.Consistency.LineEndings adds 110s

Without these two it runs in 17s

@jvf
Copy link

jvf commented Jul 10, 2017

I tried credo 0.8.2 on phoenixframework, took 12 seconds compared to dogmas 3 seconds. I tried disabling checks with --ignore or only running some tests with --only. Both did not change the performance, even with mix credo --ignore consistency,readability,refactor,design,warning. Seems like all checks are executed regardless, only presentation is influenced. Can that be right?

@devonestes
Copy link
Contributor

@myronmarston The recent release of 0.8.2 includes some significant optimizations to make those consistency checks run much faster. I'd be curious to see how that release works for you!

@myronmarston
Copy link
Author

I sadly don't have the time to try that right now. I'm leaving on a long vacation tomorrow and have some important things to get done first. I'm looking forward to trying it in the future, though!

@jvf
Copy link

jvf commented Jul 11, 2017

@devonestes I repeated my test from #84 (comment) with 0.8.1. Takes ~22 seconds compared to the ~12 seconds with 0.8.2. For the consistency checks only (mix credo --only consistency) the reported time to run the checks is 0.5 seconds in 0.8.2 compared to 11.8 seconds in 0.8.1.

(As discussed above and reported in #404, the whole command still runs for the whole 12 and 22 seconds respectively).

@devonestes
Copy link
Contributor

@jvf Nice!! Well, that other bug aside, I think this is some great progress!

@jeremyowensboggs
Copy link
Contributor

jeremyowensboggs commented May 29, 2018

I'm running into this issue with 0.9.2, however, running with mix credo --ignore readability is what allows it to run in a reasonable amount of time. Otherwise it is 500+ seconds

@falti
Copy link

falti commented Jun 6, 2018

Confirmed what @jeremyowensboggs noticed.

elixir -v
Erlang/OTP 20 [erts-9.3.1] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:10] [hipe] [kernel-poll:false] [dtrace]

Elixir 1.6.4 (compiled with OTP 20)

Running on Mac.

@TheFirstAvenger
Copy link
Contributor

This issue has been open for almost 4 years now, with the last comment almost 2 years ago. Given the amount of improvements made since then, I would be interested to see where everyone stands with the latest as far as speed.

I also would recommend we close this issue and if there are specific action items that remain that should be taken from it, individual issues be created for them so we can have actionable tasks to work on.

@jared-mackey
Copy link

@TheFirstAvenger Hi, would love to jump into this thread. We are also experiencing fairly slow credo times for our large umbrella application.

Firstly, I would love a debug flag to credo to see the times of each individual check. This would significantly help track down which ones are slow for our application.

Secondly, given the first one is available, I’d love to help track down and make improvements to the slower checks. I’d also be interested in contributing performance improvements, is there a getting started guide for contributors?

Thanks!

@rrrene
Copy link
Owner

rrrene commented Apr 25, 2020

@mackeyja92 Could you try the current master?

You can do this by setting the Credo dep to

{:credo, github: "rrrene/credo"}

Please report back if this improves your Credo experience! 👍

@jared-mackey
Copy link

@rrrene We are currently locked to 1.0 from hex. Here is the results of my run using time.

Results:
current:mix credo --strict 205.97s user 15.01s system 192% cpu 1:54.71 total
master: mix credo --strict 203.88s user 13.56s system 200% cpu 1:48.19 total

However, we have had such issues running it from the root we typically run it more like this.
current: mix cmd mix credo --strict 1.75s user 0.61s system 2% cpu 1:18.80 total
master: mix cmd mix credo --strict 1.89s user 0.65s system 3% cpu 1:20.43 total

These numbers vary from run to run, typically master is faster but not by a significant margin.

@rrrene
Copy link
Owner

rrrene commented Apr 27, 2020

@mackeyja92 Mmmh, this is surprising. I had hoped for a better speed-up.

To test this with a large umbrella, I created an umbrella from these 10 repos consisting of 1,500 Elixir files containing 226,000 lines of Elixir code.

I executed the (unscientific) test runs on an Ubuntu 19.10 VM with 8 CPUs and 24 GB RAM. This VM was not tailored to this test, it was just around 😉

Analysed Codebase

* ecto
* elixir
* ex_doc
* gen_stage
* hex
* hexpm
* jason
* myxql
* phoenix
* plug
github.com/AlDanial/cloc v 1.82  T=2.28 s (1126.8 files/s, 266458.3 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Elixir                        1515          68608          53072         225999

Credo v1.0.5

Analysis took 1018.0 seconds (0.7s to load, 1017.3s running checks)
20766 mods/funs, found 60 consistency issues, 65 warnings, 1412 refactoring opportunities, 4353 code readability issues, 1410 software design suggestions.

Credo v1.4.0

Analysis took 441.6 seconds (1.3s to load, 440.3s running 54 checks on 1234 files)
20154 mods/funs, found 50 consistency issues, 55 warnings, 242 refactoring opportunities, 4205 code readability issues, 1345 software design suggestions.

Credo v1.5.0-dev (master)

Analysis took 144.3 seconds (1.1s to load, 143.2s running 55 checks on 1234 files)
20154 mods/funs, found 50 consistency issues, 55 warnings, 242 refactoring opportunities, 4223 code readability issues, 1345 software design suggestions.


One thing that is strange to me is that your codebase does not see any speed increase worth mentioning. Could you paste the summary of one of the runs? ("summary" being the snippets I posted above containing the times, file count, etc.)

@esvinson
Copy link

With the DuplicatedCode check enabled we weren't seeing any differences in speed between 1.4.0 and the master branch.

We saw a large reduction in run time by disabling the duplicate code check.
{Credo.Check.Design.DuplicatedCode, false},

We're seeing significant decreases when using the newer branch with the DuplicatedCode check disabled.

1.4.0

1.4.0: mix credo --strict with {Credo.Check.Design.DuplicatedCode, excluded_macros: []},
Analysis took 93.9 seconds (1.8s to load, 92.1s running 55 checks on 1290 files)
10580 mods/funs, found 83 refactoring opportunities, 31 code readability issues, 67 software design suggestions.

1.4.0: mix credo --strict with {Credo.Check.Design.DuplicatedCode, false},
Analysis took 34.9 seconds (1.8s to load, 33.0s running 54 checks on 1290 files)
10580 mods/funs, found 83 refactoring opportunities, 31 code readability issues.

1.4.0: mix credo --strict --ignore Consistency with {Credo.Check.Design.DuplicatedCode, false},
Analysis took 13.8 seconds (2.2s to load, 11.6s running 48 checks on 1290 files)
10580 mods/funs, found 83 refactoring opportunities, 31 code readability issues.

rrrene/credo master

git master: mix credo --strict with {Credo.Check.Design.DuplicatedCode, excluded_macros: []},
Analysis took 91.9 seconds (1.8s to load, 90.1s running 56 checks on 1290 files)
10580 mods/funs, found 83 refactoring opportunities, 154 code readability issues, 67 software design suggestions.

git master: mix credo --strict with {Credo.Check.Design.DuplicatedCode, false},
Analysis took 13.0 seconds (1.9s to load, 11.1s running 55 checks on 1290 files)
10580 mods/funs, found 83 refactoring opportunities, 154 code readability issues.

git master: mix credo --strict --ignore Consistency with {Credo.Check.Design.DuplicatedCode, false},
Analysis took 9.6 seconds (1.9s to load, 7.6s running 49 checks on 1290 files)
10580 mods/funs, found 83 refactoring opportunities, 154 code readability issues.

@rrrene
Copy link
Owner

rrrene commented Apr 28, 2020

@esvinson @mackeyja92 That makes sense, unfortunately. DuplicatedCode is a controversial legacy check and I did forget to check its performance when making the mentioned improvements.

I did tune its workloads in 178ba9c - would you mind giving it another spin with master?

@esvinson
Copy link

That makes sense, unfortunately. DuplicatedCode is a controversial legacy check and I did forget to check its performance when making the mentioned improvements.

I did tune its workloads in 178ba9c - would you mind giving it another spin with master?

Yeah I was upgrading from 0.10.0 yesterday, and we had that enabled in our old config. It's not critical, and I disabled it after regenerating the config and re-applying our customizations.

With that said, your latest changes seemed to speed it up a lot. 👍

Both using: mix credo --strict

DuplicatedCode Disabled

Analysis took 11.2 seconds (0.4s to load, 10.8s running 55 checks on 1294 files)
10657 mods/funs, found 123 code readability issues.

DuplicatedCode Enabled (Previously 91.9 seconds)

Analysis took 18.4 seconds (0.4s to load, 18.0s running 56 checks on 1294 files)
10657 mods/funs, found 123 code readability issues, 67 software design suggestions.

@rrrene
Copy link
Owner

rrrene commented Oct 29, 2020

I believe this is no longer an issue with the newest version. Thanks to anyone who contributed datapoints, insights and pull requests to speed up Credo! 🎉

If I am mistaken, please feel free to open another issue!

@rrrene rrrene closed this as completed Oct 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests