Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add concurrency #12

Merged
merged 4 commits into from
Nov 28, 2019
Merged

Add concurrency #12

merged 4 commits into from
Nov 28, 2019

Conversation

dennwc
Copy link

@dennwc dennwc commented Nov 27, 2019

This change actually uses the -n parameter to increase concurrency for CLI. Also, reduce the sample size and add a temporary workaround for the parser lifetime bug in the bindings.


This change is Reviewable

Denys Smirnov added 2 commits November 27, 2019 17:52
Signed-off-by: Denys Smirnov <denys@sourced.tech>
Signed-off-by: Denys Smirnov <denys@sourced.tech>
@dennwc dennwc requested a review from kuba-- November 27, 2019 15:57
@dennwc dennwc self-assigned this Nov 27, 2019
cmd/sourced-imports/main.go Outdated Show resolved Hide resolved
extractor.go Outdated Show resolved Hide resolved
extractor.go Outdated
Comment on lines 110 to 121
jobs = make(chan *extractJob, e.num)
errc = make(chan error, e.num)
for i := 0; i < e.num; i++ {
wg.Add(1)
go func() {
defer wg.Done()
e.worker(jobs, errc)
}()
}
// no sample buffer, it's per-routine
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets move it to some function.
Btw. I think error channel is not closed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, but I don't think it will make the code smaller.

If you propose to move the worker creation, it will require to pass wg to a function, which is considered a bad practice (you should see all usages of wg in one place, if possible).

Same for other things: creating a function will require passing other state like channels to them. Having in mind the size of the code inlining it here may still be acceptable.

But if you have some specific suggestion, please elaborate a bit more.

Re: not closing the channel - it's not necessary. GC will take care of that anyway. It's not a resource leak.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my intention was not make the code shorter but more readable (less statements from different contexts, like here we start workers, there we handle channels plus ifs single threaded or multi threaded.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said, if you have any suggestions - please elaborate. Moving things will complicate code or will require passing all the state as arguments.

extractor.go Outdated Show resolved Hide resolved
extractor.go Outdated Show resolved Hide resolved
@kuba--
Copy link
Contributor

kuba-- commented Nov 28, 2019

here are some tests with ~3K C# files:
master:
single thread: 0m7.632s

concurrent:
n=1: 0m12.913s
n=2: 0m7.767s
n=3: 0m7.020s
n=4: 0m6.683s

OK, it's better if you have (4CPUs), but otherwise it's not a huge performance improvement. I wonder if it's a json IO + mutex or maybe communication over channels.

@dennwc
Copy link
Author

dennwc commented Nov 28, 2019

@kuba-- I don't think you are testing the same thing, though. master skips files larger than N bytes, while this branch still process those but takes only first N bytes (sampling). So it's not 7 vs 12 vs 6, it's 12 vs 12 vs 6 in case you change master to process the larger files as well.

@kuba--
Copy link
Contributor

kuba-- commented Nov 28, 2019

@dennwc it's not really because of sampling.
I tested agains EntityFrameworkCore-master repo. There are no big files. Moreover I've changed maxsize:
c.MaxSize = 1024 * 1024 * 1024
So it should not be the case, but still I get: 0m7.689s on single thread

Denys Smirnov added 2 commits November 28, 2019 17:17
Signed-off-by: Denys Smirnov <denys@sourced.tech>
Signed-off-by: Denys Smirnov <denys@sourced.tech>
@kuba-- kuba-- merged commit bf22b73 into master Nov 28, 2019
@kuba-- kuba-- deleted the concurrent branch November 28, 2019 15:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants