Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Isolate Dir.chdir to a new process, or mutex #372

Closed
wants to merge 15 commits into from
Closed

Isolate Dir.chdir to a new process, or mutex #372

wants to merge 15 commits into from

Conversation

taquitos
Copy link
Contributor

@taquitos taquitos commented Jun 25, 2018

Instead of using chdir blocks, we should use absolute paths so that we can enable threaded usage of the library

If Process.fork() is available, we'll use that for git.chdir actions, otherwise, we'll use a global mutex to prevent threads from calling git.chdir before the previous execution has completed.

Fixes #355

Note: if you do async work in git.chdir all bets are off.

Signed-off-by: Joshua Liebowitz

  • Ensure all commits include DCO sign-off.
  • Ensure that your contributions pass unit testing.
  • Ensure that your contributions contain documentation if applicable.

Instead of using chdir blocks, we should use absolute paths so that we can enable threaded usage of the library

Fixes #355
lib/git/lib.rb Outdated
if @git_dir
Dir.chdir(@git_dir, &do_get)
else
build_list.call
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a typo in the original code, submitting a pr for this

@perlun
Copy link
Contributor

perlun commented Jun 25, 2018

@taquitos Much appreciated! 👏 Let me know once you have this under control.

Once this is sorted out, we'll probably get #358 fixed for free.

taquitos and others added 2 commits June 25, 2018 13:50
@taquitos
Copy link
Contributor Author

I am now testing with the latest change. Will update after we have it running for a bit.

taquitos pushed a commit to fastlane/ci that referenced this pull request Jun 25, 2018
Now that we’ve made progress on ruby-git/ruby-git#372
We can limit mutexes to be individual repo-scoped instead of 1 giant mutex for all.
Signed-off-by: Joshua Liebowitz <taquitos@google.com>
@taquitos taquitos changed the title [WIP]Start replacing Dir.chdir with paths Isolate Dir.chdir to a new process, or mutex Jun 25, 2018
lib/git/lib.rb Outdated
else
do_get.call
end
do_get.call(@git_dir)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not 💯 this will work, but tests pass. Seems like we might need to pass the path to the config command.

lib/git/lib.rb Outdated
else
build_list.call
end
build_list.call(@git_dir)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, I think we need to utilize the path variable in the config command if it exists

@taquitos taquitos changed the title Isolate Dir.chdir to a new process, or mutex [WIP]Isolate Dir.chdir to a new process, or mutex Jun 25, 2018
Signed-off-by: Joshua Liebowitz <taquitos@google.com>
@@ -8,6 +8,7 @@ class GitExecuteError < StandardError
class Lib

@@semaphore = Mutex.new
@@config_semaphore = Mutex.new
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

calls to config also should be protected, but it might be heavy handed to fork

@taquitos
Copy link
Contributor Author

@perlun I'd love to get your thoughts. The changes don't make everything thread-safe, but it takes care of a lot of issues. There are a few places where paths can't be passed in $git config is one of those, so we must Dir.chdir 🙄when we're dealing with local config.

@taquitos taquitos changed the title [WIP]Isolate Dir.chdir to a new process, or mutex Isolate Dir.chdir to a new process, or mutex Jun 27, 2018
@perlun
Copy link
Contributor

perlun commented Jul 10, 2018

@taquitos Thanks, will try to get this reviewed ASAP. Two minor things:

  1. Could you rebase/merge in master into the branch?
  2. Could you add the DCO?

Thanks in advance.

@taquitos
Copy link
Contributor Author

@perlun when you get a moment 👍

Copy link
Contributor

@perlun perlun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some very good fixes here, bravo! 👏

I gave some notes on things I think could be addressed before merge. Please let me know what you think about these.

lib/git/base.rb Outdated
class << self
attr_accessor :chdir_semaphore
end
Git::Base.chdir_semaphore = Mutex.new
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works perfectly fine, but maybe the approach below would be even better. It shouldn't really be an accessor but more like a reader since it's an immutable data structure that shouldn't be tampered with elsewhere, right?

class << self
  def chdir_semaphore
    @chdir_semaphore ||= Mutex.new
  end
end

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right, but are you sure that's threadsafe? If we have 2 threads that call chdir_semaphore before it is set, is it possible they both end up calling Mutex.new?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. So maybe something like this then:

@chdir_semaphore = Mutex.new

class << self
  attr_reader :chdir_semaphore
end

(if I remember the context right - the `@chdir_semaphore call in normal class context will assign it to the class-level variable. Please try this out before trusting me blindly on this.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@perlun I tested with your suggestion, it works 👍

lib/git/base.rb Outdated
end
Process.wait(pid)
else
# Windows and NetBSD 4 don't support fork()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting anecdote, I wasn't aware of NetBSD 4 here. But OTOH, it's past EOL already. Maybe we can drop that reference and instead mention JRuby (which is actively used by many), since it also has problems with fork() on the JVM.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True dat

Dir.chdir(repo.path) do
return `du -s`.chomp.split.first.to_i
end
return `du -s #{repo.path}`.chomp.split.first.to_i
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good fix, absolutely no reason to chdir in this case. 🤦‍♂️

lib/git/lib.rb Outdated
@@ -310,7 +311,7 @@ def branches_all
def list_files(ref_dir)
dir = File.join(@git_dir, 'refs', ref_dir)
files = []
Dir.chdir(dir) { files = Dir.glob('**/*').select { |f| File.file?(f) } } rescue nil
files = Dir.glob(File.join(dir, '**/*')).select { |f| File.file?(f) }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the rescue nil removal can be dangerous? Unfortunately not aware of its root cause of being there though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was thinking about that too. I wasn't convinced it was needed, but it's probably safer since passing nil into File.file? or Dir.glob can raise an error

@perlun
Copy link
Contributor

perlun commented Aug 13, 2018

@taquitos Is this reviewable now? If so, please rebase and I'll give it a look ASAP.

taquitos and others added 2 commits August 13, 2018 10:54
Instead of using chdir blocks, we should use absolute paths so that we can enable threaded usage of the library

If Process.fork() is available, we'll use that for git.chdir actions, otherwise, we'll use a global mutex to
prevent threads from calling git.chdir before the previous execution has completed.

Fixes #355

Note: if you do async work in git.chdir all bets are off.

- Reduce and isolate usage of Dir.chdir
- fork() when using Dir.chdir, otherwise, use a mutex
- wrap calls to in a semaphore
- replace Dir.chdir with paths

Signed-off-by: Joshua Liebowitz <taquitos@google.com>
@taquitos
Copy link
Contributor Author

Ugh. Making new PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants