Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Too Many Open Files Error #1789

Closed
dshauver opened this issue May 6, 2020 · 2 comments · Fixed by #1825
Closed

Too Many Open Files Error #1789

dshauver opened this issue May 6, 2020 · 2 comments · Fixed by #1825
Labels
Bug Bug reports and fixes.

Comments

@dshauver
Copy link

dshauver commented May 6, 2020

Describe the Bug

Executed bolt command run 'hostname' against a set of 75 windows servers. This failed on some, but not all, of the servers with the large message attached, followed by a series of

Failed on hostname.removed:
Too many open files

and one error of

Failed on hostname.removed
Failed to connect to : Too many open files @ rb_sysopen - /opt/puppetlabs/bolt/lib/ruby/gems/2.5.0/gems/winrm-2.3.4/lib/winrm/psrp/create_pipeline.xml.erb

Subsequent runs showed only the simple "Too many open files" failure on 2-3 targets - different ones each time - per run.
too_many_files_error.txt

Expected Behavior

bolt executes the "hostname" command successfully on 75 or more remote windows servers without this error.

Steps to Reproduce

Steps to reproduce the behavior:

  1. create an inventory file with 75 windows hosts
  2. execute bolt command run 'hostname' -t windows

Environment

Mac OS X Catalina 10.15.4
ulimit -n = 256
bolt 2.7.0

Additional Context

Problem was resolved by adding ulimit -n 1024 to .bash_profile and launching a new terminal window. I expect it wasn't actually resolved, but rather pushed out - I think it could be triggered again under the right condition. Nick L/Lucy W suggested this fix.

Additional commentary from the discussion thread :

Well, I at least understand what's raising the message, but not why. The WinRM gem seems to use 'Powershell Remoting Protocol' to encode messages to send over WinRM, and it does so by using a template that it reads into memory and populates. It's trying to do that when it fails saying too many files are open. It also sounds like "too many" is usually 1024, and that the fact the error is Errno::EMFILE indicates that it's too many files in the ruby process itself and not on the system.

That problem could have been made worse by the recent addition of a lot more internal pipes.

We might be using 5+ fds for each target

@dshauver dshauver added the Bug Bug reports and fixes. label May 6, 2020
@lucywyman
Copy link
Contributor

lucywyman commented May 6, 2020

  • We should catch the EMFILE error and have a message linking to the doc to instruct people on how to increase their file limit.
  • We should document the issue in our 'known issues' page with more information about why you need to set the limit, how to do it on various platforms, universal vs. user limits, and any other information that's relevant.
  • We should also spike on whether there are alternatives we can use to have fewer fds open.

@lucywyman lucywyman added this to 📝To Do in DEPRECATED: old Bolt Kanban via automation May 6, 2020
@nicklewis
Copy link
Contributor

We have some info about tuning ulimit for PE:
https://puppet.com/docs/pe/latest/config_ulimit.html

@lucywyman lucywyman moved this from 📝To Do to ⚡️Doing in DEPRECATED: old Bolt Kanban May 15, 2020
lucywyman added a commit to lucywyman/bolt that referenced this issue May 15, 2020
Since splitting the shell implementation from the transport connecting
to the shell, we now open a new thread with input, output, and error
streams for every connection. With concurrency defaulting to 100 this
often means 300+ file descriptors are open in Bolt's ruby process when
running against many targets, which exceeds the common FD limit of 256.
This adds a helpful error message when EMFILE is thrown instructing the
user on how to increase their FD limit. It also adds a section in the
known issues documentation about the error, also with instructions on
increasing the FD limit and on dialing back concurrency.

Closes puppetlabs#1789

!no-release-note
lucywyman added a commit to lucywyman/bolt that referenced this issue May 15, 2020
Since splitting the shell implementation from the transport connecting
to the shell, we now open a new thread with input, output, and error
streams for every connection. With concurrency defaulting to 100 this
often means 300+ file descriptors are open in Bolt's ruby process when
running against many targets, which exceeds the common FD limit of 256.
This adds a helpful error message when EMFILE is thrown instructing the
user on how to increase their FD limit. It also adds a section in the
known issues documentation about the error, also with instructions on
increasing the FD limit and on dialing back concurrency.

Closes puppetlabs#1789

!no-release-note
lucywyman added a commit to lucywyman/bolt that referenced this issue May 15, 2020
Since splitting the shell implementation from the transport connecting
to the shell, we now open a new thread with input, output, and error
streams for every connection. With concurrency defaulting to 100 this
often means 300+ file descriptors are open in Bolt's ruby process when
running against many targets, which exceeds the common FD limit of 256.
This adds a helpful error message when EMFILE is thrown instructing the
user on how to increase their FD limit. It also adds a section in the
known issues documentation about the error, also with instructions on
increasing the FD limit and on dialing back concurrency.

Closes puppetlabs#1789

!no-release-note
lucywyman added a commit to lucywyman/bolt that referenced this issue May 15, 2020
Since splitting the shell implementation from the transport connecting
to the shell, we now open a new thread with input, output, and error
streams for every connection. With concurrency defaulting to 100 this
often means 300+ file descriptors are open in Bolt's ruby process when
running against many targets, which exceeds the common FD limit of 256.
This adds a helpful error message when EMFILE is thrown instructing the
user on how to increase their FD limit. It also adds a section in the
known issues documentation about the error, also with instructions on
increasing the FD limit and on dialing back concurrency.

Closes puppetlabs#1789

!no-release-note
lucywyman added a commit to lucywyman/bolt that referenced this issue May 15, 2020
Since splitting the shell implementation from the transport connecting
to the shell, we now open a new thread with input, output, and error
streams for every connection. With concurrency defaulting to 100 this
often means 300+ file descriptors are open in Bolt's ruby process when
running against many targets, which exceeds the common FD limit of 256.
This adds a helpful error message when EMFILE is thrown instructing the
user on how to increase their FD limit. It also adds a section in the
known issues documentation about the error, also with instructions on
increasing the FD limit and on dialing back concurrency.

Closes puppetlabs#1789

!no-release-note
lucywyman added a commit to lucywyman/bolt that referenced this issue May 18, 2020
Since splitting the shell implementation from the transport connecting
to the shell, we now open a new thread with input, output, and error
streams for every connection. With concurrency defaulting to 100 this
often means 300+ file descriptors are open in Bolt's ruby process when
running against many targets, which exceeds the common FD limit of 256.
We now set concurrency to 1/3 the file descriptor limit if the limit is
below 300, otherwise it's set to 100. We also warn the user that
concurrency is set low, and provide instructions on how to increase the
ulimit or concurrency. We also raise a helpful error message when the
EMFILE error is thrown instructing the user on how to increase their FD
limit. Lastly this adds a section in the known issues documentation about
the error, also with instructions on increasing the FD limit and on
dialing back concurrency.

Closes puppetlabs#1789

!feature

* **Lower default concurrency when ulimit is low** ([1789](puppetlabs#1789))

  Concurrency defaults to 1/3 the ulimit if ulimit is below 300, and
  warns if lowered concurrency is used.
lucywyman added a commit to lucywyman/bolt that referenced this issue May 18, 2020
Since splitting the shell implementation from the transport connecting
to the shell, we now open a new thread with input, output, and error
streams for every connection. With concurrency defaulting to 100 this
often means 300+ file descriptors are open in Bolt's ruby process when
running against many targets, which exceeds the common FD limit of 256.
We now set concurrency to 1/3 the file descriptor limit if the limit is
below 300, otherwise it's set to 100. We also warn the user that
concurrency is set low, and provide instructions on how to increase the
ulimit or concurrency. We also raise a helpful error message when the
EMFILE error is thrown instructing the user on how to increase their FD
limit. Lastly this adds a section in the known issues documentation about
the error, also with instructions on increasing the FD limit and on
dialing back concurrency.

Closes puppetlabs#1789

!feature

* **Lower default concurrency when ulimit is low** ([1789](puppetlabs#1789))

  Concurrency defaults to 1/3 the ulimit if ulimit is below 300, and
  warns if lowered concurrency is used.
lucywyman added a commit to lucywyman/bolt that referenced this issue May 18, 2020
Since splitting the shell implementation from the transport connecting
to the shell, we now open a new thread with input, output, and error
streams for every connection. With concurrency defaulting to 100 this
often means 300+ file descriptors are open in Bolt's ruby process when
running against many targets, which exceeds the common FD limit of 256.
We now set concurrency to 1/3 the file descriptor limit if the limit is
below 300, otherwise it's set to 100. We also warn the user that
concurrency is set low, and provide instructions on how to increase the
ulimit or concurrency. We also raise a helpful error message when the
EMFILE error is thrown instructing the user on how to increase their FD
limit. Lastly this adds a section in the known issues documentation about
the error, also with instructions on increasing the FD limit and on
dialing back concurrency.

Closes puppetlabs#1789

!feature

* **Lower default concurrency when ulimit is low** ([1789](puppetlabs#1789))

  Concurrency defaults to 1/3 the ulimit if ulimit is below 300, and
  warns if lowered concurrency is used.
@lucywyman lucywyman moved this from ⚡️Doing to 🚧 Reviewing in DEPRECATED: old Bolt Kanban May 18, 2020
lucywyman added a commit to lucywyman/bolt that referenced this issue May 18, 2020
Since splitting the shell implementation from the transport connecting
to the shell, we now open a new thread with input, output, and error
streams for every connection. With concurrency defaulting to 100 this
often means 300+ file descriptors are open in Bolt's ruby process when
running against many targets, which exceeds the common FD limit of 256.
We now set concurrency to 1/3 the file descriptor limit if the limit is
below 300, otherwise it's set to 100. We also warn the user that
concurrency is set low, and provide instructions on how to increase the
ulimit or concurrency. We also raise a helpful error message when the
EMFILE error is thrown instructing the user on how to increase their FD
limit. Lastly this adds a section in the known issues documentation about
the error, also with instructions on increasing the FD limit and on
dialing back concurrency.

Closes puppetlabs#1789

!feature

* **Lower default concurrency when ulimit is low** ([1789](puppetlabs#1789))

  Concurrency defaults to 1/3 the ulimit if ulimit is below 300, and
  warns if lowered concurrency is used.
lucywyman added a commit to lucywyman/bolt that referenced this issue May 18, 2020
Since splitting the shell implementation from the transport connecting
to the shell, we now open a new thread with input, output, and error
streams for every connection. With concurrency defaulting to 100 this
often means 300+ file descriptors are open in Bolt's ruby process when
running against many targets, which exceeds the common FD limit of 256.
We now set concurrency to 1/3 the file descriptor limit if the limit is
below 300, otherwise it's set to 100. We also warn the user that
concurrency is set low, and provide instructions on how to increase the
ulimit or concurrency. We also raise a helpful error message when the
EMFILE error is thrown instructing the user on how to increase their FD
limit. Lastly this adds a section in the known issues documentation about
the error, also with instructions on increasing the FD limit and on
dialing back concurrency.

Closes puppetlabs#1789

!feature

* **Lower default concurrency when ulimit is low** ([1789](puppetlabs#1789))

  Concurrency defaults to 1/3 the ulimit if ulimit is below 300, and
  warns if lowered concurrency is used.
lucywyman added a commit to lucywyman/bolt that referenced this issue May 18, 2020
Since splitting the shell implementation from the transport connecting
to the shell, we now open a new thread with input, output, and error
streams for every connection. With concurrency defaulting to 100 this
often means 300+ file descriptors are open in Bolt's ruby process when
running against many targets, which exceeds the common FD limit of 256.
We now set concurrency to 1/3 the file descriptor limit if the limit is
below 300, otherwise it's set to 100. We also warn the user that
concurrency is set low, and provide instructions on how to increase the
ulimit or concurrency. We also raise a helpful error message when the
EMFILE error is thrown instructing the user on how to increase their FD
limit. Lastly this adds a section in the known issues documentation about
the error, also with instructions on increasing the FD limit and on
dialing back concurrency.

Closes puppetlabs#1789

!feature

* **Lower default concurrency when ulimit is low** ([1789](puppetlabs#1789))

  Concurrency defaults to 1/3 the ulimit if ulimit is below 300, and
  warns if lowered concurrency is used.
lucywyman added a commit to lucywyman/bolt that referenced this issue May 18, 2020
Since splitting the shell implementation from the transport connecting
to the shell, we now open a new thread with input, output, and error
streams for every connection. With concurrency defaulting to 100 this
often means 300+ file descriptors are open in Bolt's ruby process when
running against many targets, which exceeds the common FD limit of 256.
We now set concurrency to 1/3 the file descriptor limit if the limit is
below 300, otherwise it's set to 100. We also warn the user that
concurrency is set low, and provide instructions on how to increase the
ulimit or concurrency. We also raise a helpful error message when the
EMFILE error is thrown instructing the user on how to increase their FD
limit. Lastly this adds a section in the known issues documentation about
the error, also with instructions on increasing the FD limit and on
dialing back concurrency.

Closes puppetlabs#1789

!feature

* **Lower default concurrency when ulimit is low** ([1789](puppetlabs#1789))

  Concurrency defaults to 1/3 the ulimit if ulimit is below 300, and
  warns if lowered concurrency is used.
lucywyman added a commit to lucywyman/bolt that referenced this issue May 18, 2020
Since splitting the shell implementation from the transport connecting
to the shell, we now open a new thread with input, output, and error
streams for every connection. With concurrency defaulting to 100 this
often means 300+ file descriptors are open in Bolt's ruby process when
running against many targets, which exceeds the common FD limit of 256.
We now set concurrency to 1/3 the file descriptor limit if the limit is
below 300, otherwise it's set to 100. We also warn the user that
concurrency is set low, and provide instructions on how to increase the
ulimit or concurrency. We also raise a helpful error message when the
EMFILE error is thrown instructing the user on how to increase their FD
limit. Lastly this adds a section in the known issues documentation about
the error, also with instructions on increasing the FD limit and on
dialing back concurrency.

Closes puppetlabs#1789

!feature

* **Lower default concurrency when ulimit is low** ([1789](puppetlabs#1789))

  Concurrency defaults to 1/3 the ulimit if ulimit is below 300, and
  warns if lowered concurrency is used.
lucywyman added a commit to lucywyman/bolt that referenced this issue May 18, 2020
Since splitting the shell implementation from the transport connecting
to the shell, we now open a new thread with input, output, and error
streams for every connection. With concurrency defaulting to 100 this
often means 300+ file descriptors are open in Bolt's ruby process when
running against many targets, which exceeds the common FD limit of 256.
We now set concurrency to 1/3 the file descriptor limit if the limit is
below 300, otherwise it's set to 100. We also warn the user that
concurrency is set low, and provide instructions on how to increase the
ulimit or concurrency. We also raise a helpful error message when the
EMFILE error is thrown instructing the user on how to increase their FD
limit. Lastly this adds a section in the known issues documentation about
the error, also with instructions on increasing the FD limit and on
dialing back concurrency.

Closes puppetlabs#1789

!feature

* **Lower default concurrency when ulimit is low** ([1789](puppetlabs#1789))

  Concurrency defaults to 1/3 the ulimit if ulimit is below 300, and
  warns if lowered concurrency is used.
beechtom added a commit that referenced this issue May 21, 2020
(GH-1789) Rescue EMFILE error with helpful message
@beechtom beechtom removed this from 🚧 Reviewing in DEPRECATED: old Bolt Kanban May 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Bug reports and fixes.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants