New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor encoding for multiple nailgun messages, refactor logging on exit #6388

Merged
merged 32 commits into from Sep 28, 2018

Conversation

Projects
None yet
5 participants
@cosmicexplorer
Copy link
Contributor

cosmicexplorer commented Aug 23, 2018

Problem

Internally, we're seeing this error message occur intermittently while running a ./pants --changed-parent=master list or while running an internal goal which accepts the output of that list command and filters it down to some subset of those targets. This error message occurs much more reliably on very large diffs -- I have been reliably reproing it with diffs that affect several thousand BUILD files. The error message goes like:

++ ./pants list ::
CRITICAL] 
CRITICAL] lost active connection to pantsd!
Exception caught: (<class 'pants.bin.remote_pants_runner.Terminated'>)
  File ".bootstrap/_pex/pex.py", line 367, in execute
    self._wrap_coverage(self._wrap_profiling, self._execute)
  File ".bootstrap/_pex/pex.py", line 293, in _wrap_coverage
    runner(*args)
  File ".bootstrap/_pex/pex.py", line 325, in _wrap_profiling
    runner(*args)
  File ".bootstrap/_pex/pex.py", line 410, in _execute
    return self.execute_entry(self._pex_info.entry_point)
  File ".bootstrap/_pex/pex.py", line 468, in execute_entry
    return runner(entry_point)
  File ".bootstrap/_pex/pex.py", line 486, in execute_pkg_resources
    return runner()
  File "/redacted/location/of/unzipped/pants/wheel/pants/bin/pants_loader.py", line 68, in main
    PantsLoader.run()
  File "/redacted/location/of/unzipped/pants/wheel/pants/bin/pants_loader.py", line 64, in run
    cls.load_and_execute(entrypoint)
  File "/redacted/location/of/unzipped/pants/wheel/pants/bin/pants_loader.py", line 57, in load_and_execute
    entrypoint_main()
  File "/redacted/location/of/unzipped/pants/wheel/pants/bin/pants_exe.py", line 38, in main
    PantsRunner(exiter, start_time=start_time).run()
  File "/redacted/location/of/unzipped/pants/wheel/pants/bin/pants_runner.py", line 38, in run
    return RemotePantsRunner(self._exiter, self._args, self._env, bootstrap_options).run()
  File "/redacted/location/of/unzipped/pants/wheel/pants/bin/remote_pants_runner.py", line 161, in run
    self._run_pants_with_retry(port)
  File "/redacted/location/of/unzipped/pants/wheel/pants/java/nailgun_client.py", line 216, in execute
    return self._session.execute(cwd, main_class, *args, **environment)
  File "/redacted/location/of/unzipped/pants/wheel/pants/java/nailgun_client.py", line 89, in execute
    return self._process_session()
  File "/redacted/location/of/unzipped/pants/wheel/pants/java/nailgun_client.py", line 64, in _process_session
    for chunk_type, payload in self.iter_chunks(self._sock, return_bytes=True):
  File "/redacted/location/of/unzipped/pants/wheel/pants/java/nailgun_protocol.py", line 207, in iter_chunks
    chunk_type, payload = cls.read_chunk(sock, return_bytes)
  File "/redacted/location/of/unzipped/pants/wheel/pants/java/nailgun_protocol.py", line 183, in read_chunk
    raise cls.TruncatedHeaderError('Failed to read nailgun chunk header ({!r}).'.format(e))

Exception message: abruptly lost active connection to pantsd runner: NailgunError(u'Problem talking to nailgun server (address: 127.0.0.1:37215): TruncatedHeaderError(u"Failed to read nailgun chunk header (TruncatedRead(u\'Expected 5 bytes before socket shutdown, instead received 0\',)).",)',)

We see this error repro with 1.8.x, 1.9.x, and 1.10.x, with approximately equal frequency.

Solution

  • Centralize the logic of encoding integers to send over the nailgun connection.
  • Ensure we use that logic to encode the exit code correctly in pailgun_server.py. I have no clue if this change is the reason for this diff fixing the intermittent issue we're seeing.

Result

It's a little more clear how our nailgun chunks are encoded, and there's a little more test coverage.

@stuhood
Copy link
Member

stuhood left a comment

It would be great to add a test if possible, although I'm not sure that it is.

self._exiter.exit(0)
try:
# Setup the Exiter's finalizer.
self._exiter.set_finalizer(finalizer)

This comment has been minimized.

@stuhood

stuhood Aug 23, 2018

Member

The finalizer doesn't exist at this point: it comes from self.nailgunned_stdio in the context manager. So this needs to stay inside the block.

This comment has been minimized.

@cosmicexplorer

cosmicexplorer Aug 23, 2018

Contributor

Fixed in 29be02b!

@cosmicexplorer

This comment has been minimized.

Copy link
Contributor

cosmicexplorer commented Aug 23, 2018

I think a test can be made and I will look into doing that now.

@ity

ity approved these changes Aug 23, 2018

Copy link
Member

ity left a comment

looks good - feel free to follow up in a separate PR with the test.

@cosmicexplorer cosmicexplorer force-pushed the cosmicexplorer:add-pantsd-runner-dead-logging branch from d36b4b9 to 83f6f49 Aug 31, 2018

@cosmicexplorer cosmicexplorer force-pushed the cosmicexplorer:add-pantsd-runner-dead-logging branch from 83f6f49 to 764c134 Sep 14, 2018

@cosmicexplorer

This comment has been minimized.

Copy link
Contributor

cosmicexplorer commented Sep 14, 2018

I just updated the diff with a fix that has removed all instances of a nondeterministic nailgun connection error that reproes internally. I need to update the OP, and I would really love it if someone had a thought about how to repro the error this diff fixes, or how to test that this is really a fix in this diff.

@stuhood
Copy link
Member

stuhood left a comment

There is one correctness issue here, but once the tests are passing, feel free to merge.

self._exiter.handle_unhandled_exception(add_newline=True)
else:
self._exiter.exit(0)
except Exception:

This comment has been minimized.

@stuhood

stuhood Sep 15, 2018

Member

After logging it, this swallows the exception...it should probably re-raise it a bare raise. ie:

except Exception:
  do_stuff()
  raise

This comment has been minimized.

@cosmicexplorer

cosmicexplorer Sep 15, 2018

Contributor

Done in 02d0fa1 by adding a re_raise parameter to handle_unhandled_exception()! I thought this was a reasonable extension, let me know if that's ridiculous.

This comment has been minimized.

@stuhood

stuhood Sep 15, 2018

Member

It doesn't look like many callsites actually pass the exc parameter, and re_raise doesn't check that it has an exc to raise.

This comment has been minimized.

@cosmicexplorer

cosmicexplorer Sep 15, 2018

Contributor

You're right, I didn't realize the sys.exc_info()[1] could return None. Will revise.

@classmethod
def encode_env_var_value(cls, obj):
"""Encode the input so that it can be used as the value of an environment variable."""
return binary_type(str(obj))

This comment has been minimized.

@stuhood

stuhood Sep 15, 2018

Member

The method name is a bit misleading, I think... because this is going to go on the wire via nailgun, it's not immediately going to be used as an env var: instead, it must always output bytes. So maybe encode_str_value?

But regardless: this should probably be explicitly encoding as utf-8:

str(obj).encode('utf-8')

This comment has been minimized.

@cosmicexplorer

cosmicexplorer Sep 15, 2018

Contributor

This was indeed the error in the py3 shard -- I agree that it's not immediately going to be used as an env var, but I also want to make sure that everything that wants to be an environment variable's value goes through this function. Given that it is defined on the NailgunProtocol object, I think the intent could be made more clear by just editing the docstring to clarify that this is purely for the purposes of a nailgun env chunk (or just make the method to send those chunks call this one).

This comment has been minimized.

@cosmicexplorer

cosmicexplorer Sep 15, 2018

Contributor

I updated the docstring to clarify how the environment variables should be used.

@@ -51,6 +51,8 @@ def test_pantsd_run(self):
'GLOBAL': {
# Muddies the logs with warnings: once all of the warnings in the repository
# are fixed, this can be removed.
# TODO: some testprojects are made to intentionally create these warnings -- these could
# potentially be tagged as such so we don't have to have this setting on.

This comment has been minimized.

@stuhood

stuhood Sep 15, 2018

Member

Should probably drop this.

This comment has been minimized.

@cosmicexplorer

cosmicexplorer Sep 15, 2018

Contributor

Done!

@stuhood stuhood requested a review from Eric-Arellano Sep 15, 2018

@Eric-Arellano
Copy link
Contributor

Eric-Arellano left a comment

Awesome! This code was so hard for me to reason about. Thanks for cleaning it up.

if not isinstance(obj, int):
raise TypeError("cannot encode non-integer object in encode_int(): object was {} (type '{}')."
.format(obj, type(obj)))
return binary_type(str(obj).encode('ascii'))

This comment has been minimized.

@Eric-Arellano

Eric-Arellano Sep 15, 2018

Contributor

binary_type() is unnecessary, because str() will get unicode and encode() will get bytes.

This comment has been minimized.

@cosmicexplorer

cosmicexplorer Sep 15, 2018

Contributor

Totally right! I misinterpreted a python 3 test failure in the exact opposite way I should have, leading to very helpful errors in both shards.

@classmethod
def encode_env_var_value(cls, obj):
"""Encode the input so that it can be used as the value of an environment variable."""
return binary_type(str(obj).encode('utf-8'))

This comment has been minimized.

@Eric-Arellano

Eric-Arellano Sep 15, 2018

Contributor

Ditto with not needing binary_type

This comment has been minimized.

@cosmicexplorer

cosmicexplorer Sep 15, 2018

Contributor

Also fixed!

@cosmicexplorer cosmicexplorer force-pushed the cosmicexplorer:add-pantsd-runner-dead-logging branch 2 times, most recently from e33fc77 to 92e4fd8 Sep 15, 2018

@cosmicexplorer cosmicexplorer changed the title Add pantsd-runner death logging and exception wrapper around call to self.nailgunned_stdio() refactor encoding for multiple nailgun messages, refactor logging on exit Sep 16, 2018

@cosmicexplorer

This comment has been minimized.

Copy link
Contributor

cosmicexplorer commented Sep 16, 2018

Doing some further vetting of this internally (making some more huge diffs) which requires waiting until at least Monday, but we should be able to confirm this works, after which I will merge this, by ~the end of the day. This "internal vetting" could potentially be replaced with generated huge projects or something, just haven't figured out how to do that yet (looking into fuzzing techniques).

@cosmicexplorer

This comment has been minimized.

Copy link
Contributor

cosmicexplorer commented Sep 16, 2018

Hm. Actually, could anyone else let me know if the following test passes locally (this is on master too)? Because I'm seeing an error and it seems like it could be because travis isn't running it with a tty or something?

> ./pants test tests/python/pants_test/java: -- -vs -k test_scala_repl_helloworld_input
# ...
                     ============== test session starts ===============
                     platform linux2 -- Python 2.7.15, pytest-3.6.4, py-1.6.0, pluggy-0.7.1 -- /home/cosmicexplorer/tools/pants/build-support/pants_dev_deps.venv/bin/python2.7
                     cachedir: .pants.d/.pytest_cache
                     rootdir: /home/cosmicexplorer/tools/pants/.pants.d, inifile: /home/cosmicexplorer/tools/pants/.pants.d/test/pytest-prep/CPython-2.7.15/bd2ad0f4126bd8c3f3bb996d9eeb4e239cc321ec/pytest.ini
                     plugins: timeout-1.2.1, cov-2.4.0
                     collecting ... collected 58 items / 57 deselected
                     
                     tests/python/pants_test/java/test_nailgun_integration.py::TestNailgunIntegration::test_scala_repl_helloworld_input <- pyprep/sources/2457a9a63e7d368ce7a222995bbb27265d02f627/pants_test/java/test_nailgun_integration.py FAILED
                     
                     ==================== FAILURES ====================
                      TestNailgunIntegration.test_scala_repl_helloworld_input 
                     
                     self = <pants_test.java.test_nailgun_integration.TestNailgunIntegration testMethod=test_scala_repl_helloworld_input>
                     
                         def test_scala_repl_helloworld_input(self):
                           """Integration test to exercise possible closed-loop breakages in NailgunClient, NailgunSession
                             and InputReader.
                             """
                           target = 'examples/src/scala/org/pantsbuild/example/hello/welcome'
                           pants_run = self.run_pants(
                             command=['repl', target, '--quiet'],
                             stdin_data=(
                               'import org.pantsbuild.example.hello.welcome.WelcomeEverybody\n'
                               'println(WelcomeEverybody("World" :: Nil).head)\n'
                             ),
                             # Override the PANTS_CONFIG_FILES="pants.travis-ci.ini" used within TravisCI to enable
                             # nailgun usage for the purpose of exercising that stack in the integration test.
                             config={'DEFAULT': {'execution_strategy': 'nailgun'}}
                           )
                           self.assert_success(pants_run)
                     >     self.assertIn('Hello, World!', pants_run.stdout_data.splitlines())
                     E     AssertionError: u'Hello, World!' not found in [u'', u'Welcome to Scala 2.11.12 (OpenJDK 64-Bit Server VM, Java 1.8.0_181).', u'Type in expressions for evaluation. Or try :help.', u'', u'scala> import org.pantsbuild.example.hello.welcome.WelcomeEverybody', u'', u'scala> Hello, World!', u'', u'scala> :quit']
                     
                     .pants.d/pyprep/sources/2457a9a63e7d368ce7a222995bbb27265d02f627/pants_test/java/test_nailgun_integration.py:27: AssertionError
                      generated xml file: /home/cosmicexplorer/tools/pants/.pants.d/test/pytest/57ea125b7d9be409b74c1a04584c4d44d63b50cb/junitxml/TEST-57ea125b7d9be409b74c1a04584c4d44d63b50cb.xml 
                     ==== 1 failed, 57 deselected in 16.28 seconds ====
# ...
@jsirois

This comment has been minimized.

Copy link
Member

jsirois commented Sep 16, 2018

The error is: u'Hello, World!' not found in (list of strings that includes u'scala> Hello, World!').
So the scala> prompt prefix foils you here. Should fail in the same way on your machine.

@cosmicexplorer

This comment has been minimized.

Copy link
Contributor

cosmicexplorer commented Sep 16, 2018

Sorry, yes, that makes sense, and running the command the test is running locally produces the same output. I'm wondering why the scala> prompt is in the output here, but not in the output that travis presumably gets.

@cosmicexplorer

This comment has been minimized.

Copy link
Contributor

cosmicexplorer commented Sep 16, 2018

(I'll make a separate issue/PR about the above)

@cosmicexplorer cosmicexplorer force-pushed the cosmicexplorer:add-pantsd-runner-dead-logging branch from dcd8328 to dd3f4bf Sep 17, 2018

@cosmicexplorer

This comment has been minimized.

Copy link
Contributor

cosmicexplorer commented Sep 18, 2018

There's a failure in the py2 contrib shard which is definitely "spurious" but is the same NailgunError as shown in the OP. I'm thinking the pantsd connection error message could try to show some of the contents of at least the pantsd log (if it can be located, since this is an exit) -- would that make any sense? It seems like it might have made the linked failure more debuggable.

@stuhood

This comment has been minimized.

Copy link
Member

stuhood commented Sep 18, 2018

@cosmicexplorer : That is probably not spurious. That test was added as part of #6505.

But yes: attempting to include some structured information from a log would be a good idea. Let's pair on this later today.

cosmicexplorer added a commit to cosmicexplorer/pants that referenced this pull request Sep 27, 2018

move remote log collection into RemoteExiter
i believe this needs pantsbuild#6388 to continue.

the nailgun client reads the pid chunk incorrectly, or something else

if you run the previously-xfailed test with `./pants test
tests/python/pants_test/pantsd:pantsd_integration -- -vs -k test_pantsd_killed_run`, you'll see that
the output from the thin client in the test is:

```
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file

Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Signal 15 was raised. Exiting with failure.
(backtrace omitted)

Additional error killing nailgun client upon exit: A pid was required for some operation, but the session was not initialized yet. (address: 127.0.0.1:57072): ValueError(u'the pid was not yet initialized',)

```

I believe the pid should have been initialized by now, and I believe this may be due to the encoding
issue fixed by pantsbuild#6388. I will verify this now.

cosmicexplorer added some commits Aug 23, 2018

cosmicexplorer added some commits Sep 7, 2018

@cosmicexplorer cosmicexplorer force-pushed the cosmicexplorer:add-pantsd-runner-dead-logging branch from dd3f4bf to 02edca7 Sep 27, 2018

cosmicexplorer added a commit to cosmicexplorer/pants that referenced this pull request Sep 28, 2018

move remote log collection into RemoteExiter
i believe this needs pantsbuild#6388 to continue.

the nailgun client reads the pid chunk incorrectly, or something else

if you run the previously-xfailed test with `./pants test
tests/python/pants_test/pantsd:pantsd_integration -- -vs -k test_pantsd_killed_run`, you'll see that
the output from the thin client in the test is:

```
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file

Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Signal 15 was raised. Exiting with failure.
(backtrace omitted)

Additional error killing nailgun client upon exit: A pid was required for some operation, but the session was not initialized yet. (address: 127.0.0.1:57072): ValueError(u'the pid was not yet initialized',)

```

I believe the pid should have been initialized by now, and I believe this may be due to the encoding
issue fixed by pantsbuild#6388. I will verify this now.

@cosmicexplorer cosmicexplorer merged commit ad42c25 into pantsbuild:master Sep 28, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

cosmicexplorer added a commit to cosmicexplorer/pants that referenced this pull request Sep 28, 2018

move remote log collection into RemoteExiter
i believe this needs pantsbuild#6388 to continue.

the nailgun client reads the pid chunk incorrectly, or something else

if you run the previously-xfailed test with `./pants test
tests/python/pants_test/pantsd:pantsd_integration -- -vs -k test_pantsd_killed_run`, you'll see that
the output from the thin client in the test is:

```
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file

Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Signal 15 was raised. Exiting with failure.
(backtrace omitted)

Additional error killing nailgun client upon exit: A pid was required for some operation, but the session was not initialized yet. (address: 127.0.0.1:57072): ValueError(u'the pid was not yet initialized',)

```

I believe the pid should have been initialized by now, and I believe this may be due to the encoding
issue fixed by pantsbuild#6388. I will verify this now.

cosmicexplorer added a commit to cosmicexplorer/pants that referenced this pull request Oct 2, 2018

move remote log collection into RemoteExiter
i believe this needs pantsbuild#6388 to continue.

the nailgun client reads the pid chunk incorrectly, or something else

if you run the previously-xfailed test with `./pants test
tests/python/pants_test/pantsd:pantsd_integration -- -vs -k test_pantsd_killed_run`, you'll see that
the output from the thin client in the test is:

```
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file

Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Signal 15 was raised. Exiting with failure.
(backtrace omitted)

Additional error killing nailgun client upon exit: A pid was required for some operation, but the session was not initialized yet. (address: 127.0.0.1:57072): ValueError(u'the pid was not yet initialized',)

```

I believe the pid should have been initialized by now, and I believe this may be due to the encoding
issue fixed by pantsbuild#6388. I will verify this now.

cosmicexplorer added a commit that referenced this pull request Oct 3, 2018

move remote log collection into RemoteExiter
i believe this needs #6388 to continue.

the nailgun client reads the pid chunk incorrectly, or something else

if you run the previously-xfailed test with `./pants test
tests/python/pants_test/pantsd:pantsd_integration -- -vs -k test_pantsd_killed_run`, you'll see that
the output from the thin client in the test is:

```
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file

Waiting for file /Users/dmcclanahan/tools/pants/.pants.d/tmp/tmpQbL8l2.pants.d/.workdir.pants.d/some_magic_file
Signal 15 was raised. Exiting with failure.
(backtrace omitted)

Additional error killing nailgun client upon exit: A pid was required for some operation, but the session was not initialized yet. (address: 127.0.0.1:57072): ValueError(u'the pid was not yet initialized',)

```

I believe the pid should have been initialized by now, and I believe this may be due to the encoding
issue fixed by #6388. I will verify this now.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment