Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

./pants does not work out of the box - no requests-sessions #1

Closed
jsirois opened this issue Jan 13, 2016 · 2 comments
Closed

./pants does not work out of the box - no requests-sessions #1

jsirois opened this issue Jan 13, 2016 · 2 comments

Comments

@jsirois
Copy link

jsirois commented Jan 13, 2016

Looks like so on a fresh clone:

jsirois@gill ~/dev/3rdparty/fsqio (master) $ git log -1
commit 837a4965444334a849f539ed20f8b6a7f2d46af9
Author: John Gallagher <john@foursquare.com>
Date:   Fri Jan 8 14:41:08 2016 -0500

    Update fhttp readme for fsq.io

    (sapling split of 15944090750f351ba99b4bee763c84019a33affc)
jsirois@gill ~/dev/3rdparty/fsqio (master) $ git clean -fdx && ./pants pom-resolve
++++ which python2.7
+++ PYTHON=/home/jsirois/.pyenv/shims/python2.7
+++ PANTS_HOME=/home/jsirois/.cache/pants/setup
+++ PANTS_BOOTSTRAP=/home/jsirois/.cache/pants/setup/bootstrap
+++ VENV_VERSION=13.1.2
+++ VENV_PACKAGE=virtualenv-13.1.2
+++ VENV_TARBALL=virtualenv-13.1.2.tar.gz
+++ FOURSQUARE_REQUIREMENTS=3rdparty/python/requirements.txt
++++ bootstrap_pants
++++ pants_requirement=pantsbuild.pants
+++++ grep -E '^[[:space:]]*pants_version' pants.ini
+++++ sed -E 's|^[[:space:]]*pants_version[:=][[:space:]]*([^[:space:]]+)|\1|'
++++ pants_version=0.0.65
++++ [[ -n 0.0.65 ]]
++++ pants_requirement=pantsbuild.pants==0.0.65
++++ [[ ! -d /home/jsirois/.cache/pants/setup/bootstrap/0.0.65 ]]
++++ echo /home/jsirois/.cache/pants/setup/bootstrap/0.0.65
+++ pants_dir=/home/jsirois/.cache/pants/setup/bootstrap/0.0.65
+++ export PANTSBINARY=/home/jsirois/.cache/pants/setup/bootstrap/0.0.65/bin/pants
+++ PANTSBINARY=/home/jsirois/.cache/pants/setup/bootstrap/0.0.65/bin/pants
+ '[' '' == --help ']'
+ '[' '' == force ']'
+ '[' '' '!=' '' ']'
+ exec ./scripts/upkeep/check.sh
++++ PYTHON=/home/jsirois/.pyenv/shims/python2.7
++++ PANTS_HOME=/home/jsirois/.cache/pants/setup
++++ PANTS_BOOTSTRAP=/home/jsirois/.cache/pants/setup/bootstrap
++++ VENV_VERSION=13.1.2
++++ VENV_PACKAGE=virtualenv-13.1.2
++++ VENV_TARBALL=virtualenv-13.1.2.tar.gz
++++ FOURSQUARE_REQUIREMENTS=3rdparty/python/requirements.txt
+++++ bootstrap_pants
+++++ pants_requirement=pantsbuild.pants
++++++ grep -E '^[[:space:]]*pants_version' pants.ini
++++++ sed -E 's|^[[:space:]]*pants_version[:=][[:space:]]*([^[:space:]]+)|\1|'
+++++ pants_version=0.0.65
+++++ [[ -n 0.0.65 ]]
+++++ pants_requirement=pantsbuild.pants==0.0.65
+++++ [[ ! -d /home/jsirois/.cache/pants/setup/bootstrap/0.0.65 ]]
+++++ echo /home/jsirois/.cache/pants/setup/bootstrap/0.0.65
++++ pants_dir=/home/jsirois/.cache/pants/setup/bootstrap/0.0.65
++++ export PANTSBINARY=/home/jsirois/.cache/pants/setup/bootstrap/0.0.65/bin/pants
++++ PANTSBINARY=/home/jsirois/.cache/pants/setup/bootstrap/0.0.65/bin/pants
+ set -e
+ shopt -s nullglob
+ ran=
+ '[' '' '!=' '' ']'
+ '[' '' '!=' '' ']'
++++ which python2.7
+++ PYTHON=/home/jsirois/.pyenv/shims/python2.7
+++ PANTS_HOME=/home/jsirois/.cache/pants/setup
+++ PANTS_BOOTSTRAP=/home/jsirois/.cache/pants/setup/bootstrap
+++ VENV_VERSION=13.1.2
+++ VENV_PACKAGE=virtualenv-13.1.2
+++ VENV_TARBALL=virtualenv-13.1.2.tar.gz
+++ FOURSQUARE_REQUIREMENTS=3rdparty/python/requirements.txt
++++ bootstrap_pants
++++ pants_requirement=pantsbuild.pants
+++++ grep -E '^[[:space:]]*pants_version' pants.ini
+++++ sed -E 's|^[[:space:]]*pants_version[:=][[:space:]]*([^[:space:]]+)|\1|'
++++ pants_version=0.0.65
++++ [[ -n 0.0.65 ]]
++++ pants_requirement=pantsbuild.pants==0.0.65
++++ [[ ! -d /home/jsirois/.cache/pants/setup/bootstrap/0.0.65 ]]
++++ echo /home/jsirois/.cache/pants/setup/bootstrap/0.0.65
+++ pants_dir=/home/jsirois/.cache/pants/setup/bootstrap/0.0.65
+++ export PANTSBINARY=/home/jsirois/.cache/pants/setup/bootstrap/0.0.65/bin/pants
+++ PANTSBINARY=/home/jsirois/.cache/pants/setup/bootstrap/0.0.65/bin/pants
+ '[' -n '' ']'
+ '[' -z /home/jsirois/.cache/pants/setup/bootstrap/0.0.65/bin/pants ']'
+ export PYTHONPATH=src/python
+ PYTHONPATH=src/python
+ exec /home/jsirois/.cache/pants/setup/bootstrap/0.0.65/bin/pants --print-exception-stacktrace pom-resolve
Traceback (most recent call last):
  File "/home/jsirois/.cache/pants/setup/bootstrap/pants.4sOId7/install/lib/python2.7/site-packages/pants/bin/extension_loader.py", line 140, in load_backend
    module = importlib.import_module(backend_module)
  File "/usr/lib64/python2.7/importlib/__init__.py", line 37, in import_module
    __import__(name)
  File "/home/jsirois/dev/3rdparty/fsqio/src/python/fsqio/pants/register.py", line 15, in <module>
    from fsqio.pants.pom.pom_resolve import PomResolve
  File "/home/jsirois/dev/3rdparty/fsqio/src/python/fsqio/pants/pom/pom_resolve.py", line 39, in <module>
    from requests_futures.sessions import FuturesSession
ImportError: No module named requests_futures.sessions

Exception caught: (<class 'pants.base.exceptions.BackendConfigurationError'>)
  File "/home/jsirois/.cache/pants/setup/bootstrap/0.0.65/bin/pants", line 11, in <module>
    sys.exit(main())
  File "/home/jsirois/.cache/pants/setup/bootstrap/pants.4sOId7/install/lib/python2.7/site-packages/pants/bin/pants_exe.py", line 26, in main
    LocalPantsRunner(exiter).run()
  File "/home/jsirois/.cache/pants/setup/bootstrap/pants.4sOId7/install/lib/python2.7/site-packages/pants/bin/pants_runner.py", line 60, in run
    self._maybe_profiled(self._run)
  File "/home/jsirois/.cache/pants/setup/bootstrap/pants.4sOId7/install/lib/python2.7/site-packages/pants/bin/pants_runner.py", line 57, in _maybe_profiled
    runner()
  File "/home/jsirois/.cache/pants/setup/bootstrap/pants.4sOId7/install/lib/python2.7/site-packages/pants/bin/pants_runner.py", line 65, in _run
    options, build_config = OptionsInitializer(options_bootstrapper, exiter=self.exiter).setup()
  File "/home/jsirois/.cache/pants/setup/bootstrap/pants.4sOId7/install/lib/python2.7/site-packages/pants/bin/goal_runner.py", line 128, in setup
    return self._setup_options(self._options_bootstrapper, self._working_set)
  File "/home/jsirois/.cache/pants/setup/bootstrap/pants.4sOId7/install/lib/python2.7/site-packages/pants/bin/goal_runner.py", line 102, in _setup_options
    build_configuration = load_plugins_and_backends(plugins, working_set, backend_packages)
  File "/home/jsirois/.cache/pants/setup/bootstrap/pants.4sOId7/install/lib/python2.7/site-packages/pants/bin/extension_loader.py", line 36, in load_plugins_and_backends
    load_build_configuration_from_source(build_configuration, additional_backends=backends or [])
  File "/home/jsirois/.cache/pants/setup/bootstrap/pants.4sOId7/install/lib/python2.7/site-packages/pants/bin/extension_loader.py", line 126, in load_build_configuration_from_source
    load_backend(build_configuration, backend_package)
  File "/home/jsirois/.cache/pants/setup/bootstrap/pants.4sOId7/install/lib/python2.7/site-packages/pants/bin/extension_loader.py", line 144, in load_backend
    .format(backend=backend_module, error=e))

Exception message: Failed to load the fsqio.pants.register backend: No module named requests_futures.sessions

So the build script either needs to bootstrap a venv with 3rdparty/python/requirements.txt or else instructions need to be added to the README to do some out of band manual setup.

@jsirois jsirois changed the title ./pants does not work out of the box - no requests-sessions ./pants does not work out of the box - no requests-sessions Jan 13, 2016
@mateor
Copy link
Contributor

mateor commented Jan 13, 2016

Hello, John, how's things : )

Sooo...I had this whole thing where I was asking you what OS you were running and all that....but I see the issue. We thank you in our bootstrap script because that script is basically identical to your pants setup script you made. Since you already have ran that script for 0065, you already have a 0065 virtualenv in your home directory so it never installs the FOURSQUARE_REQUIREMENTS. I will push an update to the bootstrap scripts soon but the fix is generally to just differentiate upstream pants virtualenv from the fsqio one.

A random contributor to Fsqio (patrick) feels pretty strongly we should keep the virtualenv under the repo root, I guess this is an argument in his favor : )

On the plus side, you inspired me to go out and test this on some Linux images. I will update the README but you will need python-dev, JDK8 and libpq-dev and perhaps build-essential to run on linux. I will ping/close this issue when I push the fix, but you can probably get around this by passing PANTS_HOME to a location under the buildroot.

@mateor
Copy link
Contributor

mateor commented Jan 14, 2016

I think we have the fix for this pushed...thanks for the report!

d9396ca

@mateor mateor closed this as completed Jan 14, 2016
mateor added a commit that referenced this issue Dec 17, 2017
This was a colloboration with @abe-winter-4s, who debugged some tricky failure
case that ended up being due to the caching process starting before
the copy to results_dir had completed.

This commit finishes the migration of all our custom codegen back onto
the upstream base class, and leaves the artifact cache and synthetic
target construction to be identical to every target.

This is has been a long-standing problem for us, and the
effected areas are discussed below, but there is a cost for the
correctness:

Regenerating all of our ~500 targets jumps from 30s to about 3 mins
That is super rarely required. In most cases, folks will be rarely
require more than double digit thrift targets at once. This should
safely be a perf win across the board, when incremental compile and
invalidation correctness are factored in. But fyi.

Ivy and javac correctness/speed
-------------------------------
This has always been an issue and has bitten us regularly. But now that
we are experimenting with Ivy resolver, this has the potential to be
very disruptive, because we are currently using the cache_key to keep
the produce artifacts of each resolver separate from the other.

During Pants upgrades we see things like missing analysis files and
ClassNotFound errors, which are generally rooted in the nonexistant
spindle namespacing.

As we move forwad, I intend to more thoroughly convert to the Ivy backend
in order to leverage incremental compile support. One of the biggest
wins on the table there is allowing CI-constructed targets to serve
as the basis of incremental compilation. But that cannot be relied upon
when this munging of output is live, because the things that the incremental
compiler is at risk of doing poorly is in the same scope and therefore
cannot be fairly evaluated with this unfixed.

The speed of Ivy-resolved compiles + warm incremental cache has been
oberved as on the order of 4x what we see in an equivalent pom-resolve
and no incremental run. So this is not academic benefit - getting a warm
cache from our CIs would be one of the biggest wins available.

Pants and Codegen correctness/speed
-----------------------------------

TLDR: This class was coupled to the conventions of Pants 0.40 instead
of modern day Pants. This meant that things that were supposed to
safely create their own namespaces (due to Task version, cache_key value,
changes in Pants/Zinc javac analysis files, JDK version ,etc, etc, xyz)
were not respected.

This included the artifacts being served by the cache. We could clean and
bump the cache_key, but compile nondeterminism would return because the
cache_key of the generated scala stayed the same.

This is a bit of a compromise, in that this is not a pure implementer of
SimpleCodegen. As Pants evolved, it became clear that the need for isolation
and its correctness trumped the need for speed in almost every case.
Before that was fully accepted, codegen tasks were often 1-many as a perf
hack.

As folks wearied of non-determinism, almost every codegen task has moved
away from the "1 -> many" model and onto "1 -> 1". My first pass at this
change enforced that on Spindle - but the perf hit was severe to the
point of near-hilarity.

Spindle was just not written with isolation in mind, and to that matter,
neither was Thrift. Passing a single target to Spindle still requires
gathering all the scala_record_library targets it depends on, and the
code generator will have to parse and regenerate all  of them, no
matter how recently it just did identical work.

Spindle also maintains its own cache, a scalate directory that holds class
files generated from the ssps and internal state. So isolating the ~500 spindle
targets meant that those files were regenerated ~500 times.

The compromise here is to treat the isolation of each Pants invocation as a
sizable enough win. Each invocation that requires executing Spindle creates
a scalate and output directory in a temp dir. Each target is generated from
that location, and a heuristic is used to gather any files that correspond
to the declared namespaces found in the target's direct source files.

Intellij Stubs
-------------
This refactor now allows the stubs to be included in the artifact cache.
It now only regenerates the files that have actually changed, and will
repopulate the output path should any expected files go missing. We
could consider turning this on for global runs now - it would be served
by the cache and silently update to whereever you are checked out to.

The only reason not to turn it on universally is that people who
regularly change thrift files would have to wait through those same
changes to regenerate the spindle and the stubs.

Fully regenerating both sets takes about 4.5 mins, if there are no cache
hits.

Before this change, each invocation of ide-gen took about 30s, no matter
the file state and with no cache support. After this lands, it will
generally noop. If a thrift file changes, it will only regenerate the
required file's worth - a single thrift file takes about 5s, but the
first target is the slowest.

A full rebuild of every stub now takes 60s or so, which is the pathological
case but the one we are worried about.

For now, I have turned on stub generation for the CIs, so generally speaking
users should be able to download them with a manual Pants run.

If you run Pants from inside Intellij, you should add the
--no-ide-gen-spindle-stubs-skip to your invocation, which will always
keep those stubs up to date.

To continue to generate on demand, run:
   ./pants ide-gen --no-skip src/thrift::

and checkout your scripts/Intellij/spindle-stubs/scala_record directory.

Closes: #1

* Fix the --skip option.

I have removed the custom product, so this task will now
be scheduled everytime regular SpindleGen is run. But it
is set to short circuit by default, it runs only when passing

  ./pants ide-gen --no-skip <targets>
        or
  ./pants <goal> --no-ide-gen-spindle-stubs-skip <targets>

I did it this way so that the CI would populate the cache
and so that Intellij users could add that to their IDE
run command and get the stubs populated on as as-needed basis
in the background.

The shortcircuit doesn't happen until after invalidation so
that the CI's tiny artifacts still get pulled down

* fixlint: print path and line

* spindle size checks & PrintWriter.checkError

* unicode fix for linter

* bump spindlegen version to trigger build issues on CI

* bump SpindleGen version again to force rebuild

* bump spindle version to force build

* experiment: disable caching in CI to fix gen-spindle

* restore jenkins caching

* bump version again to inspect jenkins local cache

* bump version (again) to inspect .pants.d/gen in CI

* reorder copy2 & vt.update to solve incomplete artifacts

* remove size_checks (not needed w/out nondeterminism)

* bump version (last time) to lucky number 13

* rm extra imports

* Bump the global cache key and final polish on spindle changes.

* add buildgen

(sapling split of dbb275e02839447ce1796ab9828dcc2ef50f9628)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants