Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add awkward1 and uproot4 to python tools #6119

Merged
merged 2 commits into from Aug 12, 2020

Conversation

mrodozov
Copy link
Contributor

please test
Resolves: #6117
@ianna

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 28, 2020

The tests are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @mrodozov (Mircho Rodozov) for branch IB/CMSSW_11_2_X/master.

@cmsbuild, @smuzaffar, @mrodozov, @tulamor can you please review it and eventually sign? Thanks.
cms-bot commands are listed here

@davidlange6
Copy link
Contributor

we really need two uproots and two awkwards?

@mrodozov
Copy link
Contributor Author

that's what @ianna asked for in #6117 .

@ianna
Copy link

ianna commented Jul 28, 2020

we really need two uproots and two awkwards?

The first ones were added on your request. If you think, it’s ok to upgrade, I’d happy to have only the latest versions.

@cmsbuild
Copy link
Contributor

+1
Tested at: db4db68
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-09f0de/8368/summary.html
CMSSW: CMSSW_11_2_X_2020-07-28-1100
SCRAM_ARCH: slc7_amd64_gcc820

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-09f0de/8368/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 8 differences found in the comparisons
  • DQMHistoTests: Total files compared: 34
  • DQMHistoTests: Total histograms compared: 2525444
  • DQMHistoTests: Total failures: 11
  • DQMHistoTests: Total nulls: 1
  • DQMHistoTests: Total successes: 2525385
  • DQMHistoTests: Total skipped: 47
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: -0.004 KiB( 33 files compared)
  • DQMHistoSizes: changed ( 10224.0 ): -0.004 KiB MessageLogger/Warnings
  • Checked 144 log files, 17 edm output root files, 34 DQM output files

@davidlange6
Copy link
Contributor

davidlange6 commented Jul 29, 2020 via email

@mrodozov
Copy link
Contributor Author

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 29, 2020

The tests are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

Pull request #6119 was updated.

@cmsbuild
Copy link
Contributor

-1

Tested at: 1b514a2

  • Build:

I found compilation warning when building: See details on the summary page.

You can see the results of the tests here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-09f0de/8385/summary.html

@cmsbuild
Copy link
Contributor

Pull request #6119 was updated.

@cmsbuild
Copy link
Contributor

+1
Tested at: 3d95401
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-09f0de/8719/summary.html
CMSSW: CMSSW_11_2_X_2020-08-11-1100
SCRAM_ARCH: slc7_amd64_gcc820

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-09f0de/8719/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 2 differences found in the comparisons
  • DQMHistoTests: Total files compared: 35
  • DQMHistoTests: Total histograms compared: 2612401
  • DQMHistoTests: Total failures: 5
  • DQMHistoTests: Total nulls: 1
  • DQMHistoTests: Total successes: 2612347
  • DQMHistoTests: Total skipped: 48
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.004 KiB( 34 files compared)
  • DQMHistoSizes: changed ( 10224.0 ): 0.004 KiB MessageLogger/Warnings
  • Checked 149 log files, 22 edm output root files, 35 DQM output files

@mrodozov mrodozov merged commit 065df4b into IB/CMSSW_11_2_X/master Aug 12, 2020
@mrodozov mrodozov deleted the mrodozov-patch-2 branch August 12, 2020 12:10
@smuzaffar
Copy link
Contributor

@mrodozov , this is failing to download the souerces for GCC10. Please check with @ianna and update the version.
@ianna , do we still need two version of awkward ( awkward and awkward1)?

@davidlange6
Copy link
Contributor

davidlange6 commented Dec 8, 2020 via email

@ianna
Copy link

ianna commented Dec 8, 2020

@mrodozov , this is failing to download the souerces for GCC10. Please check with @ianna and update the version.
@ianna , do we still need two version of awkward ( awkward and awkward1)?

We ought to switch the names: awkward to awkward0 and awkward1 to awkward
The same for uproot: uproot to uproot3 and uproot4 to uproot

@ianna
Copy link

ianna commented Dec 8, 2020

Also be aware that the default branch is named main, not master

@davidlange6
Copy link
Contributor

davidlange6 commented Dec 8, 2020 via email

@mrodozov
Copy link
Contributor Author

mrodozov commented Dec 8, 2020

@smuzaffar I see it just failed again. checking the new names

@davidlange6
Copy link
Contributor

davidlange6 commented Dec 8, 2020 via email

@mrodozov
Copy link
Contributor Author

mrodozov commented Dec 8, 2020

Why was the special source location for awkward1 needed?

On Dec 8, 2020, at 1:41 PM, Mircho Rodozov @.***> wrote: @smuzaffar I see it just failed again. checking the new names — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

because of the recursive submodules awkward is downloading to build itself.
at the time I was checking, submodules were downloaded and used for awkward to build, but not installed with it. so that either tho we use a submodule that is available as external this submodule is not later installed in the awkward installation and there will be no duplication in externals. that was one way of doing it, and it worked. It was either this solution or patches for the externals awkward needs that will also install the missing cmake files and find them to configure itself. something of that sort

note the submodules=1 in this line:
https://github.com/cms-sw/cmsdist/pull/6119/files#diff-c406bd44958425c52e021c208a1e0829141232bdd0f5655f683ac31e148fdde0R2

@davidlange6
Copy link
Contributor

davidlange6 commented Dec 8, 2020 via email

@davidlange6
Copy link
Contributor

davidlange6 commented Dec 8, 2020 via email

@jpivarski
Copy link

Is it intentional that Awkward Array is being compiled from source, rather than using the wheels (sdist vs bdist)?

The new structure of Awkward Array and Uproot PyPI names is:

  • awkward0 and uproot3 are the pure Python old versions with the old interfaces, intended for keeping old scripts working (by adding import awkward0 as awkward; import uproot3 as uproot).
  • awkward and uproot (unqualified) are the new versions with new interfaces. The awkward package has CMake + compiler dependencies if compiled from source. The sdist tarball should include the pybind11 and RapidJSON header-only dependencies so that pip install . in the source directory should fully compile and install.
  • awkward1 and uproot4 are now thin-wrappers (pure Python) that depend on awkward and uproot and pass through their symbols. Thus, pip install awkward1 will download and try to install awkward, so that import awkward1 creates a module full of references to the functions and classes that are actually defined in awkward.

This is the end of the name-change process, which started the last week of November. We're not planning on changing the names again: the old awkward0 and uproot3 libraries will always be available, under these names, but not actively maintained. Similarly, the awkward1 and uproot4 names exist for legacy, but as simple pointers.

And yes, if you're getting things from the GitHub repos, the default branches of scikit-hep/awkward-1.0 and scikit-hep/uproot4 are now named "main", rather than "master". (GitHub names have not changed yet, but when they do, GitHub will make redirect URLs.)

@davidlange6
Copy link
Contributor

davidlange6 commented Dec 8, 2020 via email

@jpivarski
Copy link

From the current diff for this PR:

+ awkward1==0.2.27
+ uproot4==0.0.16

Okay, well, these versions of awkward1 and uproot4 are not thin wrappers pointing to awkward and uproot. If you need to pin versions, this is what you should pin them to:

  • awkward0==0.15.1 (all module references "awkward" renamed to "awkward1" so that users can import awkward0 as awkward)
  • uproot3==3.14.1 (all module references "uproot" renamed to "uproot3" so that users can import uproot3 as uproot)
  • to be explicit, uproot3 depends on uproot3-methods==0.10.0
  • awkward==1.0.0 the first new version, which will have a regular cadence from now on, either fortnightly or monthly (TBD, motivated by the fact that binary distributions are large and we'll need to delete release candidates from PyPI, so the non-release candidates have to be regular milestones)
  • uproot==4.0.0 the first new version, which will still be updated as-needed, in response to bug reports
  • awkward1==1.0.0 points to awkward>=1.0.0, so this will be valid forever
  • uproot4==4.0.0 points to uproot>=4.0.0, so this will be valid forever

The versions you've pinned to, awkward1==0.2.27 (July 23, 2020) and uproot4==0.0.16 (July 19, 2020), are quite old. A lot of work has been done since then.

@mrodozov
Copy link
Contributor Author

mrodozov commented Dec 8, 2020

Since then we updated it to https://github.com/cms-sw/cmsdist/blob/IB/CMSSW_11_3_X/master/pip/requirements.txt#L34
So now it's on 0.4.4 and uproot is 0.14
Anyway, should we get the latest and change the names again ?

@jpivarski
Copy link

Yes. We try to compile everything from source.

I'm going to try

pip install awkward --no-binary :all:

to make sure that this works in principle. I'm realizing now that in the checks for the new versions, I hadn't explicitly tried the source distribution, but of course that's important.

It finished while writing this comment:

% pip uninstall awkward
WARNING: Skipping awkward as it is not installed.
% pip install awkward --no-binary :all:
Collecting awkward
  Downloading awkward-1.0.0.tar.gz (804 kB)
     |████████████████████████████████| 804 kB 23.9 MB/s 
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
    Preparing wheel metadata ... done
Requirement already satisfied: numpy>=1.13.1 in ./miniconda3/lib/python3.8/site-packages (from awkward) (1.19.1)
Building wheels for collected packages: awkward
  Building wheel for awkward (PEP 517) ... done
  Created wheel for awkward: filename=awkward-1.0.0-cp38-cp38-linux_x86_64.whl size=5996163 sha256=cbf5889858d9fe2e31f0fa060bb39c2db04d4a62cc0221fa35e6c98b7dd5f1b7
  Stored in directory: /home/jpivarski/.cache/pip/wheels/13/ff/9d/acf323e172097f553a0eb203a6c14d41cad375685679dbd2eb
Successfully built awkward
Installing collected packages: awkward
Successfully installed awkward-1.0.0
% python
Python 3.8.5 | packaged by conda-forge | (default, Jul 31 2020, 02:39:48) 
[GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import awkward
>>> awkward.Array([[1, 2, 3], [], [4, 5]])
<Array [[1, 2, 3], [], [4, 5]] type='3 * var * int64'>

It looks like compilation from source is, in principle, not broken. CMake, make, and a compiler are required—I don't believe there are other compilation dependencies. (NumPy is a Python dependency, as you can see above.)

We should instead add dependencies to tell awkward about them. It looks like we have one of the dependencies (pybind11) and not the other (rapidjson)

In Awkward Array, pybind11 is part of its public API (its only reason for being) but RapidJSON is not even "leaked" into the public API. None of the include/awkward/**/*.h reveal any references to RapidJSON. Also, we statically compile RapidJSON into Awkward Array (using its header-only library interface) and we do not expose symbols used internally (that's actually a pybind11 requirement). So RapidJSON can be considered a completely internal aspect of Awkward Array compilation—there's no reason to share versions with other libraries or avoid conflicting symbols (there are none).

As for pybind11, this is going to be tricky. I've done tests in which two libraries both compile with different versions of pybind11. They can both be loaded into Python without errors (because pybind11 requires projects to hide symbols by default), but it wasn't possible to pass Awkward Arrays from one of these libraries to the other—it just didn't recognize the types. When both libraries are compiled with the same version of pybind11, there's no issue. So libraries that depend on Awkward Array at the C++ level (such as what we're planning for pyjet: scikit-hep/pyjet#31) will depend on Awkward with a particular version of pybind11. I'll probably need to document which versions of Awkward depend on which versions of pybind11 and change it infrequently. If you're compiling Awkward Array and injecting a different pybind11, that can make the Awkward-version/pybind11-version not match the table. We should at least talk about that to come up with a good plan.

Right now, Awkward Array is pinned to pybind11==2.6.0 because this is the first version that supports Python 3.9. There's also a pybind11=2.6.1, but Awkward Array doesn't use it yet.

@jpivarski
Copy link

Since then we updated it to https://github.com/cms-sw/cmsdist/blob/IB/CMSSW_11_3_X/master/pip/requirements.txt#L34
So now it's on 0.4.4 and uproot is 0.14
Anyway, should we get the latest and change the names again ?

awkward==0.14.0
awkward1==0.4.4

uproot==3.13.0
uproot-methods==0.8.0
uproot4==0.1.2

These are nearly the latest versions before the name change. It would be a good idea to cross the boundary to the new names if this is a good time to make such a change. The new names and versions should be taken all at once, not piecemeal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New external tools: Awkward Array 1.0 and uproot4
6 participants