Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NETBEANS-1428] Fix for profiler crash - with CI/CD workflow #2700

Merged
merged 7 commits into from Apr 1, 2021
Merged

[NETBEANS-1428] Fix for profiler crash - with CI/CD workflow #2700

merged 7 commits into from Apr 1, 2021

Conversation

lbruun
Copy link
Contributor

@lbruun lbruun commented Jan 21, 2021

This is a fix for [NETBEANS-1428]

This PR replaces #2021.
This PR builds upon the fixes made by Peter Hull (I've cherry-picked commits from PR-2021) but corrects a number of issues, changes copyright from Oracle to ASF and finally adds a GitHub Actions workflow which builds the native binary on Windows, Linux and MacOS in a manner which is consistent, transparent and reproducible. The core component of the fix is still Peter's. Peter analyzed the bug, diagnosed it and came up with a solution (partly based on discussion on the OpenJDK mailing list) so all credit to Peter!!.

Background

The reason for the bug is that the original C code in lib.profiler, which acts as JNI interface, made an incorrect assumption about how to convert from jint to jmethodID and in doing so effectively relied on an undocumented JVM internal. This assumption was broken with Java 9 on Windows, hence the crash. Peter's fix is to have a HashMap in the C code which can translate from jint to jmethodID but only do so on platforms where the two are of different implementation and therefore need conversion.

It can be said that the bug existed on all platforms, but because of some JVM internals only manifests itself on the Windows platform. Peter's fix targets all platforms and this is important as we are then immune to the bug if the JVM in future changes its (undocumented) internals for Linux or MacOS. Therefore it is important to re-build binaries not only for Windows, but for other platforms too.

Produced artifacts

The GitHub Workflow added in this PR produce binaries for

  • Linux x86_64 , 32-bit and 64-bit
  • Windows x86_64, 32-bit and 64-bit
  • MacOS x86_64, 64-bit only

as all other platforms are now considered unsupported from NetBeans Profiler point of view.

The GitHub workflow produces a ZIP file, profiler-external-binaries-ASF.zip which can be downloaded from the "Artifacts" section of a particular run in the GitHub Actions UI. This ZIP is suitable for upload to https://netbeans.osuosl.org/binaries/.

Why another PR?

Although the base of the fix in #2021, the C code, was/is fully correct, I felt there were too many changes required to make the PR work (reproducible build, rather than as a one-time lucky punch) and only the PR creator would have permission to make those changes. It could potentially take many ping-pong discussions. Also this PR has a clear view of how the fix can be incorporated into the NetBeans distribution following acceptance of the PR. See below.

Testing this PR

or rather : testing the new binaries which are a result of this PR.

Provoke the crash

  1. Be sure that your NetBeans IDE uses a 64-bit JDK. In order to actually see the bug you need to be on Windows and your IDE must be using Java 9 or later.
  2. Start Netbeans IDE
  3. Select File|New Project|Samples|Java with Ant|Anagram Game .. and create the project.
  4. On the "AnagramGame" project: Go project properties|Libraries and make sure a JDK platform 9 or later is selected.
  5. On the "AnagramGame" project: Right click project and select "Profile".
  6. Click Configure Session and select "Methods"
  7. Start profiling session
  8. Anagram Game's window will briefly appear and then disappear. The Output Window will record a JVM crash.

Replace with new binaries

  1. Exit NetBeans IDE
  2. Download the profiler-external-binaries-ASF.zip from the "Artifacts" sections of the GitHub Actions UI and explode it onto your NetBeans IDE installation directory, folder netbeans/profiler/, so that files from the ZIP overwrite what you already have on your harddisk. (you should make a backup prior to this of the files that will be overwritten).
  3. Start the IDE.
  4. Do steps 1-8 again from the "Provoke the crash" section. The crash should now no longer happen.

For MacOS and Linux there's no crash, so the test should be that the new binaries (also) work.

Next steps

This is essentially a two-step process which requires not one, but two, PRs. This is the first of those PRs.

Once this PR is accepted the GitHub workflow will run and produce artifact. Someone with proper permission must then upload this to
https://netbeans.osuosl.org/binaries/. (naming the file profiler-external-binaries-ASF.zip)

I'll then produce a follow-up PR which incorporates the new ZIP in the Ant build scripts like this;

Currently the Ant build script looks like this:

    <target name="-process.release.files">
        <unzip src="external/profiler-external-binaries-8.2.zip" 
               overwrite="false"
               dest="${profiler.cluster}"/>
    </target>

in the future it will look like this:

    <target name="-process.release.files">
        <!-- CDDL licensed binaries, from the Oracle days -->
        <unzip src="external/profiler-external-binaries-8.2.zip" 
               overwrite="false"
               dest="${profiler.cluster}"/>
    </target>
        <!-- AL2 licensed binaries, build from ASF code -->
    <target name="-process.release.files">
        <unzip src="external/profiler-external-binaries-ASF.zip" 
               overwrite="true"
               dest="${profiler.cluster}"/>
    </target>

In other words: We retain all esoteric binaries from the CDDL licensed bundle but those are overlayed with ones created in ASF (meaning those that are AL2 licensed). The new ASF bundle only include those platforms listed at the top and which are probably the only platforms we can truly support going forward. So for the Windows, MacOS and Linux platforms it will be the binaries from the ASF build which are effectively being used ... and since these are the only ones which has been rebuild it is also the only ones which includes the bug fix for [NETBEANS-1428] but I think that is acceptable as we have no knowledge that the bug will manifest itself on those esoteric platforms. Nor do we have any means of re-building for those platforms. So we leave them in the dust.

pedro-w and others added 6 commits January 21, 2021 11:52
* Stacks.c: corrected some incorrect casts for 64-bit windows. Implemented
  mapping from methodIds to ints using a hash table for 64-bit platforms.
* Updated build scripts for Windows, Linux and Mac.
* Moved scripts out of the build subdirectory.
Build scripts for legacy platforms moved to 'legacy' folder to signal
that these scripts are no longer being maintained .. nor used.
As of now we focus on the following platforms only:

Linux X64  (64-bit)
Linux X86  (32-bit)
Windows X64  (64-bit)
Windows X86  (32-bit)
MacOS X64  (64-bit)

Additonal note: Generating JNI headers for Java classes is now part
of the Ant build so the 'generate-headers-15.bat' file is also
no longer used.
Windows only: This information shows up if you right-click the DLL
in Windows File Explorer and choose 'Properties' --> 'Details'.
Properly very few people would do that. All information here,
including the version information, is irrelevant. The version number
is just an arbitrary string. There is no ambition that this
version number follows the general NetBeans versioning.
Since the build scripts will from now on be used in a CI/CD pipeline
they need to be strictly correct, not have bugs and not require manual
intervention. Therefore...

Fixes:
- The Windows build scripts did not incorporate Peter's fix, i.e. creating
  the 'config.h' file as a pre-cursor to the actual build.
  This is now fixed. (compare to build scripts for MacOS and Linux)
- The Windows build script did not include headers from '..\build'
  when compiling. They therefore did not work. This is now fixed.
  (this is a consequence of the JNI headers having been removed from
   source control in commit f395ace;
   the change was correctly implemented at the time in build scripts
   for Linux and MacOS but for some reason never in build scripts for
   Windows)
- The Windows build scripts will now stop on first error. (checks for
  exit code and abends immediately). The Linux and MacOS scripts do not
  need the same logic as such script can simply be executed with
  'bash -x <script>'.
- The content of the 'config.h' file is now echoed to the console
  as part of the build. This file is crucial to understand the fix
  for the bug reported in NETBEANS-1428.
- Since we will be creating 32-bit binaries on 64-bit platforms
  (because we can no longer find 32-bit hosts to build on):
  The Linux 32-bit script did not create a 32-bit binary for the
  'config' application, hence it produced an incorrect 'config.h' file.
- The Windows build scripts could not handle path names with spaces
  in them as is typical in Windows world. Hence the script did not
  work when executed from a GitHub Action workflow. This is now fixed.
  (too lazy to fix the same problem in the MacOS and Linux scripts;
   on those platforms the problem never manifests itself)
- MacOS script: was hardwired to use the installed JDK rather
  than using the JDK_HOME environment variable. This is now fixed.
- MacOS script: added comments and transparency about how the Universal
  Library mechanism works. (the result of the mechanism is that
  binaries for legacy platforms PowerPC and i386 architectures
  still exist in the bundle but do not receive bug fixes .. meaning
  they'll not as an example receive bug fix for NETBEANS-1428)
- The Windows build scripts used Unix line endings. Although Windows
  doesn't mind this when executing such script it has now been corrected.
@errael
Copy link
Contributor

errael commented Jan 21, 2021

A cause for celebration.

Someone with proper permission must then upload this

So should the internals change in the future, the manual step is required to upload?

if the JVM in future changes its (undocumented) internals for Linux or MacOS ... as all other platforms are now considered unsupported from NetBeans Profiler

It it correct that these other platforms will continue to work (assuming the internals aren't changed)?

Can you get any reviewers assigned?

@pedro-w
Copy link
Contributor

pedro-w commented Jan 21, 2021

Big thumbs up @lbruun, I'm very glad you've been able to finish this off properly.

@lbruun
Copy link
Contributor Author

lbruun commented Jan 21, 2021

A cause for celebration.

Someone with proper permission must then upload this

So should the internals change in the future, the manual step is required to upload?

if the JVM in future changes its (undocumented) internals for Linux or MacOS ... as all other platforms are now considered unsupported from NetBeans Profiler

It it correct that these other platforms will continue to work (assuming the internals aren't changed)?

Can you get any reviewers assigned?

…filer

The workflow builds 'profilerinterface' library on the currently
maintained platforms (see file header) and produces a single ZIP file
which is suitable for upload to https://netbeans.osuosl.org/binaries/.
@lbruun lbruun closed this Mar 21, 2021
@lbruun lbruun reopened this Mar 21, 2021
@lbruun
Copy link
Contributor Author

lbruun commented Apr 1, 2021

Merging as per the discussion on the dev mailing list.

@lbruun lbruun merged commit 5273a10 into apache:master Apr 1, 2021
@pedro-w
Copy link
Contributor

pedro-w commented Apr 1, 2021

This is great. Thanks @lbruun !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants