Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8267213: cpuinfo_segv is incorrectly triaged as execution protection violation on x86_32 #4044

Closed
wants to merge 4 commits into from

Conversation

DamonFool
Copy link
Member

@DamonFool DamonFool commented May 17, 2021

Hi all,

This is a follow-up of JDK-8260046.
And it can be reproduced by java -XX:UnguardOnExecutionViolation=1 on x86_32.
Let's fix it

Thanks.
Best regards,
Jie


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8267213: cpuinfo_segv is incorrectly triaged as execution protection violation on x86_32

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/4044/head:pull/4044
$ git checkout pull/4044

Update a local copy of the PR:
$ git checkout pull/4044
$ git pull https://git.openjdk.java.net/jdk pull/4044/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 4044

View PR using the GUI difftool:
$ git pr show -t 4044

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/4044.diff

@bridgekeeper
Copy link

bridgekeeper bot commented May 17, 2021

👋 Welcome back jiefu! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@DamonFool
Copy link
Member Author

/test
/label add hotspot
/cc hotspot

@openjdk openjdk bot added the hotspot hotspot-dev@openjdk.org label May 17, 2021
@openjdk
Copy link

openjdk bot commented May 17, 2021

@DamonFool
The hotspot label was successfully added.

@openjdk
Copy link

openjdk bot commented May 17, 2021

@DamonFool The hotspot label was already applied.

@DamonFool DamonFool marked this pull request as ready for review May 17, 2021 06:06
@openjdk openjdk bot added the rfr Pull request is ready for review label May 17, 2021
@mlbridge
Copy link

mlbridge bot commented May 17, 2021

Webrevs

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Jie,

I'm not sure this is the right fix. It seems to me from the comments about having a sane addr and pc that the basic assumption/premise is that addr > pc as it is being checked when expected to be part of the current instruction. So if addr < pc I would think pc_is_near_addr should be false.

David

@DamonFool
Copy link
Member Author

Hi Jie,

I'm not sure this is the right fix. It seems to me from the comments about having a sane addr and pc that the basic assumption/premise is that addr > pc as it is being checked when expected to be part of the current instruction. So if addr < pc I would think pc_is_near_addr should be false.

David

Thanks @dholmes-ora for your review.

Okay, I think your suggestion is good since it just works as before and won't make things worse.

But I'm not sure whether the basic assumption/premise that addr > pc is always right.

Let's consider the following case

...
    inst1
    inst2
    jmp L1  <--- sizeof(jmp) = 5 bytes, the first byte is on Page k, the other 4 bytes on Page k+1
    ...
L1:

In this case, I think addr would point to some part of the jmp instruction and pc would point to L1, which means addr < pc.
What do you think?

Thanks.
Best regards,
Jie

@albertnetymk
Copy link
Member

How about removing the if?

bool pc_is_near_addr = (addr >= pc) && 
                       pointer_delta((void*) addr, (void*) pc, sizeof(char)) < 15;

@DamonFool
Copy link
Member Author

How about removing the if?

bool pc_is_near_addr = (addr >= pc) && 
                       pointer_delta((void*) addr, (void*) pc, sizeof(char)) < 15;

Good suggestion.
Updated.
Thanks.

@mlbridge
Copy link

mlbridge bot commented May 17, 2021

Mailing list message from David Holmes on hotspot-dev:

On 17/05/2021 5:54 pm, Jie Fu wrote:

On Mon, 17 May 2021 06:45:45 GMT, David Holmes <dholmes at openjdk.org> wrote:

Hi Jie,

I'm not sure this is the right fix. It seems to me from the comments about having a sane addr and pc that the basic assumption/premise is that addr > pc as it is being checked when expected to be part of the current instruction. So if addr < pc I would think pc_is_near_addr should be false.

David

Thanks @dholmes-ora for your review.

Okay, I think your suggestion is good since it just works as before and won't make things worse.

But I'm not sure whether the basic assumption/premise that `addr > pc` is always right.

Let's consider the following case

...
inst1
inst2
jmp L1 <--- sizeof(jmp) = 5 bytes, the first byte is on Page k, the other 4 bytes on Page k+1
...
L1:

In this case, I think `addr` would point to some part of the jmp instruction and `pc` would point to L1, which means addr < pc.
What do you think?

I would expect pc to point to start of jmp instruction and addr to point
later.

David
-----

@DamonFool
Copy link
Member Author

I would expect pc to point to start of jmp instruction and addr to point
later.

OK.
If so, the original code is right and the lasted fix just follows what it does before.
Thanks.

@mlbridge
Copy link

mlbridge bot commented May 18, 2021

Mailing list message from David Holmes on hotspot-dev:

Hi Jie,

On 18/05/2021 9:14 am, Jie Fu wrote:

On Mon, 17 May 2021 22:27:16 GMT, David Holmes <david.holmes at oracle.com> wrote:

I would expect pc to point to start of jmp instruction and addr to point
later.

OK.
If so, the original code is right and the lasted fix just follows what it does before.
Thanks.

I don't know how this "pc is near addr" check ends up affecting the
assert(left >= right) but presumably we are hitting a case where the
addr is in fact < pc. So the question is then whether that should be
considered "near" or not. Your original fix decided "near" means within
15 in either direction; while under an expectation that addr >= pc, the
current fix only consider it near within +15.

Have you analysed the original crash to check what the actual pc and
addr values were?

Thanks,
David

1 similar comment
@mlbridge
Copy link

mlbridge bot commented May 18, 2021

Mailing list message from David Holmes on hotspot-dev:

Hi Jie,

On 18/05/2021 9:14 am, Jie Fu wrote:

On Mon, 17 May 2021 22:27:16 GMT, David Holmes <david.holmes at oracle.com> wrote:

I would expect pc to point to start of jmp instruction and addr to point
later.

OK.
If so, the original code is right and the lasted fix just follows what it does before.
Thanks.

I don't know how this "pc is near addr" check ends up affecting the
assert(left >= right) but presumably we are hitting a case where the
addr is in fact < pc. So the question is then whether that should be
considered "near" or not. Your original fix decided "near" means within
15 in either direction; while under an expectation that addr >= pc, the
current fix only consider it near within +15.

Have you analysed the original crash to check what the actual pc and
addr values were?

Thanks,
David

@DamonFool
Copy link
Member Author

Mailing list message from David Holmes on hotspot-dev:

Hi Jie,

On 18/05/2021 9:14 am, Jie Fu wrote:

On Mon, 17 May 2021 22:27:16 GMT, David Holmes <david.holmes at oracle.com> wrote:

I would expect pc to point to start of jmp instruction and addr to point
later.

OK.
If so, the original code is right and the lasted fix just follows what it does before.
Thanks.

I don't know how this "pc is near addr" check ends up affecting the
assert(left >= right) but presumably we are hitting a case where the
addr is in fact < pc. So the question is then whether that should be
considered "near" or not. Your original fix decided "near" means within
15 in either direction; while under an expectation that addr >= pc, the
current fix only consider it near within +15.

Have you analysed the original crash to check what the actual pc and
addr values were?

Thanks,
David

The crash case should not be considered "near" since addr = 0x0, pc = 0xe66095e6.
It seems like a harmless false-positive check [1].

And the stack is like this:

Current thread (0xf5817218):  JavaThread "Unknown thread" [_thread_in_vm, id=41005, stack(0xf5907000,0xf5958000)]

Stack: [0xf5907000,0xf5958000],  sp=0xf59559f0,  free space=314k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x1103c98]  PosixSignals::pd_hotspot_signal_handler(int, siginfo_t*, ucontext_t*, JavaThread*)+0x588
V  [libjvm.so+0x137209e]  JVM_handle_linux_signal+0x14e
V  [libjvm.so+0x1372313]  javaSignalHandler(int, siginfo_t*, void*)+0x23

Registers:
EAX=0xf779b000, EBX=0xf74c5ae8, ECX=0xf5817218, EDX=0xf753791c
ESP=0xf59559f0, EBP=0xf5955a58, ESI=0xe66095e6, EDI=0x00000000
EIP=0xf6b67c98, EFLAGS=0x00210213, CR2=0x00000000f779b000

Thanks.

[1] https://github.com/openjdk/jdk/blob/master/src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp#L344

@DamonFool
Copy link
Member Author

Have you analysed the original crash to check what the actual pc and
addr values were?

Hi @dholmes-ora ,

After more investigation, I believe this is a signal handling bug.

In this case, cpuinfo_segv is incorrectly triaged as execution protection violation on Linux/x86_32.

During VM initialization, cpuinfo_segv [1] will be triggered (by accessing addr=0) on purpose.

#0  VM_Version::get_processor_features () at /home/jdk/src/hotspot/cpu/x86/vm_version_x86.cpp:630
#1  0xf720cc21 in VM_Version::initialize () at /home/jdk/src/hotspot/cpu/x86/vm_version_x86.cpp:1890
#2  0xf7206d85 in VM_Version_init () at /home/jdk/src/hotspot/share/runtime/vm_version.cpp:32
#3  0xf6b72e4f in init_globals () at /home/jdk/src/hotspot/share/runtime/init.cpp:119
#4  0xf71500c6 in Threads::create_vm (args=0xf621a26c, canTryAgain=0xf621a1d3) at /home/jdk/src/hotspot/share/runtime/thread.cpp:2854
#5  0xf6c6b167 in JNI_CreateJavaVM_inner (vm=0xf621a2bc, penv=0xf621a2c0, args=0xf621a26c) at /home/jdk/src/hotspot/share/prims/jni.cpp:3592
#6  0xf6c6b35c in JNI_CreateJavaVM (vm=0xf621a2bc, penv=0xf621a2c0, args=0xf621a26c) at /home/jdk/src/hotspot/share/prims/jni.cpp:3680
#7  0xf7fbe61f in InitializeJVM (pvm=0xf621a2bc, penv=0xf621a2c0, ifn=0xf621a300) at /home/jdk/src/java.base/share/native/libjli/java.c:1539
#8  0xf7fbb283 in JavaMain (_args=0xffffa484) at /home/jdk/src/java.base/share/native/libjli/java.c:415
#9  0xf7fc1bed in ThreadJavaMain (args=0xffffa484) at /home/jdk/src/java.base/unix/native/libjli/java_md.c:651
#10 0xf7d983bd in start_thread (arg=0xf621ab40) at pthread_create.c:463

The VM can recognizes it as cpuinfo_segv [2] here and assigned the stub.
But unfortunately, it's re-triaged as execution protection violation on x86_32 when UnguardOnExecutionViolation > 0, which shouldn't happen.

To avoid this kind of false-positive, one more condition stub == NULL is added.

Note: we don't need to change windows since there is a special signal for this condition [3].

Thanks.
Best regards,
Jie

[1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/vm_version_x86.cpp#L466
[2] https://github.com/openjdk/jdk/blob/master/src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp#L246
[3] https://github.com/openjdk/jdk/blob/master/src/hotspot/os/windows/os_windows.cpp#L2449

@DamonFool DamonFool changed the title 8267213: assert(left >= right) failed: avoid underflow 8267213: cpuinfo_segv is incorrectly triaged as execution protection violation on x86_32 May 23, 2021
@DamonFool
Copy link
Member Author

/summary 8267213: cpuinfo_segv is incorrectly triaged as execution protection violation on x86_32

@openjdk
Copy link

openjdk bot commented May 23, 2021

@DamonFool Setting summary to 8267213: cpuinfo_segv is incorrectly triaged as execution protection violation on x86_32

@DamonFool
Copy link
Member Author

Have you analysed the original crash to check what the actual pc and
addr values were?

Hi @dholmes-ora ,

After more investigation, I believe this is a signal handling bug.

In this case, cpuinfo_segv is incorrectly triaged as execution protection violation on Linux/x86_32.

During VM initialization, cpuinfo_segv [1] will be triggered (by accessing addr=0) on purpose.

#0  VM_Version::get_processor_features () at /home/jdk/src/hotspot/cpu/x86/vm_version_x86.cpp:630
#1  0xf720cc21 in VM_Version::initialize () at /home/jdk/src/hotspot/cpu/x86/vm_version_x86.cpp:1890
#2  0xf7206d85 in VM_Version_init () at /home/jdk/src/hotspot/share/runtime/vm_version.cpp:32
#3  0xf6b72e4f in init_globals () at /home/jdk/src/hotspot/share/runtime/init.cpp:119
#4  0xf71500c6 in Threads::create_vm (args=0xf621a26c, canTryAgain=0xf621a1d3) at /home/jdk/src/hotspot/share/runtime/thread.cpp:2854
#5  0xf6c6b167 in JNI_CreateJavaVM_inner (vm=0xf621a2bc, penv=0xf621a2c0, args=0xf621a26c) at /home/jdk/src/hotspot/share/prims/jni.cpp:3592
#6  0xf6c6b35c in JNI_CreateJavaVM (vm=0xf621a2bc, penv=0xf621a2c0, args=0xf621a26c) at /home/jdk/src/hotspot/share/prims/jni.cpp:3680
#7  0xf7fbe61f in InitializeJVM (pvm=0xf621a2bc, penv=0xf621a2c0, ifn=0xf621a300) at /home/jdk/src/java.base/share/native/libjli/java.c:1539
#8  0xf7fbb283 in JavaMain (_args=0xffffa484) at /home/jdk/src/java.base/share/native/libjli/java.c:415
#9  0xf7fc1bed in ThreadJavaMain (args=0xffffa484) at /home/jdk/src/java.base/unix/native/libjli/java_md.c:651
#10 0xf7d983bd in start_thread (arg=0xf621ab40) at pthread_create.c:463

The VM can recognizes it as cpuinfo_segv [2] here and assigned the stub.
But unfortunately, it's re-triaged as execution protection violation on x86_32 when UnguardOnExecutionViolation > 0, which shouldn't happen.

To avoid this kind of false-positive, one more condition stub == NULL is added.

Note: we don't need to change windows since there is a special signal for this condition [3].

Thanks.
Best regards,
Jie

[1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/vm_version_x86.cpp#L466
[2] https://github.com/openjdk/jdk/blob/master/src/hotspot/os_cpu/linux_x86/os_linux_x86.cpp#L246
[3] https://github.com/openjdk/jdk/blob/master/src/hotspot/os/windows/os_windows.cpp#L2449

May I get reviews for this small fix?
Thanks.

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay.

The revised analysis and fix seems reasonable.

Thanks,
David

@openjdk
Copy link

openjdk bot commented May 24, 2021

@DamonFool This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8267213: cpuinfo_segv is incorrectly triaged as execution protection violation on x86_32

Reviewed-by: dholmes

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 103 new commits pushed to the master branch:

  • 3113910: 8267553: Extra JavaThread assignment in ClassLoader::create_class_path_entry()
  • 4d26f22: 8264304: Create implementation for NSAccessibilityToolbar protocol peer
  • 6288a99: 8267404: vmTestbase/vm/mlvm/anonloader/stress/oome/metaspace/Test.java failed with OutOfMemoryError
  • 71e2fa2: 8267531: [x86] Assembler::andb(Address,Register) encoding is incorrect
  • 4023646: 8266528: Optimize C2 VerifyIterativeGVN execution time
  • 2462316: 8261354: SIGSEGV at MethodIteratorHost
  • 72c9567: 8263486: Clean up MTLSurfaceDataBase.h
  • fe33343: 8256304: should MonitorUsedDeflationThreshold be experimental or diagnostic
  • 8f10c5a: 8267190: Optimize Vector API test operations
  • 94cfeb9: 8256155: Allow multiple large page sizes to be used on Linux
  • ... and 93 more: https://git.openjdk.java.net/jdk/compare/2066f497b9677971ece0b8a4d855f87a2f4c4018...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label May 24, 2021
@DamonFool
Copy link
Member Author

The revised analysis and fix seems reasonable.

Thanks @dholmes-ora .

@DamonFool
Copy link
Member Author

/summary

@openjdk
Copy link

openjdk bot commented May 24, 2021

@DamonFool Removing existing summary

@DamonFool
Copy link
Member Author

/integrate

@openjdk openjdk bot closed this May 25, 2021
@openjdk openjdk bot added integrated Pull request has been integrated and removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels May 25, 2021
@openjdk
Copy link

openjdk bot commented May 25, 2021

@DamonFool Since your change was applied there have been 128 commits pushed to the master branch:

  • 66b190e: 8267612: Declare package-private VarHandle.AccessMode/AccessType counts
  • a52c4ed: 8267110: Update java.util to use instanceof pattern variable
  • 0a03fc8: 8255674: SSLEngine class description is missing "case" in switch statement
  • d86f916: 8267066: New NSAccessibility peers should return they roles and subroles directly
  • 31d0f0d: 8248843: java in source-file mode suggests javac-only options
  • 2e8812d: 8265129: Add intrinsic support for JVM.getClassId
  • 123cdd1: 8264973: AArch64: Optimize vector max/min/add reduction of two integers with NEON pairwise instructions
  • b4d4884: 8267126: javadoc should show "line and caret" for diagnostics.
  • 461a3fe: 8261478: InstanceKlass::set_classpath_index does not match comments
  • de27da7: 8267431: Rename InstanceKlass::has_old_class_version to can_be_verified_at_dumptime
  • ... and 118 more: https://git.openjdk.java.net/jdk/compare/2066f497b9677971ece0b8a4d855f87a2f4c4018...master

Your commit was automatically rebased without conflicts.

Pushed as commit b403d39.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@DamonFool DamonFool deleted the JDK-8267213 branch May 25, 2021 11:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot hotspot-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

3 participants