Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Libhermit's use of halt and the panic_handler #45

Closed
jounathaen opened this issue May 6, 2020 · 10 comments
Closed

Libhermit's use of halt and the panic_handler #45

jounathaen opened this issue May 6, 2020 · 10 comments

Comments

@jounathaen
Copy link
Member

jounathaen commented May 6, 2020

When panicing, RustyHermit loops arch::processor::halt() infinitely. Uhyve treats hlt instructions by shutting down the vm with exit code 0. (hermit-os/uhyve#7)

Related Issue: hermit-os/uhyve#11

RustyHermit should not use arch::processor::halt when running in a VM but use the proper shutdown command.

  • It should probably keep using a hlt loop when running on real hardware so that the error message can be observed on e.g. the monitor.
  • It could write to the shutdown port, so that uhyve shuts down before the hlt-loop. I don't know how real Hardware handles out instructions to a non-existing port.
@stlankes
Copy link
Contributor

Doesn't the patch solve this issue?

@jschwe
Copy link
Contributor

jschwe commented May 15, 2020

Is this issue completely solved?

I'd like to link to the Travis CI build for commit 4a2381ccc192b3d699fd4b54fb007e297bac9e2d. In release mode it seems a race condition occured that caused a Panic due to a BorrowError: [0][!!!PANIC!!!] /home/travis/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libcore/cell.rs:798: already mutably borrowed: BorrowError
Given the output I assume that the BorrowError occured in the kernel or demo and not in uhyve.
The thing that makes this related to this isssue is that uhyve doesn't exit after the panic.

I've saved the log output here so it doesn't get lost if someone reruns the build.
log.txt

Originally posted by @jschwe in hermit-os/uhyve#7 (comment)

@stlankes
Copy link
Contributor

stlankes commented May 15, 2020

I guess that an interrupt handler is trying to borrow something, which is already borrowed. I will check it. In my opinion, it is a completely different problem and has nothing to do with this issue.

@jschwe
Copy link
Contributor

jschwe commented May 15, 2020

The reason I'm mentioning this here is because uhyve doesn't exit after the Panic.

@stlankes
Copy link
Contributor

stlankes commented May 15, 2020

Ah... You tested with the current patch?

@jschwe
Copy link
Contributor

jschwe commented May 15, 2020

The Travis CI build was for hermit-os/uhyve@4a2381c and tested against hermit-os/hermit-rs@18a0235 (current state of the master branch as of 14 hours ago)

Regarding uhyve this means that the Halt State is ignored since hermit-os/uhyve#14 is already merged, so if the panic handler still enters the halt state, this would explain why uhyve doesn't exit.

@stlankes
Copy link
Contributor

I hope that PR #51 helps to fix the issue.

@stlankes stlankes reopened this May 15, 2020
@jschwe
Copy link
Contributor

jschwe commented May 20, 2020

The travis job for fc5d048 failed and entered the idle loop. Before merging the branch all tests passed.
This indicates that the issue is not yet solved, or perhaps flaky.
The relevant part from the log:

Backtrace

$ sudo -E sudo -u $USER -E bash -c "HERMIT_VERBOSE=1 HERMIT_CPUS=2 $HOME/.cargo/bin/uhyve target/x86_64-unknown-hermit/debug/rusty_demo"

[0][INFO] Welcome to HermitCore-rs 0.3.30

[0][INFO] Kernel starts at 0x200000

[0][INFO] BSS starts at 0x47d680

[0][INFO] TLS starts at 0x47b5c8 (size 304 Bytes)

[0][INFO] Total memory size: 64 MB

[0][INFO] A pure Rust application is running on top of HermitCore!

[0][INFO] Heap: size 50 MB, start address 0x600000

[0][INFO] Heap is located at 0x600000 -- 0x3800000 (0 Bytes unmapped)

[0][INFO] 

[0][INFO] ===================== PHYSICAL MEMORY FREE LIST ======================

[0][INFO] 0x00000003800000 - 0x00000004000000

[0][INFO] ======================================================================

[0][INFO] 

[0][INFO] 

[0][INFO] ================== KERNEL VIRTUAL MEMORY FREE LIST ===================

[0][INFO] 0x00000003800000 - 0x00800000000000

[0][INFO] ======================================================================

[0][INFO] 

[0][INFO] 

[0][INFO] ========================== CPU INFORMATION ===========================

[0][INFO] Model:                   uhyve - unikernel hypervisor

[0][INFO] Frequency:               2800 MHz (from Hypervisor)

[0][INFO] SpeedStep Technology:    Not Available

[0][INFO] Features:                MMX SSE SSE2 SSE3 SSSE3 SSE4.1 SSE4.2 AVX AESNI RDRAND FMA MOVBE MCE FXSR XSAVE RDTSCP CLFLUSH TSC-DEADLINE X2APIC AVX2 AVX512F AVX512DQ AVX512CD AVX512BW AVX512VL BMI1 BMI2 RTM HLE FSGSBASE 

[0][INFO] Physical Address Width:  46 bits

[0][INFO] Linear Address Width:    48 bits

[0][INFO] Supports 1GiB Pages:     No

[0][INFO] ======================================================================

[0][INFO] 

[0][INFO] HermitCore-rs booted on 2020-05-20 at 09:54:51

[0][INFO] 

[0][INFO] ======================== PCI BUS INFORMATION =========================

[0][INFO] 00:00 Ethernet controller [0200]: Red Hat, Inc. Virtio network device [1AF4:1000], MemoryBar: 0xc000 (size 0x10)

[0][INFO] ======================================================================

[0][INFO] 

[0][INFO] IOAPIC v17 has 24 entries

[0][INFO] Disable IOAPIC timer

[1][INFO] Entering idle loop for application processor

[0][ERROR] Invalid Opcode (#UD) Exception: ExceptionStackFrame {

    instruction_pointer: 0x1ff56a,

    code_segment: 0x8,

    cpu_flags: 0x10206,

    stack_pointer: 0x1feea0,

    stack_segment: 0x10,

}

No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself.

Check the details on how to adjust your build configuration on: https://docs.travis-ci.com/user/common-build-problems/#build-times-out-because-no-output-was-received

The build has been terminated

@stlankes stlankes closed this as completed Jul 1, 2020
@jounathaen
Copy link
Member Author

Further locations of spin_loop_hint which have to be replaced by panic!:

  • src/arch/x86_64/kernel/processor.rs:795
  • src/mm/mod.rs:114

@jounathaen jounathaen reopened this Jul 1, 2020
@jounathaen
Copy link
Member Author

Closed by 7decb24

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants