Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

X86 kvm sometimes fail with timer error #1195

Open
Harshil2107 opened this issue Jun 3, 2024 · 0 comments
Open

X86 kvm sometimes fail with timer error #1195

Harshil2107 opened this issue Jun 3, 2024 · 0 comments
Labels

Comments

@Harshil2107
Copy link
Contributor

Sometimes when running a FS workload, X86 kVM fails with a timer error when running on a intel CPU (tested with intel 13th gen). This error has been occuring most frequently when trying to run the mi200.py in configs/example/gpufs/mi200.py. I have tested this script on an machine with AMD cpu and this error doesnt occur. As mentioned above, this error occurs randomly, so simulation doesnt always fail.

The error looks like the following:

./build/VEGA_X86/gem5.opt configs/example/gpufs/mi200.py --disk-image /home/harshilp/gem5-resources-worktrees/gem5-resources-repo/src/x86-ubuntu-gpu-ml/disk-image/x86-ubuntu-gpu-ml --kernel  /home/harshilp/gem5-resources-worktrees/gem5-resources-repo/src/x86-ubuntu-gpu-ml/vmlinux-gpu-ml --app ./pytorch_test.py
gem5 Simulator System.  https://www.gem5.org
gem5 is copyrighted software; use the --copyright option for details.

gem5 version DEVELOP-FOR-24.0
gem5 compiled May 23 2024 13:08:48
gem5 started May 24 2024 14:15:07
gem5 executing on COE-CS-sterling, pid 1157351
command line: ./build/VEGA_X86/gem5.opt configs/example/gpufs/mi200.py --disk-image /home/harshilp/gem5-resources-worktrees/gem5-resources-repo/src/x86-ubuntu-gpu-ml/disk-image/x86-ubuntu-gpu-ml --kernel /home/harshilp/gem5-resources-worktrees/gem5-resources-repo/src/x86-ubuntu-gpu-ml/vmlinux-gpu-ml --app ./pytorch_test.py

warn: Physical memory size specified is 8GB which is greater than 3GB.  Twice the number of memory controllers would be created.
Global frequency set at 1000000000000 ticks per second
warn: system.workload.acpi_description_table_pointer.rsdt adopting orphan SimObject param 'entries'
warn: No dot file generated. Please install pydot to generate the dot file and pdf.
src/mem/dram_interface.cc:690: warn: DRAM device capacity (8192 Mbytes) does not match the address range assigned (4096 Mbytes)
src/sim/kernel_workload.cc:46: info: kernel located at: /home/harshilp/gem5-resources-worktrees/gem5-resources-repo/src/x86-ubuntu-gpu-ml/vmlinux-gpu-ml
src/base/statistics.hh:279: warn: One of the stats is a legacy stat. Legacy stat is a stat that does not belong to any statistics::Group. Legacy stat is deprecated.
src/base/statistics.hh:279: warn: One of the stats is a legacy stat. Legacy stat is a stat that does not belong to any statistics::Group. Legacy stat is deprecated.
src/mem/dram_interface.cc:690: warn: DRAM device capacity (128 Mbytes) does not match the address range assigned (16384 Mbytes)
src/base/statistics.hh:279: warn: One of the stats is a legacy stat. Legacy stat is a stat that does not belong to any statistics::Group. Legacy stat is deprecated.
      0: system.pc.south_bridge.cmos.rtc: Real-time clock set to Sun Jan  1 00:00:00 2012
system.pc.com_1.device: Listening for connections on port 3456
src/base/statistics.hh:279: warn: One of the stats is a legacy stat. Legacy stat is a stat that does not belong to any statistics::Group. Legacy stat is deprecated.
system.remote_gdb: Listening for connections on port 7000
src/dev/intel_8254_timer.cc:128: warn: Reading current count from inactive timer.
Running the simulation
src/cpu/kvm/base.cc:169: info: KVM: Coalesced MMIO disabled by config.
src/sim/simulate.cc:199: info: Entering event queue @ 0.  Starting simulation...
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x3a) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x48) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0xe1) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x12) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x11) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x4b564d01) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x4b564d00) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x40000000) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x40000001) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x40000020) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x40000021) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x40000022) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x40000023) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x40000100) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x40000101) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x40000102) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x40000103) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x40000104) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x40000105) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x40000003) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x40000002) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x40000010) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x40000080) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x400000b0) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x40000073) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x40000106) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x40000107) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x40000108) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x40000118) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x400000ff) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x400000f1) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x400000f2) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x400000f3) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x400000f4) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x400000f5) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x4b564d02) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x4b564d03) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x4b564d04) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x4b564d06) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x4b564d07) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x3b) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x6e0) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x10a) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x345) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x1a0) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x4d0) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x9e) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x34) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0xce) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x140) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x1fc) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x8b) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x480) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x48d) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x48e) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x48f) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x490) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x485) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x486) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x488) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x48a) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x48b) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x48c) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x491) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0xc0010015) unsupported by gem5. Skipping.
src/arch/x86/kvm/x86_cpu.cc:1688: warn: kvm-x86: MSR (0x4b564d05) unsupported by gem5. Skipping.
src/dev/x86/pc.cc:117: warn: Don't know what interrupt to clear for console.
src/arch/x86/interrupts.cc:530: hack: Assuming logical destinations are 1 << id.
src/dev/intel_8254_timer.cc:215: panic: PIT mode 0x4 is not implemented: 
Memory Usage: 29050468 KBytes
Program aborted at tick 2383316357000
--- BEGIN LIBC BACKTRACE ---
./build/VEGA_X86/gem5.opt(_ZN4gem515print_backtraceEv+0x30)[0x5fbc06fc1190]
./build/VEGA_X86/gem5.opt(_ZN4gem512abortHandlerEi+0x4c)[0x5fbc06fe5adc]
/lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x71035ba42520]
/lib/x86_64-linux-gnu/libc.so.6(pthread_kill+0x12c)[0x71035ba969fc]
/lib/x86_64-linux-gnu/libc.so.6(raise+0x16)[0x71035ba42476]
/lib/x86_64-linux-gnu/libc.so.6(abort+0xd3)[0x71035ba287f3]
./build/VEGA_X86/gem5.opt(+0xdcf1b5)[0x5fbc064a11b5]
./build/VEGA_X86/gem5.opt(_ZN4gem514Intel8254Timer12writeControlENS_16bitfield_backend17BitUnionOperatorsINS0_32BitfieldUnderlyingClassesCtrlRegEEE+0x83c)[0x5fbc07e6f4cc]
./build/VEGA_X86/gem5.opt(_ZN4gem56X86ISA5I82545writeEPNS_6PacketE+0x7a)[0x5fbc07ec908a]
./build/VEGA_X86/gem5.opt(_ZThn64_N4gem57PioPortINS_9PioDeviceEE10recvAtomicEPNS_6PacketE+0x76)[0x5fbc07e50ea6]
./build/VEGA_X86/gem5.opt(_ZN4gem515NoncoherentXBar18recvAtomicBackdoorEPNS_6PacketEsPPNS_11MemBackdoorE+0x39c)[0x5fbc06a101cc]
./build/VEGA_X86/gem5.opt(_ZN4gem54ruby8RubyPort15MemResponsePort10recvAtomicEPNS_6PacketE+0x26d)[0x5fbc06bca8fd]
./build/VEGA_X86/gem5.opt(_ZN4gem510BaseKvmCPU10KVMCpuPort8submitIOEPNS_6PacketE+0xeb)[0x5fbc07bf6dfb]
./build/VEGA_X86/gem5.opt(_ZN4gem59X86KvmCPU15handleKvmExitIOEv+0x4af)[0x5fbc0798f2af]
./build/VEGA_X86/gem5.opt(_ZN4gem510BaseKvmCPU13handleKvmExitEv+0x11b)[0x5fbc07bf792b]
./build/VEGA_X86/gem5.opt(_ZN4gem510BaseKvmCPU4tickEv+0x99)[0x5fbc07bf62d9]
./build/VEGA_X86/gem5.opt(_ZN4gem510EventQueue10serviceOneEv+0xc2)[0x5fbc06fd59f2]
./build/VEGA_X86/gem5.opt(_ZN4gem59doSimLoopEPNS_10EventQueueE+0x68)[0x5fbc06fff5b8]
./build/VEGA_X86/gem5.opt(_ZN4gem58simulateEm+0x283)[0x5fbc06fffbb3]
./build/VEGA_X86/gem5.opt(+0x121b5e0)[0x5fbc068ed5e0]
./build/VEGA_X86/gem5.opt(+0xdb8334)[0x5fbc0648a334]
/lib/x86_64-linux-gnu/libpython3.10.so.1.0(+0x128023)[0x71035c928023]
/lib/x86_64-linux-gnu/libpython3.10.so.1.0(_PyObject_Call+0x5c)[0x71035c8e1fec]
/lib/x86_64-linux-gnu/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x4b16)[0x71035c876776]
/lib/x86_64-linux-gnu/libpython3.10.so.1.0(+0x1c23af)[0x71035c9c23af]
/lib/x86_64-linux-gnu/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x9d68)[0x71035c87b9c8]
/lib/x86_64-linux-gnu/libpython3.10.so.1.0(+0x1c23af)[0x71035c9c23af]
/lib/x86_64-linux-gnu/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x9d68)[0x71035c87b9c8]
/lib/x86_64-linux-gnu/libpython3.10.so.1.0(+0x1c23af)[0x71035c9c23af]
/lib/x86_64-linux-gnu/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x69de)[0x71035c87863e]
/lib/x86_64-linux-gnu/libpython3.10.so.1.0(+0x1c23af)[0x71035c9c23af]
/lib/x86_64-linux-gnu/libpython3.10.so.1.0(PyEval_EvalCode+0xbe)[0x71035c9bd3de]
--- END LIBC BACKTRACE ---
For more info on how to address this issue, please visit https://www.gem5.org/documentation/general_docs/common-errors/ 

Aborted (core dumped)

Affects version
Develop

To Reproduce
Steps to reproduce the behavior. Please assume starting from a clean repository:

  1. Compile gem5 for VEGA_X86 with scons build/VEGA_X86/gem5.opt -j nproc
  2. Get the diskimage and kernel from gem5 resources
    - Kernel link: https://storage.googleapis.com/dist.gem5.org/dist/v24-0/gpu-fs/kernel/vmlinux-gpu-ml.gz
    - Diskimage link: https://storage.googleapis.com/dist.gem5.org/dist/v24-0/gpu-fs/diskimage/x86-ubuntu-gpu-ml.gz
  3. To get a run script make a pytorch.py file as shown in the README for the disk image at gem5-resources/src/x86-ubuntu-gpu-ml
  4. Run the following command to run the mi200.py:
./build/VEGA_X86/gem5.opt configs/example/gpufs/mi200.py --disk-image [disk image path] --kernel  [Kernel Path] --app ./pytorch_test.py

The issue is that the Mode 4 for the PIT is being requested and it is not implemented in gem5, so the best solution would be to implement this mode. More detail can be found here: https://wiki.osdev.org/Programmable_Interval_Timer#Mode_4_.E2.80.93_Software_Triggered_Strobe

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant