Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

complete system freeze/bork #1060

Closed
empijei opened this issue Aug 29, 2016 · 9 comments
Closed

complete system freeze/bork #1060

empijei opened this issue Aug 29, 2016 · 9 comments

Comments

@empijei
Copy link

empijei commented Aug 29, 2016

Output of awesome --version:
awesome v3.5.9 (Mighty Ravendark)
• Build: Mar 12 2016 01:11:40 for x86_64 by gcc version 5.3.0 (builduser@rw)
• Compiled against Lua 5.3.2 (running with Lua 5.3)
• D-Bus support: ✔

How to reproduce the issue:
install arch linux on a thinkpad carbon X1, install awesome, wait. (it can take some hours before the freeze)

Actual result:
The freeze is random and does not seem to be related to any event.
When the computer freezes nothing works. The computer stops responding to pings, there is no "Ctrl+Alt+F2", no "Ctrl+Alt+Whatever".
The only way to recover is a hard reboot holding down the power button.
Any file that was being written during the crash (including logs) just contains a bunch of zeroes (AKA bytes composed only by 0).

Expected result:
Everything to keep working

Notes
The laptop works perfectly with arch and any wm that is not awesome (I tried lxde, xfce and gnome)
The laptop worked fine at least until the last days of May
I have tried downgrading all of awesome's dependencies to a date where everything worked fine, with no success
I have tried with both the default awesome rc.lua and a custom one.
PLEASE i really love Arch+Awesome and it is necessary for my workflow to use both of them. I have the same setup on all my pcs, which are 3 laptops and a desktop computer, and it only freezes on the Thinkpad. (i Sync all the packages and versions across all the PCs.)

I am available to provide help, intel or anything you need.

@Elv13
Copy link
Member

Elv13 commented Aug 29, 2016

Hello.

First of all, Awesome can't freeze your computer that way. However the Intel graphic driver can (Or NVIDIA, if you have Optimus enabled). If you open an SSH server on your X1, you might be able to login after it is frozen (or maybe not if its a full Kernel Panic (KP)). If you can login, get the dmesg content and contact the right dev team.

Now, you can try to mitigate the issue. First, try to add --no-argb Awesome command line option and see it it helps. If it doesn't and you are using the Intel graphic driver, you can try to turn off some of its features:

acpi_backlight=vendor
i915.allow_pc8=0
i915.enable_psr=0
i915.powersave=0
i915.enable_rc6=0
i915.enable_fbc=0
i915.lvds_downclock=0
i915.semaphores=0

Add these lines one by one to you grub.cfg (edit: kernel line) until you figure out the feature thats causing the lockup. Once done, get in touch with the Intel devs. You might want to try the latest kernel just in case. I had your issue with 4.4.x, however it is now fixed (for me). Given awesome sometime push things a little hard by abusing of argb surfaces, it would not be the first time this kind of issue arise.

@empijei
Copy link
Author

empijei commented Aug 30, 2016

First of all thank you for the reply,

I will try to do as suggested, but i can't do the ssh thing because the system is completely useless when the freeze happens, there is not even an ARP response from the computer.

I am currently on kernel 4.7.2-1-ARCH.

I'll try to discover what is the issue using your pointers and will report it to intel devs in case I pinpoint it to the driver.

@empijei
Copy link
Author

empijei commented Aug 30, 2016

--no-argb did not help, now i am running with the optional kernel parameters you suggested.

@blueyed
Copy link
Member

blueyed commented Aug 31, 2016

@empijei
Which on? Could you narrow it down to a single one?

@empijei
Copy link
Author

empijei commented Aug 31, 2016

I am currently testing with all of them, since the bork can take tens of hours, and if i don't get one in the next 30 hours of use i'm gonna start bisecting them.

@osleg
Copy link
Contributor

osleg commented Sep 1, 2016

Can you please update on this and also tell me what PC (laptop/desktop, brand, cpu model) do you have?

@empijei
Copy link
Author

empijei commented Sep 1, 2016

So, it has been 2 days since the last bork. The system looks stable and didn't freeze since i added the following options to the systemd-boot:

acpi_backlight=vendor ii915.allow_pc8=0 i915.enable_psr=0 i915.powersave=0 i915.enable_rc6=0 i915.enable_fbc=0 i915.lvds_downclock=0 i915.semaphores=0

My PC is a Thinkpad Lenovo Carbon X1. During the weekend I am going to remove some of the flags and see which one is the cause of the problem. (I use the thinkpad at work and i don't want it to freeze while working)

Before reporting this issue I already tried loading the intel microcode (intel-ucode.img) before the kernel, but without success.

Some extract of cpuinfo:

processor        : 0

vendor_id        : GenuineIntel

cpu family        : 6

model                : 61

model name        : Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz

stepping        : 4

microcode        : 0x24

cpu MHz                : 3169.702

cache size        : 4096 KB

physical id        : 0

siblings        : 4

core id                : 0

cpu cores        : 2

apicid                : 0

initial apicid        : 0

fpu                : yes

fpu_exception        : yes

cpuid level        : 20

wp                : yes

flags                : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx smap xsaveopt dtherm ida arat pln pts

bugs                :

bogomips        : 5189.84

clflush size        : 64

cache_alignment        : 64

address sizes        : 39 bits physical, 48 bits virtual

power management:

@empijei
Copy link
Author

empijei commented Sep 4, 2016

Update: So, the system freezes if and only if this flag is not specified:
i915.enable_rc6=0

@Elv13
Copy link
Member

Elv13 commented Sep 4, 2016

Great, thanks! I am closing this, please report your issue to the Intel devs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants