Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The server is always down while installing QAT driver #104

Open
onioner opened this issue Mar 13, 2019 · 3 comments
Open

The server is always down while installing QAT driver #104

onioner opened this issue Mar 13, 2019 · 3 comments

Comments

@onioner
Copy link

onioner commented Mar 13, 2019

Hi Team,

I'm not sure whether I should ask questions here. This problem has been bothering me for a long time.
While I installing QAT driver, the server is always down after a certain step. I have to restart server through field engineer, and then I reinstall QAT driver, the problem won't happen again.

make[3]: Leaving directory `/usr/src/kernels/2.6.32-642.el6.x86_64'
make[2]: Leaving directory `/usr/local/src/ssl/qat1.7.l.4.4.0-00023/quickassist/qat'
Creating startup and kill scripts
Copying libqat_s.so to /usr/local/QAT_driver/lib
Copying libusdm_drv_s.so to /usr/local/QAT_driver/lib
Copying usdm module to system drivers
Creating udev rules
Creating module.dep file for QAT released kernel object
This will take a few moments
Starting QAT service
Stopping all devices.
Failed to stop device

Connection closed by foreign host.

Disconnected from remote host at 12:31:37.

OS: CentOS 6.8
kernel: 2.6.32-642.el6.x86_64
driver version: qat1.7.l.4.4.0-00023

Thanks in advance!

@Yogaraj-Alamenda
Copy link
Contributor

Hi @onioner

I suspect there is some problem in disabling the kernel QAT Driver that comes included with your CentOS distribution
Could you please try blacklisting the QAT driver from kernel distribution using the below commands, reboot and try the driver installation again.

echo "blacklist intel_qat" > /etc/modprobe.d/blacklist-intel_qat.conf
echo "blacklist qat_dh895xcc" > /etc/modprobe.d/blacklist-qat_dh895xcc.conf 
echo "blacklist qat_c6xx" > /etc/modprobe.d/blacklist-qat_c6xx.conf

This will blacklist the qat_dh895xcc and qat_c6xx driver modules being loaded by default during system bootup.
If you still see the issue, Could you please share the InstallerLog.txt along with dmesg log for further investigation ?

@onioner
Copy link
Author

onioner commented Mar 15, 2019

Hi @Yogaraj-Alamenda

Thanks very much, you are right, the problem didn't happen again.

make[3]: Leaving directory `/usr/src/kernels/2.6.32-642.el6.x86_64'
make[2]: Leaving directory `/usr/local/src/ssl/qat1.7.l.4.4.0-00023/quickassist/qat'
Creating startup and kill scripts
Copying libqat_s.so to /usr/local/QAT_driver/lib
Copying libusdm_drv_s.so to /usr/local/QAT_driver/lib
Copying usdm module to system drivers
Creating udev rules
Creating module.dep file for QAT released kernel object
This will take a few moments
Starting QAT service
Can not open /dev/qat_adf_ctl
Stopping all devices.
Restarting all devices.
Processing /etc/dh895xcc_dev0.conf
Checking status of all devices.
There is 1 QAT acceleration device(s) in the system:
 qat_dev0 - type: dh895xcc,  inst_id: 0,  node_id: 1,  bsf: 0000:86:00.0,  #accel: 6 #engines: 12 state: up
make[1]: Nothing to be done for `install-data-am'.
make[1]: Leaving directory `/usr/local/src/ssl/qat1.7.l.4.4.0-00023'

And I have an another question, I always can't see that lspci has information about QAT, unless I reinstall the OS. Can you tell me the probable cause?

[root@localhost ~]# lspci |grep QAT
86:00.0 Co-processor: Intel Corporation DH895XCC Series QAT

Thanks again !

@Yogaraj-Alamenda
Copy link
Contributor

Hi @onioner
Strange that reinstalling the OS and lspci works, Just need few information to understand better,

  1. What does the output show when it doesn't list any device with lspci |grep QAT?
    If it shows command not found then you may need to install pciutils.
  2. Is lspci instead of lspci |grep QAT show any output ?
    Also check it might be possible there are some rules added in the dir "/etc/udev/rules.d" to remove that PCI device.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants