Skip to content

Commit

Permalink
build intel pcm
Browse files Browse the repository at this point in the history
  • Loading branch information
huataihuang committed Jul 16, 2023
1 parent eb1e792 commit d7149e3
Show file tree
Hide file tree
Showing 15 changed files with 197 additions and 1 deletion.
2 changes: 2 additions & 0 deletions source/clang/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@ C Atlas
:maxdepth: 1

cppflags_ldflags.rst
upgrade_cmake_on_centos7.rst
upgrade_gcc_on_centos7.rst

.. only:: subproject and html

Expand Down
20 changes: 20 additions & 0 deletions source/clang/upgrade_cmake_on_centos7.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
.. _upgrade_cmake_on_centos7:

=========================
在CentOS 7环境升级CMake
=========================

我在 :ref:`build_pcm` 遇到一个问题,就是编译需要使用 ``cmake`` 3.5以上版本,而CentOS 7全系列使用的是 cmake 2.8

编译
=======

- 编译准备(安装一些编译依赖):

.. literalinclude:: upgrade_cmake_on_centos7/prepare_build_cmake
:caption: 编译cmake准备

- 编译 cmake 3.26.4

.. literalinclude:: upgrade_cmake_on_centos7/build_cmake
:caption: 编译cmake
10 changes: 10 additions & 0 deletions source/clang/upgrade_cmake_on_centos7/build_cmake
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
version=3.26.4

wget https://github.com/Kitware/CMake/releases/download/v${version}/cmake-${version}.tar.gz

tar xfz cmake-${version}.tar.gz
cd cmake-${version}

./configure
make
sudo make install
1 change: 1 addition & 0 deletions source/clang/upgrade_cmake_on_centos7/prepare_build_cmake
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
sudo yum install openssl-devel
24 changes: 24 additions & 0 deletions source/clang/upgrade_gcc_on_centos7.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
.. _upgrade_gcc_on_centos7:

===========================
升级CentOS 7 GCC
===========================

:ref:`build_pcm` 使用了 `simdjson(Github) <https://github.com/simdjson/simdjson>`_ ,而 ``simdjson`` 需要使用现代化的编译器(LLVM clang 6 or better, GNU GCC 7.4 or better, Xcode 11 or better)。在CentOS 7环境,默认的 gcc 4.8.5 无法 :ref:`build_pcm` ,所以升级

从 `gcc mirror sites <https://gcc.gnu.org/mirrors.html>`_ 找一个最近的镜像网站,下载 10.5 版本

- 编译准备:

.. literalinclude:: upgrade_gcc_on_centos7/prepare_build_gcc
:caption: 编译gcc准备(安装编译依赖)

- 编译安装gcc:

.. literalinclude:: upgrade_gcc_on_centos7/build_gcc
:caption: 编译gcc

参考
======

- `build gcc from source on centos 7 <https://www.jwillikers.com/build-gcc-from-source-on-centos-7>`_
12 changes: 12 additions & 0 deletions source/clang/upgrade_gcc_on_centos7/build_gcc
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
version=10.5.0

wget wget http://ftp.tsukuba.wide.ad.jp/software/gcc/releases/gcc-${version}/gcc-${version}.tar.gz
tar xfz gcc-${version}.tar.gz
cd gcc-${version}

# 64位操作系统没有安装32位开发库,一般也用不上,所以使用 --disable-multilib 参数
# 仅编译支持c/c++
# 如果要指定安装目录,可以使用类似 --prefix=$HOME/.gcc/10.5.0 这样的参数
./configure --disable-multilib --enable-languages=c,c++
make
sudo make install
1 change: 1 addition & 0 deletions source/clang/upgrade_gcc_on_centos7/prepare_build_gcc
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
sudo yum -y install bzip2 wget gcc gcc-c++ gmp-devel mpfr-devel libmpc-devel make
52 changes: 52 additions & 0 deletions source/performance/intel_pcm/build_pcm.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
.. _build_pcm:

================
编译Intel PCM
================

编译
========

- clone CPM代码仓库以及子模块:

.. literalinclude:: build_pcm/build_pcm
:caption: 编译CPM的简单步骤

在CentOS 7,2平台编译
======================

生产环境使用了古老的(类)CentOS 7.2环境,在这个旧OS中编译会比较折腾(无力吐糟)

- 默认CentOS 7使用的CMaker版本是 2.8.12.2 ,CPM编译时会提示要求 CMake 3.5或更高版本,所以要从 `cmake官方下载 <https://cmake.org/download/>`_ 最新版本自己 :ref:`upgrade_cmake_on_centos7`

- Intel PCM使用了 `simdjson(Github) <https://github.com/simdjson/simdjson>`_ ,而 ``simdjson`` 需要使用现代化的编译器(LLVM clang 6 or better, GNU GCC 7.4 or better, Xcode 11 or better) ,所以需要 :ref:`upgrade_gcc_on_centos7`

- 编译步骤同上

文件打包
========

为了方便安装,根据 ``make install`` 列出文件,记录到 ``files.txt`` 中,然后执行以下命令打包成 ``cpm.tar.gz`` (参考 `Tar archiving that takes input from a list of files <https://stackoverflow.com/questions/8033857/tar-archiving-that-takes-input-from-a-list-of-files>`_ ,此外同时打包 :ref:`pcm-exporter` 的 ``/etc/systemd/system/pcm-exporter.service`` ):

.. literalinclude:: build_pcm/tar_pcm_tar
:caption: 根据文件列表打包pcm安装文件

然后在目标服务器上只需要执行以下命令就能快速运行 ``pcm-exporter`` 服务:

.. literalinclude:: build_pcm/deploy_pcm
:caption: 快速部署自己编译的pcm-exporter

问题排查
==========

遇到一个问题,使用 :ref:`pcm-exporter` 中 :ref:`systemd` 配置方式启动 ``pcm-sensor-server`` 失败:

.. literalinclude:: build_pcm/systemd_start_pcm-server_fail
:caption: 使用 systemd 启动自己编译的 pcm-sensor-server 失败

我发现是参数 ``--real-time`` 导致,原因未明,通过取消该参数恢复

参考
======

- `Intel Performance Counter Monitor (Intel PCM) (GitHub) <https://github.com/intel/pcm>`_
11 changes: 11 additions & 0 deletions source/performance/intel_pcm/build_pcm/build_pcm
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
git clone --recursive https://github.com/opcm/pcm.git
# 或者clone之后执行 git submodule update --init --recursive

mkdir build
cd build
cmake ..
# 使用 --parallel 参数可以加快编译
cmake --build . --parallel

# 所有编译后的二进制执行文件在 build/bin 目录下,这里执行 make install 会安装到 /usr/local/sbin 目录
sudo make install
3 changes: 3 additions & 0 deletions source/performance/intel_pcm/build_pcm/deploy_pcm
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
tar xfz pcm.tar.gz
systemctl daemon-reload
systemctl enable --now pcm-exporter
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
● pcm-exporter.service - pcm-exporter
Loaded: loaded (/etc/systemd/system/pcm-exporter.service; disabled; vendor preset: disabled)
Active: activating (auto-restart) (Result: exit-code) since Sun 2023-07-16 16:00:41 CST; 864ms ago
Process: 373807 ExecStart=/usr/local/sbin/pcm-sensor-server -p 9738 --real-time (code=exited, status=6)
Main PID: 373807 (code=exited, status=6)

Jul 16 16:00:41 sqaappxdn006002124157.sa127 systemd[1]: Unit pcm-exporter.service entered failed state.
Jul 16 16:00:41 sqaappxdn006002124157.sa127 systemd[1]: pcm-exporter.service failed.
3 changes: 3 additions & 0 deletions source/performance/intel_pcm/build_pcm/tar_pcm_tar
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# 这里参数 -T 表示 --files-from 可以从文件中读取需要打包的文件列表
tar -cvf pcm.tar -T files.txt

1 change: 1 addition & 0 deletions source/performance/intel_pcm/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ Intel® Performance Counter Monitor (Intel® PCM)
intro_intel_pcm.rst
pcm-exporter.rst
pcm-grafana.rst
build_pcm.rst

.. only:: subproject and html

Expand Down
5 changes: 4 additions & 1 deletion source/performance/intel_pcm/pcm-exporter.rst
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,10 @@ Intel开发的 PCM (Performance Counter Monitor) 提供了 :ref:`prometheus_expo
:caption: ``pcm-exporter`` 服务运行状态
:emphasize-lines: 11,16

注意,功能受到硬件支持的限制,例如 :ref:`xeon_e5-2600_v3` 无法支持 :ref:`intel_rdt` (需要到下一代 ``v4`` 才行),也不支持 `Intel QuickPath Interconnect <https://www.intel.com/content/www/us/en/io/quickpath-technology/quickpath-technology-general.html>`_
注意,功能受到硬件支持的限制,例如 :ref:`xeon_e5-2600_v3` 无法支持 :ref:`intel_rdt` (需要到下一代 ``v4`` 才行),也不支持 `Intel QuickPath Interconnect <https://www.intel.com/content/www/us/en/io/quickpath-technology/quickpath-technology-general.html>`_ 。 不过,如果换成在 ``Xeon Platinum 8163 CPU @ 2.50GHz`` (skylake) 则可以看到如下完整的正常输出:

.. literalinclude:: pcm-exporter/pcm-sensor-server_output_skylake
:caption: ``pcm-exporter`` 服务运行输出(skylake处理器)

此时,使用浏览器访问 http://192.168.6.200:9738 (我的服务器地址),就能够看到 ``PCM Sensor Server`` 介绍页面,其中提供了 :ref:`influxdb` 和 :ref:`prometheus` 结合 :ref:`grafana` 的配置案例。

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
Scheduler changed to SCHED_RR and priority to 1

===== Processor information =====
Linux arch_perfmon flag : yes
Hybrid processor : no
IBRS and IBPB supported : yes
STIBP supported : yes
Spec arch caps supported : no
Max CPUID level : 22
CPU model number : 85
Number of physical cores: 48
Number of logical cores: 96
Number of online logical cores: 96
Threads (logical cores) per physical core: 2
Num sockets: 2
Physical cores per socket: 24
Last level cache slices per socket: 24
Core PMU (perfmon) version: 4
Number of core PMU generic (programmable) counters: 3
Width of generic (programmable) counters: 48 bits
Number of core PMU fixed counters: 3
Width of fixed counters: 48 bits
Nominal core frequency: 2500000000 Hz
IBRS enabled in the kernel : no
STIBP enabled in the kernel : no
Package thermal spec power: 165 Watt; Package minimum power: 87 Watt; Package maximum power: 363 Watt;

INFO: Linux perf interface to program uncore PMUs is present
Socket 0: 2 memory controllers detected with total number of 6 channels. 3 UPI ports detected. 2 M2M (mesh to memory) blocks detected. 0 HBM M2M blocks detected. 0 EDC/HBM channels detected. 0 Home Agents detected. 3 M3UPI blocks detected.
Socket 1: 2 memory controllers detected with total number of 6 channels. 3 UPI ports detected. 2 M2M (mesh to memory) blocks detected. 0 HBM M2M blocks detected. 0 EDC/HBM channels detected. 0 Home Agents detected. 3 M3UPI blocks detected.
INFO: using Linux resctrl driver for RDT metrics (L3OCC, LMB, RMB) because resctrl driver is mounted.

Disabling NMI watchdog since it consumes one hw-PMU counter. To keep NMI watchdog set environment variable PCM_KEEP_NMI_WATCHDOG=1 (this reduces the core metrics set)
Closed perf event handles
Trying to use Linux perf events...
Successfully programmed on-core PMU using Linux perf
Socket 0
Max UPI link 0 speed: 23.3 GBytes/second (10.4 GT/second)
Max UPI link 1 speed: 23.3 GBytes/second (10.4 GT/second)
Max UPI link 2 speed: 23.3 GBytes/second (10.4 GT/second)
Socket 1
Max UPI link 0 speed: 23.3 GBytes/second (10.4 GT/second)
Max UPI link 1 speed: 23.3 GBytes/second (10.4 GT/second)
Max UPI link 2 speed: 23.3 GBytes/second (10.4 GT/second)
Starting plain HTTP server on http://localhost:9738/

0 comments on commit d7149e3

Please sign in to comment.