Skip to content

Commit

Permalink
softraid and some update
Browse files Browse the repository at this point in the history
  • Loading branch information
huataihuang committed Oct 30, 2023
1 parent 212aa59 commit a3d35d9
Show file tree
Hide file tree
Showing 17 changed files with 178 additions and 0 deletions.
43 changes: 43 additions & 0 deletions source/android/build/build_lineageos_20_pixel_4.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,49 @@

- 配置 ``git`` :

.. literalinclude:: build_lineageos_20_pixel_4/git
:caption: 配置git

- 初始化android仓库以及获取源代码:

.. literalinclude:: build_lineageos_20_pixel_4/repo_sync
:caption: 初始化android仓库以及获取源代码

这里我遇到一个提示:

.. literalinclude:: build_lineageos_20_pixel_4/repo_sync_err
:caption: 初始化android仓库以及获取源代码

按照提示复制升级 ``repo`` ::

cp /home/admin/android/lineage/.repo/repo/repo /home/admin/bin/repo

然后重新执行仓库同步::

repo init -u https://github.com/LineageOS/android.git -b lineage-20.0

.. note::

访问GitHub仓库可能受到GFW干扰,所以需要采用 :ref:`git_proxy`

.. note::

``repo sync`` 同步命令默认参数是 ``-j 4 -c`` 表示:

- ``-j 4`` 表示并发4个同步线程(连接)
- ``-c`` 表示 ``repo`` 值同步当前分支而不是GitHub上该仓库的所有分支

LineageOS 建议使用默认配置,不过我发现由于翻墙网络非常缓慢,适当增加同步并发可以加快同步。例如 ``repo sync -j 20``

- 准备设备特定代码:

.. note::

这里针对 :ref:`pixel_4` 设备编译源代码,对应的code名字是 ``flame``

.. literalinclude:: build_lineageos_20_pixel_4/breakfast
:caption: 同步设备特定代码

参考
======

Expand Down
3 changes: 3 additions & 0 deletions source/android/build/build_lineageos_20_pixel_4/breakfast
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
source build/envsetup.sh
# 这里同步pixel4对应的设备代码 flame
breakfast flame
3 changes: 3 additions & 0 deletions source/android/build/build_lineageos_20_pixel_4/git
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# 这里git配置请按照自己的github账号配置,以便能够从GitHub同步代码仓库
git config --global user.email user@localhost
git config --global user.name user
9 changes: 9 additions & 0 deletions source/android/build/build_lineageos_20_pixel_4/repo_sync
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# 创建源代码目录
mkdir -p ~/android/lineage/
cd ~/android/lineage/

# 初始化仓库(这里需要根据设备对应的代码分支):
repo init -u https://github.com/LineageOS/android.git -b lineage-20.0 --git-lfs

# 代码同步
repo sync
12 changes: 12 additions & 0 deletions source/android/build/build_lineageos_20_pixel_4/repo_sync_err
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
Downloading Repo source from https://gerrit.googlesource.com/git-repo
repo: Updating release signing keys to keyset ver 2.3
/home/admin/android/lineage/.repo/repo/main.py:569: DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead
logger.warn("\n... A new version of repo (%s) is available.", exp_str)

... A new version of repo (2.37) is available.
/home/admin/android/lineage/.repo/repo/main.py:571: DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead
logger.warn(
... You should upgrade soon:
cp /home/admin/android/lineage/.repo/repo/repo /home/admin/bin/repo

...
2 changes: 2 additions & 0 deletions source/linux/storage/software_raid/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ Linux 软RAID
mdadm_partion_vs_disk.rst
mdadm.rst
mdadm_raid10.rst
raid-check.rst
md_sync_speed.rst
mdadm_remove_md.rst
speed_up_mdadm_rebuild_resyn.rst
../../../devops/ansible/ansible_config_raid.rst
Expand Down
60 changes: 60 additions & 0 deletions source/linux/storage/software_raid/md_sync_speed.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
.. _md_sync_speed:

=====================
``mdadm`` 同步速度
=====================

.. warning::

当你调整系统默认配置时,务必充分理解参数含义以及影响,并做好详细记录。本文是 **我的一次经验教训总结**

我在 :ref:`mdadm_raid10` 实践时,由于服务器硬件规格极大,采用了 4TB 的 :ref:`nvme` ,所以在构建 ``RAID10`` 初始化RAID的 ``sync`` 同步非常耗时,原因是默认同步限速是 ``200MB/s`` ,对于海量存储来说完成首次全量同步可能会需要以天为计量单位。

例如,我的实践 :ref:`mdadm_raid10` ,刚完成 ``RAID10`` 构建时检查 ``mdstat`` :

.. literalinclude:: mdadm_raid10/mdstat
:caption: 检查md状态

可以看到同步速度是 ``207272K/sec`` 也就是大约 ``200MB/s`` ,预估完成时间 ``1803min`` (30小时):

.. literalinclude:: mdadm_raid10/mdstat_output
:caption: 检查md状态可以看到RAID正在构建

对于构建 :ref:`deploy_lvm_mdadm_raid10` 底层基础工作,虽然没有明显影响(raid同步时依然可以读写),但是还是会带来一些不便(主要是想快速完成部署和验证 :ref:`gluster` 性能)

检查同步速度
=============

对于同步限制的主要参数调整是 ``md`` 设备 ``sync_speed_max`` ,这个参数可以通过 ``/sys/block/md10/md/sync_speed_max`` 检查:

- 检查 ``md`` 设备同步速度:

.. literalinclude:: md_sync_speed/md_sync_speed_max
:caption: 检查 ``md`` 设备 ``md10`` 的 ``sync_speed_max`` 限速

默认值是 ``200000`` 也就是 ``200MB/s`` :

.. literalinclude:: md_sync_speed/md_sync_speed_max_output
:caption: ``md`` 设备默认 ``sync_speed_max`` 限速是 ``200MB/s``

- 此外还有一个默认的 ``md_sync_speed_min`` :

.. literalinclude:: md_sync_speed/md_sync_speed_min
:caption: 检查 ``md`` 设备 ``md10`` 的 ``sync_speed_min`` 最小同步速度(下限)

默认值是 ``1000`` 也就是 ``1MB/s`` :

.. literalinclude:: md_sync_speed/md_sync_speed_min
:caption: 检查 ``md`` 设备 ``md10`` 的 ``sync_speed_min`` 最小同步速度(下限)

调整同步速度
==============

- 可以在线调整同步速度:

.. literalinclude:: md_sync_speed/adjust_md_sync_speed
:caption: 在线调整 ``md`` 设备 ``md10`` 同步速率

.. warning::

``md`` 配置默认 ``200MB/s`` 同步速度是有一定道理的,我在这里踩了一个坑(见下文)
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# 我在构建RAID的初始化时调整放宽了同步速率10倍
sysctl -w dev.raid.speed_limit_max=2000000
sysctl -w dev.raid.speed_limit_min=10000
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
cat /sys/block/md10/md/sync_speed_max
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
200000 (system)
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
cat /sys/block/md10/md/sync_speed_min
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
1000 (system)
21 changes: 21 additions & 0 deletions source/linux/storage/software_raid/raid-check.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
.. _raid-check:

==================
raid-check
==================

在完成 :ref:`mdadm_raid10` 之后,运维过程中发现默认系统配置了每周一次 ``raid-check`` ,也就是在 :ref:`cron` 配置了一个 ``/etc/cron.d/raid-check`` :

.. literalinclude:: raid-check/cron
:caption: 默认配置每周日凌晨1点进行 ``raid-check``

在实际生产环境中,由于现代存储容量非常巨大(单块 :ref:`nvme` 容量达到4T,组合 :ref:`mdadm_raid10` 达到数十T),这个 ``mdadm`` 的检查耗时会非常长:

- 默认同步速度限制为 ``200MB/sec`` (可修改,但是我踩了一个 :ref:`md_sync_speed` 调整的坑 )

参考
======

- `Weekly RAID check affecting my system - any way to mitigate? <https://serverfault.com/questions/1100760/weekly-raid-check-affecting-my-system-any-way-to-mitigate>`_
- `mdadm RAID5 RAID6 how to check consistency on running array <https://serverfault.com/questions/1064838/mdadm-raid5-raid6-how-to-check-consistency-on-running-array>`_
- `Check RAID software: my status <https://serverfault.com/questions/721364/check-raid-software-my-status>`_
2 changes: 2 additions & 0 deletions source/linux/storage/software_raid/raid-check/cron
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Run system wide raid-check once a week on Sunday at 1am by default
0 1 * * Sun root /usr/sbin/raid-check
1 change: 1 addition & 0 deletions source/machine_learning/fauxpilot/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ FauxPilot
.. toctree::
:maxdepth: 1

opensource_ai_coding_assistant.rst
intro_fauxpilot.rst
codegen.rst
nvidia_triton.rst
Expand Down
6 changes: 6 additions & 0 deletions source/machine_learning/fauxpilot/intro_fauxpilot.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,12 @@ FauxPilot简介

`FauxPilot(GitHub) <https://github.com/fauxpilot/fauxpilot>`_ 是开源的本地AI服务编程工具,用于构建一个本地运行的替代 GitHub :ref:`copilot` ,使用了在 结合了 :ref:`fastertransformer` 的 :ref:`nvidia_triton` 服务器中运行的SalesForce :ref:`codegen` 模型。

.. note::

关键是理解AI codeing assisant工作的原理已经架构复现,相同的开源工具有不少,可以参考学习:

- `Tabby(GitHub) <https://github.com/TabbyML/tabby>`_

运行要求
==========

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
.. _opensource_ai_coding_assistant:

=============================
开源AI编程辅助工具
=============================

随着 :ref:`llm` 迅速发展,开源实现类似 :ref:`gpt` 和 ``GitHub Copilot`` 的实现方案不断涌现,这里我尝试汇总相关开源项目,并选择比较全面和优秀的架构进行实践

- :ref:`intro_fauxpilot`

0 comments on commit a3d35d9

Please sign in to comment.