Skip to content
This repository has been archived by the owner on May 25, 2022. It is now read-only.

Design a structured scanner:auto attach language,dynamic load libs,nested scanner #3

Closed
Anddd7 opened this issue Apr 17, 2022 · 13 comments
Assignees
Labels
enhancement New feature or request

Comments

@Anddd7
Copy link

Anddd7 commented Apr 17, 2022

Is your feature request related to a problem? Please describe.
No

Describe the solution you'd like
探针+扫描器

Describe alternatives you've considered
探针扫描语言和项目结构,判断是否适用该扫描器
扫描器执行扫描(基于接口)
扫描器可以嵌套,例如kotlin嵌套spring嵌套jpa

Additional context
插件化 pipeline filter模式

@phodal
Copy link
Member

phodal commented Apr 17, 2022

现有规则的设计

  1. Rule 作为基本的规则类。
  2. Rulesets 是一堆规则的合集
  3. RuleSetProvider 根据不同情况,如 Standard、Recommend 返回对应的规则。
  4. RuleVisitor 根据不同的规则类型(如 API、数据库、测试坏味道等)提供 Visitor 接口。

示例见:tbs

@Anddd7
Copy link
Author

Anddd7 commented Apr 17, 2022

8E6451C5-AABA-4EAE-B989-1CCC48AB4049

将scanner分为language/entry scanner和feature scanner

  • 输入source(git,source code),通过language scanner,输出data structure
  • 输入data structure,通过feature scanner,输出data structure(或db,api等特殊scheme)
  • scanner以树状形式相连,按中缀遍历依次执行

scanner配置化,可重定向目标uri(file path or url)

  • (后续)提供scanner project template,大家可以快速创建一个language or feature scanner
  • 每个scanner即一个function,具有独立标识符并可以单独打包,提供一个配置类来声明scanner的树状结构
  • 使用class for name or spi来动态加载指定的scanner,并保证链路正常

解决的问题

最大的scanner chapi目前包含全语言,40m,可以不着急按语言拆,主要是通过重构合理安排scanner的结构,防止无限制膨胀

  • 单一scanner膨胀变大(可按data structure的分析阶段 任意拆分)
  • 自定义扩展(通过overwrite配置 or append实现)

5.0?for 大型系统(类jenkins x)

  • scanner进程化(类mapreduce),每发起一次scan请求,即创建一个pod
  • pod内会包含相应的scanner环境
  • 利用k8s的伸缩能力来安排scanner执行(or 自定义Operator)
  • 将结果写到etcd或通过hooks写回archguard

@phodal
Copy link
Member

phodal commented Apr 17, 2022

大型系统理论上应该可以结合 archguard/archguard#43 一起考虑

@Anddd7
Copy link
Author

Anddd7 commented Apr 17, 2022

archguard/archguard#30 (comment)

class DubboScanner:FeatureScanner

language:kotlin // kotlin sourcecode
features:
- api
- db
- dubbo:git@xxx.xxx.dubbo-scanner.git

@Anddd7
Copy link
Author

Anddd7 commented Apr 17, 2022

大型系统理论上应该可以结合 archguard/archguard#43 一起考虑

这个可以先展望一下 哈哈

@Anddd7
Copy link
Author

Anddd7 commented Apr 18, 2022

现有规则的设计

  1. Rule 作为基本的规则类。
  2. Rulesets 是一堆规则的合集
  3. RuleSetProvider 根据不同情况,如 Standard、Recommend 返回对应的规则。
  4. RuleVisitor 根据不同的规则类型(如 API、数据库、测试坏味道等)提供 Visitor 接口。

示例见:tbs

看了一下,感觉实现会比较类似,但是scanner还是单独拆开好些(视角不一样);
scanner只做数据清洗的工作,即转换成标准的model(CodeDataStructure, ApiCodeCallMap, ContainerService...);
rule的input应该是scanner的output;

@phodal
Copy link
Member

phodal commented Apr 18, 2022

对,比较相似,但是其实是两个不同的东西。

@Anddd7
Copy link
Author

Anddd7 commented Apr 18, 2022

image

这个图还是很清晰的

AST -> Chapi
Model Construction+Extraction -> Scanner
Analysis with Patterns -> Rule

@Anddd7
Copy link
Author

Anddd7 commented Apr 20, 2022

Update latest design for scanner

image

@Anddd7 Anddd7 self-assigned this Apr 23, 2022
@Anddd7 Anddd7 added the enhancement New feature or request label Apr 23, 2022
@Anddd7
Copy link
Author

Anddd7 commented Apr 23, 2022

https://github.com/archguard/scanner/tree/master/scanner_cli
https://github.com/archguard/scanner/tree/master/analyser_sourcecode/lang_kotlin

TODO:

  • migrate all language analysers with new structure (e.g. lang_kotlin)
  • migrate API/DB analysers
  • test scanner cli to dispatch tasks to different analysers
  • create scanner-client/http api in ArchGuard Backend
  • test whole workflow
  • create analyser template as a standard project
  • migrate other analysers (git, bytecode ...)

@phodal
Copy link
Member

phodal commented Apr 25, 2022

最后,估计还需要一个 scanner_output,比如支持直接插入数据库、生成 JSON、CSV

@Anddd7
Copy link
Author

Anddd7 commented Apr 25, 2022

最后,估计还需要一个 scanner_output,比如支持直接插入数据库、生成 JSON、CSV

https://github.com/archguard/scanner/blob/master/scanner_cli/src/main/kotlin/org/archguard/scanner/ctl/client/ArchGuardHttpClient.kt

实现其他的ArchGuardClient,里面拿到数据就打印JSON、CSV,在cli通过参数控制选Http的client还是JSON的client

@Anddd7
Copy link
Author

Anddd7 commented May 17, 2022

design and implementation already done, close this issue

@Anddd7 Anddd7 closed this as completed May 17, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants