这是一个使用CodeFuse-Query分析 XML 项目的教程。在教程中，你将体验到使用命令行工具对代码仓库进行数据化，然后使用Godel语言来分析这个仓库。

检查cli是否就绪

In [1]:
!which sparrow

/sparrow-cli/sparrow



STEP 0: 克隆要分析的仓库。我们以 [sofa-boot](https://github.com/sofastack/sofa-boot.git) 项目为例。

In [2]:
!git clone https://github.com/sofastack/sofa-boot.git -q

STEP 1: 代码数据化。使用 `sparrow database create` 命令创建一个db文件，指定待分析的仓库地址（当前目录下的sofa-boot子目录），分析的语言（xml），以及db文件的存储路径（放置在当前目录下的/db/sofa-boot）。执行该命令之后，就会生成一份db文件，该文件存储着代码仓库的结构化数据，之后的分析就是针对这份数据进行。

In [3]:
!sparrow database create --source-root sofa-boot --data-language-type xml --output ./db/sofa-boot --overwrite > /dev/null

STEP 2: 使用Godel分析语言分析db文件。在本教程中，可以点击代码左侧的执行按钮，或使用快捷键：`Shift+Enter`，直接运行分析脚本。这里使用 `%db /path/to/db` 魔法命令来设置COREF db路径，内核会读取这个值来进行query查询。

<b>示例</b> 查询 [sofa-boot](https://github.com/sofastack/sofa-boot.git) 的POM信息（如文件路径，引用的jar资源，版本信息）。

第一行通过内核魔法命令指定分析的db路径，后面写查询POM信息Godel脚本。

In [4]:
%db ./db/sofa-boot
// script
use coref::xml::*

schema DependencyElement extends XmlElement {}

impl DependencyElement {
    @data_constraint
    pub fn __all__(db: XmlDB) -> *DependencyElement {
        for(e in XmlElement(db)) {
            if (e.getElementName() = "dependency") {
                yield DependencyElement {
                    id: e.id,
                    location_id: e.location_id,
                    parent_id: e.parent_id,
                    index_order: e.index_order
                }
            }
        }
    }
}

schema GroupElement extends XmlElement {}

impl GroupElement {
    @data_constraint
    pub fn __all__(db: XmlDB) -> *GroupElement {
        for(e in XmlElement(db)) {
            if (e.getElementName() = "groupId") {
                yield GroupElement {
                    id: e.id,
                    location_id: e.location_id,
                    parent_id: e.parent_id,
                    index_order: e.index_order
                }
            }
        }
    }
}

schema VersionElement extends XmlElement {}

impl VersionElement {
    @data_constraint
    pub fn __all__(db: XmlDB) -> *VersionElement {
        for(e in XmlElement(db)) {
            if (e.getElementName() = "version") {
                yield VersionElement {
                    id: e.id,
                    location_id: e.location_id,
                    parent_id: e.parent_id,
                    index_order: e.index_order
                }
            }
        }
    }
}

schema ArtifactElement extends XmlElement {}

impl ArtifactElement {
    @data_constraint
    pub fn __all__(db: XmlDB) -> *ArtifactElement {
        for(e in XmlElement(db)) {
            if (e.getElementName() = "artifactId") {
                yield ArtifactElement {
                    id: e.id,
                    location_id: e.location_id,
                    parent_id: e.parent_id,
                    index_order: e.index_order
                }
            }
        }
    }
}

schema PomFile extends XmlFile {}

impl PomFile {
    @data_constraint
    pub fn __all__(db: XmlDB) -> *PomFile {
        for(f in XmlFile(db)) {
            if (f.getFileName() = "pom.xml") {
                yield PomFile {
                    id: f.id,
                    file_name: f.file_name,
                    relative_path: f.relative_path
                }
            }
        }
    }
}

// output relative path of the file, referenced jar name and version
fn out(fileName: string, m1: string, m2: string, m3: string) -> bool {
    let (db = XmlDB::load("coref_xml_src.db")) {
        for (f in PomFile(db),
            e1 in GroupElement(db),
            e2 in VersionElement(db),
            e3 in ArtifactElement(db),
            c1 in XmlCharacter(db),
            c2 in XmlCharacter(db),
            c3 in XmlCharacter(db),
            p in DependencyElement(db)) {
            if (f.key_eq(p.getLocation().getFile()) &&
                fileName = f.getRelativePath() &&
                p.key_eq(e1.getParent()) &&
                e1.key_eq(c1.getBelongedElement()) &&
                m1 = c1.getText() &&
                p.key_eq(e2.getParent()) &&
                e2.key_eq(c2.getBelongedElement()) &&
                m2 = c2.getText() &&
                p.key_eq(e3.getParent()) &&
                e3.key_eq(c3.getBelongedElement()) &&
                m3 = c3.getText()) {
                return true
            }
        }
    }
}

fn main() {
    output(out())
}

/workspaces/CodeFuse-Query/tutorial/notebook/db/sofa-boot


[0;31mSparrow database is set to: /workspaces/CodeFuse-Query/tutorial/notebook/db/sofa-boot
[0m

2023-12-06 07:49:13,344 INFO: sparrow 2.0.0
 will start
2023-12-06 07:49:13,345 INFO: database /workspaces/CodeFuse-Query/tutorial/notebook/db/sofa-boot/coref_xml_src.db size: 908.00 KB
2023-12-06 07:49:13,345 INFO: execute : /sparrow-cli/godel-script/usr/bin/godel /tmp/godel-jupyter-q_z2i2by/query.gdl -p /sparrow-cli/lib-1.0 -o /tmp/tmph3yjtf_3.gdl
2023-12-06 07:49:13,361 INFO: godel-script compile time: 0.02s
2023-12-06 07:49:13,362 INFO: execute : /sparrow-cli/godel-1.0/usr/bin/godel /tmp/tmph3yjtf_3.gdl --run-souffle-directly --package-path /sparrow-cli/lib-1.0 --souffle-fact-dir /workspaces/CodeFuse-Query/tutorial/notebook/db/sofa-boot --souffle-output-format json --souffle-output-path /tmp/godel-jupyter-q_z2i2by/query.json
2023-12-06 07:49:13,678 INFO: Task /tmp/godel-jupyter-q_z2i2by/query.gdl is success, result is NOT-EMPTY, execution time is  0.33s.
2023-12-06 07:49:13,678 INFO: run success

Total results: 105


Unnamed: 0,fileName,m1,m2,m3
0,sofa-boot-project/sofa-boot-parent/pom.xml,com.puppycrawl.tools,8.42,checkstyle
1,sofa-boot-project/sofa-boot-tools/sofa-boot-gr...,org.springframework.boot,3.1.2,spring-boot-gradle-plugin
2,sofa-boot-project/sofa-boot-tools/sofa-boot-gr...,org.apache.commons,1.19,commons-compress
3,sofa-boot-project/sofa-boot-tools/sofa-boot-gr...,io.spring.gradle,1.1.0,dependency-management-plugin
4,sofa-boot-project/sofaboot-dependencies/pom.xml,com.alipay.sofa,${sofa.ark.version},sofa-ark-springboot-starter
...,...,...,...,...
100,sofa-boot-tests/pom.xml,com.alipay.sofa,${sofa.boot.version},sofa-boot-smoke-tests-boot
101,sofa-boot-tests/pom.xml,com.alipay.sofa,${sofa.boot.version},sofa-boot-smoke-tests-ark
102,sofa-boot-tests/pom.xml,com.alipay.sofa,${sofa.boot.version},sofa-boot-smoke-tests-runtime
103,sofa-boot-tests/pom.xml,com.alipay.sofa,${sofa.boot.version},sofa-boot-smoke-tests-tracer


保存上一次运行的 query 结果保存到一个 JSON/CSV 文件

In [5]:
%%save_to ./query.csv

Query result saved to /workspaces/CodeFuse-Query/tutorial/notebook/query.csv


STEP 3: 好了，你可以针对分析生成的结果，进行进一步的代码分析了。

Enjoy！