# ST 生成器

- 总结 ST (Structured text) 语言的语法
- 自动化生成 ST 代码
- 使用生成的代码测试 openplc 中使用的 iec2c


In [285]:
from fuzzingbook.Grammars import EXPR_EBNF_GRAMMAR, srange, convert_ebnf_grammar, Grammar, Expansion, CHARACTERS_WITHOUT_QUOTE
from fuzzingbook.Grammars import is_valid_grammar, exp_string
from fuzzingbook.GrammarFuzzer import EvenFasterGrammarFuzzer
from fuzzingbook.GrammarCoverageFuzzer import GrammarCoverageFuzzer
from fuzzingbook.bookutils import print_file
import string
import subprocess
import os
import tempfile


## ST 语法

[ST 语法参考](https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.49.2016&rep=rep1&type=pdf)


使用 openplc Editor 软件编写并生成的一段 ST 代码如下，该代码可导入到 openplc Runtime 中运行。

```iecst
PROGRAM program0
  VAR
    PB1 : BOOL;
    PB2 : BOOL;
    LED : BOOL;
  END_VAR

  IF PB1 THEN
    LED := TRUE;
  END_IF;

  IF PB2 THEN
    LED := FALSE;
  END_IF;
END_PROGRAM


CONFIGURATION Config0

  RESOURCE Res0 ON PLC
    TASK task0(INTERVAL := T#20ms,PRIORITY := 0);
    PROGRAM instance0 WITH task0 : program0;
  END_RESOURCE
END_CONFIGURATION
```
PROGRAM 部分是我们主要考虑的。


### 数据类型

https://en.wikipedia.org/wiki/IEC_61131-3

#### 基本数据类型

- 整数
- 浮点数
- 时间
- 字符串
- 布尔




#### 组合数据类型


- ARRAY
- STRUCT
- UNION
- Sub-range

```iecst
// 结构体
TYPE Rectangle :
    STRUCT
    TopLeft : Point;
    Height : INT;
    Width : INT;
    END_STRUCT;
END_TYPE

// 枚举
TYPE Color :
    (Red, White, Blue);
END_TYPE

// Sub-ranges
TYPE Angle :
    INT(-180..+180);
END_TYPE


// 数组
TYPE Display :
    ARRAY[1..768, 1..1024] OF Color;
END_TYPE

```




### 赋值语句 Assignments

```iecst
VarA := (VarB * 24 MOD 2 = 1) XOR (VarC <= VarD + 34);

(* AnArray is an array containing twenty integers *)
AnArray := 10(1), 5(2), 5(3);

VarA := AND(Var1, Var2, ..., VarN);
```
一些运算符可以有任何数量的参数（相同类型）


In [286]:
# A.3
ST_ASSIGNMENTS_EBNF: Grammar = {
    "<AssignmentStatement>":
        ["<Variable> := <Expression>\n"],
}

### 函数调用语句 Function calls


*先不考虑*

注意，ST中不允许递归调用函数。

> Note that, strictly speaking, the FUNCTION … END_FUNCTION and the VAR … END_VAR constructs mentioned in this paragraph do not belong to the Structured text programming language, as discussed earlier.

```iecst
FUNCTION_BLOCK FB_Timed_Counter
    VAR_INPUT
        // ...
    END_VAR
    
    VAR_OUTPUT
        // ...
    END_VAR
    
    VAR
        // ...
    END_VAR
        
    // Start of Function Block programming
    
END_FUNCTION_BLOCK

// 另一种写法？
FUNCTION Ave_REAL : REAL
    VAR_INPUT
        Input1, Input2 : REAL;
    END_VAR
    Ave_REAL := (Input1 + Input2) / 2;
END_FUNCTION
Average1 := Ave_REAL(5.0, 4.0);
Average2 := Ave_REAL(Input2 := 6.0);
(* Value 3.0 assigned to Average2 *)


```


In [287]:
# A.4
# TODO

### 条件分支语句 Conditional statements

注意 等于号 `=` 和 赋值号 `:=` 的区别


```iecst
IF VarB > 0 THEN
  IF VarC = 3 THEN
    VarA := TRUE;
  ELSE
    VarC := 3;
    VarA := FALSE;
  END_IF;
ELSIF VarB < 0 THEN
  VarA := FALSE;
ELSE
  VarC := 3;
  VarA := TRUE;
END_IF;


CASE Var233 OF
  0: 
  1:
ELSE
  
END_CASE;

```

In [288]:
# A.5
_ST_IF_S = ["""
IF <Expression> THEN  
    <StatementList>
(ELSIF <Expression> THEN
    <StatementList>)*
(ELSE
    <StatementList>)?
END_IF"""]


_ST_CASE_S = ["""
CASE <Expression> OF
    <CaseElement>+
(ELSE
    <StatementList>)?
END_CASE"""]

ST_SELECTION_STATEMENT_EBNF: Grammar = {
    "<SelectionStatement>":
        ["<IfStatement>", "<CaseStatement>"],
    "<IfStatement>": 
        _ST_IF_S,
    "<CaseStatement>": 
        _ST_CASE_S,
    "<CaseElement>": 
        ["<CaseList> : <StatementList>\n"],
    "<CaseList>":
        ["<CaseListElement> (, <CaseListElement>)*"],
    "<CaseListElement>":
        # ["<Subrange>", "<SignedInteger>"], TODO
        ["<SignedInteger>"],
}



### 迭代循环语句 Iteration statements

3 种迭代循环语句

```iecst
FOR VarB := 1 TO VarC DO
  VarD := VarD + 1;
  VarE := 2 * VarE;
END_FOR;

FOR VarB := 100 TO 0 BY –2 DO
  VarD := VarD + VarB;
END_FOR;

WHILE VarA AND (VarD <= VarC) DO
  VarD := VarD + 1;
  VarE := 2 * VarE;
  VarA := VarE < 100;
END_WHILE;

REPEAT
  VarD := VarD + 1;
  VarE := 2 * VarE;
  VarA := VarE < 100;
UNTIL NOT(VarA AND (VarD <= VarC));
END_REPEAT;

```


In [289]:
# A.6

_ST_FOR_S = ["""
FOR <ControlVariable> := <ForList> DO
    <StatementList>
END_FOR"""]

_ST_WHILE_S = ["""
WHILE <Expression> DO
    <StatementList>
END_WHILE"""]

_ST_REPEAT_S = ["""
REPEAT
    <StatementList>
UNTIL <Expression>
END_REPEAT"""]

_ST_EXIT_S = ["EXIT"]

ST_ITERATION_STATEMENT_EBNF: Grammar = {
    "<IterationStatement>":
        ["<ForStatement>", 
         "<WhileStatement>", 
         "<RepeatStatement>",
         "<ExitStatement>",],
    "<ForStatement>":
        _ST_FOR_S,
    "<ControlVariable>":
        # ["<Identifier>"], TODO
        ["<Variable>"],
    "<ForList>":
        ["<Expression> TO <Expression> (BY <Expression>)?"],
    "<WhileStatement>":
        _ST_WHILE_S,
    "<RepeatStatement>":
        _ST_REPEAT_S,
    "<ExitStatement>":
        _ST_EXIT_S,
}



### 其他程序结构



#### 基础表达式


In [290]:
# 参照 A.1
# TODO
ST_EXPRESSIONS_EBNF: Grammar = {
    "<Expression>":
        ["<XOR_Expression> (OR <XOR_Expression>)*"],
    "<XOR_Expression>":
        ["<AND_Expression> (XOR <AND_Expression>)*"],
    "<AND_Expression>":
        ["<Comparison> (AND <Comparison>)*"],
    "<Comparison>":
        ["<EquExpression> (= <EquExpression>)*"],
        #  "<EquExpression> (<> <EquExpression>)*",], TODO 转义
    "<EquExpression>":
        ["<AddExpression> (<ComparisonOperator> <AddExpression>)*"],
    "<ComparisonOperator>":
        ["<", ">", "<=", ">="],
    "<AddExpression>":
        ["<term> (<AddOperator> <term>)*"],
    "<AddOperator>":
        ["+", "-"],
    
    # 以下是对fuzzingbook的稍作修改 TODO
    "<expr>":
        ["<term> + <expr>", "<term> - <expr>", "<term>"],

    "<term>":
        ["<factor> * <term>", "<factor> / <term>", "<factor>"],

    "<factor>":
        ["<factor>", "(<expr>)", "<SignedInteger>(.<SignedInteger>)?", "<Variable>"],

    "<sign>":
        [" NOT ", " -"],

    "<SignedInteger>":
        ["<sign>?<digit>+"],

    "<digit>":
        srange(string.digits),
        
    # 类型变量
    "<Variable>":
        ["VarWillBeReplace"], 

}

"""

"""

'\n\n'


#### 整体结构

In [291]:
# 参照 A.2
ST_STATEMENTS_EBNF: Grammar = {
    "<StatementList>":
        ["(<Statement>;\n)+"],
    "<Statement>":
        ["<AssignmentStatement>\n", # 赋值语句
        #"<Function-statements>",  # 函数语句
         "<SelectionStatement>", # 条件语句
         "<IterationStatement>"], # 循环语句
}

## ST EBNF

### 根据 PASCAL 试试看

> 方括号划界可选构建体；`{}` 表示封闭构造的零或更多重复；`()` 表示构造的简单分组；`|` 表示从许多人中选择一个；定义中的文字使用粗体字体或双引号标记表示。

> EBNF operators – `?` becomes (0,1), `*` becomes (0,), and `+` becomes (1,). 

```ebnf

```

In [292]:
LOW_LEVEL_EBNF: Grammar = {
    "<variable-list>":
        ["<variable>", "<variable-list> , <variable>"],
    "<identifier-list>":
        ["<identifier>", "<identifier-list> , <identifier>"],
    "<expression-list>":
        ["<expression>", "<expression-list> , <expression>"],
    "<number>":
        ['<integer_number>', '<real_number>'],
    "<integer_number>":
        ['<digit_sequence>'],
    "<real_number>":
        ['<digit_sequence>(.<digit_sequence>)?<scale_factor>?'],
    "<scale_factor>":
        ["E<sign>?<digit_sequence>","e<sign>?<digit_sequence>"],  
    "unsigned_digit_sequence":
        ["<digit>+"],
    "<digit_sequence>":
        ["<sign>?<unsigned_digit_sequence>"],
    "<sign>":
        [" +", " -"],
    "<letter>":
        srange(string.ascii_letters),
    "<digit>":
        srange(string.digits),
    "<string>":
        ["'<string_character>+'"],
    "<string_character>":
        srange(CHARACTERS_WITHOUT_QUOTE) + ["''"],  # ?
    "<label>":
        ["integer_number"],
    "<constant>":
        [""],  # TODO: [ sign ] (constant_identifier | number) | string
}

VARIABLE_IDENTIFIER_EBNF: Grammar = {
    "<identifier>": # TODO = letter { letter | digit } 
        ["<letter>+"],
    "<file-variable>":
        ["<variable>"],
    "<referenced-variable>":
        ["<pointer-variable>^"], # ?
    "<record-variable>":
        ["<variable>"],
    "<pointer-variable>":
        ["<variable>"],
    "<actual-variable>":
        ["<variable>"],
    "<array-variable>":
        ["<variable>"],
    "<field-identifier>":
        ["<identifier>"],
    "<constant-identifier>":
        ["<identifier>"],
    "<variable-identifier>":
        ["<identifier>"],
    "<type-identifier>":
        ["<identifier>"],
    "<procedure-identifier>":
        ["<identifier>"],
    "<function-identifier>":
        ["<identifier>"],
    "<bound-identifier>":
        ["<identifier>"],
}

# INPUT_OUTPUT_EBNF 
# Record Fields
# Types

PASCAL_EBNF_GRAMMAR: Grammar = {
    "<start>":
        ["<variable-list>"],
    **LOW_LEVEL_EBNF,
}
# assert is_valid_grammar(PASCAL_EBNF_GRAMMAR)
pascal_grammar = convert_ebnf_grammar(PASCAL_EBNF_GRAMMAR)
# assert is_valid_grammar(pascal_grammar)


### ST 语言 EBNF 自顶而下构建



1. 先生成变量列表
2. 根据变量列表生成程序主体

*时间有限，此处构建的仅为 ST 语言的子集*

范式：

```iecst
PROGRAM program0
  VAR
    // 生成变量列表
  END_VAR
  // 生成程序主体
END_PROGRAM


CONFIGURATION Config0

  RESOURCE Res0 ON PLC
    TASK task0(INTERVAL := T#20ms,PRIORITY := 0);
    PROGRAM instance0 WITH task0 : program0;
  END_RESOURCE
END_CONFIGURATION
```


#### 变量列表 EBNF

In [293]:
ST_VAR_EBNF_GRAMMAR: Grammar = {
    "<start>":
        ["<变量声明语句>+"],
    "<变量声明语句>":
        ["<variable-name> : <variable-type>;\n"],
    "<variable-name>": # 形如 Var233
        ["Var<digit>+"],
    "<variable-type>": # 形如 INT REAL
        ["INT", "REAL", "STRING", "BOOL"],
    "<digit>":
        srange(string.digits)
} # 之后用随机选取变量名和类型的方法替代了

#### 程序主体 EBNF

In [294]:
ST_BODY_EBNF_GRAMMAR: Grammar = {
    "<start>":
        ["<StatementList>"],
    **ST_STATEMENTS_EBNF,
    **ST_EXPRESSIONS_EBNF,
    **ST_ASSIGNMENTS_EBNF,
    **ST_ITERATION_STATEMENT_EBNF,
    **ST_SELECTION_STATEMENT_EBNF,
}
assert is_valid_grammar(ST_BODY_EBNF_GRAMMAR)

## ST 生成

`GrammarCoverageFuzzer`

> Subsequent calls to `fuzz()` will go for further coverage (i.e., covering the other area code digits, for example); a call to `reset()` clears the recored coverage, starting anew.

In [308]:
import numpy as np


ST_DATA_TYPES = ["INT", "REAL", "STRING", "BOOL"]
ST_DATA_TYPES_W = [0,0,0,1]
assert sum(ST_DATA_TYPES_W) == 1



st_vars = {}
for i in range(np.random.randint(10)):
    st_vars[f'Var{np.random.randint(100)}'] = np.random.choice(ST_DATA_TYPES, p=ST_DATA_TYPES_W)
st_vars_s = ""
for i in st_vars:
    st_vars_s += f"        {i} : {st_vars[i]};\n"
    
# print(st_vars_s)
# print([*st_vars])

In [309]:
ST_BODY_EBNF_GRAMMAR["<Variable>"] = [*st_vars]
fbody = GrammarCoverageFuzzer(convert_ebnf_grammar(ST_BODY_EBNF_GRAMMAR),start_symbol="<start>",max_nonterminals=20)
body_s = fbody.fuzz()
# print(body_s)

In [310]:
# 块拼接
MODEL_STR = f"""
PROGRAM program0
  VAR
{st_vars_s}
  END_VAR
  
{body_s}

END_PROGRAM
"""

# 写入文件
with open("st_gen.st", "w") as f:
    f.write(MODEL_STR)
print(MODEL_STR)


PROGRAM program0
  VAR
        Var30 : BOOL;
        Var10 : BOOL;
        Var86 : BOOL;
        Var67 : BOOL;
        Var61 : BOOL;

  END_VAR
  
Var86 := Var30    AND Var67     OR Var61     XOR Var30    XOR Var61    

;

CASE Var10       OF
    6  : EXIT;

5  : EXIT;


ELSE
    EXIT;
EXIT;
EXIT;

END_CASE;


END_PROGRAM



## 测试

### 单个测试

In [314]:
program = "/home/hxn/下载/OpenPLC_Editor/matiec/iec2c"
FILE = "st_gen.st"
arg = [program,
       "-f",
       "-l",
       "-p",
       "-I", "/home/hxn/下载/OpenPLC_Editor/matiec/lib",
       "-T", "/home/hxn/桌面/ST_study/openPLC_project_test/build",
       FILE]
# -f // 在错误消息上显示完整的位置
# -l // 使用宽松的数据类型等价模型       (a non-standard extension?)
# -p // 允许使用前向引用               (a non-standard extension?)
# -I // include_directory
# -T // target_directory
result = subprocess.run(arg,
                        stdin=subprocess.DEVNULL,
                        stdout=subprocess.PIPE,
                        stderr=subprocess.PIPE,
                        universal_newlines=True)  # Will be "text" in Python 3.7

print(">>>stdout")
print(result.stdout)
print(">>>stderr")
print(result.stderr)
print(">>>returncode")
print(result.returncode)

>>>stdout
POUS.c
POUS.h
LOCATED_VARIABLES.h
VARIABLES.csv

>>>stderr

>>>returncode
0


### 循环测试

In [312]:
basename = "st_gen.st"
tempdir = tempfile.mkdtemp()
FILE = os.path.join(tempdir, basename)
program = "/home/hxn/下载/OpenPLC_Editor/matiec/iec2c"
arg = [program,
       "-f",
       "-l",
       "-p",
       "-I", "/home/hxn/下载/OpenPLC_Editor/matiec/lib",
       "-T", tempdir,
       FILE]
# -f // 在错误消息上显示完整的位置
# -l // 使用宽松的数据类型等价模型       (a non-standard extension?)
# -p // 允许使用前向引用               (a non-standard extension?)
# -I // include_directory
# -T // target_directory

REPEAT_TIMES = 100
retuencodes = []

for i in range(REPEAT_TIMES):
    
    # 生成随机变量
    st_vars = {}
    for j in range(np.random.randint(1,20)):
        st_vars[f'Var{np.random.randint(100)}'] = np.random.choice(ST_DATA_TYPES, p=ST_DATA_TYPES_W)
    st_vars_s = ""
    for j in st_vars:
        st_vars_s += f"        {j} : {st_vars[j]};\n"
    
    ST_BODY_EBNF_GRAMMAR["<Variable>"] = [*st_vars]
    fbody = GrammarCoverageFuzzer(convert_ebnf_grammar(ST_BODY_EBNF_GRAMMAR),start_symbol="<start>",max_nonterminals=20)

    # 块拼接
    MODEL_STR = f"""
PROGRAM program0
    VAR
{st_vars_s}
    END_VAR

{fbody.fuzz()}

END_PROGRAM
"""
    # 写入文件
    with open(FILE, "w") as f:
        f.write(MODEL_STR)
    
    result = subprocess.run(arg,
                            stdin=subprocess.DEVNULL,
                            stdout=subprocess.PIPE,
                            stderr=subprocess.PIPE,
                            universal_newlines=True)  # Will be "text" in Python 3.7

    # print(result.returncode)
    retuencodes.append(result.returncode)
    if result.returncode not in [0]:
        print(">>>stdout")
        print(result.stdout)
        print(">>>stderr")
        print(result.stderr)
        print(">>>returncode")
        print(result.returncode)
        print(">>>FILE")
        print(MODEL_STR)
    if i % 50 == 0:
        print(f"{i}/{REPEAT_TIMES} done")
print("DONE, pass rate: ",retuencodes.count(0)/REPEAT_TIMES)


>>>stdout

>>>stderr
/tmp/tmp4n2cor7_/st_gen.st:21-6..21-10: error: Invalid data type for 'FOR' control variable.
/tmp/tmp4n2cor7_/st_gen.st:21-19..21-51: error: Invalid data type for 'FOR' begin expression.
/tmp/tmp4n2cor7_/st_gen.st:21-61..21-79: error: Invalid data type for 'FOR' end expression.
/tmp/tmp4n2cor7_/st_gen.st:21-88..21-120: error: Invalid data type for 'FOR' by expression.
/tmp/tmp4n2cor7_/st_gen.st:26-13..26-16: error: 'CASE' quantity not an integer or enumerated.
5 error(s) found. Bailing out!

>>>returncode
1
>>>FILE

PROGRAM program0
    VAR
        Var15 : BOOL;
        Var68 : BOOL;
        Var62 : BOOL;
        Var60 : BOOL;
        Var28 : BOOL;
        Var41 : BOOL;
        Var57 : BOOL;
        Var22 : BOOL;
        Var3 : BOOL;
        Var44 : BOOL;
        Var40 : BOOL;
        Var56 : BOOL;
        Var19 : BOOL;

    END_VAR


FOR Var19 := Var15     XOR Var68    XOR Var56      TO Var41      OR Var44      BY Var40     XOR Var62     OR Var28      DO
    Var22

## 结论与展望

根据语法规则生成的输入很难找到 iec2c 翻译器的漏洞，因为语法规则本身就是正确的，很难产生崩溃等异常情况

后续展望使用交叉变异，源码插桩的方式对 openplc 进行更深入的模糊测试。
因为时间有限，未能完成。


## 参考资料

[EBNF for Pascal](http://www.cs.kent.edu/~durand/CS43101Fall2004/resources/Pascal-EBNF.html)