week3.md

參考 1、2

語法理論

高階語言的核心式 " 語法理論 "
- 利用生成規則 BNF、EBNF描述程式的語法，生成的是一顆樹
- 對語法樹進行編譯或是解譯(直譯) 的動作，編譯比直譯還要快
- 詞彙層次: 使用正規表達式(RE: Regular Expression)
- 語句層次: 使用Context-free Grammar(CFG)
宣告變數有給初始值放在data段
宣告變數沒有給初始值放在bss段
malloc動態矩陣放在heap
區域變數，函數返回點放在stack

正規表達式

字元	解說
`\`	反斜線放在非特殊符號前面，使非特殊符號不會被逐字譯出，代表特殊作用。例如：'b' 如果沒有 '' 在前頭，功能是找出小寫 b；若改為 '\b' 則代表的是邊界功能，block 用意。 /\bter\b/.test("interest") //false /\bter\b/.test("in ter est") //true `/a/` 找出0個或是1個以上的a；而 /a/ 找出 'a' 這個字串 /aaaa/g.test("caaady") //true /a*/.test("caaady") //false '' 也能使自身表現出來，表現 '' ，必須以 '\' 表達。 /[\]/.test(">\<") //true
`^`	匹配輸入的開頭，如果 multiline flag 被設為 true，則會匹配換行字元後。例如：`/^A/` 不會匹配「an A」的 A，但會匹配「An E」中的 A。「`^`」出現在字元集模式的字首中有不同的意思，詳見補字元集。
`$`	匹配輸入的結尾，如果 multiline flag 被設為 true，則會匹配換行字元。例如：`/t$/` 不會匹配「eater」中的 t，卻會匹配「eat」中的 t。
`*`	匹配前一字元 0 至多次。下面舉例要求基本 'aaaa' ，'a' 後面有0個或多個a的意思 /aaaaa/g.test("caaady") //false例如：`/bo*/` 匹配「A ghost booooed」中的 boooo、「A bird warbled」中的 b，但在「A goat grunted」中不會匹配任何字串。
`+`	匹配前一字元 1 至多次，等同於 `{1,}`。例如：`/a+/` 匹配「candy」中的 a，以及所有「caaaaaaandy」中的 a。
`?`	匹配前一字元 0 至 1 次，等同於 `{0,1}`。例如：`/e?le?/` 匹配「angel」中的 el、「angle」中的 le、以及「oslo」中的 l。如果是使用在 `*`、`+`、`?` 或 `{}` 等 quantifier 之後，將會使這些 quantifier non-greedy（也就是儘可能匹配最少的字元），與此相對的是 greedy（匹配儘可能多的字元）。例如：在「123abc」中應用 `/\d+/` 可匹配「123」，但使用 `/\d+?/` 在相同字串上只能匹配「1」。 Also used in lookahead assertions, as described in the `x(?=y)` and `x(?!y)` entries of this table.
`.`	（小數點）匹配除了換行符號之外的單一字元。例如：/.n/ 匹配「nay, an apple is on the tree」中的 an 和 on，但在「nay」中沒有匹配。
`(x)`	Capturing Parentheses匹配 'x' 並記住此次的匹配，如下面的範例所示。在正則表達示 /(foo) (bar) \1 \2/ 中的 (foo) 與 (bar) 可匹配了 "foo bar foo bar" 這段文字中的前兩個字，而 \1 與 \2 則匹配了後面的兩個字。注意， \1, \2, ..., \n 代表的就是前面的pattern，以本範例來說，/(foo) (bar) \1 \2/ 等同於 /(foo) (bar) (foo) (bar)/。用於取代(replace)的話，則是用 $1, $2,...,$n。如 'bar boo'.replace(/(...) (...)/, '$2 $1'). `$&` 表示已匹配的字串
`(?:x)`	Non-Capturing Parentheses找出 'x'，這動作不會記憶 `()`為 group 的意思，檢查時會再 wrap 一次，若有 `g` flag 會無效， `?:` 代表只要 group 就好，不要 wrap有無 `()` 差別？ `'foo'.match(/(foo)/)` `// ['foo', 'foo', index: 0, input: 'foo' ]'foo'.match(/foo/)// [ 'foo', index: 0, input: 'foo' ]` 有無`?:`差別？ `'foo'.match(/(foo){1,2}/)// [ 'foo', 'foo', index: 0, input: 'foo' ]'foo'.match(/(?:foo){1,2}/)[ 'foo', index: 0, input: 'foo' ]` 連`()`都沒有，則`{1,2}`只是適用在`foo`的第二個`o`的條件而已。更多資訊詳見 Using parentheses 。
`x(?=y)`	符合'x'，且後接的是'y'。'y'為'x'存在的意義。例如：`/Jack(?=Sprat)/，`在後面是Sprat的存在下，Jack才有意義。 `/Jack(?=Sprat
`x(?!y)`	符合'x'，且後接的不是'y'。'y'為否定'x'存在的意義，後面不行前功盡棄(negated lookahead)。例如： `/\d+(?!\.)/` ，要找一個或多個數字時，在後面接的不是「點」的情況下成立。 `var result = /\d+(?!\.)/.exec("3.141")` ， result執行出來為[ '141', index: 2, input: '3.141']， index:2，代表141從index = 2開始。
`x\|y`	符合「x」或「y」。舉例來說，`/green
`{n}`	規定符號確切發生的次數，n為正整數例如：`/a{2}/`無法在 "candy" 找到、但 "caandy" 行；若字串擁有2個以上 "caaandy" 還是只會認前面2個。
`{n,m}`	搜尋條件：n為至少、m為至多，其n、m皆為正整數。若把m設定為0，則為Invalid regular expression。例如：`/a{1,3}/` 無法在 "cndy" 匹配到；而在 "candy" 中的第1個"a"符合；在 "caaaaaaandy" 中的前3個 "aaa" 符合，雖然此串有許多a，但只認前面3個。
`[xyz\]`	字元的集合。此格式會匹配中括號內所有字元, including escape sequences。特殊字元，例如點（`.`）和米字號（`*`），在字元集合中不具特殊意義，所以不需轉換。若要設一個字元範圍的集合，可以使用橫線 `"-"` ，如下例所示： `[a-d]` 等同於 `[abcd]。`會匹配 "brisket" 的 "b" 、"city" 的 'c' ……等。而`/[a-z.]+/` 和 `/[\w.]+/` 均可匹配字串 "test.i.ng" 。
`[^xyz\]`	bracket中寫入的字元將被否定，匹配非出現在bracket中的符號。可用 '-' 來界定字元的範圍。一般直接表達的符號都可以使用這種方式。`[^abc]`可以寫作[^`a-c]`. "brisket" 中找到 'r' 、"chop."中找到 'h'。
`[\b\]`	吻合倒退字元 (U+0008). (不會跟 \b 混淆)
`\b`	吻合文字邊界。A word boundary matches the position where a word character is not followed or preceded by another word-character. Note that a matched word boundary is not included in the match. In other words, the length of a matched word boundary is zero. (Not to be confused with `[\b]`.)Examples: `/\bm/` matches the 'm' in "moon" ; `/oo\b/` does not match the 'oo' in "moon", because 'oo' is followed by 'n' which is a word character; `/oon\b/` matches the 'oon' in "moon", because 'oon' is the end of the string, thus not followed by a word character; `/\w\b\w/` will never match anything, because a word character can never be followed by both a non-word and a word character.Note: JavaScript's regular expression engine defines a specific set of characters to be "word" characters. Any character not in that set is considered a word break. This set of characters is fairly limited: it consists solely of the Roman alphabet in both upper- and lower-case, decimal digits, and the underscore character. Accented characters, such as "é" or "ü" are, unfortunately, treated as word breaks.
`\B`	吻合非文字邊界。This matches a position where the previous and next character are of the same type: Either both must be words, or both must be non-words. The beginning and end of a string are considered non-words.For example, `/\B../` matches 'oo' in "noonday", and `/y\B./` matches 'ye' in "possibly yesterday."
`\cX`	Where X is a character ranging from A to Z. Matches a control character in a string.For example, `/\cM/` matches control-M (U+000D) in a string.
`\d`	吻合數字，寫法等同於 `[0-9] 。`例如：`/\d/` 或 `/[0-9]/` 在 "B2 is the suite number." 中找到 '2'
`\D`	吻合非數字，寫法等同於 `[^0-9]。`例如：`/\D/` 或`/[^0-9]/` 在 "B2 is the suite number." 中找到 'B' 。
`\f`	Matches a form feed (U+000C).
`\n`	Matches a line feed (U+000A).
`\r`	Matches a carriage return (U+000D).
`\s`	Matches a single white space character, including space, tab, form feed, line feed. Equivalent to `[ \f\n\r\t\v\u00a0\u1680\u180e\u2000-\u200a\u2028\u2029\u202f\u205f\u3000\ufeff]`.For example, `/\s\w*/` matches ' bar' in "foo bar."
`\S`	Matches a single character other than white space. Equivalent to `[^ \f\n\r\t\v\u00a0\\u1680u180e\u2000\u2001\u2002\u2003\u2004\u2005\u2006\u2007\u2008\u2009\u200a\u2028\u2029\u202f\u205f\u3000]`.For example, `/\S\w*/` matches 'foo' in "foo bar."
`\t`	Matches a tab (U+0009).
`\v`	Matches a vertical tab (U+000B).
`\w`	包含數字字母與底線，等同於`[A-Za-z0-9_]`。例如： `/\w/` 符合 'apple'中的 'a' 、'$5.28中的 '5' 以及 '3D' 中的 '3'。For example, `/\w/` matches 'a' in "apple," '5' in "$5.28," and '3' in "3D."
`\W`	包含"非"數字字母與底線，等同於`[^A-Za-z0-9_]`。例如： `/\W/` 或是 `/[^A-Za-z0-9_]/` 符合 "50%." 中的 '%'For example, `/\W/` or `/[^A-Za-z0-9_]/` matches '%' in "50%."
`\n`	其中 n 是一個正整數，表示第 n 個括號中的子字串匹配（包含括號中的所有的字串匹配）例如： `/apple(,)\sorange\1/` 符合 "apple, orange, cherry, peach." 的 'apple, orange,' 。（ `\1` 表示第一個 partten ，也就是 `(,)`）For example, `/apple(,)\sorange\1/` matches 'apple, orange,' in "apple, orange, cherry, peach."
`\0`	Matches a NULL (U+0000) character. Do not follow this with another digit, because `\0<digits>` is an octal escape sequence. Instead use `\x00`.
`\xhh`	Matches the character with the code hh (two hexadecimal digits)

C0語言

編譯器的六大階段

詞彙掃描(Scan | Lexical Analysis)
- 將整個程式碼分成一個一個的基本詞彙 (token)
語法剖析(Parsing | Syntax Analysis)
- 剖析器利用語法規則進行比對，逐步建立語法樹
- 研究所會考，這邊只講LL的遞迴下降法，就是上面寫的EBNF
語意分析(Semantic Analysis)
- 為語法數加註節點型態，並檢查型態是否相容，然後輸出語意樹，告訴系統誰先做誰後做
中間碼產生(Pcode Generator)
- 語意樹被轉換成中間碼
最佳化(Optimization)
- 考慮暫存器的配置問題，降低指令數，增加效率，但有時候增加速度代碼反而變多
- 這個是每個工程師最需要投入的地方，很多東西都需要提升效能
組合語言產生(Assembly Code Generator)
- 將中間碼轉換為組合語言輸出

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

week3.md

語法理論

正規表達式

C0語言

Clone this wiki locally