## Python 正規表示式（Regular Expression）規則表

| 符號       | 描述                                | 例子       | 匹配範例        |
|------------|-------------------------------------|------------|-----------------|
| `.`        | 匹配任何字符（換行除外）             | `a.c`      | 'abc', 'a9c'    |
| `^`        | 匹配字符串開頭                      | `^abc`     | 'abc'           |
| `$`        | 匹配字符串結尾                      | `abc$`     | 'abc'           |
| `*`        | 匹配前一個字符 0 次或多次           | `ab*c`     | 'ac', 'abc'     |
| `+`        | 匹配前一個字符 1 次或多次           | `ab+c`     | 'abc', 'abbc'   |
| `?`        | 匹配前一個字符 0 次或 1 次          | `ab?c`     | 'ac', 'abc'     |
| `{m}`      | 匹配前一個字符剛好 m 次             | `a{2}b`    | 'aab'           |
| `{m,}`     | 匹配前一個字符至少 m 次             | `a{2,}b`   | 'aab', 'aaab'   |
| `{m,n}`    | 匹配前一個字符 m 到 n 次            | `a{2,3}b`  | 'aab', 'aaab'   |
| `\`       | 轉義特殊字符                        | `a\.c`    | 'a.c'           |
| `[abc]`    | 匹配方括號內任一字符                | `a[bc]d`   | 'abd', 'acd'    |
| `[^abc]`   | 匹配方括號內以外的任一字符          | `a[^bc]d`  | 'aad', 'aed'    |
| `a\|b`      | 匹配 a 或 b                         | `a\|b`      | 'a', 'b'        |
| `(abc)`    | 匹配括號內的表達式                  | `a(abc)d`  | 'aabcd'         |
| `(a\|b)c`   | 匹配 a 或 b，後接 c                 | `(a\|b)c`   | 'ac', 'bc'      |

### 特殊序列

| 符號       | 描述                    | 例子        | 匹配範例     |
|------------|-------------------------|-------------|--------------|
| `\d`       | 匹配任何數字            | `\d\d`      | '12'         |
| `\D`       | 匹配任何非數字          | `\D\D`      | 'ab'         |
| `\s`       | 匹配任何空白字符        | `a\sb`      | 'a b'        |
| `\S`       | 匹配任何非空白字符      | `a\Sb`      | 'aab'        |
| `\w`       | 匹配任何字母數字字符    | `\w\w`      | 'a1'         |
| `\W`       | 匹配任何非字母數字字符  | `\W\W`      | '@!'         |



In [1]:
import re

In [17]:
pattern = re.compile(r"a?\d+\+?")
str1 = "123 a4567 4567+"
result = pattern.search(str1)
result.group()

'123'

In [14]:
result2 = re.search(pattern, str1)
result2.group()

'4567+'

In [125]:
pattern = re.compile(r"(\d+) (\d+)")
str1 = "123 4567 789"
result = pattern.search(str1)
result.group()

'123 4567'

In [129]:
result.group(0)

'123 4567'

In [127]:
result.group(1)

'123'

In [128]:
result.group(2)

'4567'

In [34]:
pattern = re.compile(r"(\d+) (\d+)")
str1 = "123 4567 789"
re.match(pattern, str1).group()

'123 4567'

In [37]:
pattern = re.compile(r"(\d{4}) (\d+)")
str1 = "123 4567 789"
result = re.match(pattern, str1)
print(result)
print()

None



In [64]:
pattern_g = re.compile(r"\d+ \d+")
str1 = "123 4567 789 45"
result = re.findall(pattern, str1)
print(result[0][0])
print(result)

1
['123 4', '567 7', '89 4']


In [69]:
pattern = re.compile(r"(\d+?) (\d+?)")
str1 = "123 4567 789 45"
result = re.findall(pattern, str1)
print(result[0][0])
print(result)

123
[('123', '4'), ('567', '7'), ('89', '4')]


In [4]:
import re

In [1]:
str2 = "Hello, My name is Steve. I am a programmer."

In [2]:
str2.split()

['Hello,', 'My', 'name', 'is', 'Steve.', 'I', 'am', 'a', 'programmer.']

In [36]:
str2 = "Hello, My name is Steve. I am a programmer."
re.split(r"[.,\s]+", str2)

['Hello', 'My', 'name', 'is', 'Steve', 'I', 'am', 'a', 'programmer', '']

In [35]:
re.findall(r"[^.,\s]+", str2)

['Hello', 'My', 'name', 'is', 'Steve', 'I', 'am', 'a', 'programmer']

In [24]:
str2 = "Hello, My name is Steve. I am a programmer"
re.split(r"([A-H])", str2)

['', 'H', 'ello, My name is Steve. I am a programmer']

In [102]:
text = "The stock code for Apple is AAPL and price is NT$ 120.15, and for Microsoft is MSFT."
re.findall(r"NT\$ \d+\.?\d*", text)

['NT$ 120.15']

In [106]:
text1 = "Today is 2023-08-31. And tomorrow is 2023-9-1"
re.findall(r"\d{4}-\d{1,2}-\d{1,2}", text1)

['2023-08-31', '2023-9-1']

In [108]:
ord('a')  # ASCII code

97

In [109]:
ord("哈")

21704

In [110]:
chr(21704)

'哈'

In [114]:
ord("😂")

128514

In [115]:
chr(128514)

'😂'

In [119]:
str1 = "你好，我叫做Steve"
re.findall(r"[\u4e00-\u9fa5，。]+", str1)

['你好，我叫做']

In [122]:
chr(int("9fa5", 16))

'龥'