-
Notifications
You must be signed in to change notification settings - Fork 229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
正则截取问题(乱码) #1135
Comments
-- 直接增加 utf8.sub() -- 用法同 strnig.sub(str,si,ei)
function utf8.sub(str,si,ei)
local function index(ustr,i)
return i>=0 and ( utf8.offset(ustr,i) or ustr:len() +1 )
or ( utf8.offset(ustr,i) or 1 )
end
local u_si= index(str,si)
ei = ei or utf8.len(str)
ei = ei >=0 and ei +1 or ei
local u_ei= index(str, ei ) -1
return str:sub(u_si,u_ei)
end |
可以用 |
shewer 大大和 Ace-Who 大大兩種作法,親測皆可行 👍🏻 |
感谢,处理了很大一部分,不过还是有见到乱码 我猜是没覆盖全utf8,看了utf8的编码规则 找到一个更加全的正则匹配: 编码表
进制转换对照
|
「帕」的例子应该不是 UTF-8 字符模式的问题,是表达式中 |
尚還有亂碼問題:
目前尚未發現亂碼問題:
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
想正则截取前50个字,截了前50个字节。最后一个utf8中文字3字节被截了一半乱码
结尾加
/u
按unicode(utf-8)匹配,不知道配置里怎么加,直接加最后没效果The text was updated successfully, but these errors were encountered: