Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent non-Korean contents from being processed #10

Closed
dahlia opened this issue May 26, 2020 · 1 comment
Closed

Prevent non-Korean contents from being processed #10

dahlia opened this issue May 26, 2020 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@dahlia
Copy link
Owner

dahlia commented May 26, 2020

HTML elements can contain an attribute lang to indicate the language of its content. If an HTML element is guaranteed that its content does not consist of Korean, i.e., it has a lang attribute and the language tag does not start with ko, Seonbi should not process the element and its contents.

For example, the following input:

<p>小說 <cite lang="ja">徳川家康</cite>는 韓國에서 <cite>大望</cite>이라는
海賊版으로 옮겨진 것이 더 有名했다.</p>

… should be processed like:

<p>소설 <cite lang="ja">徳川家康</cite>는 한국에서 <cite>대망</cite>이라는
해적판으로 옮겨진 것이 더 유명했다.</p>
@dahlia dahlia added the enhancement New feature or request label May 26, 2020
@dahlia dahlia changed the title Ignore preprocessing non-Korean contents Prevent non-Korean contents from being processed May 26, 2020
@dahlia dahlia self-assigned this May 29, 2021
@dahlia
Copy link
Owner Author

dahlia commented Nov 6, 2021

Now Chinese characters in non-Korean contents are no more phoneticized to hangul. However, still some inappropriate transformations are applied to them. For example, Chinese/Japanese stops ( & ) are still transformed into Western stops (, & .).

@dahlia dahlia closed this as completed in 0216f46 Nov 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant