Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple consecutive hyphens should be replaced by a proper Unicode dash or bar #27

Open
DavidHaslam opened this issue Dec 26, 2016 · 2 comments

Comments

@DavidHaslam
Copy link
Contributor

DavidHaslam commented Dec 26, 2016

In the concatenated USFM file there are 113 matches to the regexp pattern -{2,}

I think these should be replaced by a proper Unicode character
U+2013 – EN DASH
U+2014 — EM DASH
U+2015 ― HORIZONTAL BAR
as appropriate for each context.

Check the spacing before and after each instance too! And aim for consistency rather than the present variable number of consecutive hyphens keyed for convenience.

@DavidHaslam DavidHaslam changed the title Multiple hyphens should be replaced by a proper Unicode dash or bar Multiple consecutive hyphens should be replaced by a proper Unicode dash or bar Dec 26, 2016
@DavidHaslam
Copy link
Contributor Author

There are also 8 instances of code point U+005F _ LOW LINE, 6 of which are single occurrences.
I take the view that these 6 should also be replaced by a proper Unicode dash or bar as appropriate.

The other location has a double __ at the end of Acts 28:25

\v 25 ਜਾਂ ਉਹ ਆਪਸ ਵਿੱਚ ਇੱਕ ਜ਼ਬਾਨ ਨਾ ਹੋਏ ਤਾਂ ਪੌਲੁਸ ਦੇ ਇਹ ਇੱਕ ਗੱਲ ਕਹਿੰਦੇ ਹੀ ਓਹ ਚੱਲੇ ਗਏ ਕਿ ਪਵਿੱਤ੍ਰ ਆਤਮਾ ਨੇ ਤੁਹਾਡੇ ਵੱਡਿਆਂ ਨੂੰ ਯਸਾਯਾਹ ਨਬੀ ਦੀ ਜ਼ਬਾਨੀ ਠੀਕ ਆਖਿਆ ਸੀ, *__

The asterisk just before the __ might be the residue of a note marker?
This instance needs to be more carefully reviewed.

@DavidHaslam
Copy link
Contributor Author

DavidHaslam commented Jan 17, 2017

There are in fact 76 asterisks in the complete text! See also issue #94

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant