Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Add enum/of_enum and implode/explode to BatUTF8 #547

Open
wants to merge 1 commit into
from

Conversation

Projects
None yet
3 participants
Contributor

pminten commented Mar 8, 2014

I'm not quite sure if UTF-8 in these source files (in the test comments) is good style.

Member

c-cube commented Mar 8, 2014

I'm not sure, but I think the high-level text interface is BatText, which builds a rope structure above UTF8 strings, and BatText already provides iterators.

Contributor

pminten commented Mar 8, 2014

If that's the case I'm a bit confused as to why BatUTF8 exists as a public module. Also BatText has the "scary" description "Heavyweight strings ("ropes")" which to me implies that in most cases I'll want to use BatString/BatUTF8 instead. That might be me misinterpreting the docs, or the docs needing a bit of clarification as to what the "best" module to use is.

Member

c-cube commented Mar 8, 2014

On 8 mars 2014 15:21:22 UTC+01:00, Peter Minten notifications@github.com wrote:

If that's the case I'm a bit confused as to why BatUTF8 exists as a
public module. Also BatText has the "scary" description "Heavyweight
strings ("ropes")" which to me implies that in most cases I'll want to
use BatString/BatUTF8 instead. That might be me misinterpreting the
docs, or the docs needing a bit of clarification as to what the "best"
module to use is.

I dont think "heavyweight" means slower, in this context. Ropes are trees of strings (not mere arrays of bytes) and actually make some operation such as concatenation faster. I hope someone corrects me if I'm wrong, but I think that for Unicode text processing (not low level things like reading /writing bytes) Text should be the default choice.

Simon

Owner

thelema commented Mar 9, 2014

Yes, Text should be the default choice for unicode text processing.
BatUTF8 exists as a library for Ropes to use, and maybe as a buffer type
for conversion between UTF8 and latin-1.

On Sat, Mar 8, 2014 at 10:40 AM, Simon Cruanes notifications@github.comwrote:

On 8 mars 2014 15:21:22 UTC+01:00, Peter Minten notifications@github.com
wrote:

If that's the case I'm a bit confused as to why BatUTF8 exists as a
public module. Also BatText has the "scary" description "Heavyweight
strings ("ropes")" which to me implies that in most cases I'll want to
use BatString/BatUTF8 instead. That might be me misinterpreting the

docs, or the docs needing a bit of clarification as to what the "best"
module to use is.

I dont think "heavyweight" means slower, in this context. Ropes are trees
of strings (not mere arrays of bytes) and actually make some operation such
as concatenation faster. I hope someone corrects me if I'm wrong, but I
think that for Unicode text processing (not low level things like reading
/writing bytes) Text should be the default choice.

Simon


Reply to this email directly or view it on GitHubhttps://github.com/ocaml-batteries-team/batteries-included/pull/547#issuecomment-37105707
.

Member

c-cube commented Mar 9, 2014

Then I agree with @pminten , BatUTF8 should be a BatInnerUTF8 module or something like this. I think String is still useful as a low-level buffer (think IO buffers, etc. or Unix functions), but to represent and handle text exposed to the user, exposing both BatUTF8 and BatText is confusing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment