# Consider adding formatted string output #108

Closed
opened this Issue Nov 25, 2017 · 14 comments

Projects
None yet
4 participants

### NOP0 commented Nov 25, 2017

 Gitter discussion suggests that possible solution could be printf macro.

### hellerve added a commit to hellerve/Carp that referenced this issue Nov 29, 2017

 format: added simple versions of format (work on issue carp-lang#108) 
 354d251 
Contributor

### hellerve commented Nov 30, 2017 • edited

 I have a proposal for this in a feature branch that works with interfaces. Let me describe it and see whether it does what we need, is okay to look at, and makes sense. It registers an interface format of type (λ [String a] String) that specializes on the second argument. A motivating example: > (Float.format @"%.4d" 10.0f) => "10.0000" > (Int.format @"%x" 255) => "ff" Why do we need to define this interface for any type when it is only a thin wrapper around snprintf? Well, I’m glad you asked. This is needed to preserve type hygiene; otherwise we’d have to introduce some sort of catch-all type into the interop, and that doesn’t sound like a great idea. There are some open questions and concerns from my side (feel free to add to the list): is this good enough? While I feel like this interface is nice to have, it might become tedious to print more complicated expressions. This can be partially solved by a simple macro, if we made, say, an interface like this: (fmt "%x" 255 " %.4d" 100.0f). I didn’t add that macro, because I couldn’t settle on a good API. having user-defined strings in *printf functions is a common security hole. We’ll have to make sure that we document that having a potential attacker define the format string is insecure. what about compound types? Since there is no ready-made solution from *printf for structs, I’m not sure how to best design such an interface. Keep in mind that we already have str, and we are able to format the properties of a type by themselves, but maybe a syntax for structs would be interesting? That’s kind of a moonshot, though, I think, because it would complicate the implementation a lot. I think that’s all! Cheers

Contributor

### hellerve commented Nov 30, 2017

 The problem is that in this case that wouldn’t help, because printf is not only variadic, but also has a list of catch-all types as parameters. Unless I’m missing something profound, there is no valid signature for printf that doesn’t use the varargs ellipsis.

### jiegec commented Nov 30, 2017 • edited

 What about implement a printf-like macro that convert to (concat (Int.fmt "%d" a) (Char.fmt "%c" b))? Macro can have varargs (and maybe compile-time type check, which some c/c++ compilers already does and shows a warning when type mismatched).
Contributor

### hellerve commented Nov 30, 2017

 I think we talked about that idea on the chat! I do agree that this sounds good, but have one major concern: It would require introspection and parsing of the string, which is not currently possible inside the macro. We could do it at runtime, but that would be a terribly costly idea. Why would we need that? Because it needs to split the format strings on the % characters; and it’d also have to take care of the %% special case and such. That’s why I talked about an already-split fmt macro in my first comment. I would also try to optimize it, but bear in mind that it would generate a string for every argument and then merge them, which sounds like a fairly slow operation, even without benchmarking it. As I said, I’m not sure the format interface solves our problems. But I think it is a step in the right direction.

### jiegec commented Nov 30, 2017

 Oh, I didn't join the gitter chat and missed those conversations.
Contributor

### hellerve commented Nov 30, 2017

 That’s alright, it’s good that we document it here. To recap, a macro that analyzes the string would be awesome, but sounds like a lot of work. I’m not opposed to work per se, but I would like to make sure we do not find a simpler version that is satisfying our requirements—which also includes a good, clean, intuitive API of course. In general, I also think that while it’s important to think of a solid system to implement, it is empowering to realize that the system might change again before 1.0 if we find a better way to do things.
Collaborator

### eriksvedang commented Nov 30, 2017

 What (dynamic) functions would a macro that analyses the format string need access to? (roughly) It's probably pretty handy functions in general, so might be worth adding those and give it a try.
Contributor

### hellerve commented Nov 30, 2017

 We would have to traverse the string and be able to split and merge it. I can sketch a function that does what we want and see what I’ll need for it. In general, I’d opt for the functions being consistent for the ones we’ll implement in #94, to reduce confusion, even if the internal implementation will probably be quite different.
Collaborator

### eriksvedang commented Nov 30, 2017

 Yes agreed. I'll create a module, something like Dynamic.String so they can avoid clashing with the existing ones.
Contributor

### hellerve commented Nov 30, 2017 • edited

 I’ve attached an obviously untested prototype. This might work, but mostly it highlights which functions we need: (defdynamic fmt-internal [s args] (let [idx (String.index-of s \%) len (String.count s)] (cond (= idx -1) s ; no more splits found, just return string (= \% (String.char-at s (inc idx))) ; this is an escaped % (list 'String.append (String.substring s 0 (+ idx 2)) (fmt-internal (String.substring (+ idx 2) len))) (= 0 (count args)) ; we need to insert something, but have nothing (macro-error "error in format string: not enough arguments for format string") ; okay, this is the meat: ; get the next % after our escaper (let [next (String.index-of (String.substring s (inc idx) len) \%)] (if (= 0 next) (list 'fmt slice (car args)) (let [offs (+ (inc idx) 1) slice (String.substring s 0 offs)] (list 'String.append (list 'fmt slice (car args)) (fmt-internal (String.substring s offs len) (cdr args))))))))) (defmacro fmt [s :rest args] (fmt-internal s args)) The functions we need are thus (substring str from to), (index-of str char), and (char-at str index). One thing about this is problematic: the appending actually happens at runtime—because of the fmt, which might not be evalable at macro expansion time—, thus leading to a fairly high number of appends and possibly tiny strings in memory. I’m also not at all sure about the soundness of the algorithm, but it might work. I also attached a proof of concept/hopefully faithful transcription in Python for people to run if they want to test the behaviour: def fmt(s, args): idx = s.find("%") if idx == -1: return s if s[idx+1] == "%": return s[:idx+2] + fmt(s[idx+2:], args) if not args: raise Exception("error in format string: not enough arguments for format string") nxt = s[idx+1:].find("%") if nxt == -1: return s + "(arg={})".format(args[0]) slc = s[:nxt+1+idx] return slc + "(arg={})".format(args[0]) + fmt(s[nxt+1+idx:], args[1:]) The args are interspersed differently, of course, but it should work.
Collaborator

### eriksvedang commented Nov 30, 2017

 Excellent, that's a very good start! I'll try to make this convenient to write using the dynamic String functions. Will write here again when those are in place and works well enough.

### jiegec commented Dec 1, 2017 • edited

 We can force the format string to be a string literal as in Rust std::format!. And we can add 'varargs of same type' function like: // Force the type check on Carp side // (register concat (\lambda [:rest String] String) (omitting the ref for now) // (String.concat str1 str2 str3) // expands to // (String.concat_internal [str1 str2 str3]) void String_concat_internal(Array args) { ... } // then (format "%d %s" 6 s) can be expanded to // (String.append (Int.to_string 6) " " s) Alternatively, we can use a precompiled version of format string and its arguments as in Rust std::fmt::Arguments and pass it to internal functions to avoid additional costs.

### hellerve added a commit to hellerve/Carp that referenced this issue Dec 26, 2017

 core & compiler: initial work on carp-lang#108 
 13d2969 

### hellerve added a commit to hellerve/Carp that referenced this issue Dec 30, 2017

 format: added simple versions of format (work on issue carp-lang#108) 
 49f2bc1 

Merged

### hellerve added a commit to hellerve/Carp that referenced this issue Dec 30, 2017

 core & compiler: initial work on carp-lang#108 
 85ce16d 

### hellerve added a commit to hellerve/Carp that referenced this issue Dec 30, 2017

 format: added simple versions of format (work on issue carp-lang#108) 
 eccddd0 
Contributor

### hellerve commented Dec 30, 2017 • edited

 I’ve added #154 which puportedly fixes this. I hope this is what we want.