Bring more consistency to nmodl::stringutils utility functions #1066

tristan0x · 2023-09-06T11:34:50Z

Status

Here are some of the utility functions in src/utils/string_utils.hpp:

std::string& ltrim(std::string& s)
std::string& rtrim(std::string& s)
std::string& trim(std::string& s)
void remove_character(std::string& str, const char c)
std::string& trim_newline(std::string& s)
std::string escape_quotes(const std::string& before)
std::string align_text(const std::string& text, int width, text_alignment type)
std::string tolower(std::string text)

We can distinguish 2 categories:

the functions doing a side effect by accepting a reference to a std::string. Most of these functions also return the reference, probably to use them with a functional style (std::cout << stringutils::trim(my_str)). Note that the function remove_character does not follow that idiom by returning void.
The others returning a new instance of std::string with no side effect. Besides, tolower should accept a r-value, not an l-value :)

Situation

The functions of the first category are a bit misleading, because the fact that they return a std::string may suggest that the parameter is not modified, when it actually is.

Target

2 solutions:

either functions accept an l-value, do in-place edition of the string, and return nothing.
or adopt a full functional style by returning a copy of the input string with no side-effect

I prefer to sacrifice the functional style of (2) in the altar of performance by doing in-place edition of the strings with solution (1)

What do you think @ohm314 @pramodk ?

The text was updated successfully, but these errors were encountered:

pramodk · 2023-09-06T12:54:04Z

I didn’t check all usages but just want to say that « performance » aspect is not much a concern in the context of nmodl (as its translation program and strings that we manipulate here are hundreds of chars long).

So I don’t have strong: something that make usage simpler, nicer, intuitive is fine.

tagging @1uc as well as he has been looking at the code in recent weeks and would have opinion.

Edit: Copying @1uc's comment from the Slack:

In Option 1 how does one chain operations: ltrim(rtrim(s));

ohm314 · 2023-09-06T19:11:29Z

uff, I'm happy to see that @pramodk had the same feeling as me:

I don't think either that performance is an issue here. The nmodl codegen is not so performance critical that I would be focusing too much on that
I much prefer to have an API to use in the codegen code that makes it easy and elegant to write

so my conclusion is that ~~option 1~~ option 2 is preferable, since functional style allows you to do more and with simpler/cleaner code. Having said this, I absolutely agree with @tristan0x that these functions need to be cleaned up and made consistent.
(edit: I mistakenly wrote option 1, but meant option 2, clearly I should get some more rest before I continue here...)

1uc · 2023-09-06T20:25:22Z

Even in Option 2 it's not entirely obvious that it's a large performance penalty, due to move semantics. For example:

auto str = trim("   fooo   "); // Create one string, and copy elision.
str = ltrim(rtrim(str)); // 1 copy, 2 moves.
str = ltrim(rtrim(std::move(str))); // 3 moves

Everyone's favourite source suggests that sizeof(std::string) == 3 * sizeof(void*). Hence, a move should copy the name number of bytes for both long and short strings. The source:
https://stackoverflow.com/a/21710033

If you're doing something like stripping whitespace from the front, you end up copying the size of the string (minus the leading whitespace). So you don't have to incur a penalty that larger than the cost of the function.

Finally, there's to_upper which from it's signature is likely used in contexts where one doesn't want to destroy the original. In these cases the copy is unavoidable; and there's no difference in cost between Option 1 and Option 2 in terms of speed.

Which brings us to the final reason why Option 1 is a bit clunky:

trim(str + "foo");  // Option 1 => compilation error
auto str_foo = trim(str + "foo"); // Option 2 => just fine.

1uc · 2023-09-07T06:52:08Z

One observation, I can't remember seeing any of these functions and I don't immediately see a use for any of them, except tolower / toupper, because they're rather low level. NMODL has stronger abstractions, that I'd expect to deal with padding/alignment issues.

ltrim: 1 use
rtrim: 0 uses + 1 in trim
trim: approx. 16 uses
remove_character: 3 uses
align_text: 3 uses

ohm314 · 2023-09-07T07:04:57Z

ah you're making a good point - but we can clean them up nevertheless.

1uc · 2023-09-07T07:34:32Z

It's not an argument against cleaning it up. Rather it's an argument that a super ergonomic API maybe isn't as important as we initially thought. Also the cost of changing this if we notice that it's too slow, isn't prohibitive.

* The prototypes of these functions now follow these 2 principles: 1. the input string is kept intact 2. the transformed text is returned in a new `std::string` instance * Improve the Doxygen documentation Also removed some useless work along the way while updating the calls to these functions. Fixes #1066

* The prototypes of these functions now follow these 2 principles: 1. the input string is kept intact 2. the transformed text is returned in a new `std::string` instance * Improve the Doxygen documentation Also removed some useless work along the way while updating the calls to these functions. Fixes #1066 Co-authored-by: Luc Grosheintz <luc.grosheintz-laval@epfl.ch>

tristan0x added the proposal New suggestions label Sep 6, 2023

tristan0x assigned ohm314 and pramodk Sep 6, 2023

tristan0x assigned tristan0x and unassigned ohm314 and pramodk Sep 8, 2023

tristan0x mentioned this issue Sep 8, 2023

make nmodl::stringutils methods more consistent #1071

Merged

tristan0x closed this as completed in #1071 Sep 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bring more consistency to nmodl::stringutils utility functions #1066

Bring more consistency to nmodl::stringutils utility functions #1066

tristan0x commented Sep 6, 2023

pramodk commented Sep 6, 2023 •

edited

Loading

ohm314 commented Sep 6, 2023 •

edited

Loading

1uc commented Sep 6, 2023

1uc commented Sep 7, 2023

ohm314 commented Sep 7, 2023

1uc commented Sep 7, 2023

Bring more consistency to nmodl::stringutils utility functions #1066

Bring more consistency to nmodl::stringutils utility functions #1066

Comments

tristan0x commented Sep 6, 2023

Status

Situation

Target

pramodk commented Sep 6, 2023 • edited Loading

ohm314 commented Sep 6, 2023 • edited Loading

1uc commented Sep 6, 2023

1uc commented Sep 7, 2023

ohm314 commented Sep 7, 2023

1uc commented Sep 7, 2023

pramodk commented Sep 6, 2023 •

edited

Loading

ohm314 commented Sep 6, 2023 •

edited

Loading