change `env_var_t` type to an actual array/vector of values (was "converting between a list and a flat string is very error prone") #4200
Comments
I just merged commit a9aa234 that takes a baby step to improving the situation. The next step is to replace the flat string representation with a class based on an actual array (i.e., std::vector). We'll still need to use \x1D and \x1E when serializing the vars into the environment and uvar storage to maintain compatibility with current fish versions. But internally we'll always use an actual array representation. I'm betting that will make benchmarks like this one noticeably faster:
We should consider using a different representation when serializing vars into the environment for fish 3.0. JSON, XML, and google protobufs are the obvious choices. I'll repurpose issue #3341 to discuss that. |
This is the first step in implementing a better abstraction for handling fish script vars in the C++ code. It implements a new function (with two signatures) to provide a standard method for construct the flag string representation of a fish script array. Partial fix for #4200
This is the first step in implementing a better abstraction for handling fish script vars in the C++ code. It implements a new function (with two signatures) to provide a standard method for construct the flag string representation of a fish script array. Partial fix for fish-shell#4200
Changing this from "cleanup" to "enhancement" because I'm going to implement proper arrays for fish script vars in the core C++ code. Both to simplify the code and improve performance. By "simplify" I mean remove the need for functions like This is also an important first step for implementing issue #3341. |
Also, it's pretty obvious why the current implementation uses a flat string internally to represent an array of strings now that I've thought about it. It is a consequence of the original fish implementation having been written in C which does not have the C++ vector class. It still strikes me as weird since even in C it would have been easy to use dynamically resized arrays managed via malloc/free given an appropriate interface for managing fish script vars. |
This is the first step to implementing issue #4200 is to stop subclassing env_var_t from wcstring. Not too surprisingly doing this identified several places that were incorrectly treating env_var_t and wcstring as interchangeable types. I'm not talking about those places that passed an env_var_t instance to a function that takes a wcstring. I'm talking about doing things like assigning the former to the latter type, relying on the implicit conversion, and thus losing information. We also rename `env_get_string()` to `env_get()` for symmetry with `env_set()` and to make it clear the function does not return a string.
Another step towards implementing issue #4200 is to make the `tokenize_variable_array()` function private to the env.cpp module.
My previous change to eliminate `class var_entry_t` caused me to notice that `env_get()` turned a set but empty var into a missing var. Which is wrong. Fixing that brought to light several other pieces of code that were wrong as a consequence of the aforementioned bug. Another step to fixing issue #4200.
My previous change to eliminate `class var_entry_t` caused me to notice that `env_get()` turned a set but empty var into a missing var. Which is wrong. Fixing that brought to light several other pieces of code that were wrong as a consequence of the aforementioned bug. Another step to fixing issue #4200.
While I have resolved this issue I make no claims my change is optimal. Just changing from subclassing wcstring to using wcstring_list_t in the class was a huge undertaking. I have no doubt that someone like @ridiculousfish or @mqudsi can make additional performance improvements by avoiding copying values and the like. |
While working on implementing a fix for issue #4190 I decided it would be useful to define a couple of new helper functions for converting from a list of strings to a flat string that represents the internal representation of a fish script array var. As part of that work I converted several places that used the symbols
ARRAY_SEP
,ARRAY_SEP_STR
, orENV_NULL
, to use the new helper functions. I was surprised when several unit tests failed. Mostly because the hand crafted code I replaced with the helper functions explicitly dealt with empty value lists.That brought to my attention that this mechanism does not have an appropriate abstraction since concepts like the var being set but having zero elements or a single element that is an empty string is ill-defined in the C++ code. There shouldn't be any place in the code that calls
tokenize_variable_array()
other than the implementation of theenv_var_t
class. Similarly, there are too many places in the code that explicitly convert between the list and flat representations by referencing the aforementioned symbols.The concept of a fish script variable in the C++ code needs an overhaul with a cleaner abstraction.
The text was updated successfully, but these errors were encountered: