-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document and implement doubly-linked implementation of prop (see #1040) #1059
Conversation
Any news on this: I think 'large' props would be a great addition (cf. work on tagging for example). |
What about "props as hash-table" (since this seems to be in use nearly in the kernel by now)?? |
ef0d119
to
e9f327c
Compare
I decided to rebase locally on the |
% \cs[no-index]{prop_map_\ldots{}}. | ||
% | ||
% \item | ||
% The \enquote{linked} storage method is meant for property lists with a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
% The \enquote{linked} storage method is meant for property lists with a | |
% The \enquote{linked} storage method is meant for property lists with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me at the code level - lets see what everyone thinks interface-wise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, but I'm not through reviewing. Any, here are some first comments on the interface section
l3kernel/l3prop.dtx
Outdated
% using \cs{prop_new:N} (\enquote{flat} storage) or \cs{prop_new_large:N} | ||
% (\enquote{linked} storage). Once a property list is declared with | ||
% \cs{prop_new:N} or \cs{prop_new_large:N}, the type of internal data storage | ||
% can no longer be changed. All other \pkg{l3prop} functions transparently |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While the current implementation does not over a conversion from flat to linked it should be technically simple to provide, and we should consider to offer it. Use case: property lists that are define as flat in the kernel or in a package but that in certain circumstances, e.g., when a special package is loaded, grow a lot compared to the default case. In that case it would be nice if there is a mechanism to convert from one form to another even if the original declaration happened elsewhere and the list has already entries.
More or less syntactic sugar given that the prop_set_eq:NN
will basically do the work, but I think it deserves a defined name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Easy to provide but (1) what name? (2) for local props it's probably annoying if people try to change the storage type within a group, I'll have to see how much bookkeeping is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would agree that changing this “type” only locally does not sound like a good idea.
Name: \prop_make_large:N
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that \prop_set_eq:NN will do the conversion if needed (both ways).
So it just needs a small extension to cope with the name staying the same??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
\prop_make_...:N
is a good name. I also thought of \prop_renew_...:N
but it might suggest that the process would give an empty prop like \prop_new_...:N
. Regardless of name we need both directions I suppose.
There is a bit more work than for \prop_set_eq:NN
because some auxiliary structures have to be added / deleted.
l3kernel/l3prop.dtx
Outdated
% | ||
% \begin{function}[added = 2024-01-15]{\prop_new_large:N, \prop_new_large:c} | ||
% \begin{syntax} | ||
% \cs{prop_new_large:N} \meta{property list} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One can say that "linked" is an implementation detail and "large" is the intended use, but initially linked lists might always stay small and flat ones might grew very large. And given that "linked" also shows up in the documentation and in error messages (I think) I wonder if it would be better to call this \prop_new_linked:N
and perhaps even offer \prop_new_flat:N
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You correctly pinpointed my reason to prefer "large" in function names (and my inconsistency in using a different word in error messages), which is that linked is an implementation detail. In fact, the new props are both doubly-linked and implemented using the hash table, and the latter point is an important part of why they end up faster for accessing entries. Is "linked" really a good word for that? "hashed"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that names that refer to the implementation are not good (since there may be further changes to this in another decade or so).
I also understand @FrankMittelbach ‘s problem with labelling them as “large”.
Needs further thought!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Further: in some of the documentation, it is probably best to use both names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need a name that does not imply any “ordering” on the type of implementation, and says nothing about the expected contents, but is also independent of any details of the implementation.
l3kernel/l3prop.dtx
Outdated
% \prop_gclear_new_large:N, \prop_gclear_new_large:c | ||
% } | ||
% \begin{syntax} | ||
% \cs{prop_clear_new_large:N} \meta{property list} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here \prop_clear_new_linked:N
reads a tiny bit better to me. Pity we save an "or" in the name
l3kernel/l3prop.dtx
Outdated
% \pkg{l3keys}), each key here \emph{must} be followed with an \texttt{=} | ||
% sign. | ||
% \begin{function}[added = 2024-01-15] | ||
% {\prop_const_large_from_keyval:Nn, \prop_const_large_from_keyval:cn} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again "linked" feels more natural to me
The diff here is "too big" for my github app! It refuses to load it. |
% \end{function} | ||
% | ||
% \begin{function}[added = 2014-08-12, updated = 2021-04-29]{\prop_log:N, \prop_log:c} | ||
% \begin{syntax} | ||
% \cs{prop_log:N} \meta{property list} | ||
% \end{syntax} | ||
% Writes the entries in the \meta{property list} in the log file. | ||
% Writes the entries in the \meta{property list} in the log file, | ||
% and specifies its storage type. | ||
% \end{function} | ||
% | ||
% \section{Scratch property lists} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is probably sensible to add a note here that the are no scratch “large” pls.
But I can get the diff using Chrome. But response is achingly slow: not sure if this is due to the size of the diff or just my connection? So I have now added a few responses and comments on the documentation. Not yet tackled the code. |
A suggested stress test for the implementation: force all property lists to be"large" in the test suite by making |
Indeed Frank that would be a good test, but I'll need some time to set it up (pretty busy work week). I'm worried because the order of prop items is probably different with the new prop implementation than the old one, which may show up in various test logs. |
In the explanation of the "calculation" of the prefix, add something like this: The aim here is to make this string as short as we can, given the range of distinct characters available. (Maybe also explain why it should be as short as possible, possibly by reference to earlier where the reason is fully explained?) |
l3kernel/l3prop.dtx
Outdated
% | ||
% \item | ||
% The \enquote{linked} storage method is meant for property lists with a | ||
% large numbers of entries. It has more memory overhead, but is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imho "memory overhead" should be clarified. Half of the user will probably think that that means computer memory.
At a later stage, the code for the production of short, unique strings should probably be moved to somewhere that makes it more widely available. |
I think I addressed all of the comments. I also found a few bugs in my code, and minor ones elsewhere (see recent issues). Most importantly, I renamed the functions as Frank suggested, The small merge conflict with main is trivial to fix: keep my additional |
Looks good to me: lets see what everyone else thinks! |
Ah, one check that Frank suggested was to run the LaTeX2e suite with an additional line making all props linked. I didn't figure out how to do that, but in case someone wants to try, one has to add |
I managed to use the new linked prop in the latex2e test suite, which uncovers a problem with treatment of hashes. I'm investigating. I also forgot to add the suggested |
Presumably we can apply the 'make flat' function in 2e rollback if available: either the |
Yes. Anyways, props are flat by default, so this branch doesn't change anything in LaTeX2e. I just wanted to test whether the LaTeX2e test suite runs correctly with props being linked props. Overall it slows down the test suite slightly there, presumably because there are not that many very long props. Despite the failing tests, I think the branch can be merged/rebased into main. Last I checked (before updating l3build with the latest normalizations), the LaTeX3 tests succeeded. |
I would think that there are no "very long props" there (depending on where "very" starts!), or some such have been explicitly and purposefully added. |
The hard part was surprisingly to implement \prop_show:N and log, which have to tediously check every single detail of the internal structure of the linked property list, in a way that should be fully robust to a broken data structure. The order in which functions are defined has been modified, making the diff somewhat ugly, sorry. The new test is basically a concatenation of all the other tests, but with linked property lists instead of flat ones.
This helps get the same order of items in both implementations.
For linked props this would leave some material in the input stream. Removing \begin{document} ensures that left-over material leads to a Missing \begin{document} error.
Since some l3prop functions such as \prop_put_from_keyval:Nn use others repeatedly, one has to make it so that none of the underlying code runs any debug check (by using lower-level \cs_set_nopar:Npe etc), and add a lot of prop functions to the list of patched commands in l3debug.
cb90d95
to
ddd32a4
Compare
There remains \prop_show:N and log, which means that we cannot really
test the code seriously yet. However, it is already exercised as
part of the implementation of \prop_concat:NNN and the from_keyval
functions.