[prakriya] Optimize the tripadi
module
#12
Labels
performance
Enhancement that improves performance without changing functionality
tripadi
module
#12
Profiling indicates that the
tripadi
module is slow.Many of the rules in the
tripadi
need to iterate over every character in the string so that they can apply various sandhi changes. Currently, we create a newCompactString
for each of these rules. My rough guess is that we create a dozen such strings for each word we derive, even if none of the rules have scope to apply.CompactString
shouldn't stack allocate in most cases, but the copy work required here is still slow.Once we confirm that this is a problem with profiling, we should avoid the extra copies here. Two approaches that come to mind:
Instead of creating a new string, iterate over the
Term
strings and manage indices carefully.Store one copy of the string and rebuild it only if a rule applies. The code would follow the basic pattern of
ItPrakriya
, e.g., by extending thePrakriya
struct with new data and helper methods.I think (2) is generally cleaner, and it has the side effect of improving our APIs.
The text was updated successfully, but these errors were encountered: