-
Notifications
You must be signed in to change notification settings - Fork 806
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better way to store formulas internally #815
Comments
@Pankraty I notice that the current |
Yes, that makes perfect sense. You mentioned once you tried to screw XLParser to ClosedXML. Are you still going to use it? Maybe it can make parsing formulae faster? And maybe we can benefit from parsing multiple formulae in parallel (no guarantee, of course) |
Yes, I as delaying really looking into XLParser until we released the netstandard2.0 build. Now can continue it again. Unfortunately XLParser a bit abandoned. We'll have to take that into account. Luckily its dependency, Irony, seems to have been revived. This switch isn't something we should take lighly, and I don't even know abstract syntax trees that well, but XLParser does look very powerful in terms of formula parsing. |
Hmm, but Irony has split a bit. It used to be a project on Codeplex, by Roman Ivantsov, but now there are 2 forks: https://github.com/daxnet/irony and https://github.com/IronyProject/Irony . I'm trying to see if we can consolidate the efforts. See IronyProject/Irony#4 |
I have given it some thought and I am leaning toward not representing formula as an AST and just parsing formula each time, as long as I can use IAstFactory for evaluation without materialization of AST (possibly even if materialization will be necessary).
Basically I only might need to parse formula when I load when I need to build dependency tree and during evaluation that is limited due to dirty tracking. I need to keep AST, which costs memory that might or might not be used (likely won't be used). All that to avoid parsing that happens about once or zero times in classical use case load, change, save. XLParser was kind of slow and it made sense to keep AST. Don't think that it's true anymore. I took a sample of formulas from enron dataset (1000 files) and average lengths of a formula is 35 chars. |
Currently, formulas are stored internally as strings, either in A1 format or R1C1 format. This requires a lot of parsing, especially when ranges are copied / moved / deleted.
A better approached would be to parse the formula when it is set into some internal structure, which can be manipulated easier and converted back to string when needed.
The text was updated successfully, but these errors were encountered: