Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Formula helpers #24

Closed
wojciechczerniak opened this issue Nov 17, 2019 · 15 comments
Closed

Formula helpers #24

wojciechczerniak opened this issue Nov 17, 2019 · 15 comments
Labels
API Public methods and properties Feature Something we can add later on without introducing a breaking change

Comments

@wojciechczerniak
Copy link
Contributor

wojciechczerniak commented Nov 17, 2019

Description

The point is to provide public API for the methods we already have internally:

 public calculateFormula(formula: string, cellPosition: SimpleCellAddress): any;
 public validateFormula(formula: string): boolean; 
 public normalizeFormula(formula: string): string;

While normalize might be a static, validate and calculate should be considered with the context of the workbook.

Use case

A full use case is described in #17 but lets link to the screenshot once again:

obraz

Events

Extracted to #135

Problem?

The formula is stored outside of the graph. We won't know if it should be updated. We would have to recalculate each time something changes in the workbook.

Is there any alternative? Should we register the formula in HyperFormula so it would become a part of the graph?

Solved by @bardek8! Formulas will be added to sheet that is indexed at -1. It will be a part of the graph and change lists.

This was referenced Nov 17, 2019
Closed
@wojciechczerniak wojciechczerniak added API Public methods and properties Feature Something we can add later on without introducing a breaking change Integration labels Dec 6, 2019
@wojciechczerniak wojciechczerniak added this to the January 2020 milestone Dec 6, 2019
@wojciechczerniak
Copy link
Contributor Author

wojciechczerniak commented Feb 5, 2020

Once registered, the formula should return an address so we can watch for #135 that are related to it.
And we need an API to remove this formula from the graph when the external widget is destroyed.

@swistak35 swistak35 self-assigned this Feb 7, 2020
@swistak35
Copy link
Contributor

I assume that normalizing formula means returning it unparsed from AST (so, for example, with all formula names capitalized). I also assume that validation means validating syntax of the entered formula.

I think that using multiple sheets (each sheet for each formula) would be better, because it will work with matrices. If we would add one temporary formula in A1, and then another one in A2, but A1 would be a matrix, then probably things will start breaking :) So in the end, maybe only single identifier will be needed (a negative number representing a sheet), instead of whole address (because address would be always A1). But in the end I guess that it does not matter much to you (whether it's in one sheet, or multiple ones).

@swistak35
Copy link
Contributor

We've discussed it, and maybe indeed one worksheet would be better, if we only would assume that we do not approve matrix formulas in these temporary formulas. Is that ok for you?

swistak35 added a commit that referenced this issue Feb 7, 2020
swistak35 added a commit that referenced this issue Feb 12, 2020
swistak35 added a commit that referenced this issue Feb 12, 2020
swistak35 added a commit that referenced this issue Feb 14, 2020
swistak35 added a commit that referenced this issue Feb 14, 2020
* Add initial version of calculateFormula()

Issue: #24

* Add test which checks whether external formula is recomputed

* Add possibility to change contents of external formula

* Add possibility to normalize formula

* Refactoring: Introduce ChangeList type

* Add test ensuring that it works for more formulas

* Refactoring: Split test suite into two

* Add basic validation of external formulas

* Add documentation

* External formula is not valid if it has lexical errors

* Literal error is ok (if that's what user want)

* Extract NamedReferences module

* Refactoring: Rename HyperFormula#calculateFormula -> HyperFormula#addNamedExpression

* Add ability to retrieve value of named expression by it's name

Temporarily changing named expression formula is forbidden, need to do
new API for that which does not use -1 sheet internal.

* Allow to add only one expression for given name

* Add assertion ensuring that NamedExpression won't contain duplicates

* Add basic validation of named expressions

* Refactoring: Rename NamedExpressionAlreadyTaken -> NamedExpressionNameIsAlreadyTaken

* Refactoring: Extract error classes out to separate file

They still belongs to the responsibility of the HyperFormula unit, but
majority of this code is just formatting the error. So I think it's
cleaner when they are in separate file.

* Refactoring: Remove unused return value

* Refactoring: Rename ivar NamedExpressions#nextExternalFormulaId -> nextNamedExpressionRow

* Make it possible to retrieve value of non existing named expression

* Add API for removing named expressions

* Refactoring: extract NamedExpressions#buildAddress method

* Refactoring: Remove magic constant

* SparseStrategy will not leak memory in the long run for named expressions

* Allow again to change formulas in named expressions

* Refactoring

* Does not allow to change not existing named expression

* Add possibility to list existing named expressions

* Prerefactoring: Extract class NamedExpression

* Remove unnecessary bang

* Add assertion in NamedExpressions#changeNamedExpressionFormula

* Make named expressions names to be case insensitive

* Refactoring: Extract a NamedExpressionsStore

* Refactoring: Extract method NamedExpressions#storeFormulaInCell

* No longer workarounds for keeping sheet = -1

* Refactoring: Use ChangeList instead of CellValueChange[]

Because in a moment I want to add different kind of element to
ChangeList -- the one which will tell that named expression has changed.

* lint-fix

* Add documentation

* Fixes after merge conflict

* Resolving conflicts is so. much. fun.

Co-authored-by: Jakub Kowalski <kowalski_jakub@hotmail.com>
@wojciechczerniak
Copy link
Contributor Author

wojciechczerniak commented Feb 19, 2020

@swistak35, I know we spoke that those are basically the same as named expressions but I would keep the calculateFormula method in a lot simpler form. Without registering it, tracing changes or being able to reference it. Calculate the result of =A1+1 and forget. It will be useful for status bar SUM or formula editor live calculations.

Also, I didn't had a chance to do a review so here are my comments:


Current addNamedExpression implementation does not support additional arguments

engine.addNamedExpression('myName', '=Sheet1!A1+10')

We know that we will need more details about the expression: { value, scope, comment, visibility } so we should be ready for the object and avoid breaking changes in the near future. Or we can add config after the value: addNamedExpression(name, value, { scope, comment, visibility })


expect(() => {
engine.addNamedExpression('myName', '=Sheet1!A1+10')
}).toThrowError(`Name of Named Expression 'myName' is already present in the workbook`)

This depends on the scope. We can have one global name but each sheet can have its own.


engine.changeNamedExpressionFormula('myName', '=Sheet1!A1+11')

We can edit much more than formula. IMO name changeNamedExpressionFormula is misleading, the value can be a number, boolean or string. I hope, I don't see a test for that.

When we will have scope and other metadata it will be much more useful to have updateNamedExpression(name, { scope, ... }) method.


What happens when we remove a named expression? From what I see its address is lost forever. We always increment it and will never reuse a row. Can it create a problem for us when multiple named expressions are created and removed over time?


Just a random thought: I would move

this.namedExpressions = new NamedExpressions(this.cellContentParser, this.dependencyGraph, this.parser)
this.addressMapping.addSheet(-1, new SparseStrategy(0, 0))

To a helper private method initializeNamedExpressions so someone in the future would know straight away that the -1 sheet is created for them and they have to go in pair in case of refactoring.


Do we have the name of the named expression in the changes list? or is it only { sheet: -1, ... }. I think it would be helpful to have it as property there

@wojciechczerniak
Copy link
Contributor Author

Note to self (named expressions are not yet available within syntax so this have to wait)

This is awesome 🏆 as it solves #27 and #5 at the same time. We should add a test with

    const engine = HyperFormula.buildFromArray([
      ['=TRUE()', '=TRUE', '=A1=B1'],
      ['=FALSE()', '=FALSE', '=A2=B2'],
      ['=FALSE', '=TRUE', '=A3=B3'],
    ])

    engine.addNamedExpression('TRUE', '=True()')
    engine.addNamedExpression('FALSE', '=False()')

    expect(engine.getCellValue('C1')).toEqual(true)
    expect(engine.getCellValue('C2')).toEqual(true)
    expect(engine.getCellValue('C3')).toEqual(false)

So we can confirm it and close both issues.

@swistak35
Copy link
Contributor

PRs: #167 #192

As discussed, calculateFormula became calculating "fire and forget", without any notifying of changes and storing it anywhere.
Questions:

  1. calculateFormula now takes only formula string. Wouldn't we like to pass there also a sheet? I imagine that we may want to calculate formula in context of specific worksheet, so =A1 is obvious in context of specific worksheet, but it's not clear without it. Currently, the behaviour is undefined if someone will use a reference without a specific sheet. We definitely want to do something about it, but I don't know what:
  • either always pass some sheet (I'd like that one)
  • or go through the whole AST, check whether there are some references without sheets and say that if so, the formula is invalid
    I'd prefer this option, because second is more work and less flexible at the same time :)
  1. validateFormula / normalizeFormula -- it wasn't exactly specified what these do so I've done what I've told in a comment above ( Formula helpers #24 (comment) )

@wojciechczerniak
Copy link
Contributor Author

  1. calculateFormula now takes only formula string. Wouldn't we like to pass there also a sheet? I imagine that we may want to calculate formula in context of specific worksheet, so =A1 is obvious in context of specific worksheet, but it's not clear without it. Currently, the behaviour is undefined if someone will use a reference without a specific sheet. We definitely want to do something about it, but I don't know what:

For that reason I've proposed an additional argument to the API:

 public calculateFormula(formula: string, cellPosition: SimpleCellAddress): any;

where cellPosition is the "context" where theoretically is the formula located and all relative references could be calculated in this context. We had a short discussion on slack https://handsoncode.slack.com/archives/CP3FX2DBM/p1579187598009700 but later the subject shifted a little bit.

I'd prefer this option, because second is more work and less flexible at the same time :)

Yup, I agree. We should give flexibility but avoid any magic assumptions.

  1. validateFormula / normalizeFormula -- it wasn't exactly specified what these do so I've done what I've told in a comment above ( #24 (comment) )

I think that we understood those the same. But here's my idea:
validateFormula - return boolean true if the string starting with = is the correct, complete, parsable formula (evaluation result is unimportant) so it won't return the #ERR! error. Return false otherwise
normalizeFormula - return formula string where functions names are uppercase, sheet names case is fixed (as they were registered), same for named expressions, some whitespace is removed (ie from ranges) other whitespace is preserved. It should be exactly as if we stored it in a worksheet and requested with getCellFormula.

This was referenced Feb 24, 2020
@swistak35
Copy link
Contributor

This depends on the scope. We can have one global name but each sheet can have its own.

yes, but for now only workbook level is supporter (because my understanding is that we want to do #126 in March and I don't want to constantly increase scope in #24 :) )

What happens when we remove a named expression? From what I see its address is lost forever. We always increment it and will never reuse a row. Can it create a problem for us when multiple named expressions are created and removed over time?

No, note that we specifically ask for Sparse strategy for address mapping, so it's performant with sheet which has a lot of gaps, and doesn't lose memory.

To a helper private method initializeNamedExpressions so someone in the future would know straight away that the -1 sheet is created for them and they have to go in pair in case of refactoring.

Good catch!

Do we have the name of the named expression in the changes list? or is it only { sheet: -1, ... }. I think it would be helpful to have it as property there

Actually it's the other way around. We have a name (which is unique in given scope), we (will) have a scope (and sheet, if scope is worksheet). I didn't want -1 hack to leak outside the engine, so from the user point of view it's transparent where we store these named expressions.

@wojciechczerniak
Copy link
Contributor Author

yes, but for now only workbook level is supporter (because my understanding is that we want to do #126 in March and I don't want to constantly increase scope in #24 :) )

Fair point. I just want to keep other features in mind so we don't have to break the API next month.

No, note that we specifically ask for Sparse strategy for address mapping, so it's performant with sheet which has a lot of gaps, and doesn't lose memory.

Awesome 👍

Actually it's the other way around. We have a name (which is unique in given scope), we (will) have a scope (and sheet, if scope is worksheet). I didn't want -1 hack to leak outside the engine, so from the user point of view it's transparent where we store these named expressions.

True, we should keep -1 private. Either add scope as sheet number to indicate its scope should be named scope. Lack of scope/sheet indicates that it's global expression.

@swistak35
Copy link
Contributor

I think that #192 exhaust that issue, there are still things to do in terms of named expression, but that one is done.

For that reason I've proposed an additional argument to the API:

I've added sheetName as additional argument for now. Is there any usecase where it would be suitable to pass whole "mental position" of named expression instead of only sheet? Changing to position is easy, I just don't want to add broader API which will later need to be supported

@wojciechczerniak
Copy link
Contributor Author

wojciechczerniak commented Feb 27, 2020

I've added sheetName as additional argument for now. Is there any usecase where it would be suitable to pass whole "mental position" of named expression instead of only sheet? Changing to position is easy, I just don't want to add broader API which will later need to be supported

I would prefer sheetName as well. But what about OFFSET function, or can you think about any other that would be impossible to calculate without the position?

EDIT: Is it optional?

EDIT2: The name changeNamedExpressionExpression is confusing

public changeNamedExpressionExpression(expressionName: string, newExpression: RawCellContent): ExportedChange[] {

@swistak35
Copy link
Contributor

changeExpressionOfNamedExpression?

@wojciechczerniak
Copy link
Contributor Author

Then we will break the naming pattern. ChangeNamedExpression should be sufficient? Or updateNamedExpression?

swistak35 added a commit that referenced this issue Mar 5, 2020
swistak35 added a commit that referenced this issue Mar 5, 2020
@swistak35 swistak35 mentioned this issue Mar 5, 2020
7 tasks
@swistak35
Copy link
Contributor

swistak35 commented Mar 5, 2020

I would prefer sheetName as well. But what about OFFSET function, or can you think about any other that would be impossible to calculate without the position?

I think that assuming that we always calculate formula from A1 position is reasonable, at least for now. We can exchange that to full "cell address" later, if someone really needs it.

EDIT: Is it optional?

No. There's always some context, I could make it default to "first sheet" or something like that, but isn't that more confusing?


Anything else to do here?

swistak35 added a commit that referenced this issue Mar 5, 2020
@wojciechczerniak
Copy link
Contributor Author

Nope. All done in #167 #192 and #238

We will check with @aninde when the reference is needed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Public methods and properties Feature Something we can add later on without introducing a breaking change
Projects
None yet
Development

No branches or pull requests

2 participants