Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Operate or range over all valid cells in a column or row #451

Open
ceaum opened this issue Sep 11, 2020 · 8 comments
Open

Operate or range over all valid cells in a column or row #451

ceaum opened this issue Sep 11, 2020 · 8 comments

Comments

@ceaum
Copy link

ceaum commented Sep 11, 2020

Is it possible to operate (e.g. @sum) over all (valid) cells in a column or row, or to otherwise achieve this by ranging over all such cells?

I have tried "A:A" for column A, but this is not accepted as a valid range

@andmarti1424
Copy link
Owner

Actually I believe its not possible. I will work on adding something like this:
@sum(A0:@getent("A",@lastrow))
as soon as I find some time to do it..

@poetaman
Copy link
Contributor

poetaman commented Feb 28, 2021

@andmarti1424 If we are supporting expression in ranges (which is the intent of asking this: #501 (comment)), can we do it for the generic case instead of special functions?

If we do that, then this pattern will be part of it and will look something like:

=@sum(@cols(@mycol) # 0 : @cols(@mycol) # @rows(@lastrow))

And given this expression is run from @mycol, to get a@lastrow value there is no need to send a column name to a function like @getent.

Given @ceaum is asking to operate on all "Valid" cells, I would assume he means all cells that have been assigned a number directly or by an numeric expression. Pattern @sum(A0:@getent("A",@lastrow)) works for @sum if a column is continuous and monolithic (has only one continuous range of cells of with valid values from A<something> to A<something>) and puts the @sum above the column, few points:

  1. Someone wishes to do @avg over Valid cells in a range, and there are "unassigned/empty/label" cells in range. In that case, we will also need a @hasnum function that evaluates to 1 if a cell was assigned a valid number, or skips that cell if it evaluates to 0. The code will look something like:

=@sum(@cols(@mycol) # 0 : @cols(@mycol) # @rows(@lastrow), @hasnum)

The advantage of extracting column name from @cols(@mycol) is that the equation need not be modified when copied to other columns.

  1. Danger of @lastrow Lets assume that 1) is not a concern, and we assume operations are on contiguous cells (anyway @hasnum is a good boolean to have). This @lastrow model still is insufficient as it assumes that the column is monolithic (it has only one set of interesting values in range A to A<X+N>). If a user has another set of useful values from A<X+N+M> to A<X+N+M+P>, then @lastrow can't be used. A more scalable approach would be to have these two additional functions: @lastnrow(<start_row>), and @firstnrow(<start_row>) that return the last/first valid row that has a "number" starting from row number <start_row>.

Another reason why @lastnrow(<start_row>)/@firstnrow(<start_row>) is more scalable is: Lets say user wants to do @sum of cells right below the column of interest, and @avg below the @sum cell, being able to specify the <start_row> would make sure the cell that has @sum is not included in @avg. it is important to distinguish number from cells that are valid but have strings, so that we don't include them in our calculation (can mess up average). There can be equivalent functions @lastsrow(<start_row>)/@firstsrow(<start_row>) for strings if you wish. So a equation like this can be put under any continuous set of cells that have number, and it will sum them correctly:

=@sum(@cols(@mycol) # @rows(@firstnrow(@myrow-1)) : @cols(@mycol) # @rows(@myrow-1))

  1. Another reason why @sum(A<N>:@getent("A",@lastrow)) doesn't work is: Most spreadsheets put expressions like@sum, and @avg below the column being summed/averaged. Your @lastrow could end up including them which might affect calculation, or users will have to put these expressions above A<N> (i.e above the column), which is not as readable.

@andmarti1424
Copy link
Owner

The thing with this
=@sum(@cols(@mycol) # 0 : @cols(@mycol) # @rows(@lastrow))
is that every single function must be modified.
I would like to avoid that.

If you want copy the formula to other columns we can think in something like:
@sum(@getent(@myrow,1):@getent(@lastrow, @mycol))

Behaviour is like this: every cell in the range would be part of the sum, doesnt matter if an empty cell is in the middle, or a cell contain both text and valid numbers. Same with @AvG, It counts only if valid number is in cell.

About point 3, i believe you will be able to have a sum and avg result and copy it to other columns without problem, just using @lastrow and @lastrow-1 or something like that.

@andmarti1424
Copy link
Owner

andmarti1424 commented May 7, 2021

something like this should work now @sum(@getent(0, @mycol):@getent(@lastrow-1,@mycol))
Still to do some enhancements..

@poetaman
Copy link
Contributor

poetaman commented May 8, 2021

@andmarti1424 Some issues:

  1. Copy pasting the column to another column doesn't seem to add the formula to the dependency graph. I tried @avg.
  2. Adding the formula for the first time does not evaluate it correctly. It needs to be entered twice. =@sum(... in the same cell for it to show the correct sum value. The first time it just shows 0.00 as the result.
  3. Check the following video how the results are wrong in some cases, and also leads to circular error when there doesn't seem to be anything circular:
getent.mp4

@andmarti1424
Copy link
Owner

About 2 and 3, it seems @myrow is not updating correctly, and that leads to those misbehaviours.

About 1, why you say its not added to the depgraph? Because it wont update correctly after you copy it and you have to reenter it??
If thats the case, thats still in my todo list.

@andmarti1424
Copy link
Owner

btw, @reportaman, yes, its not in the doc yet. we need to think how the evaluation should be done. Cause for instance @lastrow gets updated all the time. should or shouldnt update all the cells that refer it?

@poetaman
Copy link
Contributor

poetaman commented May 10, 2021

@andmarti1424 Am glad you ask this question regarding @lastrow, and more importantly updating or not when it is used in a cell. IMHO, the fact that expressions evaluate in some cases when the value of the cells they refer change, and in some cases they do not evaluate when the cells they refer change is the single biggest risky choice of sc-im. It has implicitly added a "state" to expression cells; either the value in it can be VALID or STALE.

Currently a user has no way to know whether a cell holds a stale value or not. The only way to do that is to reevaluate every cell that holds expression one might suspect requires reevaluation. Given expression cells is what users end up consuming, not having expected (valid) value can break hearts with this app.

Possible solutions:

  1. Always evaluate: I guess @lastrow has a unique value per column? In that case, whenever @lastrow changes for a column, all expressions that refer to @lastrow in that column should evaluate.
  2. Signal VALID/STALE: When @lastrow changes for a column, all cells with expressions that use @lastrow should visually showcase that the value in the cell might be stale. For this end, sc-im can either a) color the cell with a different (new) color type (like EXPRESSION_STALE), or b) repeat some programmable ascii/unicode symbol in the cell to hide the stale value (so if the symbol is ?, the cell with potentially stale value will display ????), or c) provide both a) and b). I think having double visual feedback to the user in this case is a good idea, so c) is ideal.

Ideally in no case sc-im should show stale value in an expression cell without an indicator (or consider this as a bug in UI of sc-im).

In the meantime till it gets fixed, it might be a good idea to list all cases when expression cell might hold stale value, so users know what to expect & do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants