New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cell shadowing, avoid entering same formula multiple times #170

Open
ern0 opened this Issue Mar 31, 2016 · 6 comments

Comments

Projects
None yet
4 participants
@ern0

ern0 commented Mar 31, 2016

It's a usual scenario, when we have several lines (records), with some data fields and some formulas. When we add a new record, we just clone the last line, replace the data fields, but formulas remain unchanged. Finally we have a set of formulas cloned, as many times, as many lines we have.

When we enhance the formula, there are several issues:

  • we're modifying formulas in one of the lines, so complete result is not visible until we copy the new formula;
  • it's easy to skip pasting into some lines;
  • if data values and formulas are interleaved, it's a pain to select and copy formulas only.

Solution: cell cloning. A cell, which is a clone, should contain only a reference to the cloned cell. Upon evaluation, the cloned cell's formula applies.

Example:
Copy-paste style:
C1: A1 + B1
C2: A2 + B2
C3: A3 + B3
C4: A4 + B4
Clone style:
C1: A1 + B1
C2: ==C1
C3: ==C1
C4: ==C1

Other ideas:

  • clone cells should have different decoration,
  • clone cells should have normal edit mode (when editing the content of the cell, e.g. for C3: "==C1"), and "origin" edit mode, which selects editing of the source cell,
  • cloned cells (which functions as source for cloning) should have a tiny indicator with the number of clones.
@nokome

This comment has been minimized.

Show comment
Hide comment
@nokome

nokome Apr 1, 2016

Member

Thanks @ern0, that's a great idea! Really great. It would address a lot of the issues around incomplete/inconsistent formula vectors which is a common error in spreadsheets.

Just thinking about how this would be implemented:

  • on pasting a cell, if the source cell was a formula then the target would become a clone of it, if the source was data then the data would just be pasted. Same with dragging down/across.
  • when the clone cell was executed the actual expression that is executed would be a translation of the origin cell's expression by applying the row and column offsets (from the origin to the clone) to each cell id in the expression.
  • clone cells could use anchors on one or both dimensions (e.g. ==C$1) like in Excel to indicate that the dimension should not be translated - would need some way to specify that in the interface.

It would complement the cell mapping idea #118 (which would be useful for projecting native vectors or arrays onto the grid e.g. a R vector could be projected down a column using the formula profit : revenue - costs; the : is the proposed symbol for mapping cells #136). Clone cells would provide the same sort of functionality for single cell values.

Member

nokome commented Apr 1, 2016

Thanks @ern0, that's a great idea! Really great. It would address a lot of the issues around incomplete/inconsistent formula vectors which is a common error in spreadsheets.

Just thinking about how this would be implemented:

  • on pasting a cell, if the source cell was a formula then the target would become a clone of it, if the source was data then the data would just be pasted. Same with dragging down/across.
  • when the clone cell was executed the actual expression that is executed would be a translation of the origin cell's expression by applying the row and column offsets (from the origin to the clone) to each cell id in the expression.
  • clone cells could use anchors on one or both dimensions (e.g. ==C$1) like in Excel to indicate that the dimension should not be translated - would need some way to specify that in the interface.

It would complement the cell mapping idea #118 (which would be useful for projecting native vectors or arrays onto the grid e.g. a R vector could be projected down a column using the formula profit : revenue - costs; the : is the proposed symbol for mapping cells #136). Clone cells would provide the same sort of functionality for single cell values.

@ern0

This comment has been minimized.

Show comment
Hide comment
@ern0

ern0 Apr 1, 2016

Introducing shadow cell concept adds a layer to the model. Now we have two kind (OOP: class) of cells: native and shadow, so every operation (OOP: method) on the cell is doubled, one for native, one for shadow.

I think, the concept of shadow cell feature is pretty clear, the data representation and implementation is not too difficult, but it has strong effect on UI, which should be designed carefully.

When we copy a (non-shadow) cell, should we paste value or make target cell shadow? I've bought your idea that "values should be copied, formulas should be cloned". Value-copy is trivial (it makes no sense to clone a value, if one wants see the same value in the target cells, he or she can make vanilla references, it will "clone" the values). Formula-clone vs formula-copy, I think, is 80% - 20%, so make cloning the default is unquestionable. For the remaining 20% (e.g. when we want to create a new formula based on the copied one), some extra GUI functions should be implemented:

  • "paste formula instead of cloning" action (context menu and hotkey): IMHO, the decision, wheter we want to copy or clone should be make at the paste operation;
  • "convert clone to formula" action applied to selected cells;
  • "find same formulas and convert to clones" action: probably it should be limited to adjacent cells and one dimension, e.g. when there is a column with the same formulas, with the exception of top one, they should be converted to shadows to it.

ern0 commented Apr 1, 2016

Introducing shadow cell concept adds a layer to the model. Now we have two kind (OOP: class) of cells: native and shadow, so every operation (OOP: method) on the cell is doubled, one for native, one for shadow.

I think, the concept of shadow cell feature is pretty clear, the data representation and implementation is not too difficult, but it has strong effect on UI, which should be designed carefully.

When we copy a (non-shadow) cell, should we paste value or make target cell shadow? I've bought your idea that "values should be copied, formulas should be cloned". Value-copy is trivial (it makes no sense to clone a value, if one wants see the same value in the target cells, he or she can make vanilla references, it will "clone" the values). Formula-clone vs formula-copy, I think, is 80% - 20%, so make cloning the default is unquestionable. For the remaining 20% (e.g. when we want to create a new formula based on the copied one), some extra GUI functions should be implemented:

  • "paste formula instead of cloning" action (context menu and hotkey): IMHO, the decision, wheter we want to copy or clone should be make at the paste operation;
  • "convert clone to formula" action applied to selected cells;
  • "find same formulas and convert to clones" action: probably it should be limited to adjacent cells and one dimension, e.g. when there is a column with the same formulas, with the exception of top one, they should be converted to shadows to it.

@nokome nokome added the what:sheets label Apr 21, 2016

@sirinath

This comment has been minimized.

Show comment
Hide comment
@sirinath

sirinath Apr 25, 2016

I think format should be like:

c1:c4 = a1:a4 + b1:b4

Also maybe a RC style should be allowed like in Excel and orgmode.

Also perhaps you can look at the orgmode format: http://orgmode.org/manual/The-spreadsheet.html#The-spreadsheet, http://orgmode.org/worg/org-tutorials/org-spreadsheet-intro.html

sirinath commented Apr 25, 2016

I think format should be like:

c1:c4 = a1:a4 + b1:b4

Also maybe a RC style should be allowed like in Excel and orgmode.

Also perhaps you can look at the orgmode format: http://orgmode.org/manual/The-spreadsheet.html#The-spreadsheet, http://orgmode.org/worg/org-tutorials/org-spreadsheet-intro.html

@talwrii

This comment has been minimized.

Show comment
Hide comment
@talwrii

talwrii Oct 7, 2016

Aside To talk a little about orgmode as sirinath mentioned: Org mode has a concept of row and column formulae, this deals with the problem of "adding data but forgetting to update formulae"

I want to try to mutter philosophically about this a little, because I think this is a thorny problem.

Underlying problem

Generalising a little, the underlying problem here is:

  • Easily apply the same function to every entry in an array of numbers
  • Easily being able edit and look at the numbers
  • Easily being able to add entries to an array
  • Being able to easily work out how a value got worked out.

The trade off here is often between the the ease of these different activities.

Broadcasting approach

R and numpy make defining these formulas easy with broadcasting array3 = array1 + array2, however viewing and editing can be kind of fiddly. And working out where a value came from isn't too easy.

Grid approach

Excel makes viewing and editing easy, but defining the formula can be error prone, as can adding entries, partly because everything gets put onto the same sheet. Finding out where a value comes from is made easy with their colorful equation editor.

Broadcasting, with a direct acyclic graph display

There is a commonly used piece of software called analytica (I've never used it - but have strategy consultant type friends who do). Which takes a different approach:

  • Arrays are not embedded in an individual spreadsheet, but are instead freestanding objects
  • You define formulas relating these graphs
  • There is then a separate DAG graph-based display, that shows you all the arrays, and the relationships between them. You can then click on array objects to edit them

I think the DAG approach has benefits, although there's a cost in terms of familiarity and not being able to see everything at once... and not giving you a display. Of course this approach doesn't stop you from also having spreadsheets.

Now... if your spreadsheet supports numpy or R arrays there is a different approach, you place your entire array in a single cell, and then use R / numpy for broadcasting. Perhaps providing a way to map an entire spreadsheet table into a cell

Opinions?

talwrii commented Oct 7, 2016

Aside To talk a little about orgmode as sirinath mentioned: Org mode has a concept of row and column formulae, this deals with the problem of "adding data but forgetting to update formulae"

I want to try to mutter philosophically about this a little, because I think this is a thorny problem.

Underlying problem

Generalising a little, the underlying problem here is:

  • Easily apply the same function to every entry in an array of numbers
  • Easily being able edit and look at the numbers
  • Easily being able to add entries to an array
  • Being able to easily work out how a value got worked out.

The trade off here is often between the the ease of these different activities.

Broadcasting approach

R and numpy make defining these formulas easy with broadcasting array3 = array1 + array2, however viewing and editing can be kind of fiddly. And working out where a value came from isn't too easy.

Grid approach

Excel makes viewing and editing easy, but defining the formula can be error prone, as can adding entries, partly because everything gets put onto the same sheet. Finding out where a value comes from is made easy with their colorful equation editor.

Broadcasting, with a direct acyclic graph display

There is a commonly used piece of software called analytica (I've never used it - but have strategy consultant type friends who do). Which takes a different approach:

  • Arrays are not embedded in an individual spreadsheet, but are instead freestanding objects
  • You define formulas relating these graphs
  • There is then a separate DAG graph-based display, that shows you all the arrays, and the relationships between them. You can then click on array objects to edit them

I think the DAG approach has benefits, although there's a cost in terms of familiarity and not being able to see everything at once... and not giving you a display. Of course this approach doesn't stop you from also having spreadsheets.

Now... if your spreadsheet supports numpy or R arrays there is a different approach, you place your entire array in a single cell, and then use R / numpy for broadcasting. Perhaps providing a way to map an entire spreadsheet table into a cell

Opinions?

@nokome

This comment has been minimized.

Show comment
Hide comment
@nokome

nokome Oct 10, 2016

Member

@talwrii : thanks for your ideas! I don't have time to right now to give them proper attention. But I hope to come back to this topic within the next month. So I'll have a look at Analytica again and come back with a more considered response later.

Member

nokome commented Oct 10, 2016

@talwrii : thanks for your ideas! I don't have time to right now to give them proper attention. But I hope to come back to this topic within the next month. So I'll have a look at Analytica again and come back with a more considered response later.

@nokome

This comment has been minimized.

Show comment
Hide comment
@nokome

nokome Nov 15, 2017

Member

Very closely related (if not identical to the idea of clone cells) is a paper I found recently by Felienne Hermans & Tijs Van Der Storm Copy-Paste Tracking: Fixing Spreadsheets Without Breaking Them

Spreadsheets are the most popular live programming environments, but they are also notoriously fault-prone. One reason for this is that users actively rely on copy-paste to make up for the lack of abstraction mechanisms. Adding abstraction however, introduces indirection and thus cognitive distance. In this paper we propose an alternative: copy-paste tracking. Tracking copies that spreadsheet users make, allows them to directly edit copy-pasted formulas, but instead of changing only a single instance, the changes will be propagated to all formulas copied from the same source. As a result, spreadsheet users will enjoy the benefits of abstraction without its drawbacks.

copy-paste-tracking.pdf

Member

nokome commented Nov 15, 2017

Very closely related (if not identical to the idea of clone cells) is a paper I found recently by Felienne Hermans & Tijs Van Der Storm Copy-Paste Tracking: Fixing Spreadsheets Without Breaking Them

Spreadsheets are the most popular live programming environments, but they are also notoriously fault-prone. One reason for this is that users actively rely on copy-paste to make up for the lack of abstraction mechanisms. Adding abstraction however, introduces indirection and thus cognitive distance. In this paper we propose an alternative: copy-paste tracking. Tracking copies that spreadsheet users make, allows them to directly edit copy-pasted formulas, but instead of changing only a single instance, the changes will be propagated to all formulas copied from the same source. As a result, spreadsheet users will enjoy the benefits of abstraction without its drawbacks.

copy-paste-tracking.pdf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment