Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a variable to refer to other rows #3305

Closed
allanaaa opened this issue Nov 2, 2020 · 8 comments
Closed

Add a variable to refer to other rows #3305

allanaaa opened this issue Nov 2, 2020 · 8 comments
Labels
expression language Support for scripting languages (GREL, Python…) Type: Documentation Issues related to improving project documentation or tutorials. Type: Feature Request Identifies requests for new features or enhancements. These involve proposing new improvements.

Comments

@allanaaa
Copy link
Contributor

allanaaa commented Nov 2, 2020

I was wondering if there was a way to compare two rows' values to each other. For example, I wanted to create a column using diff() on a column of dates, outputting the distance between row1's date and row2's date (and subsequently between row2's date and row3's date, etc.).

Dates Diff between value and value in the next row (days)
2020-10-31 2
2020-10-29 5
2020-10-24 3
2020-10-21

There doesn't seem to be a way to call "the value of this cell in [this row's index plus one]". You can call other cells in the same row (by pointing to a specific column, with syntax like cells["Dates"].value), but not other cells in the same column (by pointing to a specific row with rowIndex).

Another use case: you could transform your data by transferring all the values in a column up by one or more cells.

Proposed solution

What I would like is a way to pull from any row's cell values by its index number. Both absolutely and relatively - as in, "row 6" as well as "this row minus two". I am terrible at syntax so I'll leave that bit to someone else.

It may also be useful to have a variable for the total number of rows/records in the project. (Which we offer at a smaller scale with row.record.rowCount.)

@allanaaa allanaaa added Type: Feature Request Identifies requests for new features or enhancements. These involve proposing new improvements. Status: Pending Review Indicates that the issue or pull request is awaiting review by project maintainers or collaborators labels Nov 2, 2020
@wetneb
Copy link
Sponsor Member

wetneb commented Nov 3, 2020

Similar to #1340

@iomicifikko
Copy link

iomicifikko commented Nov 12, 2020

@allanaaa Hi there, I'm using the following workaround to compare values between a row and its following:

  1. create a column row with the code row.index() (ex: 0,1,2...)
  2. create a column nextRow with the code row.index() + 1 (ex: 1,2,3...)
  3. use the cross function within the same project to create a new column (ex: startDate_nextRow) from the column row and import values from row2 into row1 (ex: cell.cross("sameProject", "nextRow")[0].cells["StartDate"].value
  4. compare the imported values (ex: endDate VS startDate_nextRow)

image

@wetneb
Copy link
Sponsor Member

wetneb commented Dec 15, 2022

I think there is now a standard syntax using the cross function, because when the project name and column name are omitted, they are understood as being the current project and the row index. So if you want to access the cell at row 123 in column Address, you can simply do cell.cross(123).cells["Address"].value[0] (I haven't checket it though).

I suggest we change this issue into a documentation issue (on the openrefine.org repo), where the task would be to document this workaround in https://openrefine.org/docs/manual/expressions

@antoine2711
Copy link
Member

antoine2711 commented Dec 15, 2022

So if you want to access the cell at row 123 in column Address, you can simply do cell.cross(123).cells["Address"].value[0] (I haven't checket it though).

I confirm this works with a slite change:
cross(123).cells["Address"][0].value (no cell. at the start and the [0] can go after the cell column, or after the value).

It's the same as: "123".cross().cells.Address[0].value.

Regards, Antoine

@wetneb
Copy link
Sponsor Member

wetneb commented Dec 15, 2022

ah yes, simply cross(123).cells["Address"].value[0] (no cell. in front).

@ostephens
Copy link
Sponsor Member

Hah - this is excellent - I had not realised this and have been abusing the record mode to achieve similar things

@thadguidry
Copy link
Member

thadguidry commented Dec 17, 2022

@wetneb @ostephens So I think this issue can be closed now and a new doc improvement issue can be made? It would be "swell" if @ostephens took that up for us, since I feel he might explain it very well for others.

@wetneb
Copy link
Sponsor Member

wetneb commented Apr 13, 2023

Agreed, I have opened an issue accordingly: OpenRefine/openrefine.org#214

@wetneb wetneb closed this as completed Apr 13, 2023
@tfmorris tfmorris added Type: Documentation Issues related to improving project documentation or tutorials. and removed Status: Pending Review Indicates that the issue or pull request is awaiting review by project maintainers or collaborators labels Oct 26, 2023
@antoine2711 antoine2711 added the expression language Support for scripting languages (GREL, Python…) label Jun 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
expression language Support for scripting languages (GREL, Python…) Type: Documentation Issues related to improving project documentation or tutorials. Type: Feature Request Identifies requests for new features or enhancements. These involve proposing new improvements.
Projects
None yet
Development

No branches or pull requests

7 participants