Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rownumber #60

Closed
VEZY opened this issue Apr 21, 2023 · 10 comments
Closed

rownumber #60

VEZY opened this issue Apr 21, 2023 · 10 comments

Comments

@VEZY
Copy link
Contributor

VEZY commented Apr 21, 2023

DataFrames defines rownumber and row, and I want to define a similar function for the rows (TimeStepRow) of my table type (TimeStepTable) to be able to retreive any other row of the Table from one row.

I was wondering if it would be interesting to add the functions here ?

@bkamins
Copy link
Member

bkamins commented Apr 21, 2023

row is not exported. rownumber is exported and could be added to DataAPI.jl. Let us wait for @nalimilan and @quinnj to express their optiion.

@quinnj
Copy link
Member

quinnj commented Apr 23, 2023

Hmmm.....row seems like an awfully generic name to export. I'm hesitant to export that. But rownumber seems fine.

@VEZY
Copy link
Contributor Author

VEZY commented Apr 23, 2023

Ha yes you're right I didn't think this through, and rownumber is enough for my usecase. What is the next step now ? Do I propose a PR ?

@bkamins
Copy link
Member

bkamins commented Apr 23, 2023

Sure - if you are willing to please propose a PR. Actually I think it would be worth to have three PRs (up to you to decide what you want to do, I can do the rest, if you let me know what you are willing to do):

  1. here, defining the interface function (without any implementation, but giving the API)
  2. in DataFrames.jl so that we add a method to the interface function instead of creating our own
  3. in Tables.jl to add rownumber to standard Tables.ColumnsRow, Tables.DictRow, Tables.IteratorRow ,Tables.MatrixRow ,Tables.Row (it would have to be checked for which it would be possible - I have not done this check)

Then the following steps to do:

  1. Make these three PRs and do not merge them (the issue is that we need to change the version of DataAPI.jl and bump it in DataFrames.jl and Tables.jl). Make sure that these three PRs play together well.
  2. Merge the PRs in DataAPI.jl and make a release
  3. Only then merge the PRs in DataFrames.jl and Tables.jl

This is a bit complex, but we are creating a new interface, which is always a bit more challenging 😄.

@VEZY
Copy link
Contributor Author

VEZY commented Apr 23, 2023

I believe I can try and do 1 and 2.

For 3, I'm also willing to try but I have some questions before:

I see that Tables.ColumnsRow, Tables.DictRow, Tables.IteratorRow, Tables.MatrixRow all have a row field, so if I'm not mistaken it is just a matter of importing DataAPI.rownumber and then defining e.g. rownumber(row::ColumnsRow) = getfield(row, :row) right ?

I see there's already getrow(c::ColumnsRow) = getfield(c, :row) (same for IteratorRow), so do we define rownumber(c::ColumnsRow) = getrow(c) instead of rownumber(row::ColumnsRow) = getfield(row, :row), or do we replace it entirely (getrow is not exported) ?

I'm not sure to understand how we are supposed to implement rownumber for Tables.Row. Do I add rownumber(x::Row) = rownumber(x.x) ?

I would understand if you feel that's its more work to explain this to me than just doing it by yourself. Just let me know what you prefer. In the mean time I can start with PR 1 and 2 whenever you confirm that's ok.

@VEZY VEZY changed the title rownumber and row rownumber Apr 23, 2023
@bkamins
Copy link
Member

bkamins commented Apr 23, 2023

This is OK. Regarding 3 - this is exactly what I have not checked as there are many objects to cover and I am not sure I remember correctly how they are defined 😄.

You need to check their definitions and add appropriate methods. I assume that what you propose is correct (it looks good). If you make a PR I will check it. (or I can make this PR if you prefer)

Anyway the PR for point 3 can be done after we finish PRs for points 1 and 2 so that we are sure that what we discuss does not introduce any problems (I would not expect this, but in practice different strange corner cases can happen).

@VEZY
Copy link
Contributor Author

VEZY commented Apr 23, 2023

OK I just proposed:
1 . A PR here #61
2. A PR in DataFrames: JuliaData/DataFrames.jl#3322
3. And a PR in Tables.jl: JuliaData/Tables.jl#334

Assuming that the implementation for Tables.Row (rownumber(x::Row) = rownumber(x.x)) is OK, maybe we should add a fallback or throw an explicit error if the row type does not implement rownumber ? In this case, do we catch the error in Tables.jl, or do we add a default method that returns an error in DataAPI.jl ?

@VEZY
Copy link
Contributor Author

VEZY commented Apr 23, 2023

Ho, sorry I just see I should have waited for 3.

@bkamins
Copy link
Member

bkamins commented Apr 23, 2023

Ho, sorry I just see I should have waited for 3.

No - it is OK to open all PRs. I just meant that if you are unsure we can wait.

maybe we should add a fallback or throw an explicit error if the row type does not implement rownumber ?

I was thinking about it. I came to the conclusion that for now we can leave it undefined. In this way we get MethodError and can flexibly later decide what to do.

The alternative would be to return missing by default. The reason is that we know that the row number exists but we do not know it - so this is exactly use case for missing. However, then users would need to handle Union{Int, Missing} in their code, and maybe it is easier to allow assuming that Int is returned.

A third alternative is to throw ArgumentError by default saying that rownumber is undefined for passed value.

@nalimilan - what do you think?

@nalimilan
Copy link
Member

Not sure. As you propose, let's throw an error for now so that we can change that later if needed.

@bkamins bkamins closed this as completed May 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants