Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minimum interface to define a DataFrame-like type #3

Closed
juliohm opened this issue Jan 4, 2018 · 6 comments
Closed

Minimum interface to define a DataFrame-like type #3

juliohm opened this issue Jan 4, 2018 · 6 comments

Comments

@juliohm
Copy link

juliohm commented Jan 4, 2018

It would be great to learn more about the minimum interface expected to be implemented by subtypes of AbstractDataFrame in one tutorial notebook. Do you think it makes sense to have it here?

@bkamins
Copy link
Owner

bkamins commented Jan 4, 2018

This is probably too complex for a tutorial and might change in near future.
If you want to have a peek here is a current thread related to a similar issue JuliaData/DataFrames.jl#1335 and referenced there example implementation of new subtype of AbstractDataFrame which is TypedDataFrame (https://github.com/JuliaData/DataFrames.jl/compare/nl/typed). It has almost 1000 lines of code.

I can keep this Issue open as maybe one day we will have this interface stabilized enough to specify it (but feel free to close it if it is OK for you to switch with discussion to the thread I mention here).

@nalimilan
Copy link

Sounds like something which should be documented in the DataFrames manual. But indeed, better stabilize it before working on the docs.

@juliohm
Copy link
Author

juliohm commented Jan 5, 2018

@bkamins you mean 1000 lines to define the interface? o.O

I agree that the DataFrames.jl docs is more appropriate for defining the interface, but since I couldn't find it there, I thought this repo would get it done more quickly. I encountered this necessity to define dataframe-like objects twice in my packages, but couldn't get it done.

@bkamins
Copy link
Owner

bkamins commented Jan 5, 2018

Let me write down here a tentative API a subtype of AbstractDataFrame is expected to implement (as of now - this will for sure change):

  • getindex:
    • single number, single symbol, vector of numbers, vector of symbols, vector of Bool, Colon
    • additionally pair where second argument is as above and first is: single number, vector of numbers, vector of Bool, Colon
  • index: returning type Index
  • copy, similar
  • nrow, ncol
  • convert to Matrix
  • hcat!
  • _vcat
  • all join methods
  • all reshaping methods

And there are functions that are not part of AbstractDataFrame API, but are defined for DataFrame:

  • push!
  • append
  • categorical!
  • allowmissing!
  • deleterows!
  • delete!
  • merge!
  • insert!
  • empty!
  • setindex!

@bkamins bkamins closed this as completed Aug 20, 2018
@juliohm
Copy link
Author

juliohm commented Feb 10, 2019

Is this API documented somewhere already? How does it relate to the Tables.jl API?

@bkamins
Copy link
Owner

bkamins commented Feb 11, 2019

It has not been written down unfortunately.
Tables.jl is a more general and simple API that is satisfied by DataFrames.jl in particular. You can check in /other/tables.jl file what methods need to be defined (some methods are already there for AbstractDataFrame some are specific for DataFrame and would have to be extened).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants