Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revise the current API #26

Closed
sylwiabr opened this issue Sep 10, 2018 · 14 comments
Closed

Revise the current API #26

sylwiabr opened this issue Sep 10, 2018 · 14 comments
Assignees
Labels
Category: API Changes to the library API Type: Enhancement Enhancements to the library

Comments

@sylwiabr
Copy link
Member

Some points that have been raised:

  • max of a column is not a number, but a column (I understand why that is so, but it is not nice to work with)
  • The map over a column works in a way that gets me totally lost. I need to specify the type of the column: but I don't know it in the first place. That's a big win for pandas, where I can map like I know it.
@sylwiabr sylwiabr added Type: Enhancement Enhancements to the library Category: API Changes to the library API labels Sep 10, 2018
@sylwiabr
Copy link
Member Author

maybe describe method should be for whole Dataframe and describeColumn sholud take columnName and do what current describe does? @piotrMocz @kustosz ?

@mwu-tow
Copy link
Contributor

mwu-tow commented Sep 11, 2018

Case: user has a table, wants to fill nulls in each column with a mean value for that column

@piotrMocz
Copy link
Contributor

The thing I have very strong feelings about:
the Table class should expose the read<Format> method. Please keep in mind that we're not targetting expert programmers or even hobbyist programmers, but "data profesionals".

Table.readCSV is a method that gives you a table from a CSV file. Table.readXLS is a method that gives you a table from an Excel file. And so on. Accidentally, this is exactly how pandas does it. Advantages:

  • single import for Table read and for Table processing
  • "ideological" compatibility with pandas
  • hides the implementation details and makes the API dead simple

Currently you write CSVParser.readFile. Let's look at the drawbacks:

  • when processing data, you need another entity: some parser. I wouldn't want to have to know about some parsers when all I want to do is to read a dataframe from file
  • one more concept to keep track of
  • one more import

All in all, I suggest we at least add the Table.readCSV = CSVParser.readFile alias.

Like, really, please 😅

@sylwiabr
Copy link
Member Author

sylwiabr commented Oct 3, 2018

All non-user facing functions should have _ prefix

@sylwiabr
Copy link
Member Author

sylwiabr commented Oct 3, 2018

@piotrMocz what do you rthink about #57 in context of reading/writing files?

@wdanilo
Copy link
Member

wdanilo commented Oct 3, 2018

Moreover, the toInt method should be named toEnum

@sylwiabr
Copy link
Member Author

sylwiabr commented Oct 3, 2018

Table.fromFile should have better name (more descriptive)

@sylwiabr
Copy link
Member Author

sylwiabr commented Oct 3, 2018

removeColumn -> removeByIndex

@sylwiabr
Copy link
Member Author

sylwiabr commented Oct 3, 2018

keep as opposite to remove

@sylwiabr
Copy link
Member Author

sylwiabr commented Oct 3, 2018

corr -> correlations
corrWith -> correlationsWith

@sylwiabr
Copy link
Member Author

sylwiabr commented Oct 3, 2018

change generated names from CORR_WITH_foo to corr_with_foo

@sylwiabr
Copy link
Member Author

sylwiabr commented Oct 4, 2018

sort -> sortMultiple
sort Descending by default
sort config (?)

@piotrMocz
Copy link
Contributor

@piotrMocz what do you rthink about #57 in context of reading/writing files?

I left a comment there ;)

@sylwiabr
Copy link
Member Author

plotVerticalLayout [Plot] - to remove when lazy vis appears

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Category: API Changes to the library API Type: Enhancement Enhancements to the library
Projects
None yet
Development

No branches or pull requests

4 participants