-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementation probably needed for sort-values
#8
Comments
@alanmarazzi: I'd be game to open a PR for this. Presuming you are okay with that, is there some context and guidance you'd like to provide? |
I'm absolutely ok with that! I think that After this the true main step is to decide whether you want to use The difference between the two is whether you think that having an explicit positional first argument makes sense for the implementation. For example, here you can find The reason why I decided to implement them like that is because After this, a simple docstring with a couple of examples will be ok, no need to add docs for all the arguments, I have to get rid of those for the other functions as well, since it's impossible to list them all. One cool thing would be to link to the original docs https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.sort_values.html, but of course even in this way you can't know whether people want the DataFrame or the Series implementation. (Suggestions are well accepted). The examples can be used for some tests, nothing too extensive, we don't want to test pandas, but the panthera implementation. After everything is working and test run, the last thing to do is to add the function into the panthera/src/panthera/panthera.clj Line 12 in d35232e
Please let me know about any inquiries you might have, I'd like to turn this into a contributing guide, and you'll see that once you get the hang of it it's very simple to add functionality to this! |
@alanmarazzi thanks for this writeup! I'm going to try to turn to this this week, and will certainly come back with any questions I may have. Regarding the docstring, can you give me an example of what you think is a good model for the docstrings going forward? I have to say it has been very convenient to have some of the docstrings that you have written in place. I'm thinking, for example, of the docstrings for |
You're right, ok just to be clear a dumb example of the issue is this: https://pandas.pydata.org/pandas-docs/stable/search.html?q=swaplevel# Now, the The problem is that if you look at the docs there are different arguments accepted depending on the type you get. I'd settle for something like:
So, clarify only args and send people to original docs for attrs (in this case there's none because I decided to not let people easily access Anyway I'm really keen to receive suggestions about this, I already know pandas extensively, but anyway I find myself on the docs very often because of its API size and depth |
The following piped transformation shows a gap that we might think about filling on this library:
In order to sort the
n-unique
count on the:value
column, it was necesssary as things stand to first cast the result as a data frame (it was a series after the group-by and aggregation fn), and then to call thesort_values
method on the:pyobject
.It would be nice to set things up such that we don't need to do these extra steps.
The text was updated successfully, but these errors were encountered: