-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
array/tensor refinement #14
Comments
I really like the idea of some extra formatting for well established data structures like np array or pandas DataFrames. I can see that frosch will have a lot more configuration options backed up by some default settings. Allowing for extra "callback" hooks for different types is a really cool idea. Something like this, what you already showed: def nparray_printer(np_array: np.array) -> str:
return f"Custom repr here"
# Hook provides a interface where type hooks can be custom
hook(
type_hooks={
np.array: nparray_printer,
# Other <type>: <function>
}
) I really appreciate your suggestions and the time you put into them. I will definitely take a look at this. But you are also very welcome in implementing them yourself :) |
Hi,
I would like to discuss the direction first. :)
Many of the ideas are mutually exclusive,
so nobody should waste effort.
Of course, if i deeply disagree, i could have my own fork/version, but i think everybody sees a different aspect, so common wisdom leads to better solution. If we agree on a generic direction, i will send PRs. :)
E.g. i could imagine an exception formatter for logging purposes too (file/string) which uses the same backbone, but instead of installing hook, it could be called as frosch.pretty_exception_str(exception) in a try/catch block.
Anyway, so basically i think there are two parts: interface for custom types. Here it would be nice to use isinstance(...) match, so people could write a generic solution (e.g. all tensor types). The other part is giving a mechanism to access the default formatter.
E.g. frosch.default_printer(type)
Because implementing array formatting is pain, having the default is nice, but extending it would be cool.
…-------- Original Message --------
On Nov 15, 2020, 00:24, Patrick Haller wrote:
I really like the idea of some extra formatting for well established data structures like np array or pandas DataFrames. I can see that frosch will have a lot more configuration options backed up by some default settings. Allowing for extra "callback" hooks for different types is a really cool idea.
Something like this, what you already showed:
def
nparray_printer
(
np_array
:
np
.
array
)
->
str
:
return
f"Custom repr here"
# Hook provides a interface where type hooks can be custom
hook
(
type_hooks
=
{
np
.
array
:
nparray_printer
,
# Other <type>: <function>
}
)
I really appreciate your suggestions and the time you put into them. I will definitely take a look at this. But you are also very welcome in implementing them yourself :)
—
You are receiving this because you authored the thread.
Reply to this email directly, [view it on GitHub](#14 (comment)), or [unsubscribe](https://github.com/notifications/unsubscribe-auth/AADH4WEV3ZYZ2JEI3547IELSP4GRVANCNFSM4TVAZ3XA).
|
@dvolgyes Hey I am fiddling around with this feature right now. Could you provide me with a sample "display" function, with which I can test around? Something in the form of: def display_np_array(np_array) -> str:
# Here your implementation
return "Here the string representation you want" Thanks in advance :) |
Hi, I meant something like this below. Marking: i just made a dummy example, but a generic marker would be nice, e.g. Anyway, here is a short example:
|
I am not sure about the configs for those functions. This could turn out to be a little to confusing, allowing for custom datatype display functions, but also be able to configure preexisting display functions. For those big datatypes like df and np array this makes sense. But in most scenarios this would be overkill. While messing with this around I realised that this feature is not far away from print debugging on steroids. Which I am a fan of. I like tools like icecream. But we would make it even better with our custom datatype display functions. from frosch import frosch_print # Needs a shorter name tho
frosch_print(np_array)
# Output: Like the one in the image Frosch is going in a direction of a multifunctional debugging tool... What is our opinion @dvolgyes ? |
There are many different ways, in the end, you need to choose what you prefer. :) More or less that is way i used configuration parameters, like balancing between tradeoffs. Advanced users could replace anything and everything they want, in worst case, just spin their own version,
In this formulation, you could have your own class, even frosch could provide a few alternatives, e.g.
Another point to consider: making the hook install directly in module import,
The class-based exception signaled way have its own nice part: if there is an error, it would automatically fall back
If there is a CUDA error, user code falls back to your advanced plugin. However, of course, messing with the exception hooks, and using exceptions at the same time From my point of view, I don't like IDE's, I usually work in command line and basic editors, It is a never ending story, but plugins could make it flexible enough, but at the same time, Of course, you could have a similar plugin system to flake8 which discovers plugins through |
Hi,
Great project!
I think it would be relatively easy to make it even more awesome. :)
Many people deal with data science nowadays, and there
tensor properties, like shape, are more frequent source of error
than e.g. values in the array.
Currently arrays / tensors are displayed like this:
I would recommend at least two extra piece of information: shape and dtype, something like this:
Or maybe:
Maybe it would be even nicer to make the hook configurable, like:
Of course, you can always add almost infinite useful tricks to numpy/pytorch, e.g.
isnan/isinf, so here is a final variant to consider:
Of course, these might take time to print, but if it is configurable, then it doesn't really matter.
Another way, which would be also cool, is having a configurable extra printer.
E.g.
In this case anybody could customize their own printer, and add an extra representation.
I imagine the print like this:
And the function interface would be:
This might mess up colors, i see it.
There could be other ways to define values, like returning a dictionary of texts,
like "pre_type", "post_type", "pre_value", "post_value" , and format them something like this:
In this way you don't have much issues with the coloring, it is still controlled by you,
but people could customize the printer.
Which one to choose?
Well, i like my own printers, e.g. testing for NaNs, but for most people a reasonable built-in would be the easiest,
maybe with some verbosity parameter in the hook installer.
But once again: it is a great project already, and much more readable than any default.
I really appreciate your efforts, and even if you don't incorporate anything from the above,
it is already a lot of help for lot of people. But maybe the above could make it even better. :)
The text was updated successfully, but these errors were encountered: