-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Histogram collections #45
Comments
Histograms and protocol buffers in context of R presented in this paper: |
Hi Ryan, your suggestion looks nice. I have to study the protocol buffers to properly understand all the implications and benefits, so sorry for the delay (in the past and also future). Just to give a short initial feeling (which may change after proper studying):
One additional note:
Thanks a lot and I am starting with your article :-) I will try to come back with more competent comments asap. Cheers, |
Hi, Thanks for the reply. I implemented really quickly to get some idea how it might work. I, too, would likely need to think about this further. I've been mulling over how to do this somewhat correctly with little progress. For metadata information, the protocol buffers, in general appear to be the way to properly maintain, sustain, and pass the information around. The cross language capability and versioning are the key features that make this an ideal format (for any metadata). After some further thought, I also have the same conclusions.
Tensorflow has a histogram proto, but lacks some important information that is contained in physt. Nevertheless, worthwhile as reference: Ryan |
Hi Ryan, finally I got to reading about protobufs and your code. I partly copied lines from your commits, partly wrote code from scratch (in order to fit with restructured I am still in doubt about the Collection - whether to make it a collectively manipulable entity (with shared bininnigs, meta data etc.)... This part of API will probably change (hopefully keeping the old message readable). Also let me know if you want more credits than a mention in README, you deserve it for the idea and the initial implementation. Thanks |
Hi Jan,
Apologies for the delay. I have not had much of a chance to look into the
details, but at a quick glance this looks great. I should get back to the
histogram work in the next few weeks. If something comes up I will let you
know.
Thanks for mention in the README! That's more than enough credit!
…On Fri, Sep 21, 2018 at 14:42 Jan Pipek ***@***.***> wrote:
Hi Ryan,
finally I got to reading about protobufs and your code. I partly copied
lines from your commits, partly wrote code from scratch (in order to fit
with restructured physt.io module and to deal with multidimensional
histograms). Please have a look and let me know if you find the new release
Also let me know if you want more credits than a mention in README, you
deserve it for the idea and the initial implementation.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#45 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AH1ExIJYN7HdGOTwJ1vg8T1C8nu_jAZlks5udTMjgaJpZM4Vigi0>
.
|
Hi, you are welcome :-) |
Create a collection of histograms in a file. Utility function to merge multiple files into single file.
Provides a way to distribute processing and histogram creation, store histograms and merge.
I implemented rudimentary versions with json and google protobuf. Tensorflow stores summary data and histograms in protocol buffers, so I gave that a try. The protobuf have versioning capability, which is just one potential benefit. See my fork and the protobuf tag.
Happy to work on this some more with your input.
One thing that would help is to make 'name' required, such that the Summary protocol buffer map <string, Histogram> takes as key the histogram name. This is important for the merging. Possibly merging can be implemented without converting back into the histogram, so we only deserialize, sum and write out new summary.
The text was updated successfully, but these errors were encountered: