-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Building a visualization tool for MXNet #4003
Comments
The way tensorboard works is it takes in a log file printed in a specific format and then renders them. Here is what I think would be a ideal solution:
But I haven't looked into this in-depth so this might be hard/impossible. So feel free to do anything that works for you first. We can discuss whether we want to merge it into mxnet or provide it as a separate solution afterwards. |
Yes, tensorboard only requires the proto of the logger results, but I didn't find the entrance to create a Summary object, which is return directly by scala_summary(an tensorflow op) and that means we have to run tf.run. I'm trying to walk around this. I would look into this in the coming two weeks. |
I think tensorboard is relatively isolated. Last time i see the code, only the proto of the logger file is needed |
My memory of using tensorboard is that those logfiles quickly get extremely large. Do people really share those logfiles with each other? It also made me worry that the huge amount of I/O would limit performance -- which would be more of an issue with MXNet than TF. So that's something else we can experiment/measure: what kind of IO bandwidth would be needed to produce these logfiles. |
c.f. Example of using Tensorboard in minpy @jermainewang maybe have more comments on details |
@tqchen @jermainewang Thanks for the reference, and I've found an API for Although it has only |
The minpy's way using tensorboard could be migrated to mxnet quite easily. There are majorly three components:
We plan to put the codes here: https://github.com/dmlc/minpy/tree/visualize/minpy/visualize . I will ping you again after it is updated. |
The related PR is still under review here: dmlc/minpy#87 |
@jermainewang That's great! |
Hi, I've finished doing the scalar summary part and is currently exploring image summaries and histogram summaries. We did not plan to do audio and graph summaries for minpy as minpy does not use a computational graph. But that should work for mxnet. I also realize there is a new section in TensorBoard after the release of TensorFlow v0.12 for word embeddings, which is super cool: https://www.tensorflow.org/versions/master/how_tos/embedding_viz/index.html#tensorboard-embedding-visualization. |
Hey guys, I've finished the first one in the TODOs with generous help from @mufeili and @jermainewang But it still requires a writer/RecordFileWriter from TF, I would submit the code as I finish the writer later. |
@mufeili Could you take a look at this issue? tensorflow/tensorflow#4181 in which danmane said it's 'tfrecord' that do that write file job. Then I dig into the code and find that the relevant C code tensorflow/core/lib/io/record_writer.cc and py_record_writer.cc, TensorFlow uses SWIG to make a wrapper to use them in Python. I think it's too hard to write these in Python as it has so many dependencies, and it's not easy to use in other language, which means someone has to use Python interface for visualization purpose. Can I just pull these related C files out, put it in core library and use SWIG or something else as a solution for Python interface? @piiswrong Could you give me some suggestions? What's your convention in writing a wrapper from C to Python? |
@zihaolucky tensorflow/tensorflow/core/lib/io/record_writer.cc is exactly where I got stuck at first. We then decided to use tf.python_io.TFRecordWriter for the time being. |
Good news, I've found that someone has already given a solution to write I migrated the code to MXNet and it works, now we can use TensorBoard without relying on TF. So I've submitted the code to my branch https://github.com/zihaolucky/mxnet/tree/feature/tensorboard-support-experiment and please check it out. |
@zihaolucky Awesome! I've had a quick look at it. I think it currently only supports scalar summary as well so I am not sure if the record_writer function would still work when it comes to other kinds of summaries. But still lots of thanks! |
@mufeili Seems it could also support other types of summaries. As it writes a serialized event, but it just provides a scalar summary api. |
Great work - exciting to see the progress! Note that you probably need to include the necessary copyright information if you borrow the code from some other project. |
@terrytangyuan Thanks for your kind reminder, I would do some research on the copyright issue. |
Since mxnet as tf are both using apache it should be fine. Retaining the
author comment in the begining of each file should be enough
…On Sat, Dec 10, 2016 at 8:06 PM, Zihao Zheng ***@***.***> wrote:
@terrytangyuan <https://github.com/terrytangyuan> Thanks for your kind
reminder, I would do some research on the copyright issue.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#4003 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAiudDQspmBbKcGjSlxguiCYcwABDecJks5rG3aogaJpZM4K9NFV>
.
|
It would be necessary to copy the LICENSE file from original repo, and retaining the copy right notice |
Update, we now provide a PyPI package for TensorBoard fans :) |
I make a standalone tensorboard by extract tensorboard's C++ dependency from TensorFlow. |
@bravomikekilo great work! Any plan to ship to dmlc/tensorboard? And I believe you have to make it easier to maintain, as tensorboard might change very often and new features keep coming in(as they said in TF Dev Summit, they're going to provide more flexible plugin module for tensorboard developers). So I focus on logging part and try not to change the rendering part. Just my personal opinion. |
I mostly keep the structure of the tensorboard project, I going to fix the structure same as the offical tensorflow, so we can sync the change. I have enable the logging support from C++, so It will be much faster and reliable. |
Or maybe we should merge dmlc/tensorboard to mxconsole? As the most tensorboard function can enabled from reduced tensorflow. Meanwhile, we can split the mxconsole to smaller module. The reduced tensorflow can do much more things. |
@piiswrong @jermainewang any thoughts? |
Already merged dmlc/tensorboard to bravomikekilo/mxconsole, include native library powered summary api and tutorial. |
What's the benefit of extracting the code vs cloning tensorflow? |
The library is smaller and easy to build. |
meanwhile smaller code base is much more clear and portable. |
The native library for potential more language interface support seems a good idea, while the maintainers still have to write a wrapper, same workload as they write logging interface in Scala or any other languages. I encourage you to propose your roadmap toward this direction by extracting the code, then point out some promising benefits, if not, spending times on 10% difference while the 90% same is not a good idea. |
Ok, I will try to add back the interface file for go and java from origin TensorFlow. |
Besides, the native library provide a faster implementation of crc32 and protobuf writing, and it is possible to merge the native png encode support. |
A sad story is that java and go interface don't have summary or something, maybe they just add the summary ops to the graph. Seems all logger still need to be write. |
Consider focusing on logging. |
I can just extract only logging part, that is much smaller |
maybe we should split the logging and rendering? |
so, make a sum up.
and a optional choose is to split the tensorflow_fs from mxconsole. that will easier to keep sync |
@zihaolucky @bravomikekilo Are you planning to port Tensorboard to mxnetR binding?? It will be great!! :) |
I not good at R, but I will try. It shouldn't be too hard. |
Great, thanks a lot @bravomikekilo! |
@bravomikekilo, @zihaolucky , @RogerBorras @thirdwing , it will be great if there will be a visualization board for mxnetR! |
@lichen11 @bravomikekilo If you can figure out the way to write the event file and the summary protobufs in R, then it could be achieved. Just refer the https://github.com/dmlc/tensorboard/tree/master/python and ping me if you need any help. |
Hi hackers,
I've started working on building a visualization tool for MXNet, like TensorBoard for TensorFlow. As @piiswrong suggested in #3306 that 'try to strip TensorBoard out of tensorflow' and I'm going to work on this direction. Here're some of my notes after reading TensorBoard‘s documentation and searching its usage on the web, feel free to comment below.
Motivation and some backgrounds
I've tried to visualize the data using
matplotlib
and a bunch of helper tools liketsne
on my daily work and I feel tired in rendering and adjusting the size/color of the images. Besides, it's not easy to share this data with my friends.While TensorBoard provides good solutions for our daily use cases, such as learning curves, parameters/embedding visualization, also it's easy to share. See TensorBoard for more.
Daily use cases
I think these could satisfy most people and it's already supported by TensorBoard with
tf.scalar_summary
,tf.image_summary
,tf.histogram_summary
and tensorboard.plugins.projectorTensorBoard usage
Some snippets from a tutorial on how to use TensorBoard.
The logic above is quite clear, where the
accuracy
andcost
get updated every time whensess.run
is called and return a Summary which is feed into log throughSummaryWriter
.Feasibility
1.Easy to borrow and directly use in MXNet?
I've successfully visualized the 'makeup' curve using the code below:
So it means we could pass something common, here is numpy array and normal int, and reuse most of the code. I would discuss possible routes for creating an interface to connect MXNet and TensorBoard and I need your advice. But let's keep it simple now.
2.Could be striped alone?
From this README, I guess TensorBoard could be built independently?
TODO
To prove we can use TensorBoard in a dummy way:
To keep our code clean and lightweight:
Or we could install entire TF together with MXNet? Is that acceptable?
I think it's okay but not good for our users and make this visualization tool too heavy, cause we also run core code in TensorFlow(the
sess
andTensor.eval
is actually computed by TF). But it depends on our checks, hard to tell.Or any other way to workaround? As the
summary
is a proto inwriter.add_summary(summary, epoch * batch_count + i)
that means we could only usesummaryWriter
without using the computation of TF. It's possible because the doc inSummaryWriter.add_summary
:If we decide to borrow TensorBoard:
The text was updated successfully, but these errors were encountered: