Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Visualization bug #54

Closed
matthewcarbone opened this issue Aug 9, 2018 · 11 comments
Closed

Visualization bug #54

matthewcarbone opened this issue Aug 9, 2018 · 11 comments
Labels
priority: LOW lowest priority

Comments

@matthewcarbone
Copy link
Collaborator

matthewcarbone commented Aug 9, 2018

I noticed something with the Reporting class. I'm not sure if reporting.data is supposed to print cleanly, but it doesn't for me:

screen shot 2018-08-09 at 2 17 46 pm

The standard output looks like this

image

But I think an easy fix is to do this:

screen shot 2018-08-09 at 2 16 43 pm

Thoughts?

@matthewcarbone matthewcarbone added the priority: LOW lowest priority label Aug 9, 2018
@mikkokotila
Copy link
Contributor

Thanks! I have new reporting almost rebuilt, which is now also more focused giving us access to details that are vital to the next level of abstraction i.e. optimizing the optimizing. Once we get the current dev to master, I'll add them to dev. What do you think about merging with master now?

@matthewcarbone
Copy link
Collaborator Author

I think we could but if you’re refactoring Reporting we might as well wait until you’re done with that. Is there any reason to merge sooner?

@mikkokotila
Copy link
Contributor

Ok great, yes I think it's good we merge now first. For reporting I'm actually writing it from scratch on the basis of what I found was useful. What do you think would be useful features to add in it?

@matthewcarbone
Copy link
Collaborator Author

@mikkokotila Definitely a few things from my experience with it that I would recommend.

  • Allow for the option to sort by different quantities (validation accuracy, validation loss, training loss, etc.).
  • Definitely let the user easily trim the columns. In other words, maybe we don't care about anything other than the validation accuracy. Therefore, we can let the user easily not display the training loss, validation loss and training accuracy. There is a very easy way to do this with pandas that I've used before. Let me know if you have any trouble finding/using it and I can dig it up.
  • Easy callbacks of various histories. Perhaps there's a way to get Scan to save all those. This would be incredibly useful I think.
  • Can also trim the rows, so maybe only display the top 5 permutations. This is also easy to do in pandas I think.
  • Customizable display options (e.g. if user is running on terminal, let them set something to true/false so that instead of trying to display using pandas it just saves the .csv).

Can't think of anything else off the top of my head but yeah that's a good place to start! Let me know if there's anything I can do. Otherwise when you're done I'll try to help comment/cleanup and whatnot. 👍

@mikkokotila
Copy link
Contributor

@x94carbone thanks a lot. In terms of the callbacks, could you share some examples for the use-cases. Generally storing the history is of course easy, and it seems that the data size should be totally ok as long as we handle it in an array the right way during, and dump to the df only in the end of the process.

For the displaying stuff I made the whole thing simpler in the way where you have a class object that takes in one parameter, which is the log, and then after that you have various properties (like peak round). Then it would be very easy to extend it. The code is super clean without any of the complexities of the current one.

@matthewcarbone
Copy link
Collaborator Author

Nice, that sounds good. By callbacks do you just mean how I manipulate the dataframe to trim columns and that sort of thing?

@matthewcarbone
Copy link
Collaborator Author

Well, in case that was what you were referring to, here's the answer. For the current state of Talos:

First, if using a notebook, this will make the output easier for you to deal with.

import pandas as pd
from IPython.display import clear_output, display

Load the data you just ran using Talos, but suppress the output since it's not always going to be clearly formatted depending on some unknown factors.

r = ta.Reporting('my_exp_1.csv')
clear_output()

Example of removing the columns in remove_columns and sorting the list by sort_by.

remove_columns = ['round_epochs', 'acc', 'loss']
sort_by = 'val_fbeta_score'
rr = pd.DataFrame(r.data.sort_values(sort_by, ascending=True)).drop(remove_columns, axis=1)
display(rr)

Is this helpful?

@channhan007
Copy link

Hi there,
For the format of the output, I don't know why the column names are not in the right order.
Do you guys know how to fix this ?
Thanks
image

@mikkokotila
Copy link
Contributor

@channhan007 the whole reporting piece is going to be replaced with a new much cleaner approach that also supports the various use-cases including far better integration with plots. In the meantime for the simple table view, you might just use:

from pandas import read_csv

read_csv('experiment.csv')

@channhan007
Copy link

@mikkokotila Thank you for your response. I look at the csv output file and the column labels are not in the right order. I have to rearrange those column names.

@mikkokotila
Copy link
Contributor

The new reporting is now included in dev. The old is removed, so closing here. Later we can have separate issues for new features for Reporting(). It's in /utils/reporting.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: LOW lowest priority
Projects
None yet
Development

No branches or pull requests

3 participants