Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scan and Evalutaion into csv #162

Closed
FlorianBury opened this issue Jan 3, 2019 · 5 comments
Closed

Scan and Evalutaion into csv #162

FlorianBury opened this issue Jan 3, 2019 · 5 comments
Assignees
Labels
investigation gathering information

Comments

@FlorianBury
Copy link

FlorianBury commented Jan 3, 2019

Hi, thanks for your answers on #144.

About the first question, I am currently trying to add the result from the evaluation into the csv file. This is because I believe it will be a better way to select the best models when using Predict(). Right now I just open and write the .csv file in my scripts after the scan ... not always feasible ...

You mentioned new available tools in your answer, is there any doc about them and what version to use ?

Also, I have access to a computing network (handled with slurm) and I wanted to split the trainings between different nodes (CPU, of course slower than a GPU but since there is only one of the latter and hundreds or the former ... parallel wins over serial). SO basically I just split the experiment dict into smaller ones and distribute them over the different nodes. And in the end I concatenate the csv for analysis.

But this poses a problem when appending the csv file due to the file stageout. Is something related to that being investigated or in the making ?

@mikkokotila
Copy link
Contributor

I will put together an example Notebook for the new features. Will try to do it today.

In the meantime, can you clarify the meaning of "stageout" here. Also, did I gather it correctly that what you are doing now for parallelism, is completely automated?

@mikkokotila
Copy link
Contributor

For the evaluation result to go into csv, I will look into this as well. I think it makes sense to have that as an option as it's consistent with Scan().

@mikkokotila mikkokotila self-assigned this Jan 4, 2019
@mikkokotila mikkokotila added the investigation gathering information label Jan 4, 2019
@FlorianBury
Copy link
Author

FlorianBury commented Jan 4, 2019

Great, thanks !

Basically I just re-open the csv file and append it with the evaluation error in another part of the script. I am not sure it will perform correctly when applied on different nodes, still working on that.
EDIT: I got it working now, maybe not optimal but enough for what I need

Yes it is automated but it is rather basic. I just recover the whole set of parameters that talos generates, split it into different subsets and revert them into dictionaries that are saved as pkl files. Then the script launches the scans of each of these sub-dicts on several nodes with slurm.

@mikkokotila
Copy link
Contributor

Sorry it took a little longer. The below is available in v.0.4.6 (for now in dev and daily-dev) and dramatically streamlines the process of evaluation through cross-validation.

scan_object = ta.Scan(x=x_train , y=y_train, params=p, model=gen_model)
scan_object_with_eval = ta.Autom8(scan_object, x_val, y_val)

The above will entail completing an entire pipeline from the scan to evaluation and predictions. The prediction and evaluation results will be added to the scan_object and you can easily export to csv from there if you like. For example to get the evaluation scores with parameters scan_object.data.to_csv('results.csv').

For the most part, you might just want to be able to evaluate the models right there and then and not think about predictions yet, and that's when you can just simply:

scan_object = ta.Scan(x=x_train , y=y_train, params=p, model=gen_model)
scan_object.evaluate_models(x_val, y_val)

Again you could do export to csv using the same scan_object.data.to_csv('results.csv').

@FlorianBury
Copy link
Author

Looks much simpler, thanks !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
investigation gathering information
Projects
None yet
Development

No branches or pull requests

2 participants