redesign tuner for future deployment #7

WeiFoo · 2015-11-25T20:32:39Z

To consider using other algorithms other than DE, like gale, a new tuning interface should be provided and designed.

WeiFoo · 2015-11-27T05:36:45Z

Done!

timm · 2015-11-29T23:00:25Z

u got de and gale etc in JMOO?

WeiFoo · 2015-11-29T23:20:23Z

Haven't used JMOO at this time, will write a wrapper to use gale if necessary. Let's see whether tuning smote works or not.

timm · 2015-11-30T21:19:21Z

hey @WeiFoo, i don't actually get why we DONT have DE+SMOTE results yet. given @azhe825's rig it should be fast++ to check this out....

WeiFoo · 2015-11-30T21:38:38Z

Smote once is faster, smote at least 10_10_10 with early termination is another story.

Tuning smote is most time consuming task. I simplified the experiment and now it's been running on HPC more than 16 hours, for only one small date set called anime.txt

HPC is also not that faster at all. And I run the same experiment on my lab top for 6 hours, still need 4 hours based on the logs.

By the way, I rewrite 90% codes. Takes time.

On Nov 30, 2015, at 16:19, Tim Menzies notifications@github.com wrote:

hey @WeiFoo, i don't actually get why we DONT have DE+SMOTE results yet. given @azhe825's rig it should be fast++ to check this out....

—
Reply to this email directly or view it on GitHub.

timm · 2015-12-01T02:51:53Z

By the way, I rewrite 90% codes. Takes time.

acknowledged.

fyi- if smote tuning is soooo slow and data mining tuning is soooo fast then maybe the conclusion here is tune data miners, not pre-processor

on the other hand: why is smote slow? is it the NN calculations? if you do find east1,west1 of the top level of WHERE, then used y=(a^2-x^2) to give each point a y-axis value you could quickly divide the data into a 2d grid. then you divide each dimension into 16 (so now you have 16^2 buckets) and for each bucket, just keep 5 examples of each class (selected at random). so if you want to smote something, use east1,west1 to find its bin then pick any one of the 5 in that bin.

WeiFoo · 2015-12-01T03:42:24Z

The reason is that each evaluation in tuning smote requires generating new data, fitting learner, predicting, and F/pd/precision value calculation. I explain it here ==>#9

WeiFoo added the ToDo label Nov 25, 2015

WeiFoo closed this as completed Nov 27, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

redesign tuner for future deployment #7

redesign tuner for future deployment #7

WeiFoo commented Nov 25, 2015

WeiFoo commented Nov 27, 2015

timm commented Nov 29, 2015

WeiFoo commented Nov 29, 2015

timm commented Nov 30, 2015

WeiFoo commented Nov 30, 2015

timm commented Dec 1, 2015

WeiFoo commented Dec 1, 2015

redesign tuner for future deployment #7

redesign tuner for future deployment #7

Comments

WeiFoo commented Nov 25, 2015

WeiFoo commented Nov 27, 2015

timm commented Nov 29, 2015

WeiFoo commented Nov 29, 2015

timm commented Nov 30, 2015

WeiFoo commented Nov 30, 2015

timm commented Dec 1, 2015

WeiFoo commented Dec 1, 2015