-
Notifications
You must be signed in to change notification settings - Fork 0
Preprocessing
nave91 edited this page Sep 2, 2014
·
2 revisions
Data can be preprocessed according to Menzies
mark | type | nump | wordp | ------------|-------|------|-------|------------------- ? | | | | a column to ignore ------------|-------|------|-------|------------------- dep = | klass | | X | a label to predict + | more | X | | a goal to maximize - | less | X | | a goal to minimize ------------|-------|------|-------|------------------- indep $ | num | X | | non-goal number else | term | | X | non-goal non-number
Example if we consider weather dataset:
#data/weather1.csv outlook, # forecast ?+$temperature, # degrees Farenheit, -$humidity, # % of dewpoint windy, # boolean =play # goal ################################################# sunny ,85 ,90 ,FALSE ,no sunny ,80 ,90 ,TRUE ,no overcast ,83 ,86 ,FALSE ,yes rainy ,70 ,96 ,FALSE ,yes rainy ,68 ,80 ,FALSE ,yes rainy ,65 ,? ,TRUE ,no overcast ,64 ,65 ,TRUE ,yes sunny ,72 ,? ,FALSE ,no sunny ,69 ,70 ,FALSE ,yes rainy ,75 ,80 ,FALSE ,yes sunny ,75 ,70 ,TRUE ,yes overcast ,72 ,90 ,TRUE ,yes overcast ,81 ,75 ,FALSE ,yes rainy ,71 ,90 ,TRUE ,no
The header can be formatted in form of:
outlook, # forecast ?+$temperature, # degrees Farenheit, -$humidity, # % of dewpoint windy, # boolean =play # goal