-
Notifications
You must be signed in to change notification settings - Fork 89
Configuration
Each configuration parameter is documented with name, description and a default value if any
Value is true if input format is tabular, false if text (true)
feature.schema.file.path Schema JSON file HDFS path
Cost values for cost based arbitrator
Class atrribute values. Generally this is obtained from the schema
Class attribute probablity dofeerence threshold for prediction purpose (-1)
Set to true if only feature probability needs to be output (false)
Batch size for random shuffling (1000)
Attribute ordinal list for the first set
Attribute ordinal list for the second set
Schema JSON file HDFS path
Attribute selection strategy for splitting. Values are 1.userSpecified - Specified by the user 2.all - All attributes. An attribute may be split multiple times at different levels 3.notUsedYet - Attributes not split yet will be split 4.random - Attributes will be selected randomly
Splitting algorithms. Choices are 1.entropy 2. giniIndex 3. hellingerDistance 4. classConfidenceRatio (giniIndex)
Set to true if addition split probablity is to be output (false)
Parent node information content
Schema JSON file HDFS path
Ordinal list for source attributes
Ordinal list for destination attributes
Correlation scale (1000)
Heterogeneity algorithm. Choices are 1. gini 2. uncertainty
Schema JSON file HDFS path
Set to true to output mutual info (false)
Mutual info scoring algorithms. Choices are 1. mutual.info.maximization 2. mutual.info.selection 3. joint.mutual.info 4. double.input.symmetric.relevance 5. min.redundancy.max.relevance (mutual.info.maximization)
Redundancy factor for the algorithm mutual.info.selection (1.0)
Class attribute ordinal
Sampling batch size (500)
Feature conditional probability input file prefix (condProb)
Set to true if in validation mode (false)
Set to true if class conditional probability weighting is to be applied (false)
The mode of prediction. Choices are 1. classification 2. regression (classification)
The method of regression. Choices are 1. average 2. median 3. linearRegression 4. multiLinearRegression (average)
Number of nearest neighbors (10)
Type of kernel function for calculating score distance. Choices are 1. none 2. linearMultiplicative 3. linearAdditive 4. gaussian 5. sigmoid (none)
Parameter associated with kernel function (-1)
Set to true if class conditional probability distribution is to be output (false)
Set to true if score is to be inverse distance weighted (false)
Threshold value for score ratio threshold based classification (-1.0)
Set to true for cost based classifier (false)
Coma separated class attribute values. Needed if use.cost.based.classifier is true and prediction.mode is classification
Misclassification cost. Needed if use.cost.based.classifier is true and prediction.mode is classification
Number of fields to skip from the beginning of the beginning (0)
Sub field delimiter between state and observation
Set to true if only some of the observations are tagged with states
Window function when partially.tagged is true
List of states
List of observations
Number of fields to skip from the beginning of the beginning (0)
List of coma separated states
Transition probability scale
Number of fields to skip from the beginning of the beginning (1)
Id field ordinal (0)
Set to true if only states need to be output (true)
Sub field delimiter (:)
HMM file HDFS path
Current round number
Auer deterministic algorithm. Choices are 1. AuerUBC1 (AuerUBC1)
Count field ordinal
Reward field ordinal
HDFS path for file containing group batch size
Current round number
Initial probability for random selection (0.5)
Random selection probability reduction algorithm. Choices are 1. linear 2. logLinear 3. AuerGreedy (linear)
Random selection probability reduction constant (1.0)
Count field ordinal
Reward field ordinal
Auer greedy constant. Needed when prob.reduction.algorithm is AuerGreedy (5.0)
HDFS path for file containing group batch size
Current round number
Strategy for exploration counts. Choices are 1. simple 2. pac (simple)
Exploration count factor. Needed when exploration.count.strategy is simple (2)
Reward difference. Needed when exploration.count.strategy is pac (0.2)
Probability difference. Needed when exploration.count.strategy is pac (0.2)
HDFS path for file containing group batch size
Current round number
Temperature constant (1.0)
Item trial count ordinal
Item reward ordinal
HDFS path for file containing group batch size
Concurrency for spout (1)
Concurrency for bolts (1)
Number of worker processes (1)
Redis server host
Redis server port
Redis event queue
Redis reward queue
Interval for log messages
Reinforcement learning algorithm
List of coma separated actions by the learner
Action output Redis queue