Tagging should take into account confidence of input attributes #7

vaclavbartos · 2017-07-15T14:12:20Z

Currently, rules ("condition") for tagging assumes input attributes to be binary (i.e. strictly true or false, no confidence value), confidence of output can only be set by explicitly multiplying input values by numbers.

It should be added a support for inputs with confidence value set. So, a term in a condition based on an attribute with confidence c will also have confidence c (which can be further modified by arithmetic operations, of course). For example:

Record contains: "hostname_class": {v: "dynamic", c: 0.8}
Tag condition: 0.9*('dynamic' in hostname_class)
Result: tag is set with confidence = 0.72

Of course, it the input attribute is simple, i.e. it has no confidence value assigned, confidence=1 is assumed.

Confidence of a combination of individual terms in a condition is computed as follows:

Arithmetic operations behave normally.
Logical and: A and B = A * B
- Example:
  - Rule: ('dynamic' in hostname_class) and bl.tor --> tag "dynamic_tor"
  - Inputs: ('dynamic' in hostname_class).c = 0.9, bl.tor = 1.0 (implicitly, since blacklists have no confidence values)
  - Result: dynamic_tor.c = 0.9
Logical or: A or B = 1 - ((1 - A) * (1 - B))
- Example:
  - Rule: 0.9*('dynamic' in hostname_class) or 0.2*('dsl' in hostname_class) --> tag "dynamic"
  - Inputs: ('dynamic' in hostname_class).c = 1.0, ('dsl' in hostname_class).c = 1.0
  - Result: dynamic.c = 0.9*1.0 or 0.2*1.0 = 0.9 or 0.2 = 1 - (0.1 * 0.8) = 0.92

See Data model for proposed specification of how values with confidence should be stored in database. The tagging scheme should automatically recognize if given input value is plain or with confidence (by its data type and presence of ".c" attribute).

Note: The definition of confidnce combinations may change, I'll need to prepare some real use-cases and find out if these operations are OK. So, do other issues first. I wrote this so you have an idea what you will work on in the future.

The text was updated successfully, but these errors were encountered:

vaclavbartos · 2023-04-28T08:47:35Z

Closing ages old issues.

vaclavbartos added the enhancement label Jul 15, 2017

vaclavbartos assigned jakub-jancicka Jul 15, 2017

vaclavbartos unassigned jakub-jancicka Apr 28, 2023

vaclavbartos closed this as completed Apr 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tagging should take into account confidence of input attributes #7

Tagging should take into account confidence of input attributes #7

vaclavbartos commented Jul 15, 2017

vaclavbartos commented Apr 28, 2023

Tagging should take into account confidence of input attributes #7

Tagging should take into account confidence of input attributes #7

Comments

vaclavbartos commented Jul 15, 2017

vaclavbartos commented Apr 28, 2023