#Echelon This file explains how to write an algorithm module, test it using test_algorithm.py, and run it using run_algorithms.py.
##Installation:
Dependencies:
-
Tested on Python v2.7
-
Python NLTK
##Usage:
- The 'statement' class in parse_init.py:
The statement class (an object of this class will be referred to from now on as statement) is used to create objects that hold information about a statement. A statement is defined as something a user sends as a message to other users, by hitting the enter key.
Each statement contains useful information, that can be accessed directly by using statement.attribute. For more information about what attributes can be accessed, please refer to the parse_input.py module.
-
How to write your algorithm:
i) Your first step will be to store a list of statements in a list, say, stat_list. Each statement has an attribute alg_lambda. You are only concerned with populating those alg_lambdas, whose issued_by is unknown('$$$').
ii) For each unknown statement in stat_list, your algorithm must assign certain values, called lambdas, to each user. A lambda is an indicator of the probability that that user has of being the issuing user. However it is NOT the probability. The value of a lambda will be 1 if the algorithm is completely ambivalent about that user, i.e, it has nothing to say about the possibility of that user being the issuing user. However if the algorithm is reasonably sure that it is at least twice as probable that user1 issued the statement, then user1's lambda will be 2. Similarly if the algorithm thinks that it is twice as probable that user2 did not issue the statement, then user2's lambda will be 1/2.
iii) The lambdas are used when calculating the final probability that user1 issued that statement. It is done by multiplying the lambdas obtained by different algorithms for user1. For example, if
(This is not programming. Read it like a maths text)
lambda_alg1 = 1 (alg1 is ambivalent) lambda_alg2 = 1.5 (alg2 is a little more certain that user1 issued it) lambda_alg3 = 0.9 (alg3 is a little certain that user1 did not issue it) lambda_alg4 = 0.01 (alg4 is almost certain that user1 did not issue it)
Finally,
lamda = lamda_alg1 * lamda_alg2 * lamda_alg3 * lamda_alg4 = 0.135
The final lamda obtained indicates a low, but not zero, possibility that user1 issued the statement.
iv) If you need more attributes than the statement class provides, you can derive a new class from the statement class to suit your needs. See the alg_line_context.py module for an example of this.
v) You must have a function in your algorithm module called run(). This
function is used to interface with the other modules, so it must conform to the following specifications:
a) It must take a single argument, a list of statements.
b) It need not return anything. However if it does, it will be
ignored.
c) For each unknown statement, it must either populate a user-lambda
dictionary as specified above, or assign the value {}.
Failure to conform to these specifications may result in any number of errors.
- How to test your algorithm:
In the module settings.py, import your algorithm module (alg_my_algorithm) and set the value of test_alg to alg_my_algorithm.
In bash, run:
python test_algorithm.py filename | less
-
How to optimise your algorithm:
i) Choose the parameters that you would like to vary, the range through which they must vary, and the steps. ii) Your algorithm must have the list 'params'. Make sure that the variables you have chosen as the parameters of your algorithm are assigned to values in params before every run. iii) Corresponding to each param in params, populate the variable 'ranges' with tuples of the range for each parameter. iv) Similarly, populate the variable 'steps' with how fast you'd like to step through the range.
HINT: Look at algorithms/alg_bracket.py or algorithms/alg_line_context.py for an example.
-
How to run algorithms:
Supposing that your algorithm module is named 'alg_my_algorithm.py'. In the module 'settings.py', add the lines:
import alg_my_algorithm
alg_list += alg_my_algorithm
Notice there's no '.py' extension when importing.
Now run your algorithm from bash using:
python run_algorithms.py filename | less
Warning: This feature is still under development. This will use all algorithms specified in settings.py. For testing your algorithm only, use test_algorithm.py(3). For optimising your algorithm, use optimise.py(4).