## **GML(Gradual Machine Learning) Framework Documentation**


When using the GML framework for reasoning, users need to provide the following three parts of data:             
1. Variables          
2. Features               
3. Simple Example

Examples:     

**Its main data structure is defined as follows:**

In [None]:
#1.Vars data structure Def
variables = list()  # The element type is dict:variable
variable = {
  # User-provided properties
  'var_id': 1,                  # Variable ID int type, starting from 0
  'is_easy': False,            # Whether the variable is easy to infer
  'is_evidence': True,          # Whether the variable is evidence
  'true_label': 1,             # True label of the variable
  'label': 1,                 # Inferred label of the variable, 0 is negative, 1 is positive, -1 is unknown
  'feature_set':               # Feature set of the variable
  { 
    feature_id1: [theta1, feature_value1],
    feature_id2: [theta2, feature_value2],
    ...
  },
  # System-generated properties
  'probability': 0.99,          # Probability of the variable (inferred)
  'evidential_support': 0.3,     # evidential_support
  'entropy': 0.4,               # Entropy of the variable
  'approximate_weight': 0.3,     # 
  ...
}

In [None]:
# 2. Feature struct definition
features = list()  # dict: feature
feature = {
  # User-provided properties
  'feature_id': 1,  # Feature ID int type, starting from 0
  'feature_type': 'unary_feature/binary_feature',  # Distinguish whether this feature is a single factor feature or a two-factor feature. Currently, unary_feature and binary_feature are supported.
  'feature_name': 'good',  # Feature name. If it is a token type, it is the specific word of the token. If it is a feature of other types, it is the specific type of the feature.
  'weight': {
    var_id1: [weight_value1, feature_value1],  # unary_feature
    (var_id3, var_id4): [weight_value2, feature_value2],  # binary_feature
    ...
  },
  # System-generated properties
  'tau': 0,
  'alpha': 0,
  'regression': object,
  ...
}


In [None]:
#3. Simple example: data structure definition
#  Users can also directly mark Easy in variables without providing this file
easys = list() # dict:easy
easy = {
    'var_id': 0,
    'label': 1
}


**After preparing the required data, the main calling process is as follows: easys = list() #The element type is dict: easy**

In [None]:
from gml import GML
from easy_instance_labeling import EasyInstanceLabeling
from evidential_support import EvidentialSupport
from evidence_select import EvidenceSelect
from approximate_probability_estimation import ApproximateProbabilityEstimation

EasyInstanceLabeling(variables, features, easys).label_easy_by_file()

graph = GML(variables, features, evidential_support_method, approximate_probability_method, evidence_select_method, top_m=2000, top_k=10, update_proportion=0.01, balance=False)
graph.inference()  # Factor Graph reasoning
graph.score()  # Get reasoning results


**The detailed functional design of the main classes and functions in the current gml module is as follows:**

In [None]:
# Gml Class
class GML:
    def __init__(self, variables, features, evidential_support_method, approximate_probability_method, evidence_select_method, top_m=2000, top_k=10, update_proportion=0.01, balance=False):
        self.variables = variables
        self.features = features
        self.evidential_support_method = evidential_support_method
        self.approximate_probability_method = approximate_probability_method
        self.evidence_select_method = evidence_select_method
        self.top_m = top_m
        self.top_k = top_k
        self.update_proportion = update_proportion
        self.balance = balance

    # Main functions in GML class
    def evidential_support(self):
        # Calculate evidential support
        pass

    def approximate_probability_estimation(self):
        # Calculate approximate probability
        pass

    def select_top_m_by_es(self):
        # Select topm
        pass

    def select_top_k_by_entropy(self):
        # Select topk
        pass

    def select_evidence(self):
        # Select evidence
        pass

    def construct_subgraph(self):
        # Construct subgraph
        pass

    def inference_subgraph(self):
        # Inference Subgraph
        pass

    def label(self):
        # Select one var from k to label
        pass

    def inference(self):
        # Inference
        pass

In [None]:
# Simple instance: labeling class
class EasyInstanceLabeling(variables, features, easys):
    # Currently provides the following methods:
    def label_easy_by_file():
        # Read the file to label Easy
        pass

    def label_easy_by_clustering():
        # Method for labeling easy in entity recognition
        pass

    def label_easy_by_custom():
        # User Implemented
        pass


In [None]:
# Approximate probability estimation class
class ApproximateProbabilityEstimation(variables):
    # Currently provides the following methods:
    def approximate_probability_estimation_by_interval():
        # Entity Recognition
        pass

    def approximate_probability_estimation_by_relation():
        # All methods for sentiment analysis
        pass

    def approximate_probability_estimation_by_custom():
        # User Implemented
        pass
