## Notes
In this experiment, we tested the following approaches:
1. local dictionary
2. local dictionary + on multilingual word embeddings on unsupervised learning 
3. multilingual language models on supervised learning

CMFD2 is saved in python package format [cmfd](https://pypi.org/project/cmfd/)

In [11]:
# %pip install cmfd
# %pip install jieba
import pandas as pd
import re
import cmfd

Building prefix dict from the default dictionary ...
Loading model from cache /tmp/jieba.cache
Loading model cost 0.480 seconds.
Prefix dict has been built successfully.


In [None]:
# load the bm file to save the results
bm = pd.read_csv('path to BM.csv', dtype=str)

## cmfd2 - frequency

In [None]:
# 1. Load the dataset
df = pd.read_csv('path to BM.cs', dtype=str)
texts = df['text'].tolist()

result_list = []

# 2. Calculate the moral quantity of each text using the CMFD model
for text in texts :
    result = cmfd.moral_quantity(text, duplicate=False, with_word=True)
    result_list.append(result)

print(len(result_list))

# 3. clean the result 
flattened_data_list = []
for result in result_list:
    flattened_data = {}
    if result is None:
        fixed_dict = {
            ('altr',): {'num': 0.0, 'word': ''}, 
            ('auth',): {'num': 0.0, 'word': ''}, 
            ('care',): {'num': 0.0, 'word': ''}, 
            ('dili',): {'num': 0.0, 'word': ''}, 
            ('fair',): {'num': 0.0, 'word': ''}, 
            ('general',): {'num': 0.0, 'word': ''}, 
            ('libe',): {'num': 0.0, 'word': ''}, 
            ('loya',): {'num': 0.0, 'word': ''}, 
            ('mode',): {'num': 0.0, 'word': ''}, 
            ('resi',): {'num': 0.0, 'word': ''}, 
            ('sanc',): {'num': 0.0, 'word': ''}, 
            ('wast',): {'num': 0.0, 'word': ''}
        }
        for key, sub_dict in fixed_dict.items():
            for sub_key, value in sub_dict.items():
                new_key = f"{key[0]}_{sub_key}"
                flattened_data[new_key] = value
    else:
        for key, sub_dict in result.items():
            for sub_key, value in sub_dict.items():
                new_key = f"{key[0]}_{sub_key}"
                flattened_data[new_key] = value
    flattened_data_list.append(flattened_data)
        
# # Convert the flattened dictionary to a DataFrame
cmfd2 = pd.DataFrame(flattened_data_list)
# add the original text column
cmfd2['text'] = texts

cmfd2_clean = cmfd2[['text', 'care_num', 'auth_num', 'fair_num', 'loya_num', 'sanc_num', 'care_word', 'auth_word', 'fair_word', 'loya_word', 'sanc_word','general_num', 'general_word']]
new_column_names = {
    'text': 'text',
    'care_num': 'care',
    'auth_num': 'auth',
    'fair_num': 'fair',
    'loya_num': 'loya',
    'sanc_num': 'sanc',
    'care_word': 'care_word',
    'auth_word': 'auth_word',
    'fair_word': 'fair_word',
    'loya_word': 'loya_word',
    'sanc_word': 'sanc_word',
    'general_num': 'general_num',
    'general_word': 'general_word'
}

# Rename the columns
cmfd2_clean = cmfd2_clean.rename(columns=new_column_names)

# Define the columns to check
columns_to_check = ['care', 'auth', 'fair', 'loya', 'sanc']

# Function to determine the value for the new column
def determine_mfd2(row):
    values = row[columns_to_check]
    if values.max() == 0:
        return "non_moral"
    else:
        max_value = values.max()
        max_columns = [col for col in columns_to_check if row[col] == max_value]
        return ', '.join(max_columns)

# Apply the function to each row and create the new column
bm['mfd2'] = cmfd2_clean.apply(determine_mfd2, axis=1)

# save the result
cmfd2_clean.to_csv('raw_result_cmfd2.csv', index=False)

3087


## cmfd2 - multilingual word embeddings 

download the local language model from [fastText](https://fasttext.cc/docs/en/crawl-vectors.html) 

1. Each foundation (or "concept") is described by a list of keywords in C-MFD 2.0. 
2. The vector describing the concept is the average of all embeddings for these keywords.
3. A document is transformed into a vector by tokenizing and averaging the embedding of these tokens. This is called the document embedding.
4. To assess the relevance of a document with a concept, we consider the cosine similarity between their embeddings.

code is adapted from Josh
`python score_mf_ddr.py --data $DATA_DIR --text_col $TEXT_COL --verbose $VERBOSE --output $OUTPUT_FILE`

Print progress: `VERBOSE=1`

1. use the customized dictionary `customized.csv`
2. code is running in the remote server for memory sake

Notes

1. download `score_mf_ddr.py` from Josh's [repo](https://github.com/joshnguyen99/moral_axes/blob/main/scripts/utils/create_concepts.py)
2. comment out line 5, and add `make_concepts_from_lexicon` function in line 31-79 from [`create_concepts.py`](https://github.com/joshnguyen99/moral_axes/blob/main/scripts/utils/create_concepts.py)
3. revise line 25-27 to correctly read chinese embeddings model from FastText, with genism (this cost 2 hours to debug)
4. add line 39-50 to load cmfd2 dictionary
5. line 36, specify the tokenizer in spacy to Chinese
6. line 116, removed to_lower() function 
7. save the revised document as `score_mf_ddr_revise.py`

run the following code in the remote server
`python3 score_mf_ddr_revise.py --data BM.csv --text_col text --verbose 1 --output raw_cmfd2_embedding.csv`

## Cmfd 2 - multilingual word ebemddings + FrameAxis

1. prepared customized dataset `customized.csv`
2. `main.py` change
    1. add line 5 to load fasttext embedding data correctly 
    2. corresponding updated line 44
    3. changed the file name to `frameaxis_main.py` to be more descriptive in the oii-server
3. `frameaxis.py` change
    1. add line 5 to load fasttext embedding data correctly, corresponding updated line 44
    2. commented out line 9, 278, 279, no need to preprocess the data
    3. update line 29, the dictionary path
    4. added line 13-16 chinese spacy tokenizer, and update tokenizer in `doc_scores` function, all `split` to `tokenize`
    5. when updating `tokenize` lines in step 4, removed its vocabulary filter on line 250, 267 accordingly
    6. commented out line 298 to include NaN scores. 
    7. added progress bar code in line 264, and `from tqdm import tqdm` at line 4,  change the reporting frequency to every 100 documents in line 265

run the following code in the remote server
` python3 frameaxis_main.py --docs_colname text --input_file /BM.csv --output_file raw_result_cmfd2_frameaxis.csv --dict_type customized --word_embedding_model cc.zh.300.bin `




## cmfd2 - multilingual contextual embeddings - FrameAxis
- [?] do you have a local language dictionary?
- [?] do you have a local language word embedding model?


### 1. prepare dictionary

In [None]:
import pandas as pd
#### format the cmfd2.csv file to align with `mfd_original.csv` file
cmfd = pd.read_csv('/cmfd2.csv')
# only keep big-5 categories
cmfd = cmfd[cmfd['category'].isin(['care', 'auth', 'fair', 'loya', 'sanc'])]

# have to add a sentiment column for frame axis computation, used the "liam168/c2-roberta-base-finetuned-dianping-chinese" model for sentiment 

### load the model to add the sentiment layer
from transformers import AutoModelForSequenceClassification , AutoTokenizer, pipeline
import torch

ts_texts = cmfd['word'].tolist()
model_name = "liam168/c2-roberta-base-finetuned-dianping-chinese"
class_num = 2

model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=class_num)
tokenizer = AutoTokenizer.from_pretrained(model_name)

classifier = pipeline('sentiment-analysis', model=model, tokenizer=tokenizer)

for i in range(len(ts_texts)):
    sentiment_result = classifier(ts_texts[i])[0]
    cmfd.loc[i, 'sentiment'] = sentiment_result['label']
    cmfd.loc[i,'score'] =  sentiment_result['score']

#### map the sentiment label to vice/virtue
mapping = {
    'positive': 'virtue',
    'negative': 'vice'
}
cmfd['sentiment'] = cmfd['sentiment'].map(mapping)

# save the result to customized
cmfd.to_csv('customized.csv', index=False)



### 2. revise the FrameAxis package to use xlm-roberta-base embeddings

note
- to run the frameAxis pacakge, i created a virtue environment on the server `conda create --name virtue python=3.10.14`
- `conda activate virtue` to activate the environment
- `pip install transformers torch scikit-learn` to install necessary packages, for multilingual word embeddings 
- changed `main.py`
    - line 45-47, commmented out. 
    - added line 48-51, to download the model from pre-trained `FacebookAI/xlm-roberta-base` model 
    - changed line 55
- changed `frameAxis.py`
    - line 14-15, changed the model 
    - line 18, changed the class
    - line 19-21, changed word embedding extraction methods 
    - line 74-77, define the new get embedding function 
    - other functions from `model[word]` to `get_embedding(word)` to get the embeddings
    - line 96-97 to flatten the embeddings
    - commented out/deleted line 48-71, line 133-171 useless for customized dictionary, 
    - commented out line 299, we don't preprocess the text, as the preprocess.py file is only for English text. 
    - moved line 21 and to line 40 to use cmfd2 dictionary as the microframe dictionary, not the pretrained googlenews embeddings
    - continue checking: `def _compute_axes(self, mfd)` function is fine
    - continue checking: find the bug, because line 273-278, `baseline_docs` and `tfidf` are all empty in the current setting. 
        - line 286 `doc_tokens` will return empty list, because it still tokenizes the text with english methods. 
        - so i added line 286 `doc_tokens_multilingual = self.tokenizer.tokenize(doc)` and revised line 287 `doc_tokens = [x for x in doc_tokens_multilingual if x in self.vocab]`
    - contiue checking, worked out better but still 1/3 missing values. should not be like this based on this methods - so i deleted the filtering of the tokens in line 286 `doc_tokens = self.tokenizer.tokenize(doc)`. previously, the filtering was based on the english vocab, it wants to filter out the tokens in the document that are not in the `google-news-300` embedding models, but we don't have to do this for the multilingual embeddings. We tokenzie the text with the multilingual tokenizer, and keep them all for the similarity computation. 
    - optional: commented out line 324, depending on whether you want to keep NaN rows. 
    - added progress bar code in line 282, and `from tqdm import tqdm` at line 4
    - change the reporting frequency to every 100 documents in line 283



- run the command line `python main.py --docs_colname text --input_file /home/misinfo/turing_dso_misinfo/llms_mft_multilingual/paper_linked_final/datasets/BM.csv --output_file /home/misinfo/turing_dso_misinfo/llms_mft_multilingual/paper_linked_final/experiment_lan_tool/raw_result_cmfd2_frameaxis.csv --dict_type customized`



## Result Cleaning 

In [None]:
import pandas as pd

cmfd2_sim = pd.read_csv('raw_result_cmfd2_fasttext.csv', index_col=0)
print(cmfd2_sim.columns)    

Index(['text', 'source', 'source_lan', 'source_label', 'google_trans_en',
       'source_dataset', 'care_score', 'fairness_score', 'loyalty_score',
       'authority_score', 'sanctity_score'],
      dtype='object')


In [None]:
import pandas as pd
bm = pd.read_csv('BM.csv')

#### Function to determine the moral value for the text, based on the highest score (prob or freq) of the dictionary results
def determine_score(row):
    values = row[columns_to_check]
    if values.max() == 0:
        return "non_moral"
    else:
        max_value = values.max()
        max_columns = [col for col in columns_to_check if row[col] == max_value]
        return ', '.join(max_columns)

# decide the score label for each text 
columns_to_check = ['care', 'auth', 'fair', 'loya', 'sanc']

### Cleaning 1 - the CMFD2.0 python result
# the result might include more than one labels for the classification 
cmfd2_freq = pd.read_csv('/raw_result_cmfd2_freq.csv')
cmfd2_freq['freq_cmfd2'] = cmfd2_freq.apply(determine_score, axis=1)

#append the result to the result datarame
result_cmfd2 = pd.concat([bm, cmfd2_freq[['freq_cmfd2']]], axis=1)


#### cleaning 2 - the CMFD2.0 Josh's simple word-embedding result, with word embedding
cmfd2_sim = pd.read_csv('/raw_result_cmfd2_fasttext.csv')
cmfd2_sim = cmfd2_sim[['text', 'care_score', 'fairness_score', 'loyalty_score', 'authority_score', 'sanctity_score']]
new_column_names_cmfd2_sim = {
    'text': 'text',
    'care_score': 'care',
    'fairness_score': 'fair',
    'loyalty_score': 'loya',
    'authority_score': 'auth',
    'sanctity_score': 'sanc'
}

cmfd2_sim = cmfd2_sim.rename(columns=new_column_names_cmfd2_sim)
cmfd2_sim["cc_cmfd2"] = cmfd2_sim.apply(determine_score, axis=1)
result_cmfd2 = pd.concat([result_cmfd2, cmfd2_sim[['cc_cmfd2']]], axis=1)


#### Cleaning 3 - the CMFD2.0 frameAxis word-embedding result, with word embedding
cmfd2_frameaxis_sim = pd.read_csv('/raw_result_cmfd2_fasttext_frameaxis.csv')
cmfd2_frameaxis_sim = cmfd2_frameaxis_sim[['text', 'intensity_care', 'intensity_auth','intensity_fair', 'intensity_loya', 'intensity_sanc']]

# only use intensity, intensity columns are results from FrameAxis, other columns are from the previous dictionaries
new_column_names_cmfd2_frameaxis = {
    'text': 'text',
    'intensity_care': 'care',
    'intensity_fair': 'fair',
    'intensity_loya': 'loya',
    'intensity_auth': 'auth',
    'intensity_sanc': 'sanc'
}

cmfd2_frameaxis_sim = cmfd2_frameaxis_sim.rename(columns=new_column_names_cmfd2_frameaxis)
cmfd2_frameaxis_sim['cc_frameaxis_cmfd2'] = cmfd2_frameaxis_sim.apply(determine_score, axis=1)
result_cmfd2 = pd.concat([result_cmfd2, cmfd2_frameaxis_sim[['cc_frameaxis_cmfd2']]], axis=1)

#### Cleaning 4 - the CMFD2.0 frameAxis contextual-embedding result, with xml embedding
cmfd2_frameaxis_xml = pd.read_csv('raw_result_cmfd2_xml_frameaxis.csv')
cmfd2_frameaxis_xml = cmfd2_frameaxis_xml[['text', 'intensity_care', 'intensity_auth','intensity_fair', 'intensity_loya', 'intensity_sanc']]
cmfd2_frameaxis_xml = cmfd2_frameaxis_xml.rename(columns=new_column_names_cmfd2_frameaxis)
cmfd2_frameaxis_xml['xml_frameaxis_cmfd2'] = cmfd2_frameaxis_xml.apply(determine_score, axis=1)
result_cmfd2 = pd.concat([result_cmfd2, cmfd2_frameaxis_xml[['xml_frameaxis_cmfd2']]], axis=1)


# save to csv 
result_cmfd2.to_csv('result_cmfd2.csv', index=False)

## Data Analysis

In [8]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from sklearn.metrics import classification_report, confusion_matrix

In [None]:
result_cmfd2 = pd.read_csv('result_cmfd2.csv')

##### Define a function to deal with multiple matches for one document ######
def multi_match_split(df, bm_column, predict_column):
    """
    this function will generate two predicted results:
    fuzzy match means it will be a match if there is at least one value matched in the prediction (more true values)
    exact match means it will only be a match if the whole value is matched in the prediction (fewer true values)
    for non-moral values, it will remain as "non_moral" and counted in the coverage calculation, but not in the F1 score calculation
    """
    df[predict_column] = df[predict_column].fillna("non_moral")


    for index, row in df.iterrows():
        predict_value = row[predict_column]
        bm_value = row[bm_column]
        
        # if No commas, meaning there is only one value predicted by the model, copy the value to both columns
        if ',' not in predict_value:
            df.loc[index, f"{predict_column}_fuzzy_match"] = predict_value
            df.loc[index, f"{predict_column}_exact_match"] = predict_value
        else:
            
            # Multiple values, split by comma
            predict_value = predict_value.split(',')

            # very unlikely for the model to code a text to non-moral and moral values at the same time, but just in case
            if "non_moral" in predict_value:
                df.loc[index, f"{predict_column}_fuzzy_match"] = "non_moral"
                df.loc[index, f"{predict_column}_exact_match"] = "non_moral"
            else:  
                if bm_value in predict_value:
                    # If source_label is in the mfd values - it is a match
                    df.loc[index, f"{predict_column}_fuzzy_match"] = bm_value
                    # Select one of the other values
                    exact_match_value = next((val for val in predict_value if val != bm_value))
                    df.loc[index, f"{predict_column}_exact_match"] = exact_match_value
                else:
                    # If no match is found, select the first value for both columns
                    df.loc[index, f"{predict_column}_exact_match"] = predict_value[0]
                    df.loc[index, f"{predict_column}_fuzzy_match"] = predict_value[0]
    # trim the spaces before and after the string in the columns
    df.loc[:,f"{predict_column}_exact_match"] = df[f"{predict_column}_exact_match"].str.strip()
    df.loc[:,f"{predict_column}_fuzzy_match"] = df[f"{predict_column}_fuzzy_match"].str.strip()
    # df[f"{predict_column}"] = df[f"{predict_column}"].str.strip()

    return df

##### Define a function to calculate model performance ######

def model_performance_coverage(df, predicted_label_column):
    """
    This function will calculate the coverage of the model performance
    coverage is the percentage of the non-moral text predicted by the model, meaning the limitation of the model performance
    """
    # non_moral label is the label for the non-moral text
    non_moral_label = "non_moral"
    value_counts = df[predicted_label_column].value_counts()
    non_moral_count = value_counts.get(non_moral_label, 0)
    # show the percentage of the coverage
    coverage = (len(df) - non_moral_count) / len(df)
    # keep 4 digits after the decimal point
    coverage = round(coverage, 2)

    return coverage

#### Function to round the values in the classification report dictionary #####
def round_classification_report(report, digits=3):
    for key, value in report.items():
        if isinstance(value, dict):
            for sub_key, sub_value in value.items():
                report[key][sub_key] = round(sub_value, digits)
        else:
            report[key] = round(value, digits)
    return report


def model_performance(df, source_label_column, predicted_label_column):
    
    """
    This function will return a dataframe, containing the classification report, model's name and coverage figure
    """
    # calculate the coverage of the model
    model_coverage = model_performance_coverage(df, predicted_label_column)
    
    # then, remove non-moral values from the classification report as it is calculated in the coverage calculation
    df = df[df[predicted_label_column]!='non_moral']
    true_labels = df[source_label_column]
    predicted_labels = df[predicted_label_column]

    # Generate classification report
    class_report = classification_report(true_labels, predicted_labels, output_dict=True, zero_division=0)
    rounded_class_report = round_classification_report(class_report, digits=2)
    # Convert classification report to DataFrame
    report_df = pd.DataFrame(rounded_class_report)
    report_df = report_df.loc[['f1-score']]
    report_df = report_df.rename(index={'f1-score': f'f1 {predicted_label_column}'})
    
    # add coverage figure to a new column, so far there should be only one row in the dataframe
    report_df['model_coverage'] = model_coverage
    return report_df

######## ADD the random baseline to each benchmark dataset ########
    # Ground truth label distribution
label_BM_MFV = pd.Series({'care': 27,'auth': 25, 'loya': 16, 'fair': 12,'sanc': 10})
label_BM_CS =  pd.Series({'care': 389,'auth': 331, 'loya': 248, 'fair': 259,'sanc': 226})
label_BM_CV = pd.Series({'care': 619,'auth': 271, 'loya': 349, 'fair': 253,'sanc': 52})

def randome_baseline_performance(source_dataset):
    label_counts_str = f"label_{source_dataset}"
    
    label_counts = eval(label_counts_str)
    total_samples = label_counts.sum()

    # Initialize an empty list to store random classification reports
    classification_reports = []

    classification_reports = []
    for _ in range(1000):
            random_predictions = np.random.choice(label_counts.index, size=total_samples, p=label_counts / total_samples)
            true_labels = np.repeat(label_counts.index, label_counts.values)
            
            # Generate classification report for random predictions
            random_classification_report = classification_report(true_labels, random_predictions, output_dict=True, zero_division=0)
            # # classification_reports.append(random_classification_report)
            # ramdom_rounded_class_report = round_classification_report(random_classification_report, digits=2)
            random_report_df = pd.DataFrame(random_classification_report)
            # extract all keys and its corresponding 'f1-score' values if it has otherwise, any value
            extracted_values = {key: (value['f1-score'] if isinstance(value, dict) and 'f1-score' in value else value)
                        for key, value in random_classification_report.items()}

            classification_reports.append(extracted_values)

        # Average the classification reports
    classification_reports_df = pd.DataFrame(classification_reports)

    # calculate the average of each column in the dataframe
    avg_random_report = classification_reports_df.mean().to_frame().T
    random_report_df = avg_random_report.rename(index={0: 'f1 random baseline'})
    random_report_df["model_coverage"] = 1.00

    # Convert classification report to DataFrame
    random_report_df = round_classification_report(random_report_df, digits=2)
    
    return random_report_df

###### specify models we want to benchmark and compare ######
local_lexicon_models = ['freq_cmfd2', 'cc_cmfd2', 'cc_frameaxis_cmfd2','xml_frameaxis_cmfd2']

####### define a function to present the results in a table #######
def present_tables_by_BM(df, source_dataset):
    df = df[df['source_dataset'] == source_dataset]

    table_display = pd.DataFrame()
    # add the ramdom baseline to the table
    random_baseline = randome_baseline_performance(source_dataset)
    table_display = pd.concat([table_display, random_baseline], axis=0) # row bind
    
    for model_column in local_lexicon_models:
        df =  multi_match_split(df, 'source_label', model_column)
        df_row = model_performance(df, 'source_label', f'{model_column}_fuzzy_match')
        table_display = pd.concat([table_display, df_row], axis=0) # row bind
    
    print(f"Machine Translation MF Measurement Results Benchmarked with {source_dataset} Dataset")
    display(table_display)

In [24]:
import warnings
# Ignore specific warnings
warnings.filterwarnings("ignore", category=RuntimeWarning)
warnings.filterwarnings("ignore", category=pd.errors.SettingWithCopyWarning)

present_tables_by_BM(result_cmfd2, 'BM_MFV')


Machine Translation MF Measurement Results Benchmarked with BM_MFV Dataset


Unnamed: 0,auth,care,fair,loya,sanc,accuracy,macro avg,weighted avg,model_coverage
f1 random baseline,0.16,0.25,0.16,0.34,0.1,0.21,0.2,0.21,1.0
f1 freq_cmfd2_fuzzy_match,0.48,0.18,0.67,0.29,0.22,0.39,0.37,0.42,0.4
f1 cc_cmfd2_fuzzy_match,0.44,0.0,0.0,0.12,0.18,0.3,0.15,0.16,1.0
f1 cc_frameaxis_cmfd2_fuzzy_match,0.08,0.0,0.0,0.3,0.0,0.19,0.08,0.08,1.0
f1 xml_frameaxis_cmfd2_fuzzy_match,0.0,0.07,0.24,0.12,0.0,0.16,0.09,0.07,1.0


In [51]:
present_tables_by_BM(result_cmfd2, 'BM_CS')


Machine Translation MF Measurement Results Benchmarked with BM_CS Dataset


Unnamed: 0,auth,care,fair,loya,sanc,accuracy,macro avg,weighted avg,model_coverage
f1 random baseline,0.22,0.26,0.17,0.19,0.17,0.21,0.2,0.21,1.0
f1 freq_cmfd2_fuzzy_match,0.74,0.76,0.73,0.68,0.74,0.74,0.73,0.73,0.76
f1 cc_cmfd2_fuzzy_match,0.44,0.32,0.35,0.39,0.37,0.39,0.37,0.37,0.97
f1 cc_frameaxis_cmfd2_fuzzy_match,0.14,0.05,0.15,0.29,0.02,0.2,0.13,0.12,0.97
f1 xml_frameaxis_cmfd2_fuzzy_match,0.0,0.14,0.3,0.0,0.0,0.19,0.09,0.09,1.0


In [52]:
present_tables_by_BM(result_cmfd2, 'BM_CV')


Machine Translation MF Measurement Results Benchmarked with BM_CV Dataset


Unnamed: 0,auth,care,fair,loya,sanc,accuracy,macro avg,weighted avg,model_coverage
f1 random baseline,0.19,0.42,0.16,0.25,0.04,0.28,0.21,0.28,1.0
f1 freq_cmfd2_fuzzy_match,0.26,0.63,0.45,0.28,0.09,0.45,0.34,0.43,0.64
f1 cc_cmfd2_fuzzy_match,0.28,0.17,0.09,0.01,0.03,0.2,0.12,0.13,0.98
f1 cc_frameaxis_cmfd2_fuzzy_match,0.02,0.03,0.04,0.37,0.0,0.23,0.09,0.1,0.98
f1 xml_frameaxis_cmfd2_fuzzy_match,0.0,0.18,0.27,0.02,0.0,0.19,0.09,0.12,1.0
