# Zindi Challenge Metric calculation

This calculation combines all the important perfromance metrics into one number that appears on the challenge leader board.

Main Todos:
1. [x]Inference of InkubaLM on small subset of test set for MT
2. []Inference of InkubaLM on small subset of test set for sentiment
3. []Inference of InkubaLM on small subset of test set for xnli
4. [x]Get MT score
5. [x]Get sent score
6. Get xnli score
7. [x]Get model size (needs checking)
8. [x]Combine all scores  
9. create dummy submission file from test data
10. create submission file function

Sudo code:

```
if submitted_model_size < Inkubalm_size:
    percent_scale = submitted_model/Inkubalm
    sent_score = get_f1score (senti_data)
    mt_score = get_human_compared_chrf (MT_data, human_avgs)
    nli_score = get_afriNLI (nli_data)
    final_score = avg(sent_score, mt_score, nli_score) + (1-percent_scale)* avg(sent_score, mt_score, nli_score)
```

We have added some code that allows you to run on GPU if available. You can select:
Runtime -> Change runtime type -> T4 if you are on a colab notebook in order to enable GPU



In [1]:
#install necessary packeges
!pip install objsize
!pip install sacrebleu
!pip install --upgrade transformers accelerate sentencepiece datasets evaluate -q

Collecting objsize
  Downloading objsize-0.7.0-py3-none-any.whl.metadata (12 kB)
Downloading objsize-0.7.0-py3-none-any.whl (11 kB)
Installing collected packages: objsize
Successfully installed objsize-0.7.0
Collecting sacrebleu
  Downloading sacrebleu-2.4.3-py3-none-any.whl.metadata (51 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m51.8/51.8 kB[0m [31m2.2 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting portalocker (from sacrebleu)
  Downloading portalocker-2.10.1-py3-none-any.whl.metadata (8.5 kB)
Collecting colorama (from sacrebleu)
  Downloading colorama-0.4.6-py2.py3-none-any.whl.metadata (17 kB)
Downloading sacrebleu-2.4.3-py3-none-any.whl (103 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m104.0/104.0 kB[0m [31m4.5 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading colorama-0.4.6-py2.py3-none-any.whl (25 kB)
Downloading portalocker-2.10.1-py3-none-any.whl (18 kB)
Installing collected packages: portalocker, colorama, sacrebleu
Successfully 

In [5]:
import os
from google.colab import userdata
# Note: `userdata.get` is a Colab API. If you're not using Colab, set the env
# vars as appropriate for your system.
os.environ["HF_TOKEN"] = userdata.get("HF_TOKEN")

In [66]:
# DO NOT EDIT
# create submission file
import pandas as pd
import objsize

def create_submission(model, test_flag):
    #232053 is the size of inkuba
    obj_size = sum(p.numel() for p in model.parameters())

    if test_flag == True:
      filename = 'submission_test.csv'
      try:
        df1 = pd.read_csv('hau_sent_prediction_dev.csv')
        df2 = pd.read_csv('swa_sent_prediction_dev.csv')
        df3 = pd.read_csv('hau_mt_prediction_dev.csv')
        df4 = pd.read_csv('swa_mt_prediction_dev.csv')
        df5 = pd.read_csv('hau_xnli_prediction_dev.csv')
        df6 = pd.read_csv('swa_xnli_prediction_dev.csv')
      except FileNotFoundError as e:
          print("Seems you have not completed all the tasks, please complete all the tasks before attempting to create your submission file")
          raise e  # Rethrow the exception after the message if necessary
    else:
      filename = 'submission.csv'
      try:
        df1 = pd.read_csv('hau_sent_prediction.csv')
        df2 = pd.read_csv('swa_sent_prediction.csv')
        df3 = pd.read_csv('hau_mt_prediction.csv')
        df4 = pd.read_csv('swa_mt_prediction.csv')
        df5 = pd.read_csv('hau_xnli_prediction.csv')
        df6 = pd.read_csv('swa_xnli_prediction.csv')
      except FileNotFoundError as e:
          print("Seems you have not completed all the tasks, please complete all the tasks before attempting to create your submission file")
          raise e  # Rethrow the exception after the message if necessary

    res = pd.concat([df1, df2, df3, df4, df5, df6], ignore_index=True)
    series = [{'Instruction': 'Model size', 'Input Text': obj_size}]
    submission = pd.concat([res, pd.DataFrame(series)], ignore_index=True)
    submission.to_csv(filename, index=False)

Make sure you have generated an HF_TOKEN and added it to your notebook. You can find instruction son how to do this here https://github.com/google-gemini/gemma-cookbook/blob/main/Gemma/Gemma_Basics_with_HF.ipynb

Then make sure you have requested and granted access to the Gated InkubaLM model on huggingface.

Then make sure to re-start your kernal before running the cells below.

# Step 1: Get results for all the tasks and all the languages and save them to separate CSV files

Esentially this part cycles through each of the languages for each of the tasks (Hausa + Swahili - Sentiment, MT and Xnli). This produces 6 files.

- hau_senti - results for hausa sentiment
- swa_senti - results for swahili sentiment
- hau_Xnli - results for hausa Xnli task
- swa_xnli - results for Swahili Xnli task
- hau_mt_etoh - results for hausa machine translation from english to hausa
- hau_mt_htoe - results for hausa machine translation from hausa to english
(we are simplifying the task by just doing e->african because this is typically the direction that does not perfrom well in commercial models)


The Instructions for the Tasks and the expected output of the task are specified in English such that the model is not penalised for not being able to process coherent vernacular text.

In [36]:
%load_ext autoreload
%autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [44]:
import model_function # functions to load model and run inference
import eval
import datasets

Load your model

In [69]:
from transformers import LlamaTokenizer, LlamaForCausalLM, LlamaConfig,AutoModelForCausalLM, AutoConfig, AutoTokenizer
#add you model here
model_name = "lelapa/InkubaLM-0.4B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = model_function.load_model(model_name)

The BASE_PROMPT can be changed in a prompt engineering exercise to see whether perfromance can be improved through better prompts. Task instructions can also be adjusted to see if they improve results (ensure to change `custom_instruct=True` in order to use custom instruct statements)

In [11]:
BASE_PROMPT = "Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n\n ### Instruction: {}\n\n ### Response: "

## Run Sentiment inference and create sentiment files

Load dev datasets for the sentiment task for hausa and swahili

In [12]:
# Load Hugging Face dataset for sentiment tasks
swa_dataset  = datasets.load_dataset("lelapa/Zindi_sentiment_without_test_target", 'swahili')['dev'] # change the name of dataset here for other tasks
hau_dataset  = datasets.load_dataset("lelapa/Zindi_sentiment_without_test_target", 'hausa')['dev'] # change the name of dataset here for other tasks

Downloading readme:   0%|          | 0.00/1.72k [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/11.1k [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/11.0k [00:00<?, ?B/s]

Generating dev split:   0%|          | 0/50 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/50 [00:00<?, ? examples/s]

Downloading data:   0%|          | 0.00/9.29k [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/9.17k [00:00<?, ?B/s]

Generating dev split:   0%|          | 0/50 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/50 [00:00<?, ? examples/s]

Load the instruction for sentiment

In [13]:
# don't change this instruction
sent_instruction = "Please identify the sentiment reflected in this text based on the following guidelines: Positive: if a text implies positive sentiment, attitude, and emotional state. Negative: if a text implies negative sentiment or emotion. Neutral: if a text does not imply positive or negative language directly or indirectly. Provide sentiment labels only"

Run model inference and generate inference files for sentiment

In [14]:
#for swahili
model_function.main(model, tokenizer, BASE_PROMPT, sample_size = 3, max_new_tokens=15, task_instruction = sent_instruction, dataset = swa_dataset , csv_file_path='swa_sent_prediction_dev.csv', custom_instruct = False)

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


In [18]:
#for hausa
model_function.main(model, tokenizer, BASE_PROMPT, task_instruction= sent_instruction, dataset= hau_dataset , csv_file_path='hau_sent_prediction_dev.csv', max_new_tokens=15, custom_instruct= False)

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


Run evaluation on dev results for F1 score for swahili and hausa

In [19]:
eval.main("/content/swa_sent_prediction_dev.csv")
eval.main("/content/hau_sent_prediction_dev.csv")

Accuracy: 0.6666666666666666, F1 Score: 0.5333333333333333
Accuracy: 0.25, F1 Score: 0.1


## Run MT inference and create MT files

Load Dev sets for MT task for Hausa and Swahili

In [20]:
hau_dataset  = datasets.load_dataset("lelapa/Zindi_eng_african_without_test_target", 'eng-hau')['dev'] # change the name of dataset here for other tasks
swa_dataset = datasets.load_dataset("lelapa/Zindi_eng_african_without_test_target", 'eng-swa')['dev']

Downloading readme:   0%|          | 0.00/2.87k [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/10.5k [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/6.74k [00:00<?, ?B/s]

Generating dev split:   0%|          | 0/50 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/50 [00:00<?, ? examples/s]

Downloading data:   0%|          | 0.00/10.1k [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/7.43k [00:00<?, ?B/s]

Generating dev split:   0%|          | 0/50 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/50 [00:00<?, ? examples/s]

Load the instruction for MT

In [21]:
# don't change this instruction
mt_instruction = "Translate the following from {input_lang} into {output_lang}."

Run model inference and generate inference files for machine translation

In [23]:
#for hausa
hau_mt_instruction = mt_instruction.format(input_lang = "English", output_lang = "Hausa")
model_function.main(model,
                    tokenizer,
                    BASE_PROMPT,
                    task_instruction = hau_mt_instruction,
                    dataset = hau_dataset,
                    csv_file_path='hau_mt_prediction_dev.csv',
                    custom_instruct= False)

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


In [24]:
#for swahili
swa_mt_instruction = mt_instruction.format(input_lang = "English", output_lang = "Swahili")
model_function.main(model, tokenizer, BASE_PROMPT, task_instruction= swa_mt_instruction, dataset= swa_dataset, csv_file_path='swa_mt_prediction_dev.csv', custom_instruct = False, )

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


Run evaluation on dev results for CHrf score for swahili and hausa

In [25]:
df = pd.read_csv('hau_mt_prediction_dev.csv')
eval.calculate_chrf(df)
df = pd.read_csv('swa_mt_prediction_dev.csv')
eval.calculate_chrf(df)

Downloading builder script:   0%|          | 0.00/9.01k [00:00<?, ?B/s]

{'score': 7.55537582392469, 'char_order': 6, 'word_order': 0, 'beta': 2}

## Run Xnli inference and generate inference files for Xnli task

Load Dev and Test sets for Xnli task for Hausa and Swahili

In [26]:
hau_dataset  = datasets.load_dataset("lelapa/Zindi_Afrixnli_without_test_target", "hau")['dev'] # change the name of dataset here for other tasks
swa_dataset = datasets.load_dataset("lelapa/Zindi_Afrixnli_without_test_target", "swa")['dev'] # change the name of dataset here for other tasks

Downloading readme:   0%|          | 0.00/2.60k [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/8.01k [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/8.38k [00:00<?, ?B/s]

Generating dev split:   0%|          | 0/50 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/50 [00:00<?, ? examples/s]

Downloading data:   0%|          | 0.00/7.60k [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/7.88k [00:00<?, ?B/s]

Generating dev split:   0%|          | 0/50 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/50 [00:00<?, ? examples/s]

Load the instruction for AfriXnli

In [27]:
# Instruction is derived from the dataset custom_instruct = False
# If you would like to try your own instruction thenedit the text below and edit:
# custom_instruct = True
xnli_instruction = "Is the following question True, False or Neither?"

Run model inference and generate inference files for machine translation

In [28]:
#for hausa
model_function.main(model,
                    tokenizer,
                    BASE_PROMPT,
                    task_instruction = xnli_instruction,
                    dataset = hau_dataset,
                    csv_file_path='hau_xnli_prediction_dev.csv',
                    max_new_tokens=15,
                    custom_instruct=False)

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


In [29]:
#for swahili
model_function.main(model, tokenizer, BASE_PROMPT, task_instruction = xnli_instruction, dataset = swa_dataset , csv_file_path='swa_xnli_prediction_dev.csv', max_new_tokens=15, custom_instruct = False)

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


Run evaluation on dev results for F1Score score for swahili and hausa

In [30]:
eval.main("/content/swa_xnli_prediction_dev.csv")
eval.main("/content/hau_xnli_prediction_dev.csv")

Accuracy: 0.25, F1 Score: 0.1
Accuracy: 0.25, F1 Score: 0.1


# Step 2: Check how well your model is perfroming on the Lelapa AI Zindi Scoreboard with this custom evaluation function

## Test your dev set perfromance

To get a test Submission file with the dev set and calculate a score, use the submission file test code below to see what your score is

In [67]:
create_submission(model, test_flag=True)

In [64]:
import eval
eval.zindi_score('submission_test.csv')

Score for Sentiment Hausa: 0.1
Score for Sentiment Swahili: 0.5333333333333333
Score for AfriXnli Hausa: 0.1
Score for AfriXnli Swahili: 0.1
Score for MMT: {'score': 8.883356724971241, 'char_order': 6, 'word_order': 0, 'beta': 2}
Average performance score accross tasks and langs:  0.18443338011660915


Unnamed: 0,Input Text
23,0.184383


# Step 3: Use this code to combine all files into a single submission file

In [68]:
model = ""

Once you are satisifed with the perfromance on the dev set, create your prediction files from the test set to create your submission. Note, this uses the `test` dataset

In [70]:
BASE_PROMPT = "Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n\n ### Instruction: {}\n\n ### Response: "

In [104]:
# DO NOT EDIT
import time
start_time = time.time()
swa_dataset  = datasets.load_dataset("lelapa/Zindi_sentiment_without_test_target", 'swahili')['test'] # change the name of dataset here for other tasks
hau_dataset  = datasets.load_dataset("lelapa/Zindi_sentiment_without_test_target", 'hausa')['test'] # change the name of dataset here for other tasks
#sent_instruction = '' #edit this if you have a custom instruction
model_function.main(model, tokenizer, BASE_PROMPT, sample_size = 50, task_instruction = sent_instruction, dataset = swa_dataset , csv_file_path='swa_sent_prediction.csv', max_new_tokens=15, custom_instruct = False)
model_function.main(model, tokenizer, BASE_PROMPT, sample_size = 50, task_instruction = sent_instruction, dataset = hau_dataset , csv_file_path='hau_sent_prediction.csv', max_new_tokens=15, custom_instruct = False)


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for o

In [105]:
hau_dataset  = datasets.load_dataset("lelapa/Zindi_Afrixnli_without_test_target", "hau")['test'] # change the name of dataset here for other tasks
swa_dataset = datasets.load_dataset("lelapa/Zindi_Afrixnli_without_test_target", "swa")['test'] # change the name of dataset here for other tasks
#xnli_instruction = '' #edit this if you have a custom instruction
model_function.main(
    model,
    tokenizer,
    BASE_PROMPT,
    sample_size = 50,
    task_instruction = xnli_instruction,
    dataset = hau_dataset ,
    csv_file_path='hau_xnli_prediction.csv',
    max_new_tokens=15,
    custom_instruct = False)
model_function.main(model, tokenizer, BASE_PROMPT, sample_size = 50, task_instruction = xnli_instruction, dataset = swa_dataset , csv_file_path='swa_xnli_prediction.csv', max_new_tokens=15, custom_instruct = False)

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for o

In [106]:
hau_dataset = datasets.load_dataset("lelapa/Zindi_eng_african_without_test_target", 'eng-hau')['test'] # change the name of dataset here for other tasks
swa_dataset = datasets.load_dataset("lelapa/Zindi_eng_african_without_test_target", 'eng-swa')['test']
#xnli_instruction = '' #edit this if you have a custom instruction
#xnli_instruction = '' #edit this if you have a custom instruction
model_function.main(model, tokenizer, BASE_PROMPT, sample_size = 50, task_instruction = hau_mt_instruction, dataset = hau_dataset , csv_file_path='hau_mt_prediction.csv', custom_instruct = False)
model_function.main(model, tokenizer, BASE_PROMPT, sample_size = 50, task_instruction = swa_mt_instruction, dataset = swa_dataset , csv_file_path='swa_mt_prediction.csv', custom_instruct = False)

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for o

Now create your submission file. Make sure you name the files correctly so that you do not have issues creating the submission file

In [82]:
end_time = time.time()
print("Time taken: ", end_time - start_time)

Time taken:  201.0567066669464


In [83]:
create_submission(model, test_flag=False)

# Step 4: For Zindi team

This function pulls the test dataset from the huggingface server, merges the targets back onto the dataset and runs the same evaluation function on a submission file to determine the Zindi score

Download data from huggingface that has the targets for the test set (no-one else will have access to the URL)

In [85]:
#install necessary packeges
!pip install objsize
!pip install sacrebleu
!pip install --upgrade transformers accelerate sentencepiece datasets evaluate -q

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m324.4/324.4 kB[0m [31m10.7 MB/s[0m eta [36m0:00:00[0m
[?25h

In [86]:
import pandas as pd
import datasets



In [102]:
# order is important
hau_sent  = datasets.load_dataset("lelapa/Zindi_sentiment_with_target", 'hausa')['test'].to_pandas()
swa_sent  = datasets.load_dataset("lelapa/Zindi_sentiment_with_target", 'swahili')['test'].to_pandas()
hau_mmt = datasets.load_dataset("lelapa/Zindi_eng_african_with_target", 'eng-hau')['test'].to_pandas()
swa_mmt = datasets.load_dataset("lelapa/Zindi_eng_african_with_target", 'eng-swa')['test'].to_pandas()
hau_xnli  = datasets.load_dataset("lelapa/Zindi_Afrixnli_with_target", "hau")['test'].to_pandas()
swa_xnli = datasets.load_dataset("lelapa/Zindi_Afrixnli_with_target", "swa")['test'].to_pandas()

In [89]:
data = pd.read_csv('submission_test.csv')

In [100]:
res

Unnamed: 0,instruction,inputs,targets,task,data_source,ID,langs,premise
0,Changanua mawazo ya matini yanayofuata na uain...,dreamliner dar mwanza air tanzania tunapenda ...,Chanya,sentiment,afrisenti,1,swahili,
1,Changanua mawazo ya matini yanayofuata na uain...,karibu sana hon ndani ya,Wastani,sentiment,swahili_tweet,2,swahili,
2,Changanua mawazo ya matini yanayofuata na uain...,kuna la kujifunza kwa mtoto malcolm masoud kip...,Chanya,sentiment,afrisenti,3,swahili,
3,Changanua mawazo ya matini yanayofuata na uain...,habari pole sana kwa changamoto kwa sasa laini...,Wastani,sentiment,swahili_tweet,4,swahili,
4,Changanua mawazo ya matini yanayofuata na uain...,mjumbe wa kamati ya hamasa na uchangiaji ya sa...,Chanya,sentiment,afrisenti,5,swahili,
...,...,...,...,...,...,...,...,...
295,translate the following from english into swah...,short movie (2),hadithi fupi (2),mmt,wmt22,46,eng-swa,
296,translate the following from english into swah...,i voted on the first day,hata mimi nimekubali sana leo kwa siku ya kwanza,mmt,wmt22,47,eng-swa,
297,please convert this english content into swahili.,each state has a different culture of tolerance.,kila nchi ina demokrasia tofauti.,mmt,wmt22,48,eng-swa,
298,please convert this english content into swahili.,so alice decided to come with me.,"hivyo, bi mkubwa aliamua kutokuja nami.",mmt,wmt22,49,eng-swa,


In [110]:
# order is important
res = pd.concat([hau_sent, swa_sent, hau_mmt, swa_mmt, hau_xnli, swa_xnli],ignore_index=True)
merged_submission = pd.concat([res, data], axis=1)
merged_submission.to_csv("submission_to_score.csv")

In [111]:
eval.zindi_score("submission_to_score.csv")

Score for Sentiment Hausa: 0.1
Score for Sentiment Swahili: 0.5333333333333333
Score for AfriXnli Hausa: 0.1
Score for AfriXnli Swahili: 0.1
Score for MMT: {'score': 8.883356724971241, 'char_order': 6, 'word_order': 0, 'beta': 2}
Average performance score accross tasks and langs:  0.18443338011660915


Unnamed: 0,Input Text
23,0.092217
