![logog](https://raw.githubusercontent.com/Pacific-AI-Corp/langtest/main/docs/assets/images/logo.png)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Pacific-AI-Corp/langtest/blob/main/demo/tutorials/misc/Add_New_Lines_and_Tabs_Tests.ipynb)

**LangTest** is an open-source python library designed to help developers deliver safe and effective Natural Language Processing (NLP) models. Whether you are using **John Snow Labs, Hugging Face, Spacy** models or **OpenAI, Cohere, AI21, Hugging Face Inference API and Azure-OpenAI** based LLMs, it has got you covered. You can test any Named Entity Recognition (NER), Text Classification, fill-mask, Translation model using the library. We also support testing LLMS for Question-Answering, Summarization and text-generation tasks on benchmark datasets. The library supports 60+ out of the box tests. For a complete list of supported test categories, please refer to the [documentation](http://langtest.org/docs/pages/docs/test_categories).

Metrics are calculated by comparing the model's extractions in the original list of sentences against the extractions carried out in the noisy list of sentences. The original annotated labels are not used at any point, we are simply comparing the model against itself in a 2 settings.

# Getting started with LangTest on John Snow Labs

In [None]:
!pip install langtest==2.4.0

# Harness and Its Parameters

The Harness class is a testing class for Natural Language Processing (NLP) models. It evaluates the performance of a NLP model on a given task using test data and generates a report with test results.Harness can be imported from the LangTest library in the following way.

In [3]:
from langtest import Harness

### Setup and Configure Harness

In [21]:
harness = Harness( 
    task = "text-classification",
    model = {"model": 'textcat_imdb', "hub": "spacy"},
    config = {
        "tests":  {
            "defaults": {
                "min_score": 0.7
            },
            "robustness": {
                "add_new_lines":{
                    "min_pass_rate": 0.7,
                    "parameters": {
                        "max_lines": 5
                    }

                },
                "add_tabs":{"min_pass_rate": 0.7,
                            "parameters": {
                                "max_tabs": 5
                            }
                },
            },
        }
    }
)



Test Configuration : 
 {
 "tests": {
  "defaults": {
   "min_score": 0.7
  },
  "robustness": {
   "add_new_lines": {
    "min_pass_rate": 0.7,
    "parameters": {
     "max_lines": 5
    }
   },
   "add_tabs": {
    "min_pass_rate": 0.7,
    "parameters": {
     "max_tabs": 5
    }
   }
  }
 }
}


### Generating the test cases.

In [22]:
harness.generate()

Generating testcases...: 100%|██████████| 1/1 [00:00<?, ?it/s]




In [23]:
harness.testcases()

Unnamed: 0,category,test_type,original,test_case
0,robustness,add_new_lines,Just as a reminder to anyone just now reading ...,Just as a reminder to anyone just now reading ...
1,robustness,add_new_lines,Like CURSE OF THE KOMODO was for the creature ...,Like CURSE OF\n THE KOMODO was for the creatur...
2,robustness,add_new_lines,"I think that the costumes were excellent, and ...","I think that the costumes were excellent, and ..."
3,robustness,add_new_lines,This is one of my most favorite movies of all ...,This\n\n\n is one of my most\n\n\n favorite mo...
4,robustness,add_new_lines,This program was on for a brief period when I ...,This program was on for a brief period when I ...
...,...,...,...,...
395,robustness,add_tabs,"The opening was a steal from ""Eight-legged Fre...",The\t\t\t opening\t\t\t\t\t was\t\t\t a\t\t\t ...
396,robustness,add_tabs,"Now don't get me wrong, I love seeing half nak...",Now\t\t\t\t\t don't\t\t\t\t\t get\t me\t wrong...
397,robustness,add_tabs,"Though I saw this movie dubbed in French, so I...",Though\t\t\t\t I\t saw\t\t this\t\t\t\t\t movi...
398,robustness,add_tabs,This is one of the best presentations of the 6...,This\t\t\t\t is\t one\t\t\t\t\t of\t\t\t\t the...


harness.testcases() method displays the produced test cases in form of a pandas data frame.

### Running the tests

Called after harness.generate() and is to used to run all the tests.  Returns a pass/fail flag for each test.

In [24]:
harness.run()

Running testcases... : 100%|██████████| 400/400 [00:00<00:00, 2089.73it/s]




In [25]:
harness.generated_results()

Unnamed: 0,category,test_type,original,test_case,expected_result,actual_result,pass
0,robustness,add_new_lines,Just as a reminder to anyone just now reading ...,Just as a reminder to anyone just now reading ...,POS,POS,True
1,robustness,add_new_lines,Like CURSE OF THE KOMODO was for the creature ...,Like CURSE OF\n THE KOMODO was for the creatur...,NEG,NEG,True
2,robustness,add_new_lines,"I think that the costumes were excellent, and ...","I think that the costumes were excellent, and ...",POS,POS,True
3,robustness,add_new_lines,This is one of my most favorite movies of all ...,This\n\n\n is one of my most\n\n\n favorite mo...,POS,POS,True
4,robustness,add_new_lines,This program was on for a brief period when I ...,This program was on for a brief period when I ...,POS,NEG,False
...,...,...,...,...,...,...,...
395,robustness,add_tabs,"The opening was a steal from ""Eight-legged Fre...",The\t\t\t opening\t\t\t\t\t was\t\t\t a\t\t\t ...,NEG,NEG,True
396,robustness,add_tabs,"Now don't get me wrong, I love seeing half nak...",Now\t\t\t\t\t don't\t\t\t\t\t get\t me\t wrong...,NEG,NEG,True
397,robustness,add_tabs,"Though I saw this movie dubbed in French, so I...",Though\t\t\t\t I\t saw\t\t this\t\t\t\t\t movi...,POS,NEG,False
398,robustness,add_tabs,This is one of the best presentations of the 6...,This\t\t\t\t is\t one\t\t\t\t\t of\t\t\t\t the...,POS,NEG,False


This method returns the generated results in the form of a pandas dataframe, which provides a convenient and easy-to-use format for working with the test results. You can use this method to quickly identify the test cases that failed and to determine where fixes are needed.

### Final Results

We can call `.report()` which summarizes the results giving information about pass and fail counts and overall test pass/fail flag.

To get time_elapsed for each test we pass parameter `return_runtime=True` in `.report()` method. We can also select the unit for time_elapsed i.e, seconds(s), miliseconds(ms) or microseconds(us) etc.

In [26]:
harness.report()

Unnamed: 0,category,test_type,fail_count,pass_count,pass_rate,minimum_pass_rate,pass
0,robustness,add_new_lines,6,194,97%,70%,True
1,robustness,add_tabs,99,101,50%,70%,False
