# Exploring CNN models and Fidex rule generation for MNIST classification

**Introduction:**

Welcome to HES-Xplain, our interactive platform designed to facilitate explainable artificial intelligence (XAI) techniques. In this use case, we dive into the classification with CNN models trained on MNIST.
By the end of this notebook, you'll have a solid understanding of how to use a CNN to train the model and the Fidex algorithms to extract rules.

**Objectives:**

    1. Observe a different use case where XAI can be used.
    3. Understand how to use CNNs and Fidex.
    4. Showcase the versatility of HES-Xplain using a different dataset and training model.
    5. Provide practical insights into applying CNNs and Fidex to MNIST classifiers through an interactive notebook.
    6. Foster a community of XAI enthusiasts and practitioners.

**Outline:**

    1. Dataset and Problem Statement.
    2. Model training.
    3. Local rules generation - Fidex.
    4. Global ruleSet generation - FidexGlo.
    5. Explanation and image generation.
    6. Conclusion.
    7. References.

# Workspace Setup


This section download the required dataset from our GitHub and huggingface.co repositories.

In [2]:
# download and extract dataset
import zipfile
import os
from huggingface_hub import hf_hub_download

REPO_ID = "HES-XPLAIN/Mnist"
FILENAME = "Mnist.zip"
dataset_file_path = hf_hub_download(repo_id=REPO_ID, filename=FILENAME, repo_type="dataset")
extract_path = 'data/MnistDataset'

# Unzip the file
with zipfile.ZipFile(dataset_file_path, 'r') as zip_ref:
    # Extract all the files
    for member in zip_ref.namelist():
        # Remove the folder name 'Mnist' from the path
        member_path = os.path.relpath(member, start='Mnist')
        if member_path == '.':
            continue
        # Create the appropriate path in the destination directory
        target_path = os.path.join(extract_path, member_path)
        # Create any directories needed to house the file
        os.makedirs(os.path.dirname(target_path), exist_ok=True)
        # Write the file to the directory
        with zip_ref.open(member) as source, open(target_path, 'wb') as target:
            target.write(source.read())

print(f"Dataset successfully extracted to: {extract_path}")



Dataset successfully extracted to: data/MnistDataset


# Dataset and Problem Statement

The dataset we'll be working with is called MNIST and is available on [Kaggle](https://www.kaggle.com/datasets/oddrationale/mnist-in-csv). It consists of 60'000 data samples representing images of digits between 0 and 9. Each image sample has 784 attributes (shape 28x28x1) representing a pixel value ranging from 0 to 255. 

**Problem Statement:** Our objective is to build a robust CNN classifier capable of accurately classifying the images among the 10 classes. By leveraging deep learning techniques and Fidex algorithms, we aim to not only achieve high classification performance but also gain insights into the attributes (pixels here) that contribute to the classification decisions.

We'll start by importing all libraries. A warning might appear, but there's no need to worry about it.

In [1]:
import os, sys, re
import numpy as np
from PIL import Image
from dimlpfidex.fidex import fidex, fidexGloRules, fidexGloStats, fidexGlo
from trainings.cnnTrn import cnnTrn

# utility function to preview a file entirely or only the first `nlines` lines
def previewFile(filepath, nlines=-1):
    lines = ""

    with open(filepath, "r") as f:
        if nlines == -1:
            for line in f:
                lines += line
        else:
            for _ in range(nlines):
                try:
                    lines += next(f)
                except StopIteration:
                    break
    print(lines)

We already preprocessed the data and saved it in the `data/MnistDataset` folder. It is composed of 50'000 training samples and 10'000 testing samples. It's not necessary to normalize the data as it is done during the CNN training process. In the upcoming chapter, we'll use our prepared dataset to train our CNN model.

# Model training

It's time now to train our model. We will use a special type of model called a CNN (convolutional neural network).

We use our Python program called [cnnTrn](https://github.com/HES-XPLAIN/dimlpfidex/blob/main/trainings/cnnTrn.py). Let's begin with printing the program help message to observe every option available:

In [2]:
status = cnnTrn("--help")

Usage: 
--train_data_file <str> --test_data_file <str> --original_input_size <pair<int [1, inf[>> --nb_channels <int [1,inf[> --model <{small, large, vgg, resnet}> --data_format <{normalized_01, classic, other}> --nb_classes <int [1,inf[> [-h, --help] [--json_config_file <str>] [--root_folder <str>] [--train_class_file <str>] [--test_class_file <str>] [--train_valid_pred_outfile <str>] [--test_pred_outfile <str>] [--valid_ratio <float ]0,inf[>] [--valid_data_file <str>] [--valid_class_file <str>] [--weights_outfile <str>] [--stats_file <str>] [--console_file <str>] [--nb_epochs <int [1,inf[>] [--nb_quant_levels <int [3,inf[>] [--K <float ]0,inf[>] [--model_input_size <pair<int [1, inf[>>] [--seed <{int [0,inf[}>]

This is a parser for cnnTrn


Parameters:

  ---------------------------------------------------------------------


  The arguments can be specified in the command or in a json configuration file with --json_config_file your_config_file.json.

  ----------------------------


The output reveals various options. Among these, we'll focus on the required parameters (and the `--root_folder` for convenience). Since we've already generated the train and test data files, our next step is determining the data's shape and the model we want to train. As we saw, a MNIST image has a shape of 28x28x1 and has values between 0 and 255, which correspond to the classic format. There are 10 possible classes as there are 10 digits and 28*28=784 attributes.

With these parameters in place, we can proceed to run our CNN model, allowing the remaining options to be determined by their default settings.

First, let's define some commonly used parameters:

In [3]:
rootDir = "data/MnistDataset/"
trainDataFile = "trainData.txt"
trainClassFile = "trainClass.txt"
testDataFile = "testData.txt"
testClassFile = "testClass.txt"

nclasses = 10
nattributes = 784

trainPredFile = "predTrain.out"
testPredFile = "predTest.out"
weightsFile = "weights.wts"

globalRulesFile = "globalRules.rls"

As it may take some time, the training has already been done. If you wish to train it yourself, you can uncomment and run the next instructions :

In [4]:
args = f"""
        --root_folder {rootDir}
        --train_data_file {trainDataFile}
        --train_class_file {trainClassFile}
        --test_data_file {testDataFile}
        --test_class_file {testClassFile}
        --original_input_size [28,28]
        --data_format classic
        --nb_channels 1
        --nb_classes {nclasses}
        --model small
        """

#status = cnnTrn(args)
#if (status == 0):
#    print("cnnTrn done")

The algorithm generated the train and test predictions as well as the model's weights and statistics. All outputs are saved inside the `data/MnistDataset` folder.

The train and test accuracy are stored in the `stats.txt` file. Let's visualize these accuracies :


In [5]:
previewFile("data/MnistDataset/stats.txt", 20)

Training accuracy : 99.875927%.
Testing accuracy : 99.299997%.



# Local rules generation - Fidex

Now we can generate some local rules to explain the models' results. We can start with launching [Fidex](https://hes-xplain.github.io/documentation/algorithms/fidex/fidex/) on one test sample. This will generate a rule explaining the sample locally. It is local because the algorithm searches a rule only for one sample.

First of all, let's take a look at Fidex's arguments :

In [6]:
status = fidex("--help")


---------------------------------------------------------------------

The arguments can be specified in the command or in a json configuration file with --json_config_file your_config_file.json.

----------------------------

Required parameters:

--train_data_file <str>       Path to the file containing the train portion of the dataset
--train_class_file <str>      Path to the file containing the train true classes of the dataset, not mandatory if classes are specified in train data file
--train_pred_file <str>       Path to the file containing predictions on the train portion of the dataset
--test_data_file <str>        Path to the file containing the test sample(s) data, prediction (if no --test_pred_file) and true class(if no --test_class_file)
--weights_file <str>          Path to the file containing the trained weights of the model (not mandatory if a rules file is given with --rules_file)
--rules_file <str>            Path to the file containing the trained rules to be convert

Let's have a closer look at the Fidex help output. We can observe that there are **required parameters**. Let's have a look at them:

- `--train_data_file`: a file containing features from the training portion of the dataset
- `--train_pred_file`: a file containing predictions from the training portion of the dataset
- `--train_class_file`: a file containing classes from the training portion of the dataset
- `--test_data_file`: a file containing samples to be used when generating a local rule
- `--weights_file`: a file containing weights from a model training 
- `--rules_file`: a file containing the rules generated by a model training (in our case, we don't need it because we already have a `weights file` from the CNN training)
- `--rules_outfile`: a file name that will contain the output of the Fidex algorithm
- `--nb_attributes`: the number of attributes present in the dataset
- `--nb_classes`: the number of classes present in the dataset

There is also one optional argument that we are going to use:
- `--root_folder`: path defining the root directory where every other path specified in other arguments begins

All steps done until now will allow us to run the Fidex program. To see what happens, we launch it with just one sample. Therefore, we have saved beforehand the test data sample with its class and predictions in the file `data/MnistDataset/testDataSample.txt`.

**The execution should take about 1 minute.**

In [7]:
args = f"""
        --root_folder {rootDir}
        --train_data_file {trainDataFile}
        --train_class_file {trainClassFile}
        --train_pred_file {trainPredFile}
        --test_data_file testDataSample.txt
        --weights_file {weightsFile}
        --rules_outfile rule.rls
        --nb_attributes {nattributes}
        --nb_classes {nclasses}
        """

status = fidex(args)

Parameters list:
 - train_data_file                                                      data/MnistDataset/trainData.txt
 - train_pred_file                                                      data/MnistDataset/predTrain.out
 - train_class_file                                                    data/MnistDataset/trainClass.txt
 - test_data_file                                                  data/MnistDataset/testDataSample.txt
 - rules_outfile                                                             data/MnistDataset/rule.rls
 - root_folder                                                                       data/MnistDataset/
 - weights_file                                                           data/MnistDataset/weights.wts
 - nb_attributes                                                                                    784
 - nb_classes                                                                                        10
 - nb_quant_levels                             

The output of the algorithm shows us, in the terminal, a walkthrough of the process. At the end of it, you can observe the generated rule. Let's have a closer look at it by extracting the freshly written rule file:

In [8]:
previewFile("data/MnistDataset/rule.rls", 20)

No decision threshold is used.

Rule for sample 0 :

X739>=249.9 X230>=25.5 -> class 7
   Train Covering size : 145
   Train Fidelity : 1
   Train Accuracy : 1
   Train Confidence : 0.999995




The output displays a preview of a rule generated by Fidex. Each rule includes various properties:
- The index of the sample from which the rule has been generated
- The rule itself, composed of a single or list of antecedents and the prediction
- The number of samples, in the training dataset, covered by the rule
- The fidelity of the rule according to the model's predictions
- The accuracy of the rule
- The confidence of the rule with its choices, concerning the prediction values

In the antecedents, the Xi terms represent the ith pixel of the image (or ith attribute).

These rules provide insights into the model's predictions for each sample, helping to explain its decision-making process.

It is possible to run Fidex with all test samples to generate an explaining rule for each sample. However, we will skip this step because it would take too much time due to the size of the dataset.

In the next chapter, we will move on to global ruleSet generation using FidexGloRules. This will help us understand the overall behavior of the model by generating a set of global rules.


# Global ruleSet generation - FidexGlo
We have seen how to compute a rule that explains the decision of the model for a specific sample with the Fidex algorithm. But how could we get a general set of rules that characterizes the whole train dataset ? Using the [FidexGloRules](https://hes-xplain.github.io/documentation/algorithms/fidex/fidexglorules) algorithm, it is possible to achieve this.

A global ruleset is a collection of rules that explains the model's decision for each sample present on the training portion of the dataset. Let's have a look at the fidexGloRules arguments:

In [9]:
status = fidexGloRules("--help")


---------------------------------------------------------------------

The arguments can be specified in the command or in a json configuration file with --json_config_file your_config_file.json.

----------------------------

Required parameters:

--train_data_file <str>       Path to the file containing the train portion of the dataset
--train_class_file <str>      Path to the file containing the train true classes of the dataset, not mandatory if classes are specified in train data file
--train_pred_file <str>       Path to the file containing predictions on the train portion of the dataset
--weights_file <str>          Path to the file containing the trained weights of the model (not mandatory if a rules file is given with --rules_file)
--rules_file <str>            Path to the file containing the trained rules to be converted to hyperlocus (not mandatory if a weights file is given with --weights_file)
--global_rules_outfile <str>  Path to the file where the output rule(s) will be

Meanwhile, there are `required parameters` very similar to the `Fidex` algorithm, there are many optional arguments that you can use to customize the behavior of the algorithm. Let's have a look at some of them:

- `--heuristic`: various ways to run the algorithm, these ways aim to increase execution speed. But also has a performance impact on results.
- `--nb_threads`: number of threads used to compute the algorithm. Accelerate the process.
- `--min_covering`: minimal number of samples a rule must cover
- `--max_failed_attempts`: maximum failed attempts allowed when generating a rule
- `--min_fidelity`: minimal fidelity allowed when generating a rule
- `--max_iterations`: maximum number of iterations, also the maximum possible number of antecedents in a rule
- `--dropout_dim`: probability of dropping a dimension when generating a rule
- `--dropout_hyp`: probability of dropping a hyperplane when generating a rule
- `--console_file`: a file where console outputs are redirected

The process is very long and can last several days, so we already computed it beforehand. The ruleset is available in the file `data/MnistDataset/globalRules.txt`.

If you want to launch it, and you have several processors available, you should add the parameter nb_threads with the number of processors that you want to use, it can speed up the process a lot. If you want to accelerate the process even more, you can use some dropout, the algorithm will randomly skip some dimensions or some hyperplans. For example, you can put both dropouts to 0.9, to skip nine out of ten dimensions and hyperplanes, which should be a lot faster. You just need to uncomment the next lines :

In [10]:
args = f"""
        --root_folder {rootDir} 
        --nb_threads 4
        --train_data_file {trainDataFile}
        --train_class_file {trainClassFile}
        --train_pred_file {trainPredFile}
        --weights_file {weightsFile}
        --nb_attributes {nattributes}
        --nb_classes {nclasses}
        --heuristic 1
        --global_rules_outfile {globalRulesFile}
        --max_iterations 25
        --dropout_hyp 0.9
        --dropout_dim 0.9
        --console_file fidexGloRulesResult.txt
        """

#status = fidexGloRules(args)

The algorithm generated a file that we're going to partially observe:

In [11]:
previewFile("data/MnistDataset/globalRules.rls", 25)

Number of rules : 4910, mean sample covering number per rule : 101.016701, mean number of antecedents per rule : 7.106314
No decision threshold is used.

Rule 1: X323>=219.3 X517>=198.9 X180<5.1 X492<25.5 X272<25.5 X347<10.2 X576<20.4 X236<30.6 X436<132.6 X541<30.6 -> class 1
   Train Covering size : 2156
   Train Fidelity : 1
   Train Accuracy : 1
   Train Confidence : 0.999678

Rule 2: X236<15.3 X464<20.4 X151<10.2 X522<5.1 X406>=234.6 X353<30.6 X271<40.8 X345<76.5 X375<20.4 X293<158.1 X575<91.8 X553<10.2 -> class 1
   Train Covering size : 1996
   Train Fidelity : 1
   Train Accuracy : 1
   Train Confidence : 0.999813

Rule 3: X494<20.4 X410<15.3 X378>=229.5 X298<30.6 X490>=214.2 X569<15.3 X322>=239.7 X205<10.2 X515<66.3 X269<5.1 X597<15.3 -> class 1
   Train Covering size : 1851
   Train Fidelity : 1
   Train Accuracy : 0.99946
   Train Confidence : 0.99947

Rule 4: X483>=35.7 X408<66.3 X656>=10.2 X325<40.8 X488<25.5 X301>=15.3 X519<132.6 X569>=40.8 -> class 0
   Train Covering siz

> *The algorithm result is subject to randomness as it uses random processes to compute. Results may differ between executions.*

You can observe the rules are ordered by their covering size. The first rule is the one that best describes the training portion of the dataset. The algorithm generated about 5000 rules explaining the whole train dataset. You can see at the top of the file the number of rules, the mean covering number per rule, and the mean number of antecedents. Here is an example of a rule that you may obtain:<be>

```md
Rule 1: X323>=219.3 X517>=198.9 X180<5.1 X492<25.5 X272<25.5 X347<10.2 X576<20.4 X236<30.6 X436<132.6 X541<30.6 -> class 1
   Train Covering size : 2156
   Train Fidelity : 1
   Train Accuracy : 1
   Train Confidence : 0.999678
```

This rule is the first rule, which means that it's the rule with the maximum covering. Here, 2156 train samples verify this rule. It is **100% fidel** with the model and is **100% accurate**.
This rule says that if the 323rd pixel of the image is greater than 219.3 and if the other antecedents are verified, then this is an image on the digit 1. And this rule has **99% of confidence**.<br>

To get statistics on the test portion of the dataset, let's execute the [fidexGloStats](https://hes-xplain.github.io/documentation/algorithms/fidex/fidexglostats) algorithm. Beginning with an overview of the arguments of the program: 

In [12]:
status = fidexGloStats("--help")


---------------------------------------------------------------------

The arguments can be specified in the command or in a json configuration file with --json_config_file your_config_file.json.

----------------------------

Required parameters:

--test_data_file <str>        Path to the file containing the test portion of the dataset
--test_class_file <str>       Path to the file containing the test true classes of the dataset, not mandatory if classes are specified in test data file
--test_pred_file <str>        Path to the file containing predictions on the test portion of the dataset
--global_rules_file <str>     Path to the file containing the global rules obtained with fidexGloRules algorithm.
--nb_attributes <int [1,inf[> Number of attributes in the dataset
--nb_classes <int [2,inf[>    Number of classes in the dataset

----------------------------

Optional parameters: 

--json_config_file <str>      Path to the JSON file that configures all parameters. If used, this must be

As you can observe, the required arguments are pretty much the same as previous executions. The only one that differs is `--global_rules_file` which simply asks to input the `global rule file` to compute statistics. With the parameter `--global_rules_outfile` we can generate the statistics on rules which will modify the rules file. If you want to keep the first rule set unchanged, you should give another name.

Let's try this:

In [13]:
args = f"""
        --root_folder {rootDir}
        --test_data_file {testDataFile}
        --test_class_file {testClassFile}
        --test_pred_file {testPredFile}
        --global_rules_file {globalRulesFile}
        --nb_attributes {nattributes}
        --nb_classes {nclasses}
        --stats_file fidexGloStats.txt
        --global_rules_outfile globalRulesWithTestStats.rls
        """

status = fidexGloStats(args)

Parameters list:
 - test_data_file                                                        data/MnistDataset/testData.txt
 - test_pred_file                                                        data/MnistDataset/predTest.out
 - test_class_file                                                      data/MnistDataset/testClass.txt
 - global_rules_outfile                                  data/MnistDataset/globalRulesWithTestStats.rls
 - global_rules_file                                                  data/MnistDataset/globalRules.rls
 - root_folder                                                                       data/MnistDataset/
 - stats_file                                                       data/MnistDataset/fidexGloStats.txt
 - nb_attributes                                                                                    784
 - nb_classes                                                                                        10
 - positive_class_index                        

The execution of the algorithm generated a file that we named `stats.txt` containing pretty much the same feedback as the program output.

The output of the program shows various metrics, let's have a look at them individually:

- `Global statistics`: Several values expressing general information about the ruleset.
- `Decision threshold`: Value used to define a threshold where a class is considered as true. In this case, it's written that `no decision threshold is used`.
- `Positive index class`: This value means which class is considered as the positive one. If no threshold is used, this cannot be used, like in this case.
- `Global rule fidelity rate`: Expressing whether the ruleset accurately reflects the model's predictions.
- `Global rule accuracy`: Proportion of correct predictions made by the ruleset.
- `Explainability rate`: Proportion of the samples that could be explained by one or more rules.
- `Default rule rate`: Proportion of samples that could not be explained by a rule offered by the ruleset.
- `Mean number of correct activated rules`: Average number of correct rules activated per sample.
- `Mean number of wrong activated rules`: Average number of incorrect rules activated per sample.
- `Model test accuracy`: Accuracy of the model on the test dataset
- `Model test accuracy when rules agree`: Accuracy of the model on test samples where the ruleset and model predictions agree.
- `Model test accuracy when activated rules agree`: Accuracy when at least one activated rule agrees with the model's prediction.

With this program, you can have a general overview of the quality of the ruleset.

We have about **97% fidelity**, which is good, and a **rule accuracy(96.6%)** about 3% lower than the **model accuracy(99.3%)**. So the rules seem to classify a bit worse. We have more than a **95% explainability rate**, so only in 5% of cases do we need to compute Fidex to get a rule. Each sample can activate many rules. Here on average, a sample activates **8 correct rules** and **0.2 wrong rules**. A wrong rule is a rule with which the model doesn't agree. For example, if the rule says 1 and the model says 2. Something interesting is the `model test accuracy when rules and model agree`. You can see that, generally, the accuracy increases if we consider samples where rules and model agree, and increases even more if we take only the activated rules (when there are no activated rules, we choose the model prediction). That means that the rules confirm well the model decision, but when no rule is found, the model decision may as well be wrong. <br>

Finally, in the `globalRulesWithTestStats` file, you can now see the statistics of rules on the test set. Here is the same rule seen before with the test statistics:<be>

```md
Rule 1: X323>=219.3 X517>=198.9 X180<5.1 X492<25.5 X272<25.5 X347<10.2 X576<20.4 X236<30.6 X436<132.6 X541<30.6 -> class 1
   Train Covering size : 2156 --- Test Covering size : 370
   Train Fidelity : 1 --- Test Fidelity : 0.997297
   Train Accuracy : 1 --- Test Accuracy : 1
   Train Confidence : 0.999678 --- Test Confidence : 0.997311
```

We see that the rule no longer always agrees with the model, only in 99.7% of cases. However, the rule accuracy is perfect. That means that the rule is very good in reality, with **100% of correct classification**.

In the next chapter, we will get a comprehensive explanation for each sample using `FidexGlo` and with image generation. This will help us understand better the behavior of the model.

# Explanation and image generation

Now let's get the explanations on a few test samples. We will use the [fidexGlo](https://hes-xplain.github.io/documentation/algorithms/fidex/fidexglo) algorithm with the files `testDataSamples`, `testClassSamples`, and `testPredSamples` containing 10 test samples. <br>
Beginning with an overview of the arguments of the program: 

In [14]:
status = fidexGlo("--help")


---------------------------------------------------------------------

The arguments can be specified in the command or in a JSON configuration file with --json_config_file your_config_file.json.

----------------------------

Required parameters:

--test_data_file <str>        Path to the file containing test sample(s) data, prediction (if no --test_pred_file) and true classes if launching with fidex (--with_fidex and if no --test_class_file)
--global_rules_file <str>     Path to the file containing the global rules obtained with fidexGloRules algorithm.
--nb_attributes <int [1,inf[> Number of attributes in the dataset
--nb_classes <int [2,inf[>    Number of classes in the dataset

----------------------------

Optional parameters: 

--json_config_file <str>      Path to the JSON file that configures all parameters. If used, this must be the sole argument and must specify the file's relative path
--root_folder <str>           Path to the folder, based on main default folder dimlpfide

As you can observe, the required arguments are pretty much the same as previous executions. The more important argument that differs is `--with_fidex` telling whether we want to execute `Fidex` when no rule is activated in the global rule set for a given sample.

We execute the algorithm like this :

In [None]:
args = f"""
        --root_folder {rootDir}
        --test_data_file testDataSamples.txt
        --test_class_file testClassSamples.txt
        --test_pred_file testPredSamples.txt
        --global_rules_file {globalRulesFile}
        --nb_attributes {nattributes}
        --nb_classes {nclasses}
        --explanation_file explanations.txt
        --console_file fidexGloResults.txt
        --with_fidex true
        --train_data_file {trainDataFile}
        --train_pred_file {trainPredFile}
        --train_class_file {trainClassFile}
        --weights_file {weightsFile}
        """

status = fidexGlo(args)

The explanations for each test sample can then be found in the file `explanations.txt`. Let's have a look inside the generated file:

In [None]:
previewFile("data/MnistDataset/explanations.txt", 50)

Many rules are activated for each sample and a global rule is found for each of those 10 samples, so `Fidex` was not called.

Now, we parse this explanation file to get the first explanation rule of each sample. We generate the image corresponding to the sample and add colored pixels where the rule is activated.

In [None]:
images = []
test_data = "data/MnistDataset/testDataSamples.txt"
with open(test_data, "r") as my_file:
    for line in my_file:
        images.append(line.strip().split(" "))

explanation_file = "data/MnistDataset/explanations.txt"
pattern = r'X(\d+)\s*([<>]=?)\s*([\d.]+)' # Regular expression pattern to match antecedents
rules = []
with open(explanation_file, "r") as my_file:
    for line in my_file:
        if line.startswith("R1: "):
            rules.append(line.strip())
        if line.startswith("Local rule"):
            # Search next non empty line
            next_line = next(my_file, '').strip()
            while not next_line:
                next_line = next(my_file, '').strip()
            rules.append(next_line)
    my_file.close()

# Find all matches in the input string
for id_sample in range(len(rules)):
    antecedents = []
    matches = re.findall(pattern, rules[id_sample])

    # Process each match and store in antecedents
    for match in matches:
        attribute, inequality, value = match
        antecedent = {
            "attribute": int(attribute),
            "inequality": inequality,
            "value": float(value)
        }
        antecedents.append(antecedent)

    colorimage = [[v,v,v] for v in images[id_sample]]
    for antecedent in antecedents:
                if antecedent["inequality"] == "<":
                    colorimage[antecedent["attribute"]]=[255,0,0]
                else:
                    colorimage[antecedent["attribute"]]=[0,255,0]

    colorimage_array = np.array(colorimage).reshape(28, 28, 3)
    colorimage = Image.fromarray(colorimage_array.astype('uint8'))
    image_path = 'data/MnistDataset/images/img_'+ str(id_sample) + '_out.png'
    colorimage.save(image_path)


You can observe the 10 images in the `image` folder. Let's take a look at the first one:

<center><img src="data/MnistDataset/images/img_0_out.png" width="20%" /></center>
<center><i>First sample</i></center>
<br><br>

The red dots indicate pixels where the rule requires the value to be below a certain threshold, while the green dots represent pixels where the value must be above a threshold. You can observe which pixels are used by the model to decide which digit is represented.

With the generation of local and global rules using the Fidex algorithms, we have a clearer view of how our model makes predictions. These rules help us understand the model's decisions, making it more transparent. Now, let's wrap up our findings and discuss the importance of explainable AI in the final chapter.

# Conclusion

In this notebook, we explored explainable AI using CNNs and the Fidex family of algorithms. We looked at our dataset, trained a CNN model, and examined the generated rules. We used `Fidex` to create a local rule for a given sample explanation and `FidexGloRules` to generate a global ruleset for the entire training dataset. Then, we evaluated the ruleset with `FidexGloStats`, providing insights into the model's accuracy, fidelity, and explainability. Finally, we generated some explications with `FidexGlo` and observed the rule directly on the image.

This process demonstrated how explainable AI techniques can clarify complex models, making them more transparent and trustworthy. By understanding our model's decision-making process, we can ensure better, more reliable outcomes in various applications. Using CNNs with Fidex offers a balanced approach to building interpretable and effective AI models and it is possible to obtain explanations on any image dataset you want.

To go further, you can explore the notebook on the Cracks dataset which is another application of CNNs on image classification [TODO: INSERER LIEN NOTEBOOK CRACKS].

# Références

TODO : Article DimlpBT de Guido
Article Fidex de Guido (à venir)