# <font color='blue'>Training a CNN classifer to recognize 4 objects.</font>

In this notebook, we shall train a CNN to recognize whether an image is a chandelier, motorbike, watch or a laptop.   
The dataset is derived from the `Caltech dataset` and consists of roughly 400 images.  
We have manually created this subset from the whole dataset.  The whole dataset can be found [here](http://www.vision.caltech.edu/Image_Datasets/Caltech101/).  
The data-structure looks like this.  

caltech_subset  
├── test  
│   ├── chandeleir  
│   ├── laptop  
│   ├── motorbikes  
│   └── watch  
└── train  
    ├── chandelier  
    ├── laptop  
    ├── motorbikes  
    └── watch  

As we see, we cannot use any inbuilt `Dataset` class to load the images.  
Hence we need to create our own `Dataset` class which will be shown in the next section.  

Once we create this class, we will also define a simple `CNN` to train on this data.


```C++
// Create an alias for functional module as in pytorch `import torch.nn.functional as F`
namespace F = torch::nn::functional;
```

## <font color='green'>Creating a custom Dataset class</font>

In Pytorch, we inherit the `torch.utils.data.Dataset` class and override the `__len__` and `__getitem__` for the data-size and indexing a sample respectively.  
We also specify the `__init__`  according to our specifications.    
Once we have specified the class, we only need to pass this `Dataset` object inside the `torch.utils.data.DataLoader` so that it can handle the data-loading pipeline.

Libtorch also follows the exact way, except that the syntax is completely different.


In the next cell, we will create a class called `Caltech` inheriting from `torch::data::Dataset`.

This will consist of a constructor `Caltech()` whose inputs are three arguments:
1. `input_path`- A text file that is consisting of the path to images.
2. `output_path`- A text file that is consisting of the labels associated with each corresponding image.

3. `Path`- The path to the directory where `train` or `test` exists.

The constructor also stores these `image-paths` and corresponding `labels`.


The function `size()` (equivalent to `__len__`) is overrided to get the total number of samples.

The function `get(int index)` (equivalent to `__getitem__`) is overridden to fetch the input image and the corresponding label (in the form of a data-structure `Example`) at a particular index.

The text files to the arguments `input_path` and `output_path` have been created explicitly by a different program.
Each class is associated with a particular label to is. The mapping is as follows.  
`chandeleir - 0, laptop - 1, motorbike - 2, watch - 3` 

``` C++
class Caltech : public torch::data::Dataset<Caltech>
    private:
        std::vector<std::string>image_paths;// to store the image-paths
        std::vector<int>labels; // to store the corresponding labels
        std::string data_path; // the path to the dataset
        torch::Tensor tensor_, target_; // to return the tensor and label

        std::string join_paths(std::string head, const std::string& tail) 
        {
        if (head.back() != '/') {
            head.push_back('/');
        }
        head += tail;
        return head;
        }

    public:
        // Create a constructor.
        explicit Caltech(const std::string& input_path, const std::string& output_path, const std::string& Path) 
     { 

            data_path = Path;
            // Read the image paths and store them inside `image_paths`
            std::ifstream file1(input_path);
            std::string curline1;
            while (std::getline(file1, curline1))
            {
                image_paths.push_back(curline1);
            }
            file1.close();  

            // Read the labels and store them inside `labels`
            std::ifstream file2(output_path);
            std::string curline2;
            while (std::getline(file2, curline2))
            {
                labels.push_back(std::stoi(curline2));
            }
            file2.close();  

    }

    /// Returns the length of the samples.
    torch::optional<size_t> size() const override
    { return image_paths.size(); }

    /// Returns a pair of input-Tensor and correspoiding Label  at the given `index`.
    torch::data::Example<> get(size_t index) override
    {
        // read the image path at a given index
        cv::Mat image = cv::imread(join_paths(data_path, image_paths[index]));
        cv::resize(image, image ,cv::Size(160, 160));
        // convert from cv::Mat to torch::Tensor
        torch::Tensor tensor_ = torch::from_blob(image.data, {image.rows, image.cols, 3}, at::kByte);
        tensor_ = tensor_.toType(at::kFloat);
        tensor_ = tensor_.permute({2, 0, 1});
        // store the corresponding label at a articular index.
        torch::Tensor target_ = torch::tensor(labels[index]);

        return { tensor_, target_ };
    }

 };
```

##  <font color='green'>Create a simple CNN</font>

``` C++
struct Net : torch::nn::Module
{
    Net(int64_t num_classes)    
    {
        // register the parameters of the model
        conv1_1 = register_module("conv1_1", torch::nn::Conv2d(torch::nn::Conv2dOptions(3, 32, 3).padding(1)));
        conv1_2 = register_module("conv1_2", torch::nn::Conv2d(torch::nn::Conv2dOptions(32, 32, 3)));
        dp1 = register_module("dp1", torch::nn::Dropout(0.25));
        conv2_1 = register_module("conv2_1", torch::nn::Conv2d(torch::nn::Conv2dOptions(32, 64, 3).padding(1)));
        conv2_2 = register_module("conv2_2", torch::nn::Conv2d(torch::nn::Conv2dOptions(64, 64, 3)));
        dp2 = register_module("dp2", torch::nn::Dropout(0.25));
        conv3_1 = register_module("conv3_1", torch::nn::Conv2d(torch::nn::Conv2dOptions(64, 64, 3).padding(1)));
        conv3_2 = register_module("conv3_2", torch::nn::Conv2d(torch::nn::Conv2dOptions(64, 64, 3)));
        dp3 = register_module("dp3", torch::nn::Dropout(0.25));
        fc1 = register_module("fc1", torch::nn::Linear(2 * 2 * 64 * 81, 512));
        dp4 = register_module("dp4", torch::nn::Dropout(0.5));
        fc2 = register_module("fc2", torch::nn::Linear(512, num_classes));
    }
    // the forward function to guide the flow of tensors.
    torch::Tensor forward(torch::Tensor x)
    {
        x = torch::relu(conv1_1->forward(x));
        x = torch::relu(conv1_2->forward(x));
        x = torch::max_pool2d(x, 2);
        x = dp1(x);

        x = torch::relu(conv2_1->forward(x));
        x = torch::relu(conv2_2->forward(x));
        x = torch::max_pool2d(x, 2);
        x = dp2(x);
        
        x = torch::relu(conv3_1->forward(x));
        x = torch::relu(conv3_2->forward(x));
        x = torch::max_pool2d(x, 2);
        x = dp3(x);

        x = x.view({-1, 2 * 2 * 64 * 81});
        
        x = torch::relu(fc1->forward(x));
        x = dp4(x);
        x = torch::log_softmax(fc2->forward(x), 1);
        
        return x;
    }
    // initializing the layers.
    torch::nn::Conv2d conv1_1{nullptr};
    torch::nn::Conv2d conv1_2{nullptr};
    torch::nn::Conv2d conv2_1{nullptr};
    torch::nn::Conv2d conv2_2{nullptr};
    torch::nn::Conv2d conv3_1{nullptr};
    torch::nn::Conv2d conv3_2{nullptr};
    torch::nn::Dropout dp1{nullptr};
    torch::nn::Dropout dp2{nullptr};
    torch::nn::Dropout dp3{nullptr};
    torch::nn::Dropout dp4{nullptr};
    torch::nn::Linear fc1{nullptr};
    torch::nn::Linear fc2{nullptr};
};
```

## <font color='green'> Creating functions for training and testing on every epoch</font>

``` C++
// create a template function for training.
// this function is called for every epoch.
template <typename DataLoader>
void train(int32_t epoch, Net& model, torch::Device device, DataLoader& data_loader, torch::optim::Optimizer& optimizer, size_t dataset_size)
{   // Port the model to training mode.
    model.train();
    double train_loss = 0;
    int32_t correct = 0;
    size_t batch_idx = 0;  
    // iterate for every batch in the dataset
    for (auto& batch : data_loader) {
        auto x = batch.data.to(device), targets = batch.target.to(device);
        optimizer.zero_grad();
        auto output = model.forward(x);
        // calcculate the loss
        auto loss = F::cross_entropy(output, targets);
        AT_ASSERT(!std::isnan(loss.template item<float>()));
        // calculate the gradients and update the parameters
        loss.backward();
        optimizer.step();
        // get the accuracy 
        train_loss += loss.template item<float>();
        auto pred = output.argmax(1);
        correct += pred.eq(targets).sum().template item<int64_t>();
        batch_idx+=1;       
    }
    train_loss /= batch_idx;
    std::printf(
        "\n   Train set: Average loss: %.4f | Accuracy: %.3f",
        train_loss,
        static_cast<double>(correct) / dataset_size);
}
```


``` C++
// create a template function for testing.
// this function is called for every epoch.
template <typename DataLoader>
void test(Net& model, torch::Device device, DataLoader& data_loader, size_t dataset_size)
{
    // pytorch equivalent of with `torch.no_grad()`
    torch::NoGradGuard no_grad;
    // Port the model to evaluation mode.
    model.eval();
    double test_loss = 0;
    int32_t correct = 0;
    // iterate over every batch in the dataset
    for (const auto& batch : data_loader) {
        auto data = batch.data.to(device), targets = batch.target.to(device);
        auto output = model.forward(data);
        // Calculate the loss
        test_loss += F::cross_entropy(
            output,
            targets,
            F::CrossEntropyFuncOptions().ignore_index(-100).reduction(torch::kSum))
            .template item<float>();
        // calculate the accuracy
        auto pred = output.argmax(1);
        correct += pred.eq(targets).sum().template item<int64_t>();
    }

    test_loss /= dataset_size;
    std::printf(
        "\n    Test set: Average loss: %.4f | Accuracy: %.3f\n",
        test_loss,
        static_cast<double>(correct) / dataset_size);
}
```

## <font color='green'> Code inside the "main" Function </font>

``` C++
// declare global variables
int numEpochs = 8; // number of epochs
int trainBatchSize = 32; // batch size of training data
int testBatchSize = 16; // batch size of test data

// check if gpu is available
torch::DeviceType device_type;
if (torch::cuda::is_available()) {
    std::cout << "CUDA available! Training on GPU." << std::endl;
    device_type = torch::kCUDA;
}
else {
    std::cout << "Training on CPU." << std::endl;
    device_type = torch::kCPU;
}
torch::Device device(device_type);

// instantiate a model called `CaltechClassifier` and push it to device. (gpu or cpu)
Net CaltechClassifier(4);
CaltechClassifier.to(device);
```

```C++
// Setting the necessary attributes to a tensor.
auto options = torch::TensorOptions().dtype(torch::kFloat64)//.device(torch::kCPU,1);
        
// Initialize the `Caltech` dataset with respective paths for both train and test data.
 
auto trainData = Caltech("./caltech_subset/train_paths.txt","./train_labels.txt", "./caltech_subset/train").map(torch::data::transforms::Stack<>());
auto testData = Caltech("./caltech_subset/test_paths.txt", "./test_labels.txt", "./caltech_subset/test").map(torch::data::transforms::Stack<>());
```

```C++
// create the `DataLoader` for training as well as testing data.
auto trainDataloader = torch::data::make_data_loader<torch::data::samplers::RandomSampler>(trainData, trainBatchSize);
auto testDataloader = torch::data::make_data_loader<torch::data::samplers::RandomSampler>(testData, testBatchSize);

// Get the number of samples in training data as well as test-data
const int64_t trainLen = trainData.size().value();
const int64_t testLen = testData.size().value();

// create an optimizer `Adam` with the parameters of the model to optimize.
// We will use beta1 as 0.5 and beta2 as 0.9, with learning-rate of 0.0002
torch::optim::Adam optimizer(CaltechClassifier.parameters(), torch::optim::AdamOptions(2e-4).betas(std::make_tuple(0.5,0.9)));
                ```

```C++
// for every epoch call the train function as well as the test function.
for (size_t epoch = 1; epoch <= numEpochs; ++epoch) 
{	std::cout<<"Epoch "<<epoch<<" statistics."<<std::endl;
    train(epoch, CaltechClassifier, device, *trainDataloader, optimizer, trainLen);
    test(CaltechClassifier, device, *testDataloader, testLen);
}
```

**All the code shown above has been written in a C++ file. Let's run a few bash commands to execute that code. The following code cells will build and run the code in the CPP file.**

Note that we shall use the `OpenCV` C++ library to read images.  
Hence we need to mention the path to the cmake files of OpenCV.  

Essentially, we need to set the path where the files `OpenCVConfig.cmake`, `OpenCVModules.cmake`, `OpenCVConfig-version.cmake` , `OpenCVModules-release.cmake` are present.  
In my desktop, these are present at `/home/chetan/cv2/OpenCV/lib/cmake/opencv4`.  

Therefore you need to set this path manually in `CMakeLists.txt` in the place  
`find_package(OpenCV REQUIRED PATHS "path/to/opencv/cmake-files")`

In [1]:
%cd libtorch-train-cnn/

/home/chetan/projects/piethon/pth_course/c3_w15_dl_pytorch/LibTorch/libtorch-train-cnn


In [2]:
!ls

caltech_subset	Caltech_training.cpp  CMakeLists.txt  run_training.sh


In [3]:
!mkdir build

In [4]:
%cd build

/home/chetan/projects/piethon/pth_course/c3_w15_dl_pytorch/LibTorch/libtorch-train-cnn/build


In [5]:
!cmake ..

-- The C compiler identification is GNU 7.5.0
-- The CXX compiler identification is GNU 7.5.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE  
-- Found Torch: /home/chetan/projects/piethon/pth_course/c3_w15_d

In [6]:
!cmake --build . --config Release

[35m[1mScanning dependencies of target Libtorch-week15-trainCNN[0m
[ 50%] [32mBuilding CXX object CMakeFiles/Libtorch-week15-trainCNN.dir/Caltech_training.cpp.o[0m
[100%] [32m[1mLinking CXX executable Libtorch-week15-trainCNN[0m
[100%] Built target Libtorch-week15-trainCNN


In [7]:
%cd ..

/home/chetan/projects/piethon/pth_course/c3_w15_dl_pytorch/LibTorch/libtorch-train-cnn


In [8]:
!./build/Libtorch-week15-trainCNN

Training on CPU.
Epoch 1 statistics.
   Train set: Average loss: 1.9607 | Accuracy: 0.287
   Test set: Average loss: 1.3588 | Accuracy: 0.522
Epoch 2 statistics.
   Train set: Average loss: 1.2910 | Accuracy: 0.408
   Test set: Average loss: 1.1972 | Accuracy: 0.500
Epoch 3 statistics.
   Train set: Average loss: 1.0302 | Accuracy: 0.576
   Test set: Average loss: 0.9109 | Accuracy: 0.717
Epoch 4 statistics.
   Train set: Average loss: 0.6811 | Accuracy: 0.735
   Test set: Average loss: 1.0157 | Accuracy: 0.565
Epoch 5 statistics.
   Train set: Average loss: 0.5266 | Accuracy: 0.818
   Test set: Average loss: 0.3735 | Accuracy: 0.913
Epoch 6 statistics.
   Train set: Average loss: 0.4325 | Accuracy: 0.834
   Test set: Average loss: 0.2998 | Accuracy: 0.913
Epoch 7 statistics.
   Train set: Average loss: 0.3048 | Accuracy: 0.887
   Test set: Average loss: 0.7111 | Accuracy: 0.696
Epoch 8 statistics.
   Train set: Average loss: 0.3518 | Accuracy: 0.866
   Test set: Average loss: 0.3403 |

## <font color='green'> References </font>

1. http://www.vision.caltech.edu/Image_Datasets/Caltech101/