# MXNet `RecordIO` format Engineering

## Creation of `.rec` Files
In this notebook, we are undertaking the process of transforming a dataset of images into a [RecordIO `.rec` file format](https://mesos.apache.org/documentation/latest/recordio/) using MXNet's `im2rec.py` tool. This format is helpful for training models in MXNet, as it allows for more efficient and optimized data loading. We will perform:

- Install necessary dependencies.
- Define the resize dimension for the images.
- Execute the `im2rec.py` script to generate the `.rec` files.

**Note:** `im2rec.py` script can be found in [this repository](https://github.com/apache/mxnet/blob/master/tools/im2rec.py)

### Steps:
1. **Dependency Installation**: Depending on the OS distribution, install the required system and Python dependencies.
2. **Resize Dimension Setting**: Define the resize dimension parameter.
3. **Execution of `im2rec.py` Script**: Run the script to generate `.rec` files from image datasets located at `weapon_detection/images/`.

In [2]:
%%capture
!pip install 'numpy==1.23.0'
!pip install opencv-python-headless
!pip install mxnet

In [None]:
def get_linux_distribution():
    with open("/etc/os-release", "r") as f:
        for line in f:
            if line.startswith("ID="):
                return line.split("=")[1].strip()
                
if get_linux_distribution() == "debian":
    !apt-get update
    !apt-get install ffmpeg libsm6 libxext6 -y

In [4]:
import sys
!{sys.executable} -m pip install opencv-python
!{sys.executable} -m pip install mxnet

[0m

In [19]:
RESIZE_SIZE = 256

In [20]:
!python utils/im2rec.py --resize $RESIZE_SIZE --pack-label test airplane_detection/images/

Creating .rec file from /root/test.lst in /root
multiprocessing not available, fall back to single threaded encoding
time: 0.024298906326293945  count: 0


In [21]:
!python utils/im2rec.py --resize $RESIZE_SIZE --pack-label train airplane_detection/images/

Creating .rec file from /root/train.lst in /root
multiprocessing not available, fall back to single threaded encoding
time: 0.039402008056640625  count: 0
time: 29.80021834373474  count: 1000
time: 22.50049614906311  count: 2000


We have successfully generated `.rec` files from our image datasets located at `airplane_detection/images/`. These files will facilitate efficient and optimized data loading during model training in MXNet.

### Achievements:
- **Efficient Data Loading**: The `.rec` format allows for the streamlined loading of image data.
- **Preparation for Model Training**: With the `.rec` files created, we are a step closer to executing model training seamlessly.

### Next Steps:
- Use the generated `.rec` files for training your MXNet model.
- Evaluate the model's performance using relevant metrics.
- Optimize the model as needed based on the evaluation results.