# Chapter 4. Creating TFRecords

## 4.1 Goal

Convert the labelled XML files to the TFRecord files.

## 4.2 Convert the labelled XML files to singular CSV files.

### 4.2.1 Download `xml_to_csv.py` from [datitran's github](https://github.com/datitran/raccoon_dataset).

### 4.2.2 Modify `xml_to_csv.py`.

Replace

```python
def main():
    image_path = os.path.join(os.getcwd(), 'annotations')
    xml_df = xml_to_csv(image_path)
    xml_df.to_csv('raccoon_labels.csv', index=None)
    print('Successfully converted xml to csv.')
```

by

```python
def main():
    for directory in ['train','test']:
        # image_path = os.path.join(os.getcwd(), 'macaroni-images/{}'.format(directory))  
        image_path = os.path.join(os.getcwd(), 'airplane-images/{}'.format(directory))
        xml_df = xml_to_csv(image_path)
        xml_df.to_csv('data/{}_labels.csv'.format(directory), index=None)
        print('Successfully converted xml to csv.')
```

### 4.2.3 Prepare for running `xml_to_csv.py`.

(1) Install pandas.

```bash
$ conda install pandas
```

(2) Make a subdirectory `data`.

```bash
$ mkdir data
```

### 4.2.4 Run `xml_to_csv.py`.

```bash
$ python xml_to_csv.py
```

## 4.3 Convert singular CSV files to the TFRecord files.

### 4.3.1 Download `generate_tfrecord.py` from [datitran's github](https://github.com/datitran/raccoon_dataset).

### 4.3.2 Modify `generate_tfrecord.py`.

(1) Replace 

```python
# TO-DO replace this with label map
def class_text_to_int(row_label):
    if row_label == 'raccoon':
        return 1
    else:
        None
```

by

```python
# TO-DO replace this with label map
def class_text_to_int(row_label):
    if row_label == 'macncheese':
#   if row_label == 'airplane':
        return 1
    else:
        None
```

(2) Add one more command-line parameter.

```python
flags = tf.app.flags
flags.DEFINE_string('img_path', '', 'Path to the images')
flags.DEFINE_string('csv_input', '', 'Path to the CSV input')
flags.DEFINE_string('output_path', '', 'Path to output TFRecord')
FLAGS = flags.FLAGS
```

```python
def main(_):
    writer = tf.python_io.TFRecordWriter(FLAGS.output_path)
    path = os.path.join(os.getcwd(), FLAGS.img_path)
```

### 4.3.3 Clone the TensorFlow models repository and set it up as in Chapter 1.

```bash
$ git clone https://github.com/tensorflow/models.git
$ cd models/research
$ protoc object_detection/protos/*.proto --python_out=.
$ export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim
```

### 4.3.4 Install the `object_detection` library.

```bash
$ cd models/research
$ python setup.py install
```

### 4.3.5 Run the `generate_tfrecord.py` script twice.

(1) For macaroni images:

```bash
$ python generate_tfrecord.py --img_path=macaroni-images/train --csv_input=data/train_labels.csv --output_path=data/train.record
$ python generate_tfrecord.py --img_path=macaroni-images/test --csv_input=data/test_labels.csv --output_path=data/test.record
```

(2) For airplane images:

```bash
$ python generate_tfrecord.py --img_path=airplane-images/train --csv_input=data/train_labels.csv --output_path=data/train.record
$ python generate_tfrecord.py --img_path=airplane-images/test --csv_input=data/test_labels.csv --output_path=data/test.record
```