# Image Selection

Because we have a massive dataset of about 179k images, we do not want to include all of these images into our time-lapse video. For example, some images are too dark, because they were captured early in the morning or in the evening. One can think of some more reasons why not to include all available images in the video. Therefore, we will select a smaller subset for creating our time-lapse video.

## How do We Select Images?

A very simple approach could be to select one image a day around noon. This seems reasonable, first we get a relatively small subset of the available data, one image a day leads to around 365 images and second, image brightness should be no problem.

Other possible approaches:
- select multiple images for each day with a given brightness value
- select multiple images for each day within a given timeframe, e.g. 9am to 5pm
- select multiple images for each day, by selecting the first/last *x* images matching some *condition*

and whatever you could think of.

But for now, lets start with the first and very simple approach, selecting one image a day around noon.

In [1]:
import pandas as pd
import numpy as np
import datetime

from src.features import build_features

ALL_TIMESTAMPS_FILE = '../data/raw/image-timestamps.txt'
SELECTED_TIMESTAMPS_FILE = '../data/processed/selected-image-timestamps-noon.txt'

# load all images timestamps from file
timestamps = pd.read_csv(
    ALL_TIMESTAMPS_FILE,
    names = ['timestamp'],
    dtype = {'timestamp': np.int64} )

In [2]:
# for details of get_timestamps_in_range function, see src/data/make_dataset

# extract timestamps at noon
timestamps_noon = build_features.get_timestamps_in_range(timestamps, 12, 5)

display(timestamps_noon)

Unnamed: 0,timestamp
0,1488884432
1,1488970832
2,1489057232
3,1489143632
4,1489230032
...,...
358,1522407658
359,1522493999
360,1522580459
361,1522666799


In [3]:
# write to file
timestamps_noon.to_csv(SELECTED_TIMESTAMPS_FILE, header = False, index = False)

# Prepare Selected Images for Class Prediction

Find images in filesystem and copy to directory

```
$ xargs -a ~/selected-image-timestamps-noon.txt -L 1 -I# find . -name "pic_#.jpg" -exec cp {} selected-images-noon \;
```

When find is not compiled with *exec* option do this in two steps

```
$ xargs -a ~/selected-image-timestamps-noon.txt -L 1 -I# find . -name "pic_#.jpg" > selected-images-path-noon.txt
$ xargs -a selected-images-path-noon.txt cp -t selected-images-noon
```

Selected images have been saved in *selected-images-noon* directory and can be used for further processing.