# Dataframe Generator
This notebook aims to generate train and test dataframe from the raw images in the directory using hand-crafted features, such as
- average red in the image (continuous numeric number)
- average green in the image (continuous numeric number)
- average blue in the image (continuous numeric number)
- average saturation in the image (continuous numeric number)
- contrast of the image (continuous numeric number)
- angular second moment (ASM) of the image (continuous numeric number)
- homogeneity of the image (continuous numeric number)

and also `image directory path` and `image label`.

### Note
You don't have to run this again in your machine. All the results are already saved as `train_images_features.csv` and `test_images_features.csv` that can be used to the machine learning models, directly or maybe with several extra preprocessing.

However, if you still want to try, you need to know that **it might take some time and resources, and possibly bring your computer down if you don't have enough resources**.

In [1]:
import glob
import pandas as pd

from feature_extractor import extract_images_features

In [2]:
def generate_train(path, train_label):
    df_list = []
    for label in train_label:
        print("\nProcessing label {}".format(label))
        
        image_paths = glob.glob(path + "{}/*".format(label))
        N = len(image_paths)
        df_list.append(extract_images_features(image_paths, [label] * N))
    
    return pd.concat(df_list)

In [3]:
def generate_test(path, children_path, test_label_map):
    image_paths = glob.glob(path + children_path)
    N = len(image_paths)
    
    labels = []
    for path in image_paths:
        file_name = path.split('/')[-1].lower()
        for label in test_label_map.keys():
            if file_name.startswith(label):
                labels.append(test_label_map[label])
                break
    
    return extract_images_features(image_paths, labels)

In [4]:
train_label = ["cloudy", "foggy", "rainy", "shine", "sunrise"]
train_df = generate_train("./dataset/", train_label)

display(train_df.head())
train_df.to_csv("train_images_features.csv", index_label="id")


Processing label cloudy
Processing 1 / 300 images
Processing 2 / 300 images
Processing 3 / 300 images
Processing 4 / 300 images
Processing 5 / 300 images
Processing 6 / 300 images
Processing 7 / 300 images
Processing 8 / 300 images
Processing 9 / 300 images
Processing 10 / 300 images
Processing 11 / 300 images
Processing 12 / 300 images
Processing 13 / 300 images
Processing 14 / 300 images
Processing 15 / 300 images
Processing 16 / 300 images
Processing 17 / 300 images
Processing 18 / 300 images
Processing 19 / 300 images
Processing 20 / 300 images
Processing 21 / 300 images
Processing 22 / 300 images
Processing 23 / 300 images
Processing 24 / 300 images
Processing 25 / 300 images
Processing 26 / 300 images
Processing 27 / 300 images
Processing 28 / 300 images
Processing 29 / 300 images
Processing 30 / 300 images
Processing 31 / 300 images
Processing 32 / 300 images
Processing 33 / 300 images
Processing 34 / 300 images
Processing 35 / 300 images
Processing 36 / 300 images
Processing 3

Processing 2 / 300 images
Processing 3 / 300 images
Processing 4 / 300 images
Processing 5 / 300 images
Processing 6 / 300 images
Processing 7 / 300 images
Processing 8 / 300 images
Processing 9 / 300 images
Processing 10 / 300 images
Processing 11 / 300 images
Processing 12 / 300 images
Processing 13 / 300 images
Processing 14 / 300 images
Processing 15 / 300 images
Processing 16 / 300 images
Processing 17 / 300 images
Processing 18 / 300 images
Processing 19 / 300 images
Processing 20 / 300 images
Processing 21 / 300 images
Processing 22 / 300 images
Processing 23 / 300 images
Processing 24 / 300 images
Processing 25 / 300 images
Processing 26 / 300 images
Processing 27 / 300 images
Processing 28 / 300 images
Processing 29 / 300 images
Processing 30 / 300 images
Processing 31 / 300 images
Processing 32 / 300 images
Processing 33 / 300 images
Processing 34 / 300 images
Processing 35 / 300 images
Processing 36 / 300 images
Processing 37 / 300 images
Processing 38 / 300 images
Processin

Processing 299 / 300 images
Processing 300 / 300 images

Processing label rainy
Processing 1 / 300 images
Processing 2 / 300 images
Processing 3 / 300 images
Processing 4 / 300 images
Processing 5 / 300 images
Processing 6 / 300 images
Processing 7 / 300 images
Processing 8 / 300 images
Processing 9 / 300 images
Processing 10 / 300 images
Processing 11 / 300 images
Processing 12 / 300 images
Processing 13 / 300 images
Processing 14 / 300 images
Processing 15 / 300 images
Processing 16 / 300 images
Processing 17 / 300 images
Processing 18 / 300 images
Processing 19 / 300 images
Processing 20 / 300 images
Processing 21 / 300 images
Processing 22 / 300 images
Processing 23 / 300 images
Processing 24 / 300 images
Processing 25 / 300 images
Processing 26 / 300 images
Processing 27 / 300 images
Processing 28 / 300 images
Processing 29 / 300 images
Processing 30 / 300 images
Processing 31 / 300 images
Processing 32 / 300 images
Processing 33 / 300 images
Processing 34 / 300 images
Processing 

Processing 295 / 300 images
Processing 296 / 300 images
Processing 297 / 300 images
Processing 298 / 300 images
Processing 299 / 300 images
Processing 300 / 300 images

Processing label shine
Processing 1 / 250 images
Processing 2 / 250 images
Processing 3 / 250 images
Processing 4 / 250 images
Processing 5 / 250 images
Processing 6 / 250 images
Processing 7 / 250 images
Processing 8 / 250 images
Processing 9 / 250 images
Processing 10 / 250 images
Processing 11 / 250 images
Processing 12 / 250 images
Processing 13 / 250 images
Processing 14 / 250 images
Processing 15 / 250 images
Processing 16 / 250 images
Processing 17 / 250 images
Processing 18 / 250 images
Processing 19 / 250 images
Processing 20 / 250 images
Processing 21 / 250 images
Processing 22 / 250 images
Processing 23 / 250 images
Processing 24 / 250 images
Processing 25 / 250 images
Processing 26 / 250 images
Processing 27 / 250 images
Processing 28 / 250 images
Processing 29 / 250 images
Processing 30 / 250 images
Process

Processing 43 / 350 images
Processing 44 / 350 images
Processing 45 / 350 images
Processing 46 / 350 images
Processing 47 / 350 images
Processing 48 / 350 images
Processing 49 / 350 images
Processing 50 / 350 images
Processing 51 / 350 images
Processing 52 / 350 images
Processing 53 / 350 images
Processing 54 / 350 images
Processing 55 / 350 images
Processing 56 / 350 images
Processing 57 / 350 images
Processing 58 / 350 images
Processing 59 / 350 images
Processing 60 / 350 images
Processing 61 / 350 images
Processing 62 / 350 images
Processing 63 / 350 images
Processing 64 / 350 images
Processing 65 / 350 images
Processing 66 / 350 images
Processing 67 / 350 images
Processing 68 / 350 images
Processing 69 / 350 images
Processing 70 / 350 images
Processing 71 / 350 images
Processing 72 / 350 images
Processing 73 / 350 images
Processing 74 / 350 images
Processing 75 / 350 images
Processing 76 / 350 images
Processing 77 / 350 images
Processing 78 / 350 images
Processing 79 / 350 images
P

Processing 338 / 350 images
Processing 339 / 350 images
Processing 340 / 350 images
Processing 341 / 350 images
Processing 342 / 350 images
Processing 343 / 350 images
Processing 344 / 350 images
Processing 345 / 350 images
Processing 346 / 350 images
Processing 347 / 350 images
Processing 348 / 350 images
Processing 349 / 350 images
Processing 350 / 350 images


Unnamed: 0,img_path,avg_r,avg_g,avg_b,avg_s,contrast,ASM,homogeneity,type_label
0,./dataset/cloudy/cloudy168.jpg,87.07608,91.337559,96.500416,0.132656,6.308549,0.003071,0.619131,cloudy
1,./dataset/cloudy/cloudy87.jpg,104.670648,113.177775,127.331482,0.209552,17.488994,0.002722,0.594842,cloudy
2,./dataset/cloudy/cloudy141.jpg,116.599169,130.515403,150.623516,0.294946,511.862221,0.000795,0.256738,cloudy
3,./dataset/cloudy/cloudy274.jpg,134.550592,137.921905,131.195361,0.163266,4.876999,0.00261,0.604593,cloudy
4,./dataset/cloudy/cloudy273.jpg,123.177525,128.685279,136.836869,0.103191,12.65495,0.001312,0.427159,cloudy


In [5]:
test_label_map = {
    "cloud" : "cloudy",
    "foggy" : "foggy",
    "rain" : "rainy",
    "shine" : "shine",
    "sunrise" : "sunrise"
}
test_df = generate_test("./dataset/", "alien_test/*", test_label_map)

display(test_df)
test_df.to_csv("test_images_features.csv", index_label="id")

Processing 1 / 30 images
Processing 2 / 30 images
Processing 3 / 30 images
Processing 4 / 30 images
Processing 5 / 30 images
Processing 6 / 30 images
Processing 7 / 30 images
Processing 8 / 30 images
Processing 9 / 30 images
Processing 10 / 30 images
Processing 11 / 30 images
Processing 12 / 30 images
Processing 13 / 30 images
Processing 14 / 30 images
Processing 15 / 30 images
Processing 16 / 30 images
Processing 17 / 30 images
Processing 18 / 30 images
Processing 19 / 30 images
Processing 20 / 30 images
Processing 21 / 30 images
Processing 22 / 30 images
Processing 23 / 30 images
Processing 24 / 30 images
Processing 25 / 30 images
Processing 26 / 30 images
Processing 27 / 30 images
Processing 28 / 30 images
Processing 29 / 30 images
Processing 30 / 30 images


Unnamed: 0,img_path,avg_r,avg_g,avg_b,avg_s,contrast,ASM,homogeneity,type_label
0,./dataset/alien_test/rain_1.jpg,89.380456,108.331348,92.704043,0.34014,104.245269,0.001284,0.284216,rainy
1,./dataset/alien_test/shine_1.jpg,84.179988,145.610668,202.758774,0.650524,25.325296,0.003029,0.637173,shine
2,./dataset/alien_test/foggy_5.jpg,111.901819,122.010252,123.489308,0.122873,11.67735,0.0025,0.705304,foggy
3,./dataset/alien_test/Cloud_4.jpg,88.340293,151.776627,193.868678,0.557622,17.755358,0.00265,0.506353,cloudy
4,./dataset/alien_test/foggy_7.jpg,122.728178,73.513052,36.486659,0.714519,8.337532,0.003289,0.732819,foggy
5,./dataset/alien_test/sunrise_5.jpg,111.447737,64.423823,46.635379,0.712069,151.4112,0.000367,0.250481,sunrise
6,./dataset/alien_test/foggy_3.jpg,120.045989,102.307547,83.251539,0.302281,11.199281,0.001511,0.41331,foggy
7,./dataset/alien_test/shine_3.jpg,137.93186,178.009487,215.315102,0.363824,12.915281,0.003435,0.638476,shine
8,./dataset/alien_test/foggy_10.jpg,132.539126,132.405096,126.782115,0.110793,11.561278,0.00238,0.64998,foggy
9,./dataset/alien_test/foggy_1.jpg,66.39806,80.541773,67.158103,0.26073,12.427625,0.001643,0.550861,foggy
