Balance data for latest version #76

Nixellion · 2017-10-26T19:17:10Z

I tried to adapt balance data code for the latest version, the one that supports this:

w = [1, 0, 0, 0, 0, 0, 0, 0, 0]
s = [0, 1, 0, 0, 0, 0, 0, 0, 0]
a = [0, 0, 1, 0, 0, 0, 0, 0, 0]
d = [0, 0, 0, 1, 0, 0, 0, 0, 0]
wa = [0, 0, 0, 0, 1, 0, 0, 0, 0]
wd = [0, 0, 0, 0, 0, 1, 0, 0, 0]
sa = [0, 0, 0, 0, 0, 0, 1, 0, 0]
sd = [0, 0, 0, 0, 0, 0, 0, 1, 0]
nk = [0, 0, 0, 0, 0, 0, 0, 0, 1]

But when I try to train such model, I get value out of range, errors. I suppose I do something wrong in balance data.

Is there an updated code anywhere? Or am I not supposed to balance data myself? Without it neural net always uses forward.

The text was updated successfully, but these errors were encountered:

Nixellion · 2017-10-26T22:49:09Z

Oh, ok. Seems that the problem is actually that collect_data is only recording 2 states: default (all zeroes) and w, so it just 'balances' data to the smallest array which is empty. Huh

frossaren · 2017-11-01T16:46:34Z

Ahhh so thats why i have huge problems getting it to work probably. Btw do we know anything about if this project is dead? Sentdex hasnt approved any pr or added anything in a long time.

Nixellion · 2017-11-02T11:43:44Z

I don't know, but the stream with this bot runs pretty much 24\7

kymckay · 2017-12-06T14:00:10Z

For anyone looking at this in future, I have a slightly rewritten balance_data.py to handle an arbitrary number of choices and also repack files below a specified threshold of training data. Gist here.

Phillyclause89 · 2019-05-06T18:20:42Z

This is how I went about modifying balance_data.py to balance across the 9 possible choices:

import numpy as np
import pandas as pd
from collections import Counter
from random import shuffle
import random

random.seed()
FILE_I_END = 7
offset = 10

data_order = [i for i in range(1, FILE_I_END + 1)]
shuffle(data_order)
for count, i in enumerate(data_order):
    try:
        random.seed()
        file_name = 'training_data-{}.npy'.format(i)
        # full file info
        train_data = np.load(file_name, allow_pickle=True)
        print('training_data-{}.npy'.format(i), len(train_data))
        df = pd.DataFrame(train_data)
        print(df.head())
        print(Counter(df[1].apply(str)))
        w = []
        s = []
        a = []
        d = []
        wa = []
        wd = []
        sa = []
        sd = []
        nk = []
        for data in train_data:
            img = data[0]
            choice = data[1]
            if choice == [1, 0, 0, 0, 0, 0, 0, 0, 0]:
                w.append([img, choice])
                shuffle(w)
            elif choice == [0, 1, 0, 0, 0, 0, 0, 0, 0]:
                s.append([img, choice])
                shuffle(s)
            elif choice == [0, 0, 1, 0, 0, 0, 0, 0, 0]:
                a.append([img, choice])
                shuffle(a)
            elif choice == [0, 0, 0, 1, 0, 0, 0, 0, 0]:
                d.append([img, choice])
                shuffle(d)
            elif choice == [0, 0, 0, 0, 1, 0, 0, 0, 0]:
                wa.append([img, choice])
                shuffle(wa)
            elif choice == [0, 0, 0, 0, 0, 1, 0, 0, 0]:
                wd.append([img, choice])
                shuffle(wd)
            elif choice == [0, 0, 0, 0, 0, 0, 1, 0, 0]:
                sa.append([img, choice])
                shuffle(sa)
            elif choice == [0, 0, 0, 0, 0, 0, 0, 1, 0]:
                sd.append([img, choice])
                shuffle(sd)
            elif choice == [0, 0, 0, 0, 0, 0, 0, 0, 1]:
                nk.append([img, choice])
                shuffle(nk)
            else:
                print('no matches')
        w = w[:len(s)][:len(a)][:len(d)][:len(wa)][:len(wd)][:len(sa)][:len(sd)][:len(nk)]
        s = s[:len(w)]
        a = a[:len(w)]
        d = d[:len(w)]
        wa = wa[:len(w)]
        wd = wd[:len(w)]
        sa = sa[:len(w)]
        sd = sd[:len(w)]
        nk = nk[:len(w)]

        final_data = w + s + a + d + wa + wd + sa + sd + nk
        shuffle(final_data)
        np.save('balanced_training_data-{}.npy'.format(i+offset), final_data)

    except Exception as e:
        print(str(e))

daniel-kukiela closed this as completed Jul 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Balance data for latest version #76

Balance data for latest version #76

Nixellion commented Oct 26, 2017

Nixellion commented Oct 26, 2017

frossaren commented Nov 1, 2017

Nixellion commented Nov 2, 2017

kymckay commented Dec 6, 2017

Phillyclause89 commented May 6, 2019

Balance data for latest version #76

Balance data for latest version #76

Comments

Nixellion commented Oct 26, 2017

Nixellion commented Oct 26, 2017

frossaren commented Nov 1, 2017

Nixellion commented Nov 2, 2017

kymckay commented Dec 6, 2017

Phillyclause89 commented May 6, 2019