# Question #1: How to take advantage of multiple availible GPUs?


- ### Suggested solution

For every trained model we add the following line(for 16 GPUs as __in the case of DGX-2__):

model = keras.utils.multi_gpu_model(model, gpus = 16)

# Question #2: How to source training batches from several ''heavy'' files? 

- ### Suggested solution

We use a DataGenerator in the following way:

- Suppose that we have 4 batches of 64K instances __by file__ (containing 256K instances). We need to open this file only once instead of 4 times for every single batch inside. We create a generator as shown in the cell below. We know that for example __file_1__ contains batches 0 to 3 and __file_2__ contains files 4 to 7. So in order to get the batch 4 we open __file_2__ and stock it in memory for batches 5,6 and 7. 

- We create a DataGenerator __once__ (train_generator) and then we recopy it for every new training using the following script:
__copy.deepcopy(train_generator)__



In [None]:
class DataGenerator_N(keras.utils.Sequence):

    def __init__(self, list_FILES, size_FILES = 256000, batch_size = 64000, shuffle = True):

        self.batch_size = batch_size
        
        self.list_FILES = list_FILES
        
        self.n_files = len(self.list_FILES)
        
        self.size_files = size_FILES
        
        self.shuffle = shuffle
        
        self.on_epoch_end()
                                     
        self.mode = mode
        
        self.current_file = 0
        
        # We open the first file in the list
        self.data = pd.read_csv(self.list_FILES[self.current_files], sep = ';')
        
        self.batches_per_files = size_FILES // batch_size

    def __len__(self):
        'Denotes the number of batches per epoch'
        return int(np.floor(self.n_fichiers * self.size_fichiers / self.batch_size))

    def __getitem__(self, index):
        'Generate one batch of data'
        
        # If the new batch is in another file, we update self.data 
        # by opening the next file in the row
        if index // self.batches_per_fichier > self.current_fichier:
            self.current_fichier += 1
            self.data = pd.read_csv(self.list_FICHIERS[self.current_fichier], sep = ';')
            self.data = self.data.reset_index(drop = 'index')
        
        intra_index = index % self.batches_per_fichier

        data_temp = self.data.loc[intra_index * self.batch_size : (intra_index + 1) * self.batch_size - 1]
        
        Y = data_temp.price.values
        X = data_temp.drop(columns = ['nbDates', 'price']).values

        return X, Y

    def on_epoch_end(self):
        'Shuffles the list of files after each epoch'
        if self.shuffle == True:
            np.random.shuffle(self.list_FICHIERS)
            
        self.current_fichier = 0
            

# Question #3: How to train several models in a parallel way?

- ### Suggested solution

Suppose we have a list of tasks where each task contains a __model to train__. 

For each task we create a new thread (__threading.Thread(target = procedure, args = (task)__). Each thread executes the following procedure:

__________________________________________

def procedure(task):

    session = tf.Session()
    K.set_session(session)
    with session.as_default():
        with session.graph.as_default():
            task.model.fit_generator(...)
__________________________________________

- ##### Will the line __K.set_session(session)__ executed by many threads in parallel __for different sessions__ provoke an interaction between sessions?

- ##### Is it a good approach to training several models in parallel?


# Question #4: How to avoid memory chunk in Tensorflow? Can the procedure described in Question #3 affect it?

- ### Suggested solution

We clear sessions and Graphs inside the __procedure(task)__ as follows (by every single Thread):


__________________________________________

def procedure(task):

    session = tf.Session()
    K.set_session(session)
    
    model = task.get_model()
    
    with session.as_default():
        with session.graph.as_default():
            model.fit_generator(...)
    
    # clearing the session and resetting it's graph
    
    del model
    K.clear_session()
    tf.reset_default_graph()
    gc.collect()
    
    # closing the session
    session.close()        
     
__________________________________________

- ##### Is this redundant?
- ##### Will this do what we expect?
- ##### Will this help to a avoid memory chunk ?

