Verbose Optimizer Output #58

felixmaximilian · 2016-07-08T15:06:51Z

Is there any possibility to see the details of the optimization process during the optimization or as a summary?
I am currently having trouble with a dying kernel when I increase the n_iter and its impossible to find the cause without any debug output.

felixmaximilian · 2016-07-08T17:04:11Z

I managed to get one more line out of it by using the pyspark console instead of the notebook:
a segmentation fault which you can see in the very last line.

Welcome

 to                 

      ____              __                                                                                                         
     / __/__  ___ _____/ /__                                                                                                       
    _\ \/ _ \/ _ `/ __/  '_/                                                                                                       
   /__ / .__/\_,_/_/ /_/\_\   version 1.6.1                                                                                        
      /_/                                                                                                                          

Using Python version 2.7.10 (default, Dec  8 2015 18:25:23)                                                                        
SparkContext available as sc, HiveContext available as sqlContext.                                                                 
>>> import cPickle as pickle                                                                                                       
>>> from scipy import io,sparse                                                                                                    
>>> preferencesLocalArray = pickle.load(open("preferences.pickle","rb"))                                                           

>>> features = io.mmread(open("sparse_features.mmw","rb"))                                                                         
>>> features = features.tocsc()                                                                                                    
>>> #shuffle pairwise preferences for train and test split                                                                         
... import numpy as np                                                                                                             
>>> np.random.seed(123L)                                                                                                           
>>> random_indices = np.random.randint(preferencesLocalArray.shape[0], size = preferencesLocalArray.shape[0])                      
>>> preferencesLocalArrayShuffled = np.array(preferencesLocalArray[random_indices])                                                
>>>                                                                                                                                
>>> train_percentage = 95                                                                                                          
>>> trainIdx = range(int(preferencesLocalArray.shape[0]/100.0*train_percentage))                                                   
>>> testIdx = range(int(preferencesLocalArray.shape[0]/100.0*train_percentage),preferencesLocalArray.shape[0])                     
>>> posExamples = preferencesLocalArrayShuffled[testIdx,0]                                                                         
>>> negExamples = preferencesLocalArrayShuffled[testIdx,1]                                                                         
>>> from fastFM import bpr                                                                                                         
>>> import numpy as np                                                                                                             
>>>                                                                                                                                
>>> fm = bpr.FMRecommender(n_iter=70000,init_stdev=0.01,l2_reg_w=.2,l2_reg_V=1.,step_size=.1,rank=100, random_state=11)            
>>>                                                                                                                                
>>> fm.fit(features,preferencesLocalArrayShuffled)                                                                                 
Segmentation fault

ibayer · 2016-07-10T14:50:58Z

@felixmaximilian
Looks like the error occurs in the solver which is implemented as a C extension using Cython.
The are basically two ways to debug this.

Use the Cython debugger http://docs.cython.org/src/userguide/debugging.html
Run your data from the C cli https://github.com/ibayer/fastFM-core with gdb.

I usually go with the second option. Unfortunately, the BPR SGD implementation is very sensitive to the hyperparameter settings especially to step_size. Bad settings can lead to vanishing / exploding gradients that crash fastFM. This should definitely be improved at some point.

felixmaximilian · 2016-07-11T11:14:27Z

Thanks for your assistance. What would be your suggestion to improve the BPR SGD? Adam like dynamic step_size?

felixmaximilian · 2016-07-11T14:25:44Z

Thanks to your help I found the bug within my own code. Actually the solver doesn't like duplicate training samples, which isn`t surprising. The output of the gdb was very helpful here!

ibayer · 2016-07-12T06:20:56Z

@felixmaximilian
Glad to hear that you fixed your problem.

solver doesn't like duplicate training samples

Can you expand on this?
Could be good to add a check for this on the python side.

felixmaximilian closed this as completed Jul 11, 2016

felixmaximilian mentioned this issue Jul 12, 2016

Segmentation Fault in BPR from python Interface #61

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Verbose Optimizer Output #58

Verbose Optimizer Output #58

felixmaximilian commented Jul 8, 2016

felixmaximilian commented Jul 8, 2016

ibayer commented Jul 10, 2016 •

edited

Loading

felixmaximilian commented Jul 11, 2016

felixmaximilian commented Jul 11, 2016

ibayer commented Jul 12, 2016

Verbose Optimizer Output #58

Verbose Optimizer Output #58

Comments

felixmaximilian commented Jul 8, 2016

felixmaximilian commented Jul 8, 2016

ibayer commented Jul 10, 2016 • edited Loading

felixmaximilian commented Jul 11, 2016

felixmaximilian commented Jul 11, 2016

ibayer commented Jul 12, 2016

ibayer commented Jul 10, 2016 •

edited

Loading