Skip to content
Mu Li edited this page Feb 11, 2016 · 3 revisions

Overview

Assume the objective f(w) = ℓ(w) + r(w)

Implementation

Iterating implemented in LBFGSLearner::RunScheduler

PrepareData();
InitServer();
InitWorker();
for (int k = 0; k < max_num_epochs; ++k) {
  PushGradient(k);
  PrepareCalcDirection(k);
  CalcDirection(k);
  α = 1;
  while (wolfe_condition_not_satisfied) {
     PushGradient(k);
     LineSearch(k, α);
     α *= ρ;
  }
}

The wolfe condition

  1. f(wk+αpk) ≤ f(wk)+c1αpkT∂f(wk)
  2. pkT∂f(wk+αpk) ≥ c2pkT∂f(wk)
method Worker nodes Server nodes
PrepareData 1. find the unique feature IDs
2. count the feature appearance
3. push all to severs
Return: number of examples read
-
InitServer - 1. initialize w0
Return: r(w0)
InitWorker 1. pull w0 from servers
2. calculate ∂ℓ(w0)
Return: ℓ(w0)
-
PushGradient(k) 1. push ∂ℓ(wk) to servers -
PrepareCalcDirection(k) - 1. ∂f(wk)=∂ℓ(wk)+∂r(wk)
2. add ∂f(wk) - ∂f(wk-1) to s
3. add wk - wk-1 to y
Return: B = [s,y,∂f]T[s,y,∂f]
CalcDirection(k, B) - 1. calculate pk by the two-loop
Return pkT∂ℓ(wk)
LineSearch(k,α) 1. pull pk from server if hasn't yet
2. wk+1 = wk+αpk
3. calculate ∂ℓ(wk+1)
Return: [ℓ(wk+1), ∂ℓ(wk+1)Tpk]
1. wk+1 = wk+αpk
Return: [r(wk+1), ∂r(wk+1)Tpk]

the actual implementation may be slightly different in order to save the computation and memory.

User Guides

Development Notes

Clone this wiki locally