Plot benchmark speed against pytorch #20

coreylowman · 2022-05-20T15:53:57Z

Linear batched forward (matmul & broadcast add)
Backprop algorithm
Optimizer updates
forward with tape & without tape

coreylowman · 2022-05-20T15:56:00Z

rust code for benching:

use dfdx::prelude::*;
use rand::{prelude::StdRng, SeedableRng};
use rand_distr::StandardNormal;
use std::time::{Duration, Instant};

fn main() {
    let mut rng = StdRng::seed_from_u64(0);

    let mut l: Linear<512, 256> = Default::default();
    l.randomize(&mut rng, &StandardNormal);

    let mut opt = Adam::default();

    const N: usize = 10000;
    let mut total = Duration::default();
    for _ in 0..N {
        let x: Tensor2D<32, 512> = Tensor2D::randn(&mut rng);
        let y = l.forward(x.traced());
        let loss = y.square().mean();
        let start = Instant::now();
        let gradients = loss.backward();
        opt.update(&mut l, gradients);
        total += start.elapsed();
    }
    println!("{:?} batch per s", N as f32 / total.as_secs_f32());
}

coreylowman · 2022-05-20T15:56:11Z

Python code for benching:

from datetime import datetime, timedelta
import torch

torch.manual_seed(0)

l = torch.nn.Linear(512, 256)
opt = torch.optim.Adam(l.parameters())

total = timedelta()
N = 10000
for _ in range(N):
    x = torch.randn(32, 512)
    y = l(x)
    loss = y.square().mean()
    start = datetime.now()
    opt.zero_grad()
    loss.backward()
    opt.step()
    total += datetime.now() - start

print(N / total.total_seconds())

coreylowman · 2022-05-28T22:08:41Z

Use https://crates.io/crates/criterion

coreylowman · 2022-06-28T21:24:25Z

Both dfdx version and torch version should use flush to zero (#60)

For simplicity of example

coreylowman · 2023-03-15T14:36:24Z

Closing - might do in future, but this will continue to be ad-hoc for now

coreylowman mentioned this issue May 26, 2022

Document differences from pytorch & other rust DL crates #22

Closed

coreylowman added the documentation Improvements or additions to documentation label May 26, 2022

coreylowman mentioned this issue May 26, 2022

Roadmap #25

Closed

coreylowman added a commit that referenced this issue Jun 28, 2022

#60 #20 removing ftz from mnist classifer

f32ec11

For simplicity of example

coreylowman mentioned this issue Oct 19, 2022

Add bench folder #266

Closed

5 tasks

coreylowman closed this as completed Mar 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Plot benchmark speed against pytorch #20

Plot benchmark speed against pytorch #20

coreylowman commented May 20, 2022 •

edited

coreylowman commented May 20, 2022

coreylowman commented May 20, 2022

coreylowman commented May 28, 2022

coreylowman commented Jun 28, 2022

coreylowman commented Mar 15, 2023

Plot benchmark speed against pytorch #20

Plot benchmark speed against pytorch #20

Comments

coreylowman commented May 20, 2022 • edited

coreylowman commented May 20, 2022

coreylowman commented May 20, 2022

coreylowman commented May 28, 2022

coreylowman commented Jun 28, 2022

coreylowman commented Mar 15, 2023

coreylowman commented May 20, 2022 •

edited