Conversation
@cozytk Can we merge this now? |
@hyunwoongko Yes. It passes all tests and just few review points left filesoslo/torch/_C/init.py
oslo/torch/nn/parallel/data_parallel/sharded_data_parallel.py
oslo/torch/optim/cpu_adam.py
|
|
@hyunwoongko Done! |
@@ -0,0 +1,214 @@ | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need?
@@ -0,0 +1,36 @@ | |||
#pragma once |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need too?
#include "cublas_v2.h" | ||
#include "cuda.h" | ||
#include "curand.h" | ||
#include "gemm_test.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hyunwoongko çpu_adam.h
requires context.h
. context.h
requires gemm_test.h
#pragma once | ||
|
||
#include "StopWatch.h" | ||
#include "cublas_wrappers.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hyunwoongko gemm_test.h
includes cublas_wrappers.h
|
||
void SetSeed(uint64_t new_seed) { _seed = new_seed; } | ||
|
||
void TestGemmFP16(bool test_gemm, int batch_size, int seq_len, int head_num, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can remove this. no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hyunwoongko I'm gonna apply it!
|
|
Title
-FusedAdam & CPUAdam
Description
Tasks