Skip to content
This repository has been archived by the owner on Aug 5, 2022. It is now read-only.

fp16 blas implementation patch #5

Merged
merged 8 commits into from
Feb 8, 2017
Merged

fp16 blas implementation patch #5

merged 8 commits into from
Feb 8, 2017

Conversation

listenlink
Copy link

@listenlink listenlink commented Jan 24, 2017

Hi @gongzg .

This patch implement the fp16 functionality, kernel function/ blas api/ tuning mechanism are all included.
Then can pass all of our clBLAS half test suite.

While I don't add the intelblas_gemm half version currently, because our intel_gemm implementation are still under review at upstream, this PR can be a clean and independent patch for upstream to merge.

<< "$GLOBAL " << sdtype << "* C, $SIZE_T ldc, $SIZE_T Cstart, $SIZE_T Cstride,"
<< "float" << " beta)"
<< std::endl;
}
stream << "{" << std::endl;
stream.inc_tab();

Copy link

@ptillet ptillet Jan 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably these two branches for the kernel argument can be replaced by:

std::string abdtype = (sdtype=="half")?"float":sdtype

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, will modified with your suggestion

@@ -34,7 +34,7 @@ namespace isaac
class scalar;
class expression_tree;

union ISAACAPI values_holder
struct ISAACAPI values_holder
{
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why couldn't we keep a union here? Using a struct instead will multiply the size of the class by 4, which will increase the (already high) overhead associated with expression trees construction and parsing

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because of the introducing of class half, use struct can avoid compiling error.

lib/array.cpp Outdated
case ULONG_TYPE:{ unsigned long t = v.uint64;return t;} break;
case HALF_TYPE: (half)(v.float16);
case FLOAT_TYPE:{float t = v.float32;return t;} break;
case DOUBLE_TYPE: { double t = v.float64;return t;} break;
default: throw unknown_datatype(dtype_);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it work to just keep a macro here (if the union is not changed to a struct), even for the half-type?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current code has to use struct mentioned above, so comment out the macro here.

}else{
result.alpha = value_scalar(1, dtype);
result.beta = value_scalar(is_add?1:-1, dtype);
}
}
}
else{
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I would prefer a single declaration of:
numeric_type abtype = (dtype==HALF_TYPE)?:FLOAT_TYPE:dtype;
over all these branches :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will follow your suggestion and modified them, thx

INSTANTIATEHALFOP(double)

}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, I don't understand the purpose of this class, as opposed to some uint16_t, since cl_mem is blind to the underlying datatype anyway.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class is to add the data type of half and not just the type define of uint16_t. With the help of this class, we can let the code see "half" in the process without compiling errors.

@listenlink
Copy link
Author

Have refined the code to remove half on value scalar, please have a check, thank you!

@ptillet
Copy link

ptillet commented Feb 8, 2017

That looks good to me! Feel free to submit a PR in isaac's dev repository :)

@gongzg
Copy link

gongzg commented Feb 8, 2017

@ptillet will submit to your repo soon. @listenlink Thanks for your contribution.

@gongzg gongzg merged commit e54662b into intel:fp16 Feb 8, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants