Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sketching from adapters #5365

Merged
merged 7 commits into from
Mar 7, 2020
Merged

Sketching from adapters #5365

merged 7 commits into from
Mar 7, 2020

Conversation

RAMitchell
Copy link
Member

Overhaul of GPU sketching code to support sketching on external data.

Todo:
Verify peak memory usage

Some performance charts:
AdapterDeviceSketch, DeviceSketch, DeviceSketchOld and WQSketch
Sketch performance varying number of columns

@trivialfis
Copy link
Member

trivialfis commented Feb 26, 2020

I'm a little bit concerned about removing weight, we should do some experiments to estimate the impact on various datasets with different characteristics.

@hcho3
Copy link
Collaborator

hcho3 commented Feb 26, 2020

Weighted quantile sketch is one of prominent contribution of XGBoost paper (2016). Is it necessary to remove it?

@RAMitchell
Copy link
Member Author

We cant currently access weights via the adapter constructor - they get added after in different c_api functions. If we can change the c_api to get weights on dmatrix construction we can do it.

@trivialfis
Copy link
Member

Can we create an empty DMatrix handle at first?

@RAMitchell RAMitchell marked this pull request as ready for review March 3, 2020 23:02
@RAMitchell
Copy link
Member Author

Current implementation preserves existing behaviour, using weights to compute quantiles when available from the DMatrix. When we implement DMatrix from adapters we can revisit how to access weights in this case.

src/common/hist_util.cu Show resolved Hide resolved
}
// Count the entries in each column and exclusive scan
void GetColumnSizesScan(int device,
dh::caching_device_vector<size_t>* column_sizes_scan,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I defined bst_row_t and bst_feature_t and we should use them more often.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted. If something is a sum I think it is reasonable to use size_t. For example a sum of bst_row_t can exceed 32 bits.

src/common/hist_util.cu Outdated Show resolved Hide resolved
src/common/hist_util.cu Show resolved Hide resolved
src/common/hist_util.cu Show resolved Hide resolved
src/common/hist_util.cu Outdated Show resolved Hide resolved
src/common/hist_util.cu Outdated Show resolved Hide resolved
src/common/hist_util.cu Outdated Show resolved Hide resolved
src/common/hist_util.cu Outdated Show resolved Hide resolved
@trivialfis
Copy link
Member

trivialfis commented Mar 6, 2020

I don't think the maker function can be a cause of fatal error right? It's just a helper function for type deduction

@trivialfis
Copy link
Member

@RAMitchell

This is the whole definition of thrust::make_transform_iterator:

template <class AdaptableUnaryFunction, class Iterator>
inline __host__ __device__
transform_iterator<AdaptableUnaryFunction, Iterator>
make_transform_iterator(Iterator it, AdaptableUnaryFunction fun)
{
  return transform_iterator<AdaptableUnaryFunction, Iterator>(it, fun);
} // end make_transform_iterator

Other than it's a host device function, there's no difference with your implementation. Could you take a deeper look into the fatal error you encountered? I think it's worth the debugging effort.

@RAMitchell
Copy link
Member Author

My implementation explicitly specifies the return type. The issue has always beem automatic return type deduction on msvc. I can look further.

@trivialfis
Copy link
Member

trivialfis commented Mar 6, 2020

Yup thanks! You can just print the return type by debugger or use typeid(RetType).name(). If that's the root cause than there's nothing we can do. But I'm still interested in what other type can MSVC deduce. ;-) It's just an unification algorithm, should not be way too weird.

@RAMitchell
Copy link
Member Author

These are the two return types for my version and thrust:

thrust::transform_iterator<__nv_dl_wrapper_t<__nv_dl_tag<void (__cdecl*)(xgboost::data::CupyAdapter *,unsigned __int64,unsigned __int64,float,xgboost::common::SketchContainer *,int),&xgboost::common::ProcessBatch<xgboost::data::CupyAdapter>,1>,xgboost::data::CupyAdapterBatch const >,thrust::counting_iterator<unsigned __int64,thrust::use_default,thrust::use_default,thrust::use_default>,xgboost::data::COOTuple,thrust::use_default>

thrust::transform_iterator<__nv_dl_wrapper_t<__nv_dl_tag<void (__cdecl*)(xgboost::data::CupyAdapter *,unsigned __int64,unsigned __int64,float,xgboost::common::SketchContainer *,int),&xgboost::common::ProcessBatch<xgboost::data::CupyAdapter>,2>,xgboost::data::CupyAdapterBatch const >,thrust::counting_iterator<unsigned __int64,thrust::use_default,thrust::use_default,thrust::use_default>,thrust::use_default,thrust::use_default>

@trivialfis
Copy link
Member

Let me take another look today. It seems weird that the GetRowStride is defined twice and have to pull all data to host.

@RAMitchell RAMitchell merged commit a38e7bd into dmlc:master Mar 7, 2020
sriramch added a commit to sriramch/xgboost that referenced this pull request Mar 11, 2020
sriramch added a commit to sriramch/xgboost that referenced this pull request Mar 11, 2020
@lock lock bot locked as resolved and limited conversation to collaborators Jun 24, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants