Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Add Quantiles to Numerical analyses in AnalyzeSpark #436
What changes were proposed in this pull request?
Uses T-Digests¹ to lightly track quantiles in the numerical column analysis in Spark. Provides the structure for later replacement of histograms as we implement them (T-Digests can cheaply track a cdf, for which histograms are 1 derivative away).
Gives a direction towards fixing #290
How was this patch tested?
Extension of the