[AutoML] Reservoir sample dataset statistics

Currently dataset statistics within AutoML are calculated from the first 1,000 rows of a dataset. Instead, we should be calculating statistics from a random sample of 1,000 rows. (First 1,000 rows could be biased if they are sorted by label, any other column, time of collection, etc.) We can use reservoir sampling to obtain a random sample of a fixed size in a single pass over the dataset