SAMPLE function not sampling randomly #3730
Last updated: 2015-08-28 13:41:54 +0200
Date: 2015-05-28 09:58:52 +0200
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:37.0) Gecko/20100101 Firefox/37.0
if you have a column with only two rows, the SAMPLE function does not pick each approximately 50% of the time. something is biasing the SAMPLE draws.
Steps to Reproduce:
here's the sql command.
SELECT * FROM ( SELECT 1 AS col UNION ALL SELECT 2 AS col ) AS temp SAMPLE 1;
but you have to run it a thousand times, however you prefer. here's the R code to do that with MonetDB.R--
here's a reproducible example using R code to repeat the sampling 1000 times. in both SAMPLE examples below, the database pulls the 2 less than 200 times out of 1000. shouldn't it be close to 500 out of 1000? this seems not random (misleading to users?) sorry if i'm misunderstanding something.. thank you!!
start in an empty directory somewhere
pid <- monetdb.server.start( batfile )
db <- dbConnect( MonetDB.R() , "monetdb://localhost:50000/test" , wait = TRUE )
dbGetQuery( db , "SELECT 1 AS col UNION ALL SELECT 2 AS col" )
out <- NULL
out <- NULL
ALSO not random
in both of the above examples of R output, more than 4 out of 5 draws were 1s instead of 2s.
half 1s half 2s
Date: 2015-06-02 13:50:32 +0200
For complete details, see http//devmonetdborg/hg/MonetDB?cmd=changeset;node=91c472eabc48
Date: 2015-06-03 14:20:44 +0200
Looking into this, in particular the behaviour of rand() on our beloved Windows.
Date: 2015-06-03 15:26:08 +0200
Could not reproduce on Windows 7 Professional (64bits) with MonetDB Oct2014-SP3, both 32 and 64bits. Also could not find any flaw with random number generator on Windows.
Date: 2015-08-10 20:41:20 +0200
sorry, this has not been fixed on windows using the Jul2015 release
MonetDB 5 server v11.21.1 "Jul2015"
Date: 2015-08-11 14:56:41 +0200
For complete details, see http//devmonetdborg/hg/MonetDB?cmd=changeset;node=29e2ba371d58
Date: 2015-08-11 15:22:27 +0200
For complete details, see http//devmonetdborg/hg/MonetDB?cmd=changeset;node=5c206e0510fe
Date: 2015-08-23 19:45:49 +0200
just tested this- looks like it's been solved on the latest testing version. thanks thanks!!
MonetDB 5 server v11.21.3 "Jul2015"
Date: 2015-08-28 13:41:54 +0200
Jul2015 has been released.
The text was updated successfully, but these errors were encountered: