Add 'F_ShuffleNumbers' as Global Function #872

AnnieRuru · 2015-11-16T22:43:28Z

@MishimaHaruna
you are the only one left can play algorithm script with me =/

I kinda like to add this function into svn

have been play with

prontera,155,185,5  script  sakjdhaksdjh    1_F_MARIA,{
    mes "random item giver. Input how many items you want";
    next;
    input .@num, 1, 10;
    callfunc( "F_ShuffleNumbers", 0, .maxitem, .@output, .@num );
    for ( .@i = 0; .@i < .@num; .@i++ )
        getitem .item[.@output[.@i]], 1;
    close;
OnInit:
    freeloop true;
    for ( .@i = 501; .@i < 32768; .@i++ )
        if ( getitemname(.@i) != "null" )
            .item[.@j++] = .@i;
    .maxitem = .@j;
    end;
}

I prefer this function made by keyworld over this
http://www.fredosaurus.com/notes-cpp/misc/random-shuffle.html
because the above link will ask you to create a temporary array length equal to max-min
like the script I wrote, there is about 9000+ entry from itemDB
this one made by keyworld will only loop as few as how many index the user need

and BEFORE you say this is useless
there is a popular custom script that I think can use it right now
https://github.com/HerculesWS/Hercules/blob/master/npc/custom/quests/hunting_missions.txt#L75
although I was the one proposed that SQL method to Euphy, but I still think its ugly
if this function is merge, the script will be much cleaner

    callfunc "F_ShuffleNumbers", 0, getarraysize(.AllowMobs) -1, .@output, .Quests;
    for (set .@i,0; .@i<.Quests; set .@i,.@i+1) {
        setd "Mission"+.@i, .AllowMobs[.@output[.@i]];
        setd "Mission"+.@i +"_",0;
    }

MishimaHaruna · 2015-11-18T02:39:57Z

you are the only one left can play algorithm script with me =/

And are you dissatisfied with me, onee-sama? ;)

This algorithm is very interesting, I had to read it a few times to figure out what it was doing.
It is indeed very efficient, but are you aware that it doesn't allow certain outputs that would otherwise be possible?

Let's take for example (I'll pick a simple one just to show this, but it also happens with other inputs) the call callfunc("F_ShuffleNumbers", 0, 2, .@foo), to sort three numbers from the set [0, 1, 2].
There are six theoretical outputs: 0 1 2, 0 2 1, 1 0 2, 1 2 0, 2 0 1, 2 1 0. It's impossible for the function as it is now, to produce 1 2 0 as output.

When the function is called with those arguments, both .@range and .@count will be 3, so the while loop will be repeated three times (with .@i = {0, 1, 2} < 3). The first time, .@r will be generated from the set {0, 1, 2}, the second time {1, 2} and the third time {2}, for a total of six possible sequences:

{0, 1, 2} x {1, 2} x {2} = {
  (0, 1, 2),
  (0, 2, 2),
  (1, 1, 2),
  (1, 2, 2),
  (2, 1, 2),
  (2, 2, 2)
}

Which will produce the following outputs:

(I'll skip the steps and the intermediate values of the variables during the loop, I'm sure everyone would be bored and I'd get a cramp by typing them)

Sequence (0, 1, 2) -> tmp1 = [1, 1, 1]; tmp2 = [0, 1, 2]; output : [0, 1, 2]
Sequence (0, 2, 2) -> tmp1 = [1, 0, 1]; tmp2 = [0, 2, 1]; output : [0, 2, 1]
Sequence (1, 1, 2) -> tmp1 = [0, 1, 1]; tmp2 = [1, 0, 2]; output : [1, 0, 2]
Sequence (1, 2, 2) -> tmp1 = [0, 1, 1]; tmp2 = [1, 0, 2]; output : [1, 0, 2]
Sequence (2, 1, 2) -> tmp1 = [0, 1, 1]; tmp2 = [2, 1, 0]; output : [2, 1, 0]
Sequence (2, 2, 2) -> tmp1 = [0, 0, 1]; tmp2 = [2, 0, 1]; output : [2, 0, 1]

Do you have any more details about this algorithm? Where was it described? Did KeyWorld come up with it, or it's based on something known?

Ideally, we'd want an algorithm without the bias this one has (and that can produce every result)... but I can't think of a way that's as efficient as this one (and it's bothering me to no end :D).

If the aim is just to pick k elements from a list of N elements (with k much smaller than N), then the solution is simple and we could leverage the sparseness of the Hercules arrays with something similar to:

deletearray(.@flags);
for (.@i = 0; .@i < .@count; ++.@i) {
    do {
      .@pick = rand(.@static, .@static + .@range - 1);
    } while (.@flags[.@pick - .@static]);
   .@output[.@i] = .@pick;
}

but that becomes severely inefficient as k gets closer to N (unbounded time).

In any case, if we want to keep this algorithm despite the bias, it should be noted in a comment, so that people will know what they're using.

By the way, on an unrelated note, you might want to replace .@i++ with ++.@i, since it's more efficient (the first one needs to cache the previous value of .@i to return it, while the second one doesn't)

AnnieRuru · 2015-11-18T18:01:56Z

And are you dissatisfied with me, onee-sama? ;)

OH MY ! Haru-sama, I'm sooo happy in fact
should call me imouto-chan =D

but are you aware that it doesn't allow certain outputs that would otherwise be possible?

AHHHHH !!!
%@#&(~%=: ...... I used this shuffle algorithm for about 3 years
https://rathena.org/board/topic/73771-battleground-emperium-with-shuffle-team-players/?p=154174
..... only today I know it gave bias result

now have to throw it away

Ideally, we'd want an algorithm without the bias this one has (and that can produce every result)... but I can't think of a way that's as efficient as this one (and it's bothering me to no end :D).

the best shuffle algorithm to shuffle whole array is Fisher Yates shuffle
https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle
just google search shuffle cards, the word Fisher Yates shuffle pops up non-stop

In any case, if we want to keep this algorithm despite the bias, it should be noted in a comment, so that people will know what they're using.

no, now I know that shuffle algorithm give bias result, I wont use it anymore

Do you have any more details about this algorithm? Where was it described? Did KeyWorld come up with it, or it's based on something known?

I got it from here
https://www.eathena.ws/board/index.php?s=&showtopic=269819&view=findpost&p=1477370

keyworld always invent new stuffs and new technique, always amazed us
like the sudoku script ...... eathena forum down, I can't give the link now

http://upaste.me/59b622000817023d1
continue on next post

EDIT: I made a mistake on the last one =/

AnnieRuru · 2015-11-18T18:24:43Z

so the non-bias shuffle algorithm, that gives consistent result

on 600 tries, it give 1 of the 6 combinations averagely 100 times, is

fisher yates shuffle modern method
your snippet there
keyworld's infinite loop shuffle

yours and keyworld's one are quite near, so let me test it on the performance

http://upaste.me/3e21220016ec96594

yours = 7894 mili-seconds
keyworld = 405531 mili-seconds

so I guess yours better

EDIT: err ... need more performance test ...

wait ...
WHAT THE ?
fisher yates shuffle perform even worst than keyworld's
fisher yates = 702~731 mili-seconds

so I guess has to stick to yours =/
perform fastest with hercules script engine

change to haru's snippet available

forgot the whitespace ...

AnnieRuru · 2015-11-18T19:38:15Z

tested with

prontera,153,180,5  script  F ShuffleNumbers 2  1_F_MARIA,{
    callfunc "F_ShuffleNumbers", -50, 50, .@output, 5;
    for ( .@i = 0; .@i < 5; ++.@i )
        .@print$[.@i] = .@output[.@i];
    dispbottom implode( .@print$[.@i], "," );
    end;
}

http://adrianquark.blogspot.my/2008/09/how-to-shuffle-array-correctly.html
after I read this, start to scare a little bit and ...

prontera,150,180,5  script  F ShuffleNumbers    1_F_MARIA,{
    dispbottom "=== "+ strnpcinfo(0) +" ===";
    .@loop = 2400;
    freeloop true;
    for ( .@l = 0; .@l < .@loop; ++.@l ) {
        callfunc "F_ShuffleNumbers", 0, 3, .@output;
        for ( .@i = 0; .@i < 4; ++.@i )
            .@tmp$[.@i] = .@output[.@i] +"";
        .@print$ = implode( .@tmp$, "," );
        setarray .@resultset$,
            "0,1,2,3","0,1,3,2","0,2,1,3","0,2,3,1","0,3,1,2","0,3,2,1",
            "1,0,2,3","1,0,3,2","1,2,0,3","1,2,3,0","1,3,0,2","1,3,2,0",
            "2,0,1,3","2,0,3,1","2,1,0,3","2,1,3,0","2,3,1,0","2,3,0,1",
            "3,0,1,2","3,0,2,1","3,1,0,2","3,1,2,0","3,2,0,1","3,2,1,0";
        for ( .@i = 0; .@i < 24; ++.@i ) {
            if ( .@print$ == .@resultset$[.@i] ) {
                .@resultset[.@i]++;
                break;
            }
        }
    }
    for ( .@i = 0; .@i < 24; ++.@i ) {
        dispbottom "result set "+( .@i +1 )+" '"+ .@resultset$[.@i] +"' has "+ .@resultset[.@i] +" times.";
        .@total += .@resultset[.@i];
    }
    dispbottom "total = "+ .@total;
    end;
}

still can give every one of the 24 combinations correctly

MishimaHaruna · 2015-11-19T13:53:13Z

yours = 7894 mili-seconds
keyworld = 405531 mili-seconds

so I guess yours better

EDIT: err ... need more performance test ...

wait ...
WHAT THE ?
fisher yates shuffle perform even worst than keyworld's
fisher yates = 702~731 mili-seconds

My algorithm and KeyWorld's are very similar, but mine uses arrays (which are very optimized in the Hercules scripting engine), while his uses string operations. Operations on strings are always slow, and compare is especially slow (depends on the OS/compiler implementation, and is likely to perform especially poorly on MSVC for long strings, unless they changed their strstr implementation recently). In his while loop, he does an O(n) (if lucky) operation for each attempt to generate a non already used number, while I do an operation (an array lookup is a DBMap lookup) that's potentially O(log(n)) (DBMap's RB-tree lookup), or O(1) (DBMap cache hit).

The problem in both, is that they require an unbounded amount of draws, and it gets worse and worse when the array is very large and you want to shuffle it completely. It's not a problem if you only need to draw a relatively small amount of shuffled numbers, but as the size of the output and the size of the input get closer, the chance to generate a "new" number gets very small.

In defense of Fisher-Yates, it should perform well when shuffling large sets completely. The only bottleneck I see is the partial deletearray operation. That's a slow operation because it needs to move the data in the array, and since our arrays are DBMap-backed, it'll cause a very large amount of lookups and edits (and rb-tree rebalancing). I'd be curious to know what happens if you optimized it this way:

function        script  modern_fisher_yates     {
        deletearray getarg(1);
        .@size = .@tmpsize = getarg(0);
        if ( .@size <= 0 )
                return 0;
        for ( .@i = 0; .@i < .@size; ++.@i )
                .@array_[.@i] = .@i;
        for ( .@i = 0; .@i < .@size; ++.@i ) {
                .@rand = rand(.@tmpsize);
                set getelementofarray( getarg(1), .@i ), .@array_[.@rand];
                .@array_[.@rand] = .@array[--.@tmpsize]; // Optimization
                // deletearray(.@array_[.@tmpsize], 1); // This is actually not necessary
        }
        return .@size;
}

Instead of removing elements from the middle of the array, I'm replacing the used elements with the last one (then optionally removing the last element -- that's much faster than removing from the middle, but even so not necessary, so I commented it out). That should perform better.

MishimaHaruna · 2015-11-19T15:00:55Z

Here's the test on my laptop (took the average out of 100 iterations of each algorithm -- results seem more consistent when I repeat it):

Each result set shows, in order, Haru_shuffle, keyworld_inf_shuffle, modern_fisher_yates, modern_fisher_yates_haru
script used: http://upaste.me/98067f

with .@samplesize 100
[Debug]: script debug : 0 110000000 : time used = 1.03 ms
[Debug]: script debug : 0 110000000 : time used = 3.65 ms
[Debug]: script debug : 0 110000000 : time used = 1.25 ms
[Debug]: script debug : 0 110000000 : time used = 0.73 ms

with arrays of size 1000
[Debug]: script debug : 0 110000000 : time used = 16.85 ms
[Debug]: script debug : 0 110000000 : time used = 386.40 ms
[Debug]: script debug : 0 110000000 : time used = 148.39 ms
[Debug]: script debug : 0 110000000 : time used = 10.89 ms

with arrays of size 5000
[Debug]: script debug : 0 110000000 : time used = 171.96 ms
[Debug]: script debug : 0 110000000 : time used = 13902.73 ms
[Debug]: script debug : 0 110000000 : time used = 13504.53 ms
[Debug]: script debug : 0 110000000 : time used = 135.17 ms

Test environment was:

[Info]: Hercules 64-bit for Mac OS X
[Info]: Git revision (src): '82e8740e31c64df43c203f847c81a10d9b74adb2'
[Info]: Git revision (scripts): '82e8740e31c64df43c203f847c81a10d9b74adb2'
[Info]: OS version: 'Mac OS X 10.11.1 15B42 [x86_64]'
[Info]: CPU: 'Intel Core i5 (2.3 GHz) [2]'
[Info]: Compiled with Clang v7.0.0
[Info]: Compile Flags: -g -O2 -pipe -ffast-math -fvisibility=hidden -Wall -Wextra -Wno-sign-compare -Wno-unused-parameter -Wempty-body -Wnewline-eof -Wint-conversion -Wenum-conversion -Wshorten-64-to-32 -Wconstant-conversion -Wbool-conversion -Wformat-security -Wno-format-nonliteral -Wno-switch -Wno-missing-field-initializers -Wshadow -fno-strict-aliasing -ggdb -DHAVE_EXECINFO -DMAXCONN=16384 -I../common -DHAS_TLS -DHAVE_SETRLIMIT -DHAVE_STRNLEN -DDEBUG -DDISABLE_RENEWAL -I/usr/include

AnnieRuru · 2015-11-19T15:44:31Z

yeah I already tested it before this post came out

erm .. now I re-read the wiki again.
the method I used deletearray is apparently pencil-and-paper method =/

yours that slashed out the index is modern method
so I was wrong about it

EDIT: of course I tested with the script in post ... #5 for the consistency and bug test

change again to Fisher Yate's shuffle modern method

- ... fix the example function name ...

AnnieRuru · 2015-11-19T16:22:47Z

the problem is ... now the script in post#1
need freeloop now ...

AnnieRuru · 2015-11-22T07:31:34Z

eathena forum up
this is keyworld's sudoku script
https://www.eathena.ws/board/index.php?s=&showtopic=273636&view=findpost&p=1500521
I'm sure haru will be amazed as well =D

MishimaHaruna · 2015-11-23T11:06:31Z

Wow, truly amazing. That's surely an elegant and compact way to check a sudoku solution :o

MishimaHaruna · 2015-12-13T02:10:25Z

Merged as bb214d4, thank you!

Add 'F_ShuffleNumbers' as Global Function

545a80a

MishimaHaruna added the status:inprogress Issue is being worked on / the pull request is still a WIP label Nov 16, 2015

AnnieRuru added 2 commits November 19, 2015 03:19

Add 'F_ShuffleNumbers' as Global Function

0b8ff08

change to haru's snippet available

Add 'F_ShuffleNumbers' as Global Function

4862d9a

forgot the whitespace ...

AnnieRuru added 2 commits November 19, 2015 23:49

*Add 'F_ShuffleNumbers' as Global Function

507d3e6

change again to Fisher Yate's shuffle modern method

Add 'F_ShuffleNumbers' as Global Function

ea798b6

- ... fix the example function name ...

MishimaHaruna closed this Dec 13, 2015

MishimaHaruna removed the status:code-review Awaiting code review label Dec 13, 2015

AnnieRuru mentioned this pull request Apr 2, 2018

Cleaned up custom battleground scripts rathena/rathena#3025

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add 'F_ShuffleNumbers' as Global Function #872

Add 'F_ShuffleNumbers' as Global Function #872

AnnieRuru commented Nov 16, 2015

MishimaHaruna commented Nov 18, 2015

AnnieRuru commented Nov 18, 2015

AnnieRuru commented Nov 18, 2015

AnnieRuru commented Nov 18, 2015

MishimaHaruna commented Nov 19, 2015

MishimaHaruna commented Nov 19, 2015

AnnieRuru commented Nov 19, 2015

AnnieRuru commented Nov 19, 2015

AnnieRuru commented Nov 22, 2015

MishimaHaruna commented Nov 23, 2015

MishimaHaruna commented Dec 13, 2015

Add 'F_ShuffleNumbers' as Global Function #872

Add 'F_ShuffleNumbers' as Global Function #872

Conversation

AnnieRuru commented Nov 16, 2015

MishimaHaruna commented Nov 18, 2015

AnnieRuru commented Nov 18, 2015

AnnieRuru commented Nov 18, 2015

AnnieRuru commented Nov 18, 2015

MishimaHaruna commented Nov 19, 2015

MishimaHaruna commented Nov 19, 2015

AnnieRuru commented Nov 19, 2015

AnnieRuru commented Nov 19, 2015

AnnieRuru commented Nov 22, 2015

MishimaHaruna commented Nov 23, 2015

MishimaHaruna commented Dec 13, 2015