Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add 'F_ShuffleNumbers' as Global Function #872

Closed
wants to merge 5 commits into from

Conversation

AnnieRuru
Copy link
Contributor

@MishimaHaruna
you are the only one left can play algorithm script with me =/

I kinda like to add this function into svn

have been play with

prontera,155,185,5  script  sakjdhaksdjh    1_F_MARIA,{
    mes "random item giver. Input how many items you want";
    next;
    input .@num, 1, 10;
    callfunc( "F_ShuffleNumbers", 0, .maxitem, .@output, .@num );
    for ( .@i = 0; .@i < .@num; .@i++ )
        getitem .item[.@output[.@i]], 1;
    close;
OnInit:
    freeloop true;
    for ( .@i = 501; .@i < 32768; .@i++ )
        if ( getitemname(.@i) != "null" )
            .item[.@j++] = .@i;
    .maxitem = .@j;
    end;
}

I prefer this function made by keyworld over this
http://www.fredosaurus.com/notes-cpp/misc/random-shuffle.html
because the above link will ask you to create a temporary array length equal to max-min
like the script I wrote, there is about 9000+ entry from itemDB
this one made by keyworld will only loop as few as how many index the user need

and BEFORE you say this is useless
there is a popular custom script that I think can use it right now
https://github.com/HerculesWS/Hercules/blob/master/npc/custom/quests/hunting_missions.txt#L75
although I was the one proposed that SQL method to Euphy, but I still think its ugly
if this function is merge, the script will be much cleaner

    callfunc "F_ShuffleNumbers", 0, getarraysize(.AllowMobs) -1, .@output, .Quests;
    for (set .@i,0; .@i<.Quests; set .@i,.@i+1) {
        setd "Mission"+.@i, .AllowMobs[.@output[.@i]];
        setd "Mission"+.@i +"_",0;
    }

@MishimaHaruna MishimaHaruna added the status:inprogress Issue is being worked on / the pull request is still a WIP label Nov 16, 2015
@MishimaHaruna
Copy link
Member

you are the only one left can play algorithm script with me =/

And are you dissatisfied with me, onee-sama? ;)

This algorithm is very interesting, I had to read it a few times to figure out what it was doing.
It is indeed very efficient, but are you aware that it doesn't allow certain outputs that would otherwise be possible?

Let's take for example (I'll pick a simple one just to show this, but it also happens with other inputs) the call callfunc("F_ShuffleNumbers", 0, 2, .@foo), to sort three numbers from the set [0, 1, 2].
There are six theoretical outputs: 0 1 2, 0 2 1, 1 0 2, 1 2 0, 2 0 1, 2 1 0. It's impossible for the function as it is now, to produce 1 2 0 as output.

When the function is called with those arguments, both .@range and .@count will be 3, so the while loop will be repeated three times (with .@i = {0, 1, 2} < 3). The first time, .@r will be generated from the set {0, 1, 2}, the second time {1, 2} and the third time {2}, for a total of six possible sequences:

{0, 1, 2} x {1, 2} x {2} = {
  (0, 1, 2),
  (0, 2, 2),
  (1, 1, 2),
  (1, 2, 2),
  (2, 1, 2),
  (2, 2, 2)
}

Which will produce the following outputs:

(I'll skip the steps and the intermediate values of the variables during the loop, I'm sure everyone would be bored and I'd get a cramp by typing them)

Sequence (0, 1, 2) -> tmp1 = [1, 1, 1]; tmp2 = [0, 1, 2]; output : [0, 1, 2]
Sequence (0, 2, 2) -> tmp1 = [1, 0, 1]; tmp2 = [0, 2, 1]; output : [0, 2, 1]
Sequence (1, 1, 2) -> tmp1 = [0, 1, 1]; tmp2 = [1, 0, 2]; output : [1, 0, 2]
Sequence (1, 2, 2) -> tmp1 = [0, 1, 1]; tmp2 = [1, 0, 2]; output : [1, 0, 2]
Sequence (2, 1, 2) -> tmp1 = [0, 1, 1]; tmp2 = [2, 1, 0]; output : [2, 1, 0]
Sequence (2, 2, 2) -> tmp1 = [0, 0, 1]; tmp2 = [2, 0, 1]; output : [2, 0, 1]

Do you have any more details about this algorithm? Where was it described? Did KeyWorld come up with it, or it's based on something known?

Ideally, we'd want an algorithm without the bias this one has (and that can produce every result)... but I can't think of a way that's as efficient as this one (and it's bothering me to no end :D).

If the aim is just to pick k elements from a list of N elements (with k much smaller than N), then the solution is simple and we could leverage the sparseness of the Hercules arrays with something similar to:

deletearray(.@flags);
for (.@i = 0; .@i < .@count; ++.@i) {
    do {
      .@pick = rand(.@static, .@static + .@range - 1);
    } while (.@flags[.@pick - .@static]);
   .@output[.@i] = .@pick;
}

but that becomes severely inefficient as k gets closer to N (unbounded time).

In any case, if we want to keep this algorithm despite the bias, it should be noted in a comment, so that people will know what they're using.

By the way, on an unrelated note, you might want to replace .@i++ with ++.@i, since it's more efficient (the first one needs to cache the previous value of .@i to return it, while the second one doesn't)

@AnnieRuru
Copy link
Contributor Author

And are you dissatisfied with me, onee-sama? ;)

OH MY ! Haru-sama, I'm sooo happy in fact
should call me imouto-chan =D

but are you aware that it doesn't allow certain outputs that would otherwise be possible?

AHHHHH !!!
%@#&(~%=: ...... I used this shuffle algorithm for about 3 years
https://rathena.org/board/topic/73771-battleground-emperium-with-shuffle-team-players/?p=154174
..... only today I know it gave bias result

now have to throw it away

Ideally, we'd want an algorithm without the bias this one has (and that can produce every result)... but I can't think of a way that's as efficient as this one (and it's bothering me to no end :D).

the best shuffle algorithm to shuffle whole array is Fisher Yates shuffle
https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle
just google search shuffle cards, the word Fisher Yates shuffle pops up non-stop

In any case, if we want to keep this algorithm despite the bias, it should be noted in a comment, so that people will know what they're using.

no, now I know that shuffle algorithm give bias result, I wont use it anymore

Do you have any more details about this algorithm? Where was it described? Did KeyWorld come up with it, or it's based on something known?

I got it from here
https://www.eathena.ws/board/index.php?s=&showtopic=269819&view=findpost&p=1477370

keyworld always invent new stuffs and new technique, always amazed us
like the sudoku script ...... eathena forum down, I can't give the link now

http://upaste.me/59b622000817023d1
continue on next post

EDIT: I made a mistake on the last one =/

@AnnieRuru
Copy link
Contributor Author

so the non-bias shuffle algorithm, that gives consistent result

  • on 600 tries, it give 1 of the 6 combinations averagely 100 times, is
  1. fisher yates shuffle modern method
  2. your snippet there
  3. keyworld's infinite loop shuffle

yours and keyworld's one are quite near, so let me test it on the performance

http://upaste.me/3e21220016ec96594

yours = 7894 mili-seconds
keyworld = 405
531 mili-seconds

so I guess yours better

EDIT: err ... need more performance test ...

wait ...
WHAT THE ?
fisher yates shuffle perform even worst than keyworld's
fisher yates = 702~731 mili-seconds

so I guess has to stick to yours =/
perform fastest with hercules script engine

change to haru's snippet available
@AnnieRuru
Copy link
Contributor Author

tested with

prontera,153,180,5  script  F ShuffleNumbers 2  1_F_MARIA,{
    callfunc "F_ShuffleNumbers", -50, 50, .@output, 5;
    for ( .@i = 0; .@i < 5; ++.@i )
        .@print$[.@i] = .@output[.@i];
    dispbottom implode( .@print$[.@i], "," );
    end;
}

http://adrianquark.blogspot.my/2008/09/how-to-shuffle-array-correctly.html
after I read this, start to scare a little bit and ...

prontera,150,180,5  script  F ShuffleNumbers    1_F_MARIA,{
    dispbottom "=== "+ strnpcinfo(0) +" ===";
    .@loop = 2400;
    freeloop true;
    for ( .@l = 0; .@l < .@loop; ++.@l ) {
        callfunc "F_ShuffleNumbers", 0, 3, .@output;
        for ( .@i = 0; .@i < 4; ++.@i )
            .@tmp$[.@i] = .@output[.@i] +"";
        .@print$ = implode( .@tmp$, "," );
        setarray .@resultset$,
            "0,1,2,3","0,1,3,2","0,2,1,3","0,2,3,1","0,3,1,2","0,3,2,1",
            "1,0,2,3","1,0,3,2","1,2,0,3","1,2,3,0","1,3,0,2","1,3,2,0",
            "2,0,1,3","2,0,3,1","2,1,0,3","2,1,3,0","2,3,1,0","2,3,0,1",
            "3,0,1,2","3,0,2,1","3,1,0,2","3,1,2,0","3,2,0,1","3,2,1,0";
        for ( .@i = 0; .@i < 24; ++.@i ) {
            if ( .@print$ == .@resultset$[.@i] ) {
                .@resultset[.@i]++;
                break;
            }
        }
    }
    for ( .@i = 0; .@i < 24; ++.@i ) {
        dispbottom "result set "+( .@i +1 )+" '"+ .@resultset$[.@i] +"' has "+ .@resultset[.@i] +" times.";
        .@total += .@resultset[.@i];
    }
    dispbottom "total = "+ .@total;
    end;
}

still can give every one of the 24 combinations correctly

@MishimaHaruna
Copy link
Member

yours = 7894 mili-seconds
keyworld = 405
531 mili-seconds

so I guess yours better

EDIT: err ... need more performance test ...

wait ...
WHAT THE ?
fisher yates shuffle perform even worst than keyworld's
fisher yates = 702~731 mili-seconds

My algorithm and KeyWorld's are very similar, but mine uses arrays (which are very optimized in the Hercules scripting engine), while his uses string operations. Operations on strings are always slow, and compare is especially slow (depends on the OS/compiler implementation, and is likely to perform especially poorly on MSVC for long strings, unless they changed their strstr implementation recently). In his while loop, he does an O(n) (if lucky) operation for each attempt to generate a non already used number, while I do an operation (an array lookup is a DBMap lookup) that's potentially O(log(n)) (DBMap's RB-tree lookup), or O(1) (DBMap cache hit).

The problem in both, is that they require an unbounded amount of draws, and it gets worse and worse when the array is very large and you want to shuffle it completely. It's not a problem if you only need to draw a relatively small amount of shuffled numbers, but as the size of the output and the size of the input get closer, the chance to generate a "new" number gets very small.

In defense of Fisher-Yates, it should perform well when shuffling large sets completely. The only bottleneck I see is the partial deletearray operation. That's a slow operation because it needs to move the data in the array, and since our arrays are DBMap-backed, it'll cause a very large amount of lookups and edits (and rb-tree rebalancing). I'd be curious to know what happens if you optimized it this way:

function        script  modern_fisher_yates     {
        deletearray getarg(1);
        .@size = .@tmpsize = getarg(0);
        if ( .@size <= 0 )
                return 0;
        for ( .@i = 0; .@i < .@size; ++.@i )
                .@array_[.@i] = .@i;
        for ( .@i = 0; .@i < .@size; ++.@i ) {
                .@rand = rand(.@tmpsize);
                set getelementofarray( getarg(1), .@i ), .@array_[.@rand];
                .@array_[.@rand] = .@array[--.@tmpsize]; // Optimization
                // deletearray(.@array_[.@tmpsize], 1); // This is actually not necessary
        }
        return .@size;
}

Instead of removing elements from the middle of the array, I'm replacing the used elements with the last one (then optionally removing the last element -- that's much faster than removing from the middle, but even so not necessary, so I commented it out). That should perform better.

@MishimaHaruna
Copy link
Member

Here's the test on my laptop (took the average out of 100 iterations of each algorithm -- results seem more consistent when I repeat it):

Each result set shows, in order, Haru_shuffle, keyworld_inf_shuffle, modern_fisher_yates, modern_fisher_yates_haru
script used: http://upaste.me/98067f

with .@samplesize 100
[Debug]: script debug : 0 110000000 : time used = 1.03 ms
[Debug]: script debug : 0 110000000 : time used = 3.65 ms
[Debug]: script debug : 0 110000000 : time used = 1.25 ms
[Debug]: script debug : 0 110000000 : time used = 0.73 ms
with arrays of size 1000
[Debug]: script debug : 0 110000000 : time used = 16.85 ms
[Debug]: script debug : 0 110000000 : time used = 386.40 ms
[Debug]: script debug : 0 110000000 : time used = 148.39 ms
[Debug]: script debug : 0 110000000 : time used = 10.89 ms
with arrays of size 5000
[Debug]: script debug : 0 110000000 : time used = 171.96 ms
[Debug]: script debug : 0 110000000 : time used = 13902.73 ms
[Debug]: script debug : 0 110000000 : time used = 13504.53 ms
[Debug]: script debug : 0 110000000 : time used = 135.17 ms

Test environment was:

[Info]: Hercules 64-bit for Mac OS X
[Info]: Git revision (src): '82e8740e31c64df43c203f847c81a10d9b74adb2'
[Info]: Git revision (scripts): '82e8740e31c64df43c203f847c81a10d9b74adb2'
[Info]: OS version: 'Mac OS X 10.11.1 15B42 [x86_64]'
[Info]: CPU: 'Intel Core i5 (2.3 GHz) [2]'
[Info]: Compiled with Clang v7.0.0
[Info]: Compile Flags: -g -O2 -pipe -ffast-math -fvisibility=hidden -Wall -Wextra -Wno-sign-compare -Wno-unused-parameter -Wempty-body -Wnewline-eof -Wint-conversion -Wenum-conversion -Wshorten-64-to-32 -Wconstant-conversion -Wbool-conversion -Wformat-security -Wno-format-nonliteral -Wno-switch -Wno-missing-field-initializers -Wshadow -fno-strict-aliasing -ggdb -DHAVE_EXECINFO -DMAXCONN=16384 -I../common -DHAS_TLS -DHAVE_SETRLIMIT -DHAVE_STRNLEN -DDEBUG -DDISABLE_RENEWAL -I/usr/include

@AnnieRuru
Copy link
Contributor Author

yeah I already tested it before this post came out

erm .. now I re-read the wiki again.
the method I used deletearray is apparently pencil-and-paper method =/

yours that slashed out the index is modern method
so I was wrong about it

EDIT: of course I tested with the script in post ... #5 for the consistency and bug test

change again to Fisher Yate's shuffle modern method
- ... fix the example function name ...
@AnnieRuru
Copy link
Contributor Author

the problem is ... now the script in post#1
need freeloop now ...

@AnnieRuru
Copy link
Contributor Author

eathena forum up
this is keyworld's sudoku script
https://www.eathena.ws/board/index.php?s=&showtopic=273636&view=findpost&p=1500521
I'm sure haru will be amazed as well =D

@MishimaHaruna
Copy link
Member

Wow, truly amazing. That's surely an elegant and compact way to check a sudoku solution :o

@MishimaHaruna MishimaHaruna added type:enhancement Issue describes an enhancement or feature that should be implemented component:scripts Affecting the scripts and NPCs status:code-review Awaiting code review and removed status:inprogress Issue is being worked on / the pull request is still a WIP labels Nov 23, 2015
@MishimaHaruna
Copy link
Member

Merged as bb214d4, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:scripts Affecting the scripts and NPCs type:enhancement Issue describes an enhancement or feature that should be implemented
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants