Skip to content

Add tsort#359

Closed
akiradeveloper wants to merge 10 commits into
uutils:masterfrom
akiradeveloper:feature/add-tsort-v1
Closed

Add tsort#359
akiradeveloper wants to merge 10 commits into
uutils:masterfrom
akiradeveloper:feature/add-tsort-v1

Conversation

@akiradeveloper
Copy link
Copy Markdown
Contributor

Signed-off-by: Akira Hayakawa ruby.wktk@gmail.com


Another test project is here

https://github.com/akiradeveloper/rust-tsort

Testing

It generates random graph and compares both outputs of GNU-version and Rust-version. I think Rust-version works correctly.

Performance

The current version doesn't perform well for especially large scale dataset.

Here is a experimental result from the above test project.

The test is first create DAG of N nodes and N(N-1)/2 edges and then feed the dataset into both versions. Note, t_new is of Rust-version and t_ref is of GNU-version.

N increases from top to bottom as logarithmic scale (10, 20, 50, 100, 200, 500, 1000, 2000, 5000). When the N is small. Rust-version is not bad however, it loses more the higher the N is. At the last case, it's 10 times worse.

The algorithm is O(|V|+|E|) which is O(|V|^2) in this case. I will improve the algorithm if I find better one.

TEST30000 result: OK, acyclic:true, time: t_new/t_ref=0.008438441/0.006900673
TEST30001 result: OK, acyclic:true, time: t_new/t_ref=0.009235107/0.006930367
TEST30002 result: OK, acyclic:true, time: t_new/t_ref=0.009223152/0.007113545
TEST30003 result: OK, acyclic:true, time: t_new/t_ref=0.013038735/0.007669914
TEST30004 result: OK, acyclic:true, time: t_new/t_ref=0.034065732/0.011519491
TEST30005 result: OK, acyclic:true, time: t_new/t_ref=0.186134332/0.03995111
TEST30006 result: OK, acyclic:true, time: t_new/t_ref=0.862109813/0.149578491
TEST30007 result: OK, acyclic:true, time: t_new/t_ref=4.584371179/0.62385227
TEST30008 result: OK, acyclic:true, time: t_new/t_ref=42.988573293/4.288322895

Signed-off-by: Akira Hayakawa <ruby.wktk@gmail.com>
Comment thread tsort/tsort.rs Outdated
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be let n = start_nodes.shift().unwrap();. You wouldn't need the next line in this case.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regression I said in below (refactoring looping n_out_edges) was found caused by this change.

diff --git a/tsort/tsort.rs b/tsort/tsort.rs
index 1a514a9..b0c6283 100644
--- a/tsort/tsort.rs
+++ b/tsort/tsort.rs
@@ -154,7 +154,7 @@ impl Graph {
         }

         while !start_nodes.is_empty() {
-            let n = start_nodes[0].clone();
+            let n = start_nodes.shift().unwrap();
             start_nodes.remove(0);

             self.result.push(n.clone());

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh. You need to remove start_nodes.remove(0) below. Otherwise, it will try to remove the node twice.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The double removing removed and it came back to life. Thanks.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem. :)

@Arcterus
Copy link
Copy Markdown
Collaborator

Everything else seems fine. The performance issue can be worked out later. 👍

Comment thread tsort/tsort.rs Outdated
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo.

Signed-off-by: Akira Hayakawa <ruby.wktk@gmail.com>
Signed-off-by: Akira Hayakawa <ruby.wktk@gmail.com>
Signed-off-by: Akira Hayakawa <ruby.wktk@gmail.com>
Signed-off-by: Akira Hayakawa <ruby.wktk@gmail.com>
Signed-off-by: Akira Hayakawa <ruby.wktk@gmail.com>
Signed-off-by: Akira Hayakawa <ruby.wktk@gmail.com>
Signed-off-by: Akira Hayakawa <ruby.wktk@gmail.com>
Signed-off-by: Akira Hayakawa <ruby.wktk@gmail.com>
Signed-off-by: Akira Hayakawa <ruby.wktk@gmail.com>
@akiradeveloper
Copy link
Copy Markdown
Contributor Author

All proposals applied.

No regression found and performance really improved. I think refactoring that inner loop did it.

I will submit another pull request

akira@Hercules:~/rust-tsort$ rake run_tests
TEST0 result: OK, acyclic:false, time: t_new/t_ref=0.008577448/0.006712845
TEST1 result: OK, acyclic:false, time: t_new/t_ref=0.007850375/0.006539402
TEST10000 result: OK, acyclic:false, time: t_new/t_ref=0.009184312/0.00813009
TEST10001 result: OK, acyclic:false, time: t_new/t_ref=0.009967213/0.007744685
TEST10002 result: OK, acyclic:false, time: t_new/t_ref=0.009206604/0.006665816
TEST10003 result: OK, acyclic:true, time: t_new/t_ref=0.011407594/0.00696269
TEST10004 result: OK, acyclic:false, time: t_new/t_ref=0.009013078/0.006981327
TEST10005 result: OK, acyclic:false, time: t_new/t_ref=0.010660868/0.007739061
TEST10006 result: OK, acyclic:true, time: t_new/t_ref=0.008568894/0.00687528
TEST10007 result: OK, acyclic:false, time: t_new/t_ref=0.008746539/0.006904187
TEST10008 result: OK, acyclic:false, time: t_new/t_ref=0.008943133/0.006871977
TEST10009 result: OK, acyclic:true, time: t_new/t_ref=0.011155742/0.007051623
TEST2 result: OK, acyclic:true, time: t_new/t_ref=0.009606108/0.006861813
TEST20000 result: OK, acyclic:true, time: t_new/t_ref=0.016767014/0.008741542
TEST20001 result: OK, acyclic:true, time: t_new/t_ref=0.109895699/0.031641821
TEST20002 result: OK, acyclic:true, time: t_new/t_ref=0.38382173/0.102129412
TEST20003 result: OK, acyclic:true, time: t_new/t_ref=0.030457973/0.011940991
TEST20004 result: OK, acyclic:true, time: t_new/t_ref=0.427178843/0.112283163
TEST20005 result: OK, acyclic:true, time: t_new/t_ref=0.222073998/0.061800087
TEST20006 result: OK, acyclic:true, time: t_new/t_ref=0.506039759/0.134829509
TEST20007 result: OK, acyclic:true, time: t_new/t_ref=0.057163792/0.018484568
TEST20008 result: OK, acyclic:true, time: t_new/t_ref=0.079317631/0.024882092
TEST20009 result: OK, acyclic:true, time: t_new/t_ref=0.022170054/0.009995337
TEST3 result: OK, acyclic:false, time: t_new/t_ref=0.008100273/0.006701319
TEST30000 result: OK, acyclic:true, time: t_new/t_ref=0.008127883/0.006822402
TEST30001 result: OK, acyclic:true, time: t_new/t_ref=0.008253873/0.006932948
TEST30002 result: OK, acyclic:true, time: t_new/t_ref=0.009098254/0.006977331
TEST30003 result: OK, acyclic:true, time: t_new/t_ref=0.012857694/0.009357036
TEST30004 result: OK, acyclic:true, time: t_new/t_ref=0.030292198/0.011266431
TEST30005 result: OK, acyclic:true, time: t_new/t_ref=0.136233344/0.040125623
TEST30006 result: OK, acyclic:true, time: t_new/t_ref=0.562640418/0.149092327
TEST30007 result: OK, acyclic:true, time: t_new/t_ref=2.460193113/0.613513872
TEST30008 result: OK, acyclic:true, time: t_new/t_ref=16.838372892/4.308812779
TEST4 result: OK, acyclic:true, time: t_new/t_ref=0.007966765/0.006680662
TEST5 result: OK, acyclic:false, time: t_new/t_ref=0.007779832/0.007144855
failures=[]
max ratio=4.010004052524504
min ratio=1.1905286178404915
ave ratio=2.5372806169406505
std dev=1.194178433942098

Comment thread tsort/tsort.rs
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be written as

print!("{name} v{version}

Usage:
    {name} [OPTIONS] FILE

{usage}",
        name = NAME,
        vers = VERSION,
        usage = getopts::usage(...));

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll probably refactor all of the utilities to use this at some point.

@akiradeveloper akiradeveloper mentioned this pull request Jul 20, 2014
@Arcterus
Copy link
Copy Markdown
Collaborator

Yay! 😄

Yes, it's faster because of the inner loop. In case you're interested, the reason for this is the loop no longer has bounds checks, and the elements in the Vec are no longer being shifted on each loop iteration.

@Arcterus
Copy link
Copy Markdown
Collaborator

I assume you want me to merge the other pull request, so I can close this one, correct?

@akiradeveloper
Copy link
Copy Markdown
Contributor Author

I assume you want me to merge the other pull request, so I can close this one, correct?

Yes. Please close and merge #361

@Arcterus Arcterus closed this Jul 20, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants