Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make VList's children Rc'ed #3050

Merged
merged 16 commits into from Apr 2, 2023
Merged

Make VList's children Rc'ed #3050

merged 16 commits into from Apr 2, 2023

Conversation

futursolo
Copy link
Member

@futursolo futursolo commented Dec 19, 2022

Description

Separated from #3042 to address bundle size issue.
This pull request makes VList to store children with Option<Rc<Vec<VNode>>>.

This increases the performance of {self.children.clone()}, which increases for about ~5% when running the benchmark locally with SSR Benchmark based on the function_router example. However, since we do not have a benchmark that depends heavily on cloning children, the performance increase might not be fully demonstrated.
But since ChildrenRenderer<VNode> stores VNode in a Vec<VNode>, it still involves an allocation every time children is passed.
This will be addressed in #3042, which should give a further performance increase.

This is also not a breaking change.

Checklist

  • I have reviewed my own code
  • I have added tests

@futursolo futursolo added the A-yew Area: The main yew crate label Dec 19, 2022
github-actions[bot]
github-actions bot previously approved these changes Dec 19, 2022
@github-actions
Copy link

github-actions bot commented Dec 19, 2022

Visit the preview URL for this PR (updated for commit f36a4f1):

https://yew-rs-api--pr3050-rc-vlist-5ng46tcw.web.app

(expires Sun, 09 Apr 2023 08:51:49 GMT)

🔥 via Firebase Hosting GitHub Action 🌎

@github-actions
Copy link

github-actions bot commented Dec 19, 2022

Benchmark - SSR

Yew Master

Benchmark Round Min (ms) Max (ms) Mean (ms) Standard Deviation
Baseline 10 441.475 472.882 456.081 10.591
Hello World 10 765.489 823.710 786.136 18.570
Function Router 10 2798.263 2947.195 2865.960 47.193
Concurrent Task 10 1009.352 1011.246 1010.119 0.513

Pull Request

Benchmark Round Min (ms) Max (ms) Mean (ms) Standard Deviation
Baseline 10 400.329 416.155 407.255 4.165
Hello World 10 799.978 843.181 816.637 12.053
Function Router 10 2868.003 3048.499 2915.970 52.352
Concurrent Task 10 1009.032 1010.655 1010.209 0.456
Many Providers 10 2079.316 2189.794 2126.344 34.657

@github-actions
Copy link

github-actions bot commented Dec 19, 2022

Size Comparison

examples master (KB) pull request (KB) diff (KB) diff (%)
async_clock 99.157 101.879 +2.722 +2.745%
boids 169.247 171.810 +2.562 +1.514%
communication_child_to_parent 90.023 92.598 +2.574 +2.859%
communication_grandchild_with_grandparent 103.565 103.522 -0.043 -0.041%
communication_grandparent_to_grandchild 99.584 99.693 +0.109 +0.110%
communication_parent_to_child 87.276 89.931 +2.654 +3.041%
contexts 105.871 106.125 +0.254 +0.240%
counter 85.306 87.972 +2.666 +3.125%
counter_functional 85.491 88.307 +2.815 +3.293%
dyn_create_destroy_apps 88.087 90.819 +2.732 +3.102%
file_upload 99.393 102.020 +2.627 +2.643%
function_memory_game 162.298 164.169 +1.871 +1.153%
function_router 335.665 331.980 -3.685 -1.098%
function_todomvc 157.935 159.681 +1.746 +1.106%
futures 222.872 225.227 +2.354 +1.056%
game_of_life 105.903 108.117 +2.214 +2.090%
immutable 179.313 182.636 +3.322 +1.853%
inner_html 81.717 84.624 +2.907 +3.558%
js_callback 110.199 110.230 +0.031 +0.028%
keyed_list 197.192 198.554 +1.361 +0.690%
mount_point 84.885 87.732 +2.848 +3.355%
nested_list 111.360 111.226 -0.135 -0.121%
node_refs 92.496 94.783 +2.287 +2.473%
password_strength 1539.751 1542.321 +2.570 +0.167%
portals 95.641 95.771 +0.131 +0.137%
router 307.191 303.426 -3.766 -1.226%
simple_ssr 140.471 140.751 +0.280 +0.200%
ssr_router 372.402 368.772 -3.630 -0.975%
suspense 107.392 107.342 -0.050 -0.046%
timer 88.191 90.846 +2.654 +3.010%
todomvc 140.130 142.163 +2.033 +1.451%
two_apps 85.911 88.618 +2.707 +3.151%
web_worker_fib 149.982 152.561 +2.578 +1.719%
webgl 84.426 87.260 +2.834 +3.357%

⚠️ The following examples have changed their size significantly:

examples master (KB) pull request (KB) diff (KB) diff (%)
async_clock 99.157 101.879 +2.722 +2.745%
boids 169.247 171.810 +2.562 +1.514%
communication_child_to_parent 90.023 92.598 +2.574 +2.859%
communication_parent_to_child 87.276 89.931 +2.654 +3.041%
counter 85.306 87.972 +2.666 +3.125%
counter_functional 85.491 88.307 +2.815 +3.293%
dyn_create_destroy_apps 88.087 90.819 +2.732 +3.102%
file_upload 99.393 102.020 +2.627 +2.643%
function_memory_game 162.298 164.169 +1.871 +1.153%
function_router 335.665 331.980 -3.685 -1.098%
function_todomvc 157.935 159.681 +1.746 +1.106%
futures 222.872 225.227 +2.354 +1.056%
game_of_life 105.903 108.117 +2.214 +2.090%
immutable 179.313 182.636 +3.322 +1.853%
inner_html 81.717 84.624 +2.907 +3.558%
mount_point 84.885 87.732 +2.848 +3.355%
node_refs 92.496 94.783 +2.287 +2.473%
router 307.191 303.426 -3.766 -1.226%
timer 88.191 90.846 +2.654 +3.010%
todomvc 140.130 142.163 +2.033 +1.451%
two_apps 85.911 88.618 +2.707 +3.151%
web_worker_fib 149.982 152.561 +2.578 +1.719%
webgl 84.426 87.260 +2.834 +3.357%

@futursolo
Copy link
Member Author

Looks like function_router is not a good example for benchmarking bundle size for this particular pull request.

github-actions[bot]
github-actions bot previously approved these changes Dec 19, 2022
github-actions[bot]
github-actions bot previously approved these changes Dec 19, 2022
github-actions[bot]
github-actions bot previously approved these changes Dec 19, 2022
github-actions[bot]
github-actions bot previously approved these changes Dec 19, 2022
github-actions[bot]
github-actions bot previously approved these changes Dec 19, 2022
This reverts commit 3ca55be.
github-actions[bot]
github-actions bot previously approved these changes Dec 19, 2022
@futursolo
Copy link
Member Author

futursolo commented Dec 19, 2022

Twiggy:

       +2643 ┊ <T as alloc::slice::hack::ConvertVec>::to_vec::hc92fe50ec0e432f4

The size increase is due to that rustc can no longer determine whether Clone is used for VNode (VList's children's Vec) or not at compile time because VList now uses Rc::make_mut and Rc::try_unwrap. Most examples are simple and do not involve this usage and hence resulted in a size increase.

This pull request is optimised for applications that actually uses children.clone(), which a decrease can be seen in the function_router and router example. I would expect most application with reasonable complexity would include these usage.

I have added a Many Providers test which does not show too much of a difference in this pull request,

╭─────────────────┬───────┬──────────┬──────────┬───────────┬────────────────────╮
│ Benchmark       │ Round │ Min (ms) │ Max (ms) │ Mean (ms) │ Standard Deviation │
├─────────────────┼───────┼──────────┼──────────┼───────────┼────────────────────┤
│ Baseline        │ 10    │ 300.598  │ 301.481  │ 300.914   │ 0.265              │
│ Hello World     │ 10    │ 373.998  │ 382.672  │ 376.125   │ 3.212              │
│ Function Router │ 10    │ 1161.455 │ 1189.956 │ 1169.719  │ 10.572             │
│ Concurrent Task │ 10    │ 1009.627 │ 1016.863 │ 1012.009  │ 2.213              │
│ Many Providers  │ 10    │ 1146.857 │ 1183.132 │ 1158.769  │ 14.629             │
╰─────────────────┴───────┴──────────┴──────────┴───────────┴────────────────────╯

But with slight modification with the benchmark adapted for #3042, the performance would increase about 20%.

╭─────────────────┬───────┬──────────┬──────────┬───────────┬────────────────────╮
│ Benchmark       │ Round │ Min (ms) │ Max (ms) │ Mean (ms) │ Standard Deviation │
├─────────────────┼───────┼──────────┼──────────┼───────────┼────────────────────┤
│ Baseline        │ 10    │ 300.385  │ 301.135  │ 300.652   │ 0.274              │
│ Hello World     │ 10    │ 377.968  │ 385.530  │ 379.333   │ 2.535              │
│ Function Router │ 10    │ 1112.299 │ 1134.035 │ 1117.234  │ 7.269              │
│ Concurrent Task │ 10    │ 1009.125 │ 1012.555 │ 1010.737  │ 1.059              │
│ Many Providers  │ 10    │ 895.768  │ 903.255  │ 900.444   │ 2.456              │
╰─────────────────┴───────┴──────────┴──────────┴───────────┴────────────────────╯

I think the performance increase might worth the bundle size tradeoff in this case?

github-actions[bot]
github-actions bot previously approved these changes Dec 20, 2022
github-actions[bot]
github-actions bot previously approved these changes Dec 20, 2022
github-actions[bot]
github-actions bot previously approved these changes Dec 20, 2022
github-actions[bot]
github-actions bot previously approved these changes Dec 20, 2022
@futursolo futursolo enabled auto-merge (squash) December 21, 2022 01:30
@voidpumpkin
Copy link
Member

Would work for me, if PR gets updated.

github-actions[bot]
github-actions bot previously approved these changes Apr 2, 2023
@@ -14,7 +15,7 @@ enum FullyKeyedState {
#[derive(Clone, Debug)]
pub struct VList {
/// The list of child [VNode]s
pub(crate) children: Vec<VNode>,
pub(crate) children: Option<Rc<Vec<VNode>>>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe that's not relevant but I was thinking of using IArray would help here?

I see 2 possible values but I have not investigated in details:

  1. IArray uses Rc<[T]> instead of Rc<Vec<T>>. I guess that would save an allocation somewhere.
  2. IArray has a variant Static(&'static [T]), it can maybe used in place of None here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(The only issue is the hard requirement of VNode to implement ImplicitClone but I think we can allow that because in the end this would be desirable, see discussion #3022)

Copy link
Member Author

@futursolo futursolo Apr 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think VList implements DerefMut<Target = Vec<VNode>> which allows manipulation without cloning the entire array when the reference count is 1 in which we may not be able to keep this behaviour with IArray if it uses slices.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that you mention it, I think I had issues when converting the code because of that haha

Comment on lines +50 to +58
match self.children {
Some(ref m) => m,
None => {
// This is mutable because the Vec<VNode> is not Sync
static mut EMPTY: Vec<VNode> = Vec::new();
// SAFETY: The EMPTY value is always read-only
unsafe { &EMPTY }
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks a bit tricky overall. Is it possible to achieve this while using safety?

I'm also not sure I understand why we need an Option around the children type. Probably it is related to this whole thing.

Can't we have a default static value like this?

static EMPTY: Rc<Vec<VNode>> = Rc::new(Vec::new())

(Maybe a thread-local one but you see the idea)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since VNode is !Send, this makes a global static EMPTY: &Vec<VNode>= &Vec::new() inaccessible from all threads. (Rust will not compile this anyways...)

If we use thread_local!, this variable will only live for the period of the thread, not the period of the program, hence not 'static. This means that Rust cannot guarantee the reference will outlive the ownership. (Which is not quite right, since !Send types cannot live longer than the thread it resides on. So this is technically 'static for them.) In this case, I think the only way is to teach the Rust compiler a lesson with some additional knowledge.

image

However, we can still do it with safe Rust by leaking the memory.
If we don't have SSR, this might be the preferred method over unsafe since there is only 1 thread. But we also have SSR these days which will cause leaked memory for each thread ever created.

thread_local! {
    static EMPTY: &'static Vec<VNode> = Box::leak(Box::default());
}

EMPTY.with(|m| *m)

If there is a way to achieve this with safe Rust, I am also very curious to know.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a stupid question but... do we actually need to impl Deref and DerefMut on VList??

The only use in its own crate is for this and it's kinda non-brainer to work around:

diff --git a/packages/yew/src/virtual_dom/vlist.rs b/packages/yew/src/virtual_dom/vlist.rs
index fc211ce9..40c94d5d 100644
--- a/packages/yew/src/virtual_dom/vlist.rs
+++ b/packages/yew/src/virtual_dom/vlist.rs
@@ -43,6 +43,7 @@ impl Default for VList {
     }
 }
 
+/*
 impl Deref for VList {
     type Target = Vec<VNode>;
 
@@ -58,13 +59,16 @@ impl Deref for VList {
         }
     }
 }
+*/
 
+/*
 impl DerefMut for VList {
     fn deref_mut(&mut self) -> &mut Self::Target {
         self.fully_keyed = FullyKeyedState::Unknown;
         self.children_mut()
     }
 }
+*/
 
 impl VList {
     /// Creates a new empty [VList] instance.
@@ -135,7 +139,7 @@ impl VList {
         match self.fully_keyed {
             FullyKeyedState::KnownFullyKeyed => true,
             FullyKeyedState::KnownMissingKeys => false,
-            FullyKeyedState::Unknown => self.iter().all(|c| c.has_key()),
+            FullyKeyedState::Unknown => self.children.iter().flat_map(|x| x.iter()).all(|c| c.has_key()),
         }
     }
 }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the original intention for implementing VList with dereferencing to Vec<VNode> is to allow users use all methods available to a vector to CURD VLists.

E.g.: In bounce, I used VList as a Vec to recursively read all available html to be used with the <head /> element.
https://github.com/bounce-rs/bounce/blob/master/crates/bounce/src/helmet/comp.rs#L27
I didn't use any mutable operations in bounce helmet, but I can imagine a use case where manipulates VList would exist.

Comment on lines 71 to +73
pub const fn new() -> Self {
Self {
children: Vec::new(),
children: None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(To answer my question earlier: why do we need an option around it? It's because we want to make a const fn constructor.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition, Rc::new() also results in an allocation, even if the vector is empty. :)

Copy link
Member

@cecton cecton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I think it's fine! 🚀

@futursolo futursolo merged commit 9d7fafa into yewstack:master Apr 2, 2023
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-yew Area: The main yew crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants