Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Great job, I have a few questions #8

Open
timonweb opened this issue Mar 6, 2020 · 23 comments
Open

Great job, I have a few questions #8

timonweb opened this issue Mar 6, 2020 · 23 comments

Comments

@timonweb
Copy link

timonweb commented Mar 6, 2020

Hello Eddy,

I just wanted to say "Hi" and thank you for doing this! I've been dreaming about LiveView for Django since the day it first got announced. Definitely going to play with Reactor.

I have a few questions:

  1. What's the feature parity compared to the original Phoenix LiveView, does it implement everything or there's something missing?
  2. Have you experienced any performance drawbacks? I know that Elixir is fast, and WebSockets are super cheap on Erlang, but what about Python/Django?
  3. Do you consider this library more or less complete? Are you're planning to add more features?
  4. And sorry about a question, but why do you use coffeescript? Is there any benefit? I'm asking because I think it might be a contribution blocker to a lot of people (me including).

Other than that, great job and thank you again!

@edelvalle
Copy link
Owner

Hey Tim thanks for the interest, let me answer your questions and then go over other topics.

  1. I'm not making a clone of Phoneix LiveView, I'm taking the core idea of server side rendered component, with the state in the backend, forwarding the relevant events from the front-end the back-end to modify that state and re-render, send a diff to the front-end, and apply it with morphdom. But this implementation is not a port. I like a lot VueJS so I took some ideas from then in the notation they use for event binding.
  2. Well... Doing simple queries (getting something, saving something, loading a QuerySet) during an event forwarding, takes ~40ms from what I had seen. But had not tested this with 1.000 users concurrently.
  3. This lib gets the job done for now, for my needs, I'm developing it as an experiment, is not extensively tested, but it works quite well and for sure it has it's limitations.
  4. I user CoffeeScript because I find overwhelming writing plane JavaScript with all this =>{};;()=== stuff. No other reason. Modern JS had taken a lot of ideas from CoffeeScript, and that's good, but is just comfort for me.

I'm planning to use this in production soon, but I don't advise it, use at your own risk.
In the test test directory there is a todo list mvc implemented using the lib, if you wanna play around with that.

Thanks a lot for the note, you are the first person asking about this. This is my side project, I use it because it speeds up things and I don't have to write APIs and Front-end side stuff, just as most regular Django as I can (html over the wire) and some interactive components I write them in using this. Also this library has support out of the box to subscribe to server side notifications using the pubsub that Django Channels offers.

@jbjuin
Copy link
Collaborator

jbjuin commented Mar 6, 2020

Hi Eddy, Tim,

let me multiply Tim's kudos for this project !
I forked it few months ago after searching for "liveview for django" :)

I'm currently working with it for a website in beta with few users. I deployed it with gunicorn with uvicorn workers. For now the deployment with django 3.0 is not possible due to incompatibilities with channels.

I made 3 major adjustments to speed things up:

I do not have facts about the exact speed-up but it's quite visible and made this usable for me.

I've added the following features:

  • an event handle (receive_xxx) can return False to not re-render the component
  • each component has a reference to its parent and an emit method to send event to it

I also like vuejs and I made 2 companions apps:

  • a store (ala vuex) with mutable global state that can be shared between components with a commit method and a "subscribe_store" for components, multiple store possible,
  • a router (ala vue-router) with "router-link" and "router-view" components with some routing settings with url parameters (sadly not very clean for now but barely usable... )

Anyway, it's a wonder to use and sped up the development by a huge amount for this first website.
The backend -> frontend push is a pleasure to have at hand, especially when using celery.

I didn't pull your last updates with the vue syntax @click, etc.
So cool !

I also made some experiments:

  • it works pretty well with some js lib, I made cool viz using graphjs (one could start an "easier to extend" plotly Dash),
  • I made a file uploader serializing to b64, I could make a simple lib that defines a re-usable upload component,
  • backend driven animation (ala liveview demo): updating a div rotation angle in real-time... just for fun. It works but it's quite heavy on CPU... a GIF here: https://ibb.co/qRs0vLt it's not fluid in the image but was on my computer.

I have a few questions:

  • Eddy do you plan to accept merge requests on this project ?
  • I have an issue with using threads in event handlers... I guess it's due to how channels work (with a thread pool). But I cannot find a way to fix this. For instance when 2 components listen to the same store mutation I cannot update them concurrently when the event arrives. Did you experiment with this ?

Congrats again with this awesome project !

@edelvalle
Copy link
Owner

Hahaha, this got even better... 😄 I'm happy

Let me go point by point:

For now the deployment with django 3.0 is not possible due to incompatibilities with channels.

I'm running the current version with Django 3 already

use jinja2 for a x5-10 speed up

I was guessing that would work, but I didn't want to add jinja2 as hard dependency to the project. It should be up to the lib user. But If jinja is present reactor should provide the template tags for it.

use the morphdom speed up trick

Cool!

Do not use "diff match patch" in the backend: just send back the full HTML. With jinja2 it's faster.

I was not expecting this, is this diff that slow? Is it even slow for normal django templates? Should we allow this feature go on or off using configuration?

On features you added:

an event handle (receive_xxx) can return False to not re-render the component

I don't want to write return True all over the place, we should keep the safe behavior (that is "re-render") as the default, so if you don't want to re-render (which the less common case) then you could return False, and None is treated as True (re-render)

each component has a reference to its parent and an emit method to send event to it

Cool! I needed this and did it as self.send('event_name', id=parent_id). Making it a method, great idea!

On the two companion apps...

Let me take a look to understand better what you did.

On your experiments:

  • I also need to make graphs in the front-end, it would be cool to have a way to call JS in the front-end and pass it some JSON, or render the JSON and let the Js lib read it.
  • File upload, that's tricky had done some too, would be interesting to take a look.
  • Animations? That's insane.... hahahaha, is not made for this, but whatever!

Answers

do you plan to accept merge requests on this project ?

Yes, please!

I have an issue with using threads in event handlers... I guess it's due to how channels work (with a thread pool). But I cannot find a way to fix this. For instance when 2 components listen to the same store mutation I cannot update them concurrently when the event arrives. Did you experiment with this ?

I'm not sure what you did there, but async is concurrent but not parallel and threads in Python are not real threads is time-slicing, so they are concurrent but not parallel.

@jbjuin
Copy link
Collaborator

jbjuin commented Mar 7, 2020

Hahaha, this got even better... smile I'm happy

Me too :)

Let me go point by point:

For now the deployment with django 3.0 is not possible due to incompatibilities with channels.

I'm running the current version with Django 3 already

I think I had troubles with gunicorn + uvicorn and django 3... some weird "no traceback" kick out of the app... it was too weird and I did not have enough time to investigate and just stick to 2.4.

use jinja2 for a x5-10 speed up

I was guessing that would work, but I didn't want to add jinja2 as hard dependency to the project. It should be up to the lib user. But If jinja is present reactor should provide the template tags for it.

Totally agree.

use the morphdom speed up trick

Cool!

Do not use "diff match patch" in the backend: just send back the full HTML. With jinja2 it's faster.

I was not expecting this, is this diff that slow? Is it even slow for normal django templates? Should we allow this feature go on or off using configuration?

I will make a full performance testing example, just to be more factual about my own claims.

On features you added:

an event handle (receive_xxx) can return False to not re-render the component

I don't want to write return True all over the place, we should keep the safe behavior (that is "re-render") as the default, so if you don't want to re-render (which the less common case) then you could return False, and None is treated as True (re-render)

This is exactly what I did. You have to explicitly return False to not re-render. Another approach could be to raise a NoRender exception, even more explicit, but weird to raise something that is not an error...

each component has a reference to its parent and an emit method to send event to it

Cool! I needed this and did it as self.send('event_name', id=parent_id). Making it a method, great idea!

Yes it's quite useful. I had to tweak the ComponentHierarchy when building children to pass link to self - the parent. I did not realise that one could pass the id to send. So one could just pass the parent id.

On the two companion apps...

Let me take a look to understand better what you did.

On your experiments:

* I also need to make graphs in the front-end, it would be cool to have a way to call JS in the front-end and pass it some JSON, or render the JSON and let the Js lib read it.

The way I did it is the second you mention: I just update a data in python and dump it as json in the template, the js lib graphjs just parse it from json and redraw the graph on update.

* File upload, that's tricky had done some too, would be interesting to take a look.

I used js to manage the file input, convert to b64, send it to the python component, which just convert it back to bytes. At the end it's quite clean and straightforward...

* Animations? That's insane.... hahahaha, is not made for this, but whatever!

Yep ! It was just to see if we can do something similar to the LiveView demo animation.

Answers

do you plan to accept merge requests on this project ?

Yes, please!

Nice ! Here is my plan: the code I made is quite dirty for now (too much experiments :) ), so in the coming month I will clean it up, merge your last updates and push it back on github. I will add the 2 companions apps: store and router as other repositories. I will try to add examples + update the README to explain the jinja2 implementation & the performance improvements (with a benchmark).

I have an issue with using threads in event handlers... I guess it's due to how channels work (with a thread pool). But I cannot find a way to fix this. For instance when 2 components listen to the same store mutation I cannot update them concurrently when the event arrives. Did you experiment with this ?

I'm not sure what you did there, but async is concurrent but not parallel and threads in Python are not real threads is time-slicing, so they are concurrent but not parallel.

Let me explain a little bit more: in the store I track which component should be updated when an attribute of the state is mutated. When multiple components "listen" to the same store state attribute, I would like to alert all of them concurrently. I don't need their answer, I just want them to update whatever the order... So threading here should work, it does not. The listening components are updated one after the other, in the order they were registered to the store. I think it's not a problem of GIL + threads, it seems that spawning threads do not work at all... Whatever it's not an emergency for now.

Again, great work you did with that project ! It may become something big :)

I'm totally convinced with it becoming a very convenient "SPA" like building framework with a lot of pros:

  • django based ! (Security, robustness, ORM/migrations, users/groups/perms, libraries, etc.)
  • great "SPA" UX,
  • push capability,
  • few js,
  • no API to maintain,

To be fair there are some cons:

  • new,
  • performances with large user base,
  • websockets are not always supported by browsers. I have an ubuntu phone (not updated with ubports) and my browser do not support ws. It may be great to implement a fallback to long polling (although I have no idea how to do this now :) )

I can't stop thinking about all the possibilities for our team. We may gain an order of magnitude for speed of dev, that's crazy.

Thank you again !

@edelvalle
Copy link
Owner

edelvalle commented Mar 7, 2020

I think I had troubles with gunicorn + uvicorn and django 3... some weird "no traceback" kick out of the app... it was too weird and I did not have enough time to investigate and just stick to 2.4.

Had this problem, upgrade your uvicorn and should be solved

This is exactly what I did. You have to explicitly return False to not re-render. Another approach could be to raise a NoRender exception, even more explicit, but weird to raise something that is not an error...

To raise is even more weird and try catching is slow in Python and also is goto oriented programming, so.. no way.

I had to tweak the ComponentHierarchy when building children to pass link to self - the parent. I did not realise that one could pass the id to send. So one could just pass the parent id.

Yes do that! That could be the first PR. 😄 since sounds very straight forward and does not change the current logic.


I played with the isEqualNode optimization in morphdom, I just have to ignore in the first render of the component to be able to transpile the event binds but then can be applied, I will make a PR for this and you can review it. #10

For the moment I don't want to remove the html diffing, because I'm scared of the volume of traffic that can cause, but had not used this at scale to tell you if CPU load on the server is high or is too much outgoing traffic sending full html. I don't know, I did it because it sounded reasonable and LiveView in Phoenix does something similar.

On the topic of companion apps, I'm curious to see what you did. For the moment, I'm not making full pages on this, just simple components that need to be interactive or I want to update them in real-time. I try to use as much of old-style django as I can.

I hope this project is useful, even if it is just to we both.

@timonweb
Copy link
Author

timonweb commented Mar 8, 2020

Guys, I'm glad that I've ignited this discussion! There are a lot of interesting insights!

@jbjuin
Copy link
Collaborator

jbjuin commented Mar 9, 2020

I think I had troubles with gunicorn + uvicorn and django 3... some weird "no traceback" kick out of the app... it was too weird and I did not have enough time to investigate and just stick to 2.4.

Had this problem, upgrade your uvicorn and should be solved

Haaa great !

This is exactly what I did. You have to explicitly return False to not re-render. Another approach could be to raise a NoRender exception, even more explicit, but weird to raise something that is not an error...

To raise is even more weird and try catching is slow in Python and also is goto oriented programming, so.. no way.

Seems alright then.

I had to tweak the ComponentHierarchy when building children to pass link to self - the parent. I did not realise that one could pass the id to send. So one could just pass the parent id.

Yes do that! That could be the first PR. smile since sounds very straight forward and does not change the current logic.

Ok.

I played with the isEqualNode optimization in morphdom, I just have to ignore in the first render of the component to be able to transpile the event binds but then can be applied, I will make a PR for this and you can review it. #10

Ok.

For the moment I don't want to remove the html diffing, because I'm scared of the volume of traffic that can cause, but had not used this at scale to tell you if CPU load on the server is high or is too much outgoing traffic sending full html. I don't know, I did it because it sounded reasonable and LiveView in Phoenix does something similar.

Yeah here we reach the compromise CPU/IO that will depend on each specific case. A lot of small components that update often or big components slowly changing.

I will take your code again from scratch and run benchmarks on all those options so that we step forward based on facts. Maybe we can agree on some "typical" usecases ?

To start, I suggest:

  1. A large single component HTML table.
  2. A long list where each row is a component that update randomly.

Measuring:

  • render time,
  • CPU load,
  • and bandwith usage (size of render sent back to the client)

What do you think ? Eddy, Tim ?

We may have to also design a benchmark to assess more specifically the multi-user limits... memory especially I guess.

On the topic of companion apps, I'm curious to see what you did. For the moment, I'm not making full pages on this, just simple components that need to be interactive or I want to update them in real-time. I try to use as much of old-style django as I can.

Those apps have "hard corners" for now, not really polished, especially the router. But anyway I will push them as soon as possible, which is within a month range at best.

I hope this project is useful, even if it is just to we both.

I'm convinced it will be for others as well, I guess we are not so special :)

@timonweb Yes, I wanted to talk to Eddy anyway but your questions woke me up !

@edelvalle
Copy link
Owner

Making those benchmarks? Amazing, thanks!

@jbjuin don't worry about hard corners, this project is all hard corners... just push so I get a better idea of what you are talking about, and then I can disagree.. 😆

@jbjuin
Copy link
Collaborator

jbjuin commented Mar 9, 2020

Just to start, first benchmarks give:

Bench1

The 10000 rows HTML table in single component.

  • templates + diff ~ 1500ms to load data and receive them in the browser... then DOM building take some more time... not evaluated here.

  • jinja2 + diff ~ 1200 ms

  • templates no diff ~ 450ms

  • jinja2 no diff ~ 150ms

  • payload received is the same for all since we make a full update ~2.20 Mo

  • Not playing with morphdom here.

  • Using default diff-match-patch, the full python version.

@edelvalle
Copy link
Owner

I get it, this is just how much it takes to server-side render and transmit to the front-end.

I aim for simplicity and good defaults, so I we have to compile and have our own diffing later we can do it. But now, do you suggest to remove diff-match-patch?

@jbjuin
Copy link
Collaborator

jbjuin commented Mar 10, 2020

I will make the bench 2: update a single row in a big table... in that case diff match patch will probably win when using django templates, the question is versus jinja2.

We also have to evaluate results using the C++ python wrapped diff match patch. I tried and it seems that it did not improve a lot, but I will formalize that.

We could start with sane defaults and propose an option "per component" in the class definition... since this is very component specific.

@jbjuin
Copy link
Collaborator

jbjuin commented Mar 13, 2020

Hi,
here is a the second benchmark. It's quite manual so variations are important. I just used the timing given in the console log when sending an event and getting the answer... so server HTML rendering + sending data.

Bench1

The 10000 rows HTML table.

  • templates + diff ~ 1500ms to load data and receive them in the browser... then DOM building take some time

  • jinja2 + diff ~ 1200 ms

  • templates no diff ~ 450ms

  • jinja2 no diff ~ 150ms

  • Not playing with morphdom here.

  • Using default diff-match-patch, the full python version.

Bench2

A 10000 row HTML table.
Update 10/100/1000 random rows.
I noted for few tries the time between sending event and receiving response and the payload size.

10 rows

  • templates + diff ~ 530 ms / 1.17Ko
  • jinja2 + diff ~ 250 ms / 1.17Ko
  • templates no diff ~ 450 ms / 2.2Mo
  • jinja2 no diff ~ 120 ms / 2.2Mo

100 rows

  • templates + diff ~ 1140 ms / 16Ko
  • jinja2 + diff ~ 850 ms / 16Ko
  • templates no diff ~ 450 ms / 2.2Mo
  • jinja2 no diff ~ 130 ms / 2.2Mo

1000 rows

  • templates + diff ~ 1500 ms / 2.18Mo
  • jinja2 + diff ~ 1200 ms / 2.18Mo
  • templates no diff ~ 450 ms / 2.2Mo
  • jinja2 no diff ~ 130 ms / 2.2Mo

I did not expect this. It seems that diffing time increase with number of lines. I'm running this locally so transport time is negligible... so this is not in real conditions.

@edelvalle
Copy link
Owner

I think responsiveness I think is the most important. Even if that costs a bit more of data-transfer. Optimize for raw speed is better. For the client and also for the server side.

@jbjuin
Copy link
Collaborator

jbjuin commented Apr 24, 2020

On the responsiveness side, I tried to use Mako templates.
They are ~15% faster than jinja2.

I wanted to try "spitfire" (see https://github.com/youtube/spitfire) but it's only for python 2.7.

I just saw the elixir liveview official trailer by ChrisMccord. I envy the elixir templating system and how they build the diff, so smart: minimal payload + fast as hell...

With the situation right now in France I have a little more time (no job commute) to work on this. I will try to publish my changes soon.

@edelvalle
Copy link
Owner

I think you had gone very very deep down the rabbit hole and that's enough.... but is good to know.
I'm using reactor in production, btw... at my company https://iskra.ml

@jbjuin
Copy link
Collaborator

jbjuin commented Apr 26, 2020

Congrats for you company ! Quite a promise ! I thought you were employed in another company, is this new ? Do you provide GPUs ?

On the reactor side I really love it. I'm just catching with your last development and will propose some pull requests as soon as I have the time to clean my mess.

@runekaagaard
Copy link

runekaagaard commented Jun 10, 2020

Dear @edelvalle et. al.

I've also been very inspired by Elixirs Liveview and wanted to drop by and say hi.

After going through the whole journey of writing code for the web during the last decade, and having felt the pains of the different programming models:

  • HTML only: Hard to do something dynamic. Otherwise good
  • HTML with enhanced jQuery: Disconnect between server html and client code. Error prone
  • Ajax and Handlebars: Hard to manage state. Performance issues on large pages
  • Ajax and React: A big new abstraction. Code duplication client/server. Slow initial load

combined with ever decreasing network latency - LiveView feels like a breath of fresh air. And one that has the potential to give an advantage like for instance Djangos Admin had back in the day. Therefore it's very exciting following your progress with a LiveView in the Django space!

I've been working on a similar project over at https://github.com/runekaagaard/hypergen. Some of the differences to your project are:

  • Uses Python/Cython as a template language. Not html
  • Doesn't use websockets (yet?) and therefore no app state on the server
  • Not tied to Django (yet?)

It's still not ready for production, but there are some demoes that work.

The first place where I reached a hard problem was with making an efficient pipeline between the server and the client. I - like you - also experienced with diff-match-path, and found it to be a non-optimal solution. You have to render the entire page on every request and it's a pretty expensive algorithm both on the server and the client.

After the initial full page load optimally you would only render on the server and send to the client:

  • Updated parts
  • New parts and where to insert them
  • Ids of deleted parts
  • Ids of parts moving to a new place

By part i mean a reasonably sized chunk of HTML. And then we are right back to a cache invalidation problem - we need a way to know if any of the data used by a part has changed since last time we rendered it.

This led me to begin a new project at https://github.com/runekaagaard/django-treeform. It's still just in the beginning. The idea is that we need a declarative way of describing a transformation from stuff in the database to a tree structure with leafs made of scalars. When we have that, we can reason about the transformation a lot better, which allows for asking questions about the transformation without actually running it. In the context of Django, that is useful for:

  • getting metadata about the used fields such as labels, datatypes and validators
  • calculating a hash for the transformation, thus enabling caching
  • solving the N+1 problem by knowing what to select based on child nodes
  • separating the structure of the data from the actual data
  • generating database triggers that maintains a version id of both the transformation as a whole and chosen sub parts.

With such a system in place, it gets a lot easier to know if you should render a given part or not. Then in the context of django-reactor a template could look something like:

{% for guy in guys %}
  {% cachepart guy %}
    <h1>This part is expensive to render</h1>
    <p>We better cache it!</p>
    {% for friend in guy.friends %}
      <p>
        Good thing the database automatically invalidates 
        the cache of the guy when one of his friends change.
      </p>
    {% endfor %}
  {% endcachepart %}
{% endfor %}

Then it would be trivial to send only changed parts to the frontend.

Anyway, thanks for reading. It felt like a good place to talk with like-minded individuals :)

P.S.: Did you ever consider using the javascript from Elixir LiveView and reverse engineering a python backend for it? It already seems like a pretty mature piece of software.

@edelvalle
Copy link
Owner

Nice! Very interesting observations.

Wow, I really like the ideas in https://github.com/runekaagaard/django-treeform

That's an interesting (declarative and functional) way to pipeline data transformation, and could allow to reason about it. Yes

The first place where I reached a hard problem was with making an efficient pipeline between the server and the client. I - like you - also experienced with diff-match-path, and found it to be a non-optimal solution. You have to render the entire page on every request and it's a pretty expensive algorithm both on the server and the client.

Yes, @jbjuin benchmarked this and is really slow, but I'm also concerned with network latency and sending the whole HTML.

P.S.: Did you ever consider using the javascript from Elixir LiveView and reverse engineering a python backend for it? It already seems like a pretty mature piece of software.

Doing this means implement the same protocol they use, could be possible to mimic, and use it as the golden standard. But when I started this I just experimented with many things, to do that I would have to do a complete rewrite or start a new project.

I would like to experiment, with the most optimal option for python, not rendering in the backend. But having a client template langauge and pushing state to the client. In this way there is no need for rendering in the backend everytime.

It could be nice to create an abstraction that is django independent because I would like to use this also with https://github.com/encode/starlette, for example.

Right now I'm quite busy in other areas of my company that are not web related, but when I focus again in the web part I will think harder about this.

Thanks for your post.... is very inspiring!

@mback2k
Copy link

mback2k commented Jun 19, 2020

This is really great work! While people are chiming in with previous iterations, I tried this in the past:

Maybe some ideas can be taken from these proof of concept iterations as well. My goal was to just be able to use Django-based templates on the backend and frontend while the frontend parts would refresh on QuerySet or Model changes. Unfortunately lack of time made this stuck on Django 1.11.

@edelvalle
Copy link
Owner

edelvalle commented Jul 16, 2020

@jbjuin you were sooooooo damn right about diff_match_patch, it is sooooooooooooo slow.... I removed it in 1.7.0b0

@jbjuin
Copy link
Collaborator

jbjuin commented Jul 19, 2020

Hi Edy, numbers talk :)
Did you try the c++ wrapped version ? I tested it very quickly and it seemed that it was slow also but I didn't measure it...
I still need to find time to push the router and the store.
How is you company doing ?

@edelvalle
Copy link
Owner

@jbjuin to remove external dependencies and still have some level of diffing I'm using in the latest release difflib that is in the stdlib, is line based diff, not as effective as diff_match_patch but is fast and provides a middle point in bandwidth and performance.

Company is going, we are going to release a new product this week: prediction of any column in an csv or excel... and providing those predictions over GUI or API.

@jbjuin
Copy link
Collaborator

jbjuin commented Jul 23, 2020

Hey, difflib is a great idea ! Nice ! Good luck for the product launch !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants