New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reimplement livereload in a simpler and better way #2385
Conversation
This discards the dependency on 'livereload' and 'tornado'. This implementation is based on stdlib http.server and is much smaller, while not being a downgrade in any way. This fixes issue 2061: multiple file changes no longer cause the site to repeatedly rebuild. It also makes reloads much more responsive; they're "instant" rather than being on a 1-second polling interval or such. I also was careful to keep backwards compatibility, including providing logging messages that feel just like the old ones. Each HTML page gets JavaScript injected into it (like before) but now, instead of connecting to WebSocket, it requests a long-polling endpoint, which blocks until a newer version ("epoch") of the site is available - then the page knows to reload itself. After a minute without events, the endpoint will just return the unchanged epoch (to avoid timeouts), then the page requests the same endpoint again. The "downtime" in between requests is not a problem because these are not events that need to be received in real time; the endpoint can instantly report that the latest version is newer than the one the page identified itself as. The reason to replace WebSocket isn't that it's bad or something, just that stdlib doesn't have it. But long-polling works completely fine here too.
This is interesting. Still undecided how I feel about it. To be clear, it is better than what we have, at least in some respects. On the one hand, I love the fact that we could drop a few dependencies (especially tornado, which is huge and way overkill for our needs). On the other hand, this adds to the maintenance burden of MkDocs. Yes, it could be broken out into a separate library, but it is so small and purpose built, it wouldn't make much of a standalone library and would likely only be used by MkDocs anyway. And that means we would probably need to maintain it ourselves. As a third party lib it would likely grow additional features which will only increase the maintenance burden. Whereas, by keeping it in-house, we can keep it simple and purpose-built. I keep going round and round like that. So moving on to other concerns for now... Will this require any changes by third parties and/or users? Will plugins need to change in any way (especially those that make use of the How well does this server handle multiple connections at once? A user could have 2 browser tabs/windows open to the same (or different) pages at the same time. Or, to cite the extreme case, some users have reported running a dev server in their local office and multiple employees are connected to the server simultaneously while making edits, etc. True, we don't officially support that use case, but I'd like to know what to expect. I couldn't help but notice that the amount of test coverage of this patch is very low. I assume you are planning to add tests before this would get merged. Of course, if you wanted to wait for approval first, so as not to waste your time, I understand. And now some personal observations... It had not occurred to me to use HTTPSimpleServer for this. I had created waylan/rheostatic some time ago with the idea that we would eventually use it as the server for MkDocs. That was before MkDocs' version 1.0 when directory URLs were terribly buggy. I wanted to switch to using extension-less URLs. I never followed through because I realized that that would change everyone's URLs (dropping the slash at the end) and make all old links invalid. That said, rheostatic was implemented as a wsgi app. I always imagined it could be wrapped in a livereload wsgi app. I was surprised I couldn't find one until I more recently realized that Python-Livereload can also wrap a wsgi app. Although, I suppose the lack of a definitive Python websocket lib is a factor as well. In any event, if you do make a standalone library, I have to wonder if it would get more traction as a wsgi app. Then any wsgi server could just use it as a wrapper. Or am I showing my age here? I remember when wsgi was the new hotness. Now it seems that the cool kids are using asyncio frameworks (ASGI) and the like. |
Well yes, but I'm not going anywhere, so maybe no problem there.
I have prevented breakages in ways that I could foresee.
Yeah it works totally fine (I tried). The server in stdlib is threaded for quite a while now.
Yes, like you say, we'll think about testing after initial approval. For now the PR is minimal.
Well it'd be a tough sell regardless, seeing as many production static servers wouldn't support that kind of layout.
It is indeed a good consideration, and I also noticed this. There is probably some way to hack WSGIRequestHandler into it, but I haven't tried it. |
Thanks, that addresses my concerns.
Those two lines of code are exactly what prompted my question. My concern is that If an existing plugin is passing in something other than the original
This is the one thing giving me pause. I wonder if this might be a better way to go. |
I would not recommend it as the main course of action. The main code (static file serving) is not WSGI-aware at all, but it's the main workhorse here. You also can't hook into it from a WSGI route (e.g. the code substituting a file object must be kept as is). And the only other code here is the single long-polling endpoint. Which could probably be done through a WSGI fallback, but why pull in that machinery? Anyway, the main reason is that there's no good way to make those two coexist. It's feasible to choose the type of handler at the very beginning via hardcoded paths, but I don't think the static file handler code is flexible enough to allow for a fallback handler (as in, try to serve a file, else go for a different handler). And, again, what are we doing this for? A single endpoint? Or, which other endpoints would you have in mind for this? Or, why do we need a generic server here? |
That could indeed happen, but should MkDocs really expose a generic implementation of watching arbitrary files and executing arbitrary actions? Is there any valid use case for that? This rework actually started off as including this functionality, but I intentionally excluded it when adapting the code for this pull request. It really is a lot of code and also makes it more complicated to implement the throttling mentioned in the code. You need to consider that only one of the actions at a time can be allowed to run, and that the throttling mechanism may need to become duplicated per-handler, etc. A lot of edge cases. (and despite me saying that an earlier version already had this implemented, that implementation did not, in fact, handle edge cases well) Anyway, I did a full search through all plugins on PyPI. How I searched the codepage=1
while curl -s 'https://libraries.io/api/Pypi/mkdocs/dependents?api_key=REDACTED&per_page=100&page='"$page" | jq -r '.[].name' | grep ''; do ((page++)); done | \
xargs -n1 -P5 bash -c 'curl -s "https://pypi.org/pypi/$0/json" | jq -r ".releases[.info.version][] | select(.packagetype==\"sdist\") | .url" | xargs -t wget -c -q'
find . -name '*.tar.gz' -exec tar -xzf {} \;
grep -r -n on_serve */
"Compatible" here meaning that no code change is required. Generally it's viable to support all versions >=1.1.1 with a single code path. |
True, but that doesn't mean someone hasn't done it. After all it is possible with the current implementation. Therefore, as a curtesy to our users (and to plugin devs) we MUST provide a graceful deprecation. If that makes it harder to implement and/or reduces performance for a couple releases until we completely remove support, so be it. This is nonnegotiable. |
I literally searched through all public plugins and they haven't done it |
That is done now. |
Where is the DecrecationWarning? You seemed to have been suggesting that supporting anything other than the builtin Or are you now of the opinion that we should just leave support in indefinitely? |
Overall, I don't have a strong preference for or against deprecation. Why I didn't make it deprecated:
But yeah I should've mentioned that I didn't add the deprecation. Still though, please consider what my feeling was upon the first response being "Where is the DecrecationWarning". |
Sorry if I wasn't clear before. Previously, the only reason a plugin could provide its own If you are looking for example of how to deprecate something, in versions 1.0.0, 0.17.0, and 0.16.0 we deprecated a number of things. Those things have mostly been removed now, but you may find some examples there (perhaps start with the release notes). You can also see how the deprecation progressed through multiple releases with the behavior changing slightly at each step along the way until any mention of it was completely removed. Maybe see #921 and #1026 for a few examples. Sorry, we have never raised any |
In case is wasn't clear, I'm okay with using the stdlib http.server. I'm willing to accept this assuming the following is met:
|
(just like the old server)
Additionally, building will now always happen after a period of no changes is detected for 0.1 seconds, rather than instantly upon the first change
So I actually tried this for the first time. Seems to work fine with one glaring issue: The server does not shut down immediately upon typing In other matters, I'm okay with the current logging output:
It is clean, easy to read, and easy to distinguish between server logs and build logs. But I don't love it. Of course, it is a huge improvement from before and is certainly something we can live with. Baring any better suggestions, it should be okay to leave it as-is. |
I can reproduce your finding on Windows.
It's not tied in any direct way. The main hypothesis is that if the main thread is blocked on one instruction, it is impossible to receive mkdocs/mkdocs/livereload/__init__.py Line 107 in ac71ddf
Pushed a solution.
So why don't you say what's off about it? |
That was weird. After updating I couldn't get the server to shutdown at all on the first run. Saves seemed to have no effect. Then, suddenly, after one of my saves, the next keyboard interrupt initiated a shutdown. I restarted and shutdown the server multiple times since and can't replicate the behavior. It now appears to be working fine. There is an oddity I noticed though. If I have left my browser tab open after a server shutdown, upon restarting the server, it automatically reconnects to the browser, which is great. The strange bit is that the
I think what bothers me is that the time is in the middle. I would have expected it to be in the front (before INFO). But that wouldn't work with the format of the build messages. Unfortunately, I don't have any suggestions for improvement. I see you are logging http responses as debug messages (available in |
It is not amazing but it's expected in the current "stateless" implementation. First the old page connects, gets logged and gets a reload response, then the new page connects and also gets logged. But it's actually possible to skip logging in case of an instant reload response (avoid the first of those two), just makes for slightly longer code. I pushed that now.
Requests themselves do get logged, intentionally only in verbose mode, which also matches the old implementation. mkdocs/mkdocs/livereload/__init__.py Lines 245 to 247 in 5607bd4
For the particular case of "browser connected" - first of all, the actual URL is mkdocs/mkdocs/livereload/__init__.py Lines 173 to 176 in 5607bd4
So do you want to just log requests to HTML files in particular? Or fake the messages about these livereload connections to look more like actual responses? |
Hmm, I see it now. Not sure why I wasn't seeing that before. Maybe because I expected it to the the second message, not the third, as it is below.
In any event. No concern here. Sorry for the noise.
I like it. Less confusing to the user. |
So in this particular pull request I have experienced very constructive interactions and I'm grateful for it. However, such cases are exceptional, and it's untenable to contribute in an environment where the maintainer's attitude towards a contribution is set by his preconcieved beliefs, plus some random chance, and is not possible to affect with any factual arguments. I don't intend to jeopardize this particular pull request in any way, but going forward, something will have to change. I opened a discussion here: Concerns about maintainership of MkDocs |
I believe this is almost ready to go. Unless I'm missing something, it seems that we only need the following:
One other thing is to decide whether we should include oprypin/mkdocs@b2ac728 (Offset the local site root according to the sub-path of the site_url ) in this PR or separately. I don't see any reason to not include it, but @oprypin seems to suggest is should be kept separate. This is his PR, so if he want's to keep that separate, he can. I only mention it here for completeness. The big question is whether we should delay the release of MkDocs 1.2 for this issue or not. It is not currently in the milestone, which (as of this posting) only has two remaining open issues. I didn't add it to the milestone initially because I didn't know how it would progress. Of course, as the readiness of this PR is dependent on an upstream project (see item 1 above), we have no idea or control over when that will happen. My thinking is that we can tentatively add it to the milestone now and reexamine the situation once all remaining issues in the millstone are resolved. If need be, we can remove it from the milestone at that time and proceed with the release of 2.1 without this. Of course, it would be preferable to include this it we can. |
Thanks much for the analysis.
Agreed to do that. And they have promised a release tomorrow, so that's nice.
With that in mind, hopefully the wait is not significant. And we even have the option of not waiting, because (and I don't prefer this approach, but pointing out):
Right -- I would just really like to see the initial commit of this server to be in its minimal state, and add-ons to be separate; besides, as-is, this is kinda a non-breaking change, while oprypin@b2ac728 is a kinda-breaking change. But there's no good way to ensure that these are merged as 2 commits while staying within one PR, and also no way to do dependent PRs if the branches aren't in the main repository. |
I was still seeing random failures on Windows, in addition to the consistent ones on PyPy, so may as well add those sleeps everywhere.
I was looking into the fact that tests still occasionally fail randomly -- not on PyPy, despite the attention previously being brought to PyPy. I made GitHub Actions run this new test suite 50 times on each platform. So I added sleeps directly to the tests, which changed the failure rate to 0: So, seeing as we need those sleeps anyway, and that that's what the fix for PyPy is anyway, now we don't even need to pin the watchdog version. Mind you, the sleeps sound like a pretty bad practice to add, but each of them is 10ms, those are not real waits (I guess it's basically something to ensure the execution yields to the other thread). And these tests are already slow due to watchdog's startup and shutdown taking time, so that's not a big deal. In summary, I consider this fully ready now. |
I'll give it a test. Before that: Would you be that kind and remove the merge commits from the PR please? (The feature in this form already incorporates other changes via merge.) And then rebase clean on top of the target branch (master)? That would be great @oprypin |
That is not true. Comparing this branch to master shows no other changes.
Why? Removing or avoiding them was never the plan. And I don't think the fact that it has some slightly chaotic commits affects the final installed result. Certainly doesn't affect the direct
I don't want to throw away the history of how the PR progressed. It already merges the latest of the target branch. In the end GitHub will just "rebase" the very final result itself. |
@oprypin Thank you for your hard work putting this together and your recent comment, much appreciated. I'm deeply sorry my asking caused that much of a comment and considerations, it was not meant that way. I kindly ask you to accept my apology. It is and never has been a blocker for my test (which is going good so far for me), it was just an early comment. Which makes me specifically sorry for asking being the cause of so much further ado. It was merely a QC thing, VCS related, so just a technicality. I know this does not make it better and have to ask you to accept my apology nevertheless.
I'm glad you didn't as I would have not either. As written, I feel sorry I've asked as it caused that many considerations, I will refrain from further comments at least until resolved, my intend was to keep unrelated changes out of it to ease in my local review and even with best intend, absolutely selfish. My belief was, my really short comment would be ok to handle it with sovereignty and I have to find out for myself first what made me think so as I was obviously wrong and offended you. I will step away from adding any further comments to this PR for now as I don't want to risk or derail it in any way. Be it entirely subjective feedback of mine or test results, even how I was able to test it, I fear the risk of negativity due to any kind of interaction does not justify the benefits it may have under circumstances I believed or expected and I owe to you to verify this first. I'm deeply sorry and the only thing left for me is the hope you can accept my apology and we can build trust in each other. |
@ktomk No problem at all, and thanks for your heartfelt message. I really just wanted to explain in detail what I often see as a misconception. |
@oprypin thanks. this is taking a burden off my shoulders. So if it would be OK for you to continue regardless of any misconceptions, it would be fine for me if we could leave it so. And if you're interested why I do a rebase check, I'm open to talk with you about that as well, it's just I think it does not belong here. Sounds fair to you? |
This discards the dependency on 'livereload' and 'tornado'. This implementation is based on stdlib http.server and is much smaller, while not being a downgrade in any way.
This fixes #2061: multiple file changes no longer cause the site to repeatedly rebuild.
It also makes reloads much more responsive; they're "instant" rather than being on a 1-second polling interval or such.
I also was careful to keep backwards compatibility, including providing logging messages that feel just like the old ones.
Each HTML page gets JavaScript injected into it (like before) but now, instead of connecting to WebSocket, it requests a long-polling endpoint, which blocks until a newer version ("epoch") of the site is available - then the page knows to reload itself. After a minute without events, the endpoint will just return the unchanged epoch (to avoid timeouts), then the page requests the same endpoint again. The "downtime" in between requests is not a problem because these are not events that need to be received in real time; the endpoint can instantly report that the latest version is newer than the one the page identified itself as.
The reason to replace WebSocket isn't that it's bad or something, just that stdlib doesn't have it. But long-polling works completely fine here too.
I am presenting this implementation here as part of the MkDocs repository itself. That is because this way people can try it easily, as well as review it easily, without external boilerplate. I also actually think that it would be good for MkDocs to have the flexibility provided by an inlined implementation. But if you think this should be made into a library, I'd be happy to provide that. In fact, I'll probably do that in any case. Just need a good name for it...