Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed #27685 -- Added watchman support to the autoreloader. #8819

Merged
merged 1 commit into from Jan 14, 2019

Conversation

@orf
Copy link
Contributor

@orf orf commented Jul 27, 2017

Ticket

The ticket is about adding support for watchman to the autoreloader, but a lot of Aymeric's points rang true - especially about a complete rewrite rather than a bit-by-bit refactor. There are a few other related tickets to do with the autoreloader that would require some re-engineering before they could be got right.

This MR does a big cleanup of the autoreloader code, which ends up to be half the size, be (i hope) a lot less hairy and have some interesting features.

Firstly, it implements two signals (autoreload_started and file_changed) that allow other parts of Django and perhaps even third party apps to customize what files are watched and offer custom handing when a file is changed. This is used to implement the i18n translations reset code, which currently lives in the autoreload code (and IMO really should not). Users of the signal are given the auto-reload instance which has a watch method, which accepts a glob argument, e.g:

autoreloader.watch('some_directory/', '**/*.html')

Secondly the autoreload code is split into two classes, a BaseReloader and a concrete StatReloader. In the future this can be used to add support for watchman or any other algorithm for detecting changes - it just needs to implement a yield_changes function that yields the paths of files that have changed.

It also changes the behaviour slightly: the current implementation replaces .pyc files with .py, but I'm not sure this is still valid. It assumes that .py files live next to .pyc files, which in Python 3 and __pycache__ directories may not be true. So I removed the code that handled that, as well as Jython-specific stuff.

It doesn't currently include support for catching SyntaxErrors, which I kept out in case anyone had a good idea of how to do a clean implementation of it. The current code (to quote Aymeric) is 'horrific'. It also doesn't include the ensure_echo_on code, which was a ticket that was added a long time ago. It has no tests and perhaps it's not required any more?

@orf orf changed the title REF 27685 autoreloader refactor Refs #27685 -- Refactor the autoreloader Jul 27, 2017
@orf orf force-pushed the orf:27685-autoreloader-refactor branch 2 times, most recently from 6d5ce78 to 9333a2c Jul 27, 2017
@aaugustin
Copy link
Member

@aaugustin aaugustin commented Jul 29, 2017

Thanks! I commented here https://code.djangoproject.com/ticket/27685#comment:6.

I'm planning to review this in the next few days (he says, optimistically).

@aaugustin aaugustin self-requested a review Jul 29, 2017
@orf
Copy link
Contributor Author

@orf orf commented Aug 15, 2017

@aaugustin, you said ping me by mid august: ping 🏓

@aaugustin
Copy link
Member

@aaugustin aaugustin commented Aug 19, 2017

I'm taking a look this week-end.

@aaugustin
Copy link
Member

@aaugustin aaugustin commented Aug 20, 2017

My original idea was to watch the current working directory rather than all imported Python modules. This changes the behavior for the better — I think. It's easier to explain. It also avoids debates about .pyc files, translation catalogs, etc.

I'd like to see how watchman will integrate with the APIs you defined. You said "it just needs to implement a yield_changes function" — the devil is in the details there. Perhaps it will work just fine :-) I think you should focus on this next.

The file_changed signal is a very nice touch.

Regarding ensure_echo_on, you should take a look at related commits or tickets to determine whether it's still needed. In doubt, keep it.

@orf
Copy link
Contributor Author

@orf orf commented Aug 20, 2017

Thanks for taking a look. I will:

  • Re-add ensure_echo_on - I think the tickets I read found the issue, but not the cause, so perhaps it's best to leave it in
  • Try and add watchman, and refactor the API if necessary
  • Add handing for exceptions - I removed it because there was apparently a better way. Need to do some more reading into this

I was also onboarding some new junior developers last week, and one thing I noticed that threw them a little bit was the fact that the autoreloader closes the socket while reloading. So often when refreshing after a change the browser would throw an error because nothing was listening on that port.

If you agree I'd like to fix that as well, as far as I can tell the fix would be to open the socket in the outer manage.py and pass it into the subprocess. It's a little thing but if it's simple to fix it might be good.

@aaugustin
Copy link
Member

@aaugustin aaugustin commented Aug 21, 2017

Yes, why not. Check the behavior before / after in the following cases: run manage.py runserver:

  • with an error in the settings, then remove that error
  • with an error in another module then remove that error
  • without errors, then add an error in the settings, then remove that error
  • without errors, then add an error in another file, then remove that error

BTW I just remembered the reason for watching the whole current directory rather just imported Python modules: that will catch file additions like admin.py.

@orf
Copy link
Contributor Author

@orf orf commented Aug 21, 2017

@aaugustin
Copy link
Member

@aaugustin aaugustin commented Aug 21, 2017

We could have a configurable set of excludes with reasonable defaults for the CWD... Anyway, you write the code, you get to decide :-) Let's focus on watchman integration first!

@orf
Copy link
Contributor Author

@orf orf commented Aug 27, 2017

I started on getting watchman working, and it was going well. However I got sidetracked with the exception handing situation, which needs some thought and redesigning. I've come up with a solution - however it adds a bit of complexity so I wanted to get some feedback.

The tl;dr is that the initial manage.py process needs to handle the file watching, not the child process. The reason being is that if there is an error starting the project (like an issue in the settings or a bug in Django itself) the autoreload code won't run. So how do we know when to restart the process?

My solution is this: the child manage.py is spawned using multiprocessing, which allows us to pass a Queue into it. Any files that need to be watched are dispatched into the queue, which are picked up by the parent process. When these files change the child is terminated and restarted. If the child fails to start due to an error, the last state of the watched files is used to reload the process.

This seems cleaner than the current implementation but obviously is more complex. Also some people have strong reactions when multiprocessing is mentioned. It's also inspired by a generic-wsgi reloader project linked in the developers thread.

I'm going to implement this fully with the current stat reloader and post a message to the developers list to gather feedback I think. Integrating watchman also needs this ticket to be fixed to be much use I think, as the pypi package hasn't been updated in a while and has little documentation. The overall code might be simpler with the multiprocessing queue-based approach above due to the way the watchman library works.

@orf orf force-pushed the orf:27685-autoreloader-refactor branch 2 times, most recently from 8f96562 to 05dd33a Aug 27, 2017
@orf
Copy link
Contributor Author

@orf orf commented Aug 27, 2017

I've pushed what I've got so far. It works, but I'm not sure I like it. The original code, while being hacky, was quite elegant. This approach is more complex for sure.

In any case I'll start working on adding watchman integration next.

@orf
Copy link
Contributor Author

@orf orf commented Aug 28, 2017

I've added initial support for watchman! It was actually easier than I thought - I took a bit of code from flask to handle finding the appropriate roots to listen on which sped things up a lot. It's not done yet - it will trigger a reload for any change under the roots it listens on, but it's good as a proof of concept.

Until watchman push a new pywatchman release (which I'm bugging them to do), you need to clone the watchman repository and run setup.py install from the watchman/python directory for this to work.

@orf orf force-pushed the orf:27685-autoreloader-refactor branch from 568b2d5 to dbedf5f Aug 28, 2017
@jonashaag
Copy link
Contributor

@jonashaag jonashaag commented Nov 2, 2017

I'm interested in continuing to work on this. What's the state of this PR?

@orf
Copy link
Contributor Author

@orf orf commented Nov 3, 2017

@jonashaag
Copy link
Contributor

@jonashaag jonashaag commented Nov 4, 2017

I've had a look at your implementation and I found it complicated (loads of subprocesses and threads etc). But may the problem is simply very complicated. I wonder if an approach with a completely separated autoreloader (like hupper) wouldn't be simpler. I also wonder if we couldn't simply re-use large parts of werkzeugs autoreloader.

Anyways, I rebased to current master which you can find here: https://github.com/jonashaag/django/tree/autoreloader

@orf
Copy link
Contributor Author

@orf orf commented Nov 11, 2017

So yeah, unfortunately this did get overly complicated. I wanted to run with the ideas I had to completion and see how they would work out.

The main issue is that on some systems registering a file watcher can be quite expensive. I wanted to have the parent django process manage this, and the child one that actually runs the application notify the parent of which files to watch. This brings in a lot of complexity, I guess it could be avoided by just registering/un-registering the fs notifications in the child process, just as flask does. I think there is much both projects could share.

It seems really minor, but one thing I would really like to do is to open the socket in the parent (reloader) process and pass it into the child one. This means refreshing the page as the child is reloading will not result in a Connection Refused error as it currently does, but the socket will buffer the request until the child has finished reloading and can accept request. I was onboarding several new joiners (and new-ish to Django), and this was actually quite a big pain point.

@orf orf force-pushed the orf:27685-autoreloader-refactor branch from dbedf5f to 3ea9163 Nov 11, 2017
@orf orf force-pushed the orf:27685-autoreloader-refactor branch 3 times, most recently from a59571e to f92aad9 Dec 9, 2017
@orf
Copy link
Contributor Author

@orf orf commented Dec 9, 2017

I shelved the complex multi-process branch and reverted to the simpler threading based approach we currently use. I've added support for Watchman and the current polling approach, and cleaned up the autoreloader.py file a lot. You can try it out with manage.py runserver --watchman.

I'll add tests and documentation next, any advice with testing approaches for this would be greatly appreciated.

@orf orf force-pushed the orf:27685-autoreloader-refactor branch from b8e33fe to 74135a8 Dec 9, 2017
@orf orf force-pushed the orf:27685-autoreloader-refactor branch 5 times, most recently from 364872a to 6623bd2 Nov 28, 2018
@orf orf force-pushed the orf:27685-autoreloader-refactor branch from 6623bd2 to aaf56d8 Dec 29, 2018
Copy link
Member

@timgraham timgraham left a comment

Should I install watchman from source on the CI machines? docs/internals/contributing/writing-code/unit-tests.html "Runnig all the tests" also needs an update.

Perhaps the release notes should mention the removal of pyinotify support?

tests/requirements/py3.txt Outdated Show resolved Hide resolved
docs/ref/django-admin.txt Outdated Show resolved Hide resolved


def watch_for_translation_changes(sender, **kwargs):
from django.conf import settings

This comment has been minimized.

@timgraham

timgraham Jan 9, 2019
Member

Some docstrings for these methods could be useful.

This comment has been minimized.

@orf

orf Jan 9, 2019
Author Contributor

I added a docstring, I was not sure how detailed to be.

I also removed the watching of built-in Django .mo files, there are a lot of them and I do not think there is any value in continually polling them.

tests/utils_tests/test_autoreload.py Show resolved Hide resolved
@orf orf force-pushed the orf:27685-autoreloader-refactor branch from aaf56d8 to 08ce861 Jan 9, 2019
@orf
Copy link
Contributor Author

@orf orf commented Jan 9, 2019

Thank you @timgraham, I've addressed your comments. Regarding the CI: I think you have to install it from source. I think we should install it on a single runner first (the sqlite one?) and get that working before you spend time rolling it out everywhere?

Also I think that the daemon will be automatically started by the pywatchman library as long as the binary is available on the system path (it is on MacOS), which might make admin easier.

There are a few configuration options that might be worth setting, specifically for a CI environment. The GC times could be turned down from the default of 12 hours: https://facebook.github.io/watchman/docs/config.html#gc_age_seconds

The inotify limits may also have to be tuned: https://facebook.github.io/watchman/docs/install.html#linux-inotify-limits

@collinanderson
Copy link
Contributor

@collinanderson collinanderson commented Jan 10, 2019

@orf Do you mention in the docs / release notes that inotify is no longer supported?

@timgraham
Copy link
Member

@timgraham timgraham commented Jan 12, 2019

I installed watchman on my machine and the tests are working except:

======================================================================
FAIL: test_check_availability (utils_tests.test_autoreload.WatchmanReloaderTests)
----------------------------------------------------------------------
django.utils.autoreload.WatchmanUnavailable: Watchman version >= 4.9 required

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/python3.7.0/lib/python3.7/unittest/mock.py", line 1191, in patched
    return func(*args, **keywargs)
  File "/home/tim/code/django/tests/utils_tests/test_autoreload.py", line 602, in test_check_availability
    self.RELOADER_CLS.check_availability()
  File "/opt/python3.7.0/lib/python3.7/contextlib.py", line 130, in __exit__
    self.gen.throw(type, value, traceback)
  File "/home/tim/code/django/django/test/testcases.py", line 666, in _assert_raises_or_warns_cm
    self.assertIn(expected_message, str(getattr(cm, cm_attr)))
AssertionError: 'Cannot connect to the watchman service' not found in 'Watchman version >= 4.9 required'

Does it work for you?

I did nothing to "ensure watchman is running" so I'll amend the documentation as I make my edits.

@timgraham timgraham changed the title Refs #27685 -- Refactor the autoreloader Fixed #27685 -- Made the autoreloader use watchman. Jan 12, 2019
@orf
Copy link
Contributor Author

@orf orf commented Jan 12, 2019

@timgraham timgraham changed the title Fixed #27685 -- Made the autoreloader use watchman. Fixed #27685 -- Added watchman support to the autoreloader. Jan 12, 2019
@timgraham timgraham force-pushed the orf:27685-autoreloader-refactor branch from 90651a1 to 3e014e8 Jan 12, 2019
Copy link
Member

@timgraham timgraham left a comment

I pushed my edits.

django/utils/autoreload.py Outdated Show resolved Hide resolved
django/utils/autoreload.py Show resolved Hide resolved
@orf orf force-pushed the orf:27685-autoreloader-refactor branch from cb33552 to a34d6ef Jan 12, 2019
@timgraham timgraham force-pushed the orf:27685-autoreloader-refactor branch from a34d6ef to 43c1431 Jan 12, 2019
@timgraham timgraham merged commit c8720e7 into django:master Jan 14, 2019
19 checks passed
19 checks passed
@timgraham
docs Build #18271 ended
Details
@timgraham
flake8 Build #18380 ended
Details
@timgraham
isort Build #18411 succeeded in 24 sec
Details
@timgraham
pr-mariadb/database=mysql,label=mariadb,python=python3.7 Build #2851 ended
Details
@timgraham
pr-mariadb/database=mysql_gis,label=mariadb,python=python3.7 Build #2851 ended
Details
@timgraham
pull-requests-bionic/database=mysql,label=bionic-pr,python=python3.5 Build #3329 ended
Details
@timgraham
pull-requests-bionic/database=mysql,label=bionic-pr,python=python3.7 Build #3329 ended
Details
@timgraham
pull-requests-bionic/database=mysql_gis,label=bionic-pr,python=python3.5 Build #3329 ended
Details
@timgraham
pull-requests-bionic/database=mysql_gis,label=bionic-pr,python=python3.7 Build #3329 ended
Details
@timgraham
pull-requests-bionic/database=postgis,label=bionic-pr,python=python3.5 Build #3329 ended
Details
@timgraham
pull-requests-bionic/database=postgis,label=bionic-pr,python=python3.7 Build #3329 ended
Details
@timgraham
pull-requests-bionic/database=postgres,label=bionic-pr,python=python3.5 Build #3329 ended
Details
@timgraham
pull-requests-bionic/database=postgres,label=bionic-pr,python=python3.7 Build #3329 ended
Details
@timgraham
pull-requests-bionic/database=spatialite,label=bionic-pr,python=python3.5 Build #3329 ended
Details
@timgraham
pull-requests-bionic/database=spatialite,label=bionic-pr,python=python3.7 Build #3329 ended
Details
@timgraham
pull-requests-bionic/database=sqlite3,label=bionic-pr,python=python3.5 Build #3329 ended
Details
@timgraham
pull-requests-bionic/database=sqlite3,label=bionic-pr,python=python3.7 Build #3329 ended
Details
@timgraham
pull-requests-javascript Build #14773 ended
Details
@timgraham
pull-requests-windows/database=sqlite3,label=windows,python=Python37 Build #10380 ended
Details
@orf orf deleted the orf:27685-autoreloader-refactor branch Jan 18, 2019
@adamchainz
Copy link
Member

@adamchainz adamchainz commented Jan 20, 2019

@orf Thanks for the great work here 🎉

@ArtemBernatskyy
Copy link

@ArtemBernatskyy ArtemBernatskyy commented Feb 4, 2019

Found a bug with infinite reloading.
Tested on Django==2.2a1

Watching for file changes with StatReloader
Performing system checks…

System check identified no issues (0 silenced).
/var/webapps/project/code/backend/project/contacts/api/serializers.py changed, reloading.
Watching for file changes with StatReloader
Performing system checks…

System check identified no issues (0 silenced).
/var/webapps/project/code/backend/project/contacts/api/serializers.py changed, reloading.
Watching for file changes with StatReloader
Performing system checks…

System check identified no issues (0 silenced).
/var/webapps/project/code/backend/project/contacts/api/serializers.py changed, reloading.
Watching for file changes with StatReloader
Performing system checks…

System check identified no issues (0 silenced).
/var/webapps/project/code/backend/project/contacts/api/serializers.py changed, reloading.
Watching for file changes with StatReloader
Performing system checks…

System check identified no issues (0 silenced).
/var/webapps/project/code/backend/project/contacts/api/serializers.py changed, reloading.
Watching for file changes with StatReloader
Performing system checks…

and it goes to infinite reloading loop (I haven't changed files)
Vagrant, Pyenv (1.2.9), Python (3.6.8), pywathcman not installed

@timgraham
Copy link
Member

@timgraham timgraham commented Feb 4, 2019

Please reports bugs with a Trac ticket. I guess we may need more details to understand the problem and reproduce the issue.

@orf
Copy link
Contributor Author

@orf orf commented Feb 4, 2019

Hey @ArtemBernatskyy, are you able to create a ticket on https://code.djangoproject.com/query?

There is a django.utils.autoreload logger that outputs diagnostic information if set to DEBUG. If you're able to do this then that information would greatly help in the ticket.

@ArtemBernatskyy
Copy link

@ArtemBernatskyy ArtemBernatskyy commented Feb 4, 2019

@orf @timgraham , damn, I can't reproduce it (

If somehow I'll be able to reproduce it I'll create a ticket.
Thx.

@kezabelle
Copy link
Contributor

@kezabelle kezabelle commented Feb 5, 2019

/var/webapps/project/code/backend/project/contacts/api/serializers.py
/var/webapps/project/code/backend/prvoltprojectrestful/contacts/api/serializers.py

Perhaps it's nothing, but 2 different paths to the same file, maybe?

@ArtemBernatskyy
Copy link

@ArtemBernatskyy ArtemBernatskyy commented Feb 5, 2019

@kezabelle no it's me ) I've edited the log a little bit. Fixed now.

@mazulo mazulo mentioned this pull request Aug 12, 2019
2 of 8 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment