New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed #27685 -- Added watchman support to the autoreloader. #8819

Merged
merged 1 commit into from Jan 14, 2019

Conversation

Projects
None yet
10 participants
@orf
Copy link
Contributor

orf commented Jul 27, 2017

Ticket

The ticket is about adding support for watchman to the autoreloader, but a lot of Aymeric's points rang true - especially about a complete rewrite rather than a bit-by-bit refactor. There are a few other related tickets to do with the autoreloader that would require some re-engineering before they could be got right.

This MR does a big cleanup of the autoreloader code, which ends up to be half the size, be (i hope) a lot less hairy and have some interesting features.

Firstly, it implements two signals (autoreload_started and file_changed) that allow other parts of Django and perhaps even third party apps to customize what files are watched and offer custom handing when a file is changed. This is used to implement the i18n translations reset code, which currently lives in the autoreload code (and IMO really should not). Users of the signal are given the auto-reload instance which has a watch method, which accepts a glob argument, e.g:

autoreloader.watch('some_directory/', '**/*.html')

Secondly the autoreload code is split into two classes, a BaseReloader and a concrete StatReloader. In the future this can be used to add support for watchman or any other algorithm for detecting changes - it just needs to implement a yield_changes function that yields the paths of files that have changed.

It also changes the behaviour slightly: the current implementation replaces .pyc files with .py, but I'm not sure this is still valid. It assumes that .py files live next to .pyc files, which in Python 3 and __pycache__ directories may not be true. So I removed the code that handled that, as well as Jython-specific stuff.

It doesn't currently include support for catching SyntaxErrors, which I kept out in case anyone had a good idea of how to do a clean implementation of it. The current code (to quote Aymeric) is 'horrific'. It also doesn't include the ensure_echo_on code, which was a ticket that was added a long time ago. It has no tests and perhaps it's not required any more?

@orf orf changed the title REF 27685 autoreloader refactor Refs #27685 -- Refactor the autoreloader Jul 27, 2017

@orf orf force-pushed the orf:27685-autoreloader-refactor branch 2 times, most recently from 6d5ce78 to 9333a2c Jul 27, 2017

@aaugustin

This comment has been minimized.

Copy link
Member

aaugustin commented Jul 29, 2017

Thanks! I commented here https://code.djangoproject.com/ticket/27685#comment:6.

I'm planning to review this in the next few days (he says, optimistically).

@aaugustin aaugustin self-requested a review Jul 29, 2017

@orf

This comment has been minimized.

Copy link
Contributor

orf commented Aug 15, 2017

@aaugustin, you said ping me by mid august: ping 🏓

@aaugustin

This comment has been minimized.

Copy link
Member

aaugustin commented Aug 19, 2017

I'm taking a look this week-end.

@aaugustin

This comment has been minimized.

Copy link
Member

aaugustin commented Aug 20, 2017

My original idea was to watch the current working directory rather than all imported Python modules. This changes the behavior for the better — I think. It's easier to explain. It also avoids debates about .pyc files, translation catalogs, etc.

I'd like to see how watchman will integrate with the APIs you defined. You said "it just needs to implement a yield_changes function" — the devil is in the details there. Perhaps it will work just fine :-) I think you should focus on this next.

The file_changed signal is a very nice touch.

Regarding ensure_echo_on, you should take a look at related commits or tickets to determine whether it's still needed. In doubt, keep it.

@orf

This comment has been minimized.

Copy link
Contributor

orf commented Aug 20, 2017

Thanks for taking a look. I will:

  • Re-add ensure_echo_on - I think the tickets I read found the issue, but not the cause, so perhaps it's best to leave it in
  • Try and add watchman, and refactor the API if necessary
  • Add handing for exceptions - I removed it because there was apparently a better way. Need to do some more reading into this

I was also onboarding some new junior developers last week, and one thing I noticed that threw them a little bit was the fact that the autoreloader closes the socket while reloading. So often when refreshing after a change the browser would throw an error because nothing was listening on that port.

If you agree I'd like to fix that as well, as far as I can tell the fix would be to open the socket in the outer manage.py and pass it into the subprocess. It's a little thing but if it's simple to fix it might be good.

@aaugustin

This comment has been minimized.

Copy link
Member

aaugustin commented Aug 21, 2017

Yes, why not. Check the behavior before / after in the following cases: run manage.py runserver:

  • with an error in the settings, then remove that error
  • with an error in another module then remove that error
  • without errors, then add an error in the settings, then remove that error
  • without errors, then add an error in another file, then remove that error

BTW I just remembered the reason for watching the whole current directory rather just imported Python modules: that will catch file additions like admin.py.

@orf

This comment has been minimized.

Copy link
Contributor

orf commented Aug 21, 2017

@aaugustin

This comment has been minimized.

Copy link
Member

aaugustin commented Aug 21, 2017

We could have a configurable set of excludes with reasonable defaults for the CWD... Anyway, you write the code, you get to decide :-) Let's focus on watchman integration first!

@orf

This comment has been minimized.

Copy link
Contributor

orf commented Aug 27, 2017

I started on getting watchman working, and it was going well. However I got sidetracked with the exception handing situation, which needs some thought and redesigning. I've come up with a solution - however it adds a bit of complexity so I wanted to get some feedback.

The tl;dr is that the initial manage.py process needs to handle the file watching, not the child process. The reason being is that if there is an error starting the project (like an issue in the settings or a bug in Django itself) the autoreload code won't run. So how do we know when to restart the process?

My solution is this: the child manage.py is spawned using multiprocessing, which allows us to pass a Queue into it. Any files that need to be watched are dispatched into the queue, which are picked up by the parent process. When these files change the child is terminated and restarted. If the child fails to start due to an error, the last state of the watched files is used to reload the process.

This seems cleaner than the current implementation but obviously is more complex. Also some people have strong reactions when multiprocessing is mentioned. It's also inspired by a generic-wsgi reloader project linked in the developers thread.

I'm going to implement this fully with the current stat reloader and post a message to the developers list to gather feedback I think. Integrating watchman also needs this ticket to be fixed to be much use I think, as the pypi package hasn't been updated in a while and has little documentation. The overall code might be simpler with the multiprocessing queue-based approach above due to the way the watchman library works.

@orf orf force-pushed the orf:27685-autoreloader-refactor branch 2 times, most recently from 8f96562 to 05dd33a Aug 27, 2017

@orf

This comment has been minimized.

Copy link
Contributor

orf commented Aug 27, 2017

I've pushed what I've got so far. It works, but I'm not sure I like it. The original code, while being hacky, was quite elegant. This approach is more complex for sure.

In any case I'll start working on adding watchman integration next.

@orf

This comment has been minimized.

Copy link
Contributor

orf commented Aug 28, 2017

I've added initial support for watchman! It was actually easier than I thought - I took a bit of code from flask to handle finding the appropriate roots to listen on which sped things up a lot. It's not done yet - it will trigger a reload for any change under the roots it listens on, but it's good as a proof of concept.

Until watchman push a new pywatchman release (which I'm bugging them to do), you need to clone the watchman repository and run setup.py install from the watchman/python directory for this to work.

@orf orf force-pushed the orf:27685-autoreloader-refactor branch from 568b2d5 to dbedf5f Aug 28, 2017

@jonashaag

This comment has been minimized.

Copy link
Contributor

jonashaag commented Nov 2, 2017

I'm interested in continuing to work on this. What's the state of this PR?

@orf

This comment has been minimized.

Copy link
Contributor

orf commented Nov 3, 2017

@jonashaag

This comment has been minimized.

Copy link
Contributor

jonashaag commented Nov 4, 2017

I've had a look at your implementation and I found it complicated (loads of subprocesses and threads etc). But may the problem is simply very complicated. I wonder if an approach with a completely separated autoreloader (like hupper) wouldn't be simpler. I also wonder if we couldn't simply re-use large parts of werkzeugs autoreloader.

Anyways, I rebased to current master which you can find here: https://github.com/jonashaag/django/tree/autoreloader

@orf

This comment has been minimized.

Copy link
Contributor

orf commented Nov 11, 2017

So yeah, unfortunately this did get overly complicated. I wanted to run with the ideas I had to completion and see how they would work out.

The main issue is that on some systems registering a file watcher can be quite expensive. I wanted to have the parent django process manage this, and the child one that actually runs the application notify the parent of which files to watch. This brings in a lot of complexity, I guess it could be avoided by just registering/un-registering the fs notifications in the child process, just as flask does. I think there is much both projects could share.

It seems really minor, but one thing I would really like to do is to open the socket in the parent (reloader) process and pass it into the child one. This means refreshing the page as the child is reloading will not result in a Connection Refused error as it currently does, but the socket will buffer the request until the child has finished reloading and can accept request. I was onboarding several new joiners (and new-ish to Django), and this was actually quite a big pain point.

@orf orf force-pushed the orf:27685-autoreloader-refactor branch from dbedf5f to 3ea9163 Nov 11, 2017

@orf orf force-pushed the orf:27685-autoreloader-refactor branch 3 times, most recently from a59571e to f92aad9 Dec 9, 2017

@orf

This comment has been minimized.

Copy link
Contributor

orf commented Dec 9, 2017

I shelved the complex multi-process branch and reverted to the simpler threading based approach we currently use. I've added support for Watchman and the current polling approach, and cleaned up the autoreloader.py file a lot. You can try it out with manage.py runserver --watchman.

I'll add tests and documentation next, any advice with testing approaches for this would be greatly appreciated.

@orf orf force-pushed the orf:27685-autoreloader-refactor branch from b8e33fe to 74135a8 Dec 9, 2017

@orf orf force-pushed the orf:27685-autoreloader-refactor branch 5 times, most recently from f5f77cc to a59017f Nov 18, 2018

@orf orf force-pushed the orf:27685-autoreloader-refactor branch 5 times, most recently from 364872a to 6623bd2 Nov 28, 2018

@orf orf force-pushed the orf:27685-autoreloader-refactor branch from 6623bd2 to aaf56d8 Dec 29, 2018

@timgraham
Copy link
Member

timgraham left a comment

Should I install watchman from source on the CI machines? docs/internals/contributing/writing-code/unit-tests.html "Runnig all the tests" also needs an update.

Perhaps the release notes should mention the removal of pyinotify support?

Show resolved Hide resolved tests/requirements/py3.txt Outdated
Show resolved Hide resolved docs/ref/django-admin.txt Outdated


def watch_for_translation_changes(sender, **kwargs):
from django.conf import settings

This comment has been minimized.

@timgraham

timgraham Jan 9, 2019

Member

Some docstrings for these methods could be useful.

This comment has been minimized.

@orf

orf Jan 9, 2019

Contributor

I added a docstring, I was not sure how detailed to be.

I also removed the watching of built-in Django .mo files, there are a lot of them and I do not think there is any value in continually polling them.

Show resolved Hide resolved tests/utils_tests/test_autoreload.py

@orf orf force-pushed the orf:27685-autoreloader-refactor branch from aaf56d8 to 08ce861 Jan 9, 2019

@orf

This comment has been minimized.

Copy link
Contributor

orf commented Jan 9, 2019

Thank you @timgraham, I've addressed your comments. Regarding the CI: I think you have to install it from source. I think we should install it on a single runner first (the sqlite one?) and get that working before you spend time rolling it out everywhere?

Also I think that the daemon will be automatically started by the pywatchman library as long as the binary is available on the system path (it is on MacOS), which might make admin easier.

There are a few configuration options that might be worth setting, specifically for a CI environment. The GC times could be turned down from the default of 12 hours: https://facebook.github.io/watchman/docs/config.html#gc_age_seconds

The inotify limits may also have to be tuned: https://facebook.github.io/watchman/docs/install.html#linux-inotify-limits

@collinanderson

This comment has been minimized.

Copy link
Contributor

collinanderson commented Jan 10, 2019

@orf Do you mention in the docs / release notes that inotify is no longer supported?

@timgraham

This comment has been minimized.

Copy link
Member

timgraham commented Jan 12, 2019

I installed watchman on my machine and the tests are working except:

======================================================================
FAIL: test_check_availability (utils_tests.test_autoreload.WatchmanReloaderTests)
----------------------------------------------------------------------
django.utils.autoreload.WatchmanUnavailable: Watchman version >= 4.9 required

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/python3.7.0/lib/python3.7/unittest/mock.py", line 1191, in patched
    return func(*args, **keywargs)
  File "/home/tim/code/django/tests/utils_tests/test_autoreload.py", line 602, in test_check_availability
    self.RELOADER_CLS.check_availability()
  File "/opt/python3.7.0/lib/python3.7/contextlib.py", line 130, in __exit__
    self.gen.throw(type, value, traceback)
  File "/home/tim/code/django/django/test/testcases.py", line 666, in _assert_raises_or_warns_cm
    self.assertIn(expected_message, str(getattr(cm, cm_attr)))
AssertionError: 'Cannot connect to the watchman service' not found in 'Watchman version >= 4.9 required'

Does it work for you?

I did nothing to "ensure watchman is running" so I'll amend the documentation as I make my edits.

@timgraham timgraham changed the title Refs #27685 -- Refactor the autoreloader Fixed #27685 -- Made the autoreloader use watchman. Jan 12, 2019

@orf

This comment has been minimized.

Copy link
Contributor

orf commented Jan 12, 2019

@timgraham timgraham changed the title Fixed #27685 -- Made the autoreloader use watchman. Fixed #27685 -- Added watchman support to the autoreloader. Jan 12, 2019

@timgraham timgraham force-pushed the orf:27685-autoreloader-refactor branch from 90651a1 to 3e014e8 Jan 12, 2019

@timgraham
Copy link
Member

timgraham left a comment

I pushed my edits.

Show resolved Hide resolved django/utils/autoreload.py Outdated
Show resolved Hide resolved django/utils/autoreload.py

@orf orf force-pushed the orf:27685-autoreloader-refactor branch from cb33552 to a34d6ef Jan 12, 2019

@timgraham timgraham force-pushed the orf:27685-autoreloader-refactor branch from a34d6ef to 43c1431 Jan 12, 2019

@timgraham timgraham merged commit c8720e7 into django:master Jan 14, 2019

19 checks passed

docs Build #18271 ended
Details
flake8 Build #18380 ended
Details
isort Build #18411 succeeded in 24 sec
Details
pr-mariadb/database=mysql,label=mariadb,python=python3.7 Build #2851 ended
Details
pr-mariadb/database=mysql_gis,label=mariadb,python=python3.7 Build #2851 ended
Details
pull-requests-bionic/database=mysql,label=bionic-pr,python=python3.5 Build #3329 ended
Details
pull-requests-bionic/database=mysql,label=bionic-pr,python=python3.7 Build #3329 ended
Details
pull-requests-bionic/database=mysql_gis,label=bionic-pr,python=python3.5 Build #3329 ended
Details
pull-requests-bionic/database=mysql_gis,label=bionic-pr,python=python3.7 Build #3329 ended
Details
pull-requests-bionic/database=postgis,label=bionic-pr,python=python3.5 Build #3329 ended
Details
pull-requests-bionic/database=postgis,label=bionic-pr,python=python3.7 Build #3329 ended
Details
pull-requests-bionic/database=postgres,label=bionic-pr,python=python3.5 Build #3329 ended
Details
pull-requests-bionic/database=postgres,label=bionic-pr,python=python3.7 Build #3329 ended
Details
pull-requests-bionic/database=spatialite,label=bionic-pr,python=python3.5 Build #3329 ended
Details
pull-requests-bionic/database=spatialite,label=bionic-pr,python=python3.7 Build #3329 ended
Details
pull-requests-bionic/database=sqlite3,label=bionic-pr,python=python3.5 Build #3329 ended
Details
pull-requests-bionic/database=sqlite3,label=bionic-pr,python=python3.7 Build #3329 ended
Details
pull-requests-javascript Build #14773 ended
Details
pull-requests-windows/database=sqlite3,label=windows,python=Python37 Build #10380 ended
Details

@orf orf deleted the orf:27685-autoreloader-refactor branch Jan 18, 2019

@adamchainz

This comment has been minimized.

Copy link
Member

adamchainz commented Jan 20, 2019

@orf Thanks for the great work here 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment