Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
Rendering courses from forks and content fragments caching #362
So, I guess it's about time to submit this... I'm submitting now so you guys can get acquainted with the implementation before the sprint. (I'll come Friday afternoon.). Feel free to question anything and everything.
How courses from forks work:
Courses and runs in the base repo are defined by
The forks are serviced by model
Otherwise, contents of course pages are rendered in the fork. (So even the formats of the pages can be updated). By content I mean: things which are in the template folder
The content rendered in the fork is run through a HTML whitelist (
In the fork, I'm using the logger which gets urls from
There's a new file in the root of the repository,
How caching works:
So the first thing that needs to be said about caching is that Arca has its own caching. If cache is set, every result is stored under the commit hash of the repo at the time of execution and the task definition. So if the same task is executed and nothing changes in the repo, the result comes straight from cache. This is useful for forks, since the tasks and the repos won't change often. (Arca cache is persistent on Travis, see bellow.)
But that's not the only caching I implemented. I also did the content fragments caching as requested in #175. I'm reusing the arca dogpile.cache region for it.
Since absolute urls are generated in lessons for almost anything (subpages, solutions, static), the first change I did was generating relative urls instead. Since the url generators were already injected in the rendering, I just modified the functions to generate relative urls. The functions are generated in
The next thing needed was the list of urls generated in a specific fragment, so it doesn't have to be parsed to get urls to freeze. I used the logger from Frozen-Flask again, this time in a custom context manager way (
So that's the value - a content fragment with relative urls, so it can be reused and with it the list of the relative urls that have to be frozen when used.
The key is bit more complicated. I created names which can share individual fragments. The namespace is the commit hash of the commit which last modified anything inside of the folder
So when the value is retrieved from cache, absolute urls are generated from the current url and the relative urls and they're added to the queue of urls to be frozen.
Now only sharing the cache with forks is left to explain:
The base repository creates a key for the content fragment for the specific page from the fork on the same principles as described as above. The cache is checked if the value is present. If the value is present, the key is provided to the fork and the fork can decide if it wants to use it. If so,
If the value is not present in cache, the key is not provided and the fork has to render the content on its own. The value the fork returns is then used to populate the cache for next time.
How are errors handled:
By default, all errors in forks are silenced. This can be overridden by setting the environment variable RAISE_FORK_ERRORS to
Of course, the forks can be malfunctioning in a number of ways.
The Docker backend is used in Travis. To speed things up, the built images (with installed requirements) are pushed to a registry, so next time they can be pulled instead of building. While implementing the solution I was using https://hub.docker.com/r/mikicz/naucse/tags/, but I think the best solution would be to create a new account for naucse.
Unfortunately, the build takes some extra time compared to the current version once forks are introduced, this is mainly because of the extra boot time (sudo is required for docker) and pulling/building images.
The first thing I've done to improve the situation is persistent arca cache. (Meaning if there are no new commits in a fork, the whole thing can be retrieved from cache.)
If a really bad combination comes about (e.g. clean cache and new requirements), the freeze itself can be killed by Travis. (It kills builds that don't print anything in 10 minutes (and freezing doesn't)).
So the second thing is a new cli command which lists all courses and runs with info about them (slug and title, dates for runs, repo/branch for forked). Even courses which can't be pulled or don't return basic info are printed (only slug, repo/branch and a warning), so it serves like a nice overview what's actually being rendered. But the practical effect is that all docker images are pulled or built during this command, which also prints stuff regularly so it won't be killed and the freezing itself doesn't take that much time.
Finally, travis prints out debug information about what failed (the exact url is always mentioned so debugging is easier...). The
I implemented the webhook for triggering builds in a separate repository at mikicz/naucse-hooks. Currently, the hook has to be installed manually, I'll code the automatic creation later.
It's a simple Flask app which listens at
For my own naucse I've deployed the hook at naucse_hooks.poul.me. I'm willing to keep hosting the hooks on my VPS (vpsfree.cz), however, I'd rather not use the
Naucse uses the Current Environment backend when launched locally (as mentioned in #175). Caching is enabled. So pages from forks are cached and so are content fragments in lessons from the base repository. The cache for content fragments in the base repository is disabled automatically if your local version is modified (so changes you're making are always visible)
In addition to testing stuff manually I've written a couple of tests. They make a local fork from the current state of the repo (even with changes), however, they only test the basic stuff, usually just the structure of the response and couple of values. Furthermore, they test how naucse behaves if there's an error in a fork - that the page returns 200 with a warning.
What should to be finalized:
What's not in this PR:
I will submit these later in further pull requests.
Oh, Travis doesn't trust me to provide an encrypted variable , so I had to comment out settings which push images to docker (which isn't done in this PR anyway, because there aren't any forked repositories...)
Yeah, I know. I should've submitted this a while ago tbh to give you more time, but you know, procrastination... Plus I wanted to write up what I'm actually doing so you don't have to go by code alone...
A large part of the diff is a refactor of templates (which, I think, is more manageable if you only look at the result and not at the diff), but there's a lot of code as well too...