Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add var/tmp / $kernel->getTmpDir() #23354

Closed
nicolas-grekas opened this issue Jul 3, 2017 · 13 comments · Fixed by #36515
Closed

Add var/tmp / $kernel->getTmpDir() #23354

nicolas-grekas opened this issue Jul 3, 2017 · 13 comments · Fixed by #36515
Labels
HttpKernel RFC RFC = Request For Comments (proposals about features that you want to be discussed)

Comments

@nicolas-grekas
Copy link
Member

Q A
Bug report? no
Feature request? no
BC Break report? no
RFC? yes
Symfony version 3.4

We did a lot of progress towards supporting read-only filesystems for Symfony app.
We managed to have the var/cache/ folder used almost only for build time artifacts.
I think we should do the last step and define the var/cache/ folder as the one used to store build time artifacts only.
There is at least one blocker: the cache pools, which need runtime write access.
Thus, I propose we add a new "tmp" folder, with a related $kernel->getTmpDir() method, defined as OK for storing runtime (and build time of course, for warmups) data.

@nicolas-grekas nicolas-grekas added the RFC RFC = Request For Comments (proposals about features that you want to be discussed) label Jul 3, 2017
@nicolas-grekas nicolas-grekas added this to the 3.4 milestone Jul 4, 2017
@ro0NL
Copy link
Contributor

ro0NL commented Jul 8, 2017

Like the idea :) but doesnt /cache sound like /tmp already? What about var/build / getBuildDir instead?

@havvg
Copy link
Contributor

havvg commented Jul 15, 2017

What are the actual benefits of this change? I mean, there is a cache directory, which is widely used by bundles and even applications. What are the benefits of adding such a new directory, other than putting meta constraints on it, that are no issue right now? What's the intent for this change?

Assuming, that you (as in the framework) are the only one, using the cache directory, is totally wrong to me. There are most likely applications (and probably even bundles) out there, using the cache directory for runtime caching, already.

Anyone "smart enough" would not even use the new tmp directory, because it's already outside of the project directory, e.g. on a RAM disk. Not to say they will be totally different approaches for temporary data management.

What's the "No BC Break" solution of this? Adding a new interface, checking for its existence and falling back to the old behaviour (use cache), while implementing it on the abstract kernel.

I would consider those BC Break, and I'm mentioning them here, because we have all seen this happening already, e.g. with the Form component.

  • Adding a new method to the Symfony\Component\HttpKernel\Kernel and rely on the implementation? Will kill apps not even using the abstract class.
  • Adding a new method to the Symfony\Component\HttpKernel\KernelInterface would be BC break by definition.

Even requiring to setup the new directory is BC break on the infrastructural level of the framework: You need to change deployment processes, grant access to new directories etc. What are the customer needs for this change?

@nicolas-grekas
Copy link
Member Author

nicolas-grekas commented Aug 17, 2017

The benefit would be that the "cache" directory could be mounted as readonly - and that we, the core team, would consider any write attempt there a bug. The "tmp" dir would be the only one with required read+write access, and we could (the community) advertise that the less you put here the better.

About the BC break, it is not required: we could just start by putting the tmp dir inside the cache dir as a first step in 3.4, then (if worth it) move it outside by default in 4.0, for new projects only ideally.

@dkarlovi
Copy link
Contributor

dkarlovi commented Dec 27, 2017

There is at least one blocker: the cache pools, which need runtime write access.

IMO this is significant enough to warrant a consideration: you're trying to make the cache folder read-only, but the cache pool is in your way so you'll move the cache pool out of cache folder. :)

Anyway, this is obviously a misnomer situation: there's different requirements on file-system storage by various subsystems (build-time, run-time) with varying read/write and volatility requirements. For the record, I think having a dedicated tmp folder (volatile runtime data) is a good move. 👍

Linux hier manpage talks about this, took out some more important ones (I'm not proposing to introduce all or even most of these, but it makes sense to see usage patterns):

/var
This directory contains files which may change in size, such as spool and log files.

/var/cache
Data cached for programs.

Volatile runtime data.

/var/lib
Variable state information for programs.

This is where MySQL stores its database files so it's non-volatile runtime data.

/var/lock
Lock files are placed in this directory. The naming convention for device lock files is LCK.. where is the device's name in the file system. The format used is that of HDU UUCP lock files, that is, lock files contain a PID as a 10-byte ASCII decimal number, followed by a newline character.
/var/log
Miscellaneous log files.
/var/run
Run-time variable files, like files holding process identifiers (PIDs) and logged user information (utmp). Files in this directory are usually cleared when the system boots.
/var/spool
Spooled (or queued) files for various programs.
/var/tmp
Like /tmp, this directory holds temporary files stored for an unspecified duration.

@dkarlovi
Copy link
Contributor

dkarlovi commented Dec 27, 2017

BTW segmenting files by when, how and by whom they are being written also play nicely with established runtime environments such as SELinux (used in most/all RedHat products like RHEL/CentOS, Openshift) where you have "contexts" in which certain paths (files properly labeled, to be exact) are accessible.

Free-for-all type of situation is not really great.

@smoelker
Copy link

Are there any new thoughts or is there any progress on this issue? I need to find a way to deploy a "hot build cache" to reduce warmup time of my Symfony applications. I'm running background tasks on Azure Functions and I suffer from long warmup times when scaling out to multiple instances.

@ghost
Copy link

ghost commented Jul 16, 2019

@smoelker: It does at least seem to work if you point the cache pools and the profiler to alternative directories. It's not hard to try out, so you should see what problems you run into (if any).

I have a very simple application in which i ship the container with the application in /usr/share/myapp, and then point the cache and profiler to directories in /var.

NOTE tho, that cache:clear will not clear those, so you have to use the cache:pool commands to clear the cache pools, and rm -rf the profiler files.

@jaikdean
Copy link

Naming confusion aside, having a well-defined boundary between build-time files created when “warming the cache” and files that can be written at runtime would be a massive, very welcome improvement for deploying to modern hosting stacks.

For example, when running on AWS Lambda, the filesystem is read-only. There is a writeable /tmp directory that can be used at runtime, but there's no officially supported way to deploy pre-warmed files there. Having the distinction between the two directories would make this trivial to use.

@mnapoli
Copy link
Contributor

mnapoli commented Apr 21, 2020

I have opened #36515 to try to move this topic forward. Solving this would make Symfony ready for AWS Lambda and other serverless platforms (like Azure Functions as well).

mnapoli added a commit to mnapoli/symfony that referenced this issue Apr 21, 2020
mnapoli added a commit to mnapoli/symfony that referenced this issue Apr 21, 2020
@Nemo64
Copy link
Contributor

Nemo64 commented Apr 21, 2020

I'm not sure if I should give input here or at #36515 but I'd argue that indroducing a new "imutable cache" would be less prone to backwards compatibility issues and can be named better.

Something like getBuildDir() which returns a directory where files can be generated during cache warmup, like the container, but must not be modified during normal runtime.

Existing code relying on the cache dir will continue to work because the guarantee of getCacheDir won't change. Any existing bundle/project that relies on the cache dir must opt-in into the new immutable build cache instead of opt-out like it would be the case if we'd add a temp dir and change the cache dir definition.

@dkarlovi
Copy link
Contributor

I think the problem is mostly naming, yes. Currently it's said the container is "cached", which is not really true, it's prebuilt. Cache implies volatility which isn't compatible with immutability obviously, so having "cache" be immutable is counter-intuitive to say the least, as I've noted before.

@mnapoli
Copy link
Contributor

mnapoli commented Apr 21, 2020

Those are very good points!

Introducing a getBuildDir() would make complete sense, and would be even easier to explain and keep BC.

I'm happy to follow that route.

@fabpot
Copy link
Member

fabpot commented Aug 18, 2020

Closing to focus the discussion on the open PR instead.

@fabpot fabpot closed this as completed Aug 18, 2020
mnapoli added a commit to mnapoli/symfony that referenced this issue Aug 19, 2020
mnapoli added a commit to mnapoli/symfony that referenced this issue Aug 20, 2020
…che directory

Build artifacts and caches should go in the new "build directory". This directory should only be read at runtime.

The original cache directory should now be used for caches written at runtime.
fabpot added a commit that referenced this issue Aug 21, 2020
…it from the cache directory (mnapoli)

This PR was squashed before being merged into the 5.2-dev branch.

Discussion
----------

[HttpKernel] Add `$kernel->getBuildDir()` to separate it from the cache directory

| Q             | A
| ------------- | ---
| Branch?       | master
| Bug fix?      | no
| New feature?  | yes <!-- please update src/**/CHANGELOG.md files -->
| Deprecations? | no
| Tickets       | Fix #23354
| License       | MIT
| Doc PR        | symfony/symfony-docs#... <!-- required for new features -->
<!--
Replace this notice by a short README for your feature/bugfix. This will help people
understand your PR and can be used as a start for the documentation.

Additionally (see https://symfony.com/releases):
 - Always add tests and ensure they pass.
 - Never break backward compatibility (see https://symfony.com/bc).
 - Bug fixes must be submitted against the lowest maintained branch where they apply
   (lowest branches are regularly merged to upper ones so they get the fixes too.)
 - Features and deprecations must be submitted against branch master.
-->

In order to support deploying on read-only filesystems (e.g. AWS Lambda in my case), I have started implementing #23354.

This introduces `$kernel->getBuildDir()`:

- `$kernel->getBuildDir()`: for cache that can be warmed and deployed as read-only (compiled container, annotations, etc.)
- `$kernel->getCacheDir()`: for cache that can be written at runtime (e.g. cache pools, session, profiler, etc.)

I have probably missed some places or some behavior of Symfony that I don't know. Don't consider this PR perfect, but rather I want to help move things forward :)

TODO:

- [ ] Changelog
- [ ] Upgrade guide
- [ ] Documentation

Commits
-------

ec945f1 [HttpKernel] Add `$kernel->getBuildDir()` to separate it from the cache directory
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
HttpKernel RFC RFC = Request For Comments (proposals about features that you want to be discussed)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants