Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider storing ddev db on persistent (local) storage #208

Closed
rfay opened this issue May 7, 2017 · 12 comments
Closed

Consider storing ddev db on persistent (local) storage #208

rfay opened this issue May 7, 2017 · 12 comments

Comments

@rfay
Copy link
Member

rfay commented May 7, 2017

What happened (or feature request):

Feature request: Store ddev db on persistent storage

So far I've destroyed a lot lot lot of sites. I've imported databases, and imported files Things work OK. But then I destroy them. And they aren't there when I want them.

My workflow in the past has always been to have many of my sites in dev form on my Mac workstation, always easily accessible. Even if I destroy a site the database is still in my mysql db. So I haven't been able to wean myself from that so far, because I know that at any moment I might destroy all my ddev containers.

Could we consider moving to persistent storage to avoid this?

I was reminded of this by Ten Things to Avoid in Docker Containers. NUMBER 1 is "Don’t store data in containers". We know that's going to be an issue in other products... but lo and behold I think it's an issue in ddev.

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else do we need to know:

Related source links or issues:

@rfay rfay changed the title Consider storing ddev db/files on persistent (local) storage Consider storing ddev db on persistent (local) storage May 8, 2017
@tannerjfco
Copy link
Contributor

I think part of the motivation behind providing the "stop" command is to help alleviate this issue and allow shutting down the environment without data loss, but I do agree we should make it possible to retain the mysql data even if the containers are removed. This would likely be more in line with user expectations, and I would hate for people to lose data due to misunderstanding what rm currently does.

Making this change does have some important considerations however:

  • Performance - transitioning mysql data to volume mount will severely degrade performance. Future docker FS cache options may help alleviate this, but this will require some benchmarks to determine which option will best suit us - we may also have to decide to favor reads or writes in choosing a method. Alternatively, we may actually consider rsync to another location to get a local copy without a performance hit on the site. This approach may be viable since we're not dealing with 2-way sync between host and container.
  • Clean state - If we support persisting the mysql data, we need to provide a way for users to hit the reset button and get a clean mysql instance. This might be a case easily resolved with documentation to delete the mount directory.

In the meantime, we could probably add confirmation to ddev rm and warn of its destructive nature.

@rickmanelius
Copy link
Contributor

A few thoughts:

  • Approximately two years ago, Kevin was working on a solution with Docker that created almost a "black box" approach where the container was loading the files, code, AND db from the host machine. I don't think it was much of a performance issue as a stability issue, where the local files might get corrupted if the mount dropped midflight.
  • The problem we're trying to solve is a user's habits and expectations. I know that the practice @rfay is referring to is one that I'm a bit hard-wired into myself. I would typically get the entire files + code + db working but let the DB persist the longest with the ability to simply nuke the files/code at a moment's notice. And if the files got out of sync, I typically didn't care because local was meant to be a sandbox. Perhaps there is a different way to solve it. What if we prompted a user before a ddev rm to (by default) save a drush archive. That still wouldn't address the use case where someone decides to delete the docker containers directly by bypassing ddev, but it does give some guardrails.

Sounds like we need further investigation on LOE, OS compatibility, and performance before this can become prioritized and actionable.

@rickmanelius
Copy link
Contributor

Additional thoughts:

  • If the idea is that a user could reconnect with a database that was already imported, we would need to work through the scenarios of how a user would re-attach. For example, during ddev config, would we scan the file mount to determine which databases are available and then run the additional functionality contained within the import-db command to re-establish a settings.php for Drupal? Or do we provide a flag on import-db to specify an existing database location?
  • Do we provide users with the option (via a config) to flag whether they want to use the container versus file mount?

@rickmanelius rickmanelius added this to the v0.8 milestone Jun 13, 2017
@rickmanelius
Copy link
Contributor

Since we're in a research phase, we should get a sense of the performance impact and LOE to implement.

@rfay
Copy link
Member Author

rfay commented Jun 14, 2017

Here are the results of some of Rick's favorite testing: D8 install:

Mount type D8 web install D8 drush si -y
host-mounted (not cached) 2:45 1:13
host-mounted cached 2:45 1:14
Internal 2:40 1:10

After install, the general feel of the site was just fine with the host mount. I'm quite surprised that there was this little degradation (if there is any - hard to measure in this case)

All that's required to change this is

  • minor change to remove VOLUME from the mysql container's dockerfile
  • ddev should create the directory for mysql (in .ddev I'd assume)
  • ddev adds the mysql mount to the docker-compose template.

@rfay rfay self-assigned this Jun 15, 2017
@rickmanelius
Copy link
Contributor

Hi @rfay. Thanks for completing the performance tests and detailing out the specific requirements. I believe I'm tracking, but think it's important to reflect back what you're saying to ensure I don't have any gaps.

  • Performance: based on web and CLI tests, it looks like there is a slight performance penalty of up to 5 seconds (representing 3% to 6% of the total time) by mounting the mysql directory on the host machine.
  • Requirements: given that we're only talking about the mysql files/storage, this adds no new requirements on the host machine (e.g. this doesn't require a Linux, macOS, or Windows user to install mysql on the host machine if it's not already present).
  • Directory: You mention that it would make sense to store in the .ddev directory, but it wasn't clear if you were proposing the local project directory OR the global .ddev directory. If the goal is persistent storage that an end-user wouldn't accidentally delete, it would probably make sense to use the global ~/.ddev directory. Of course, we could just start with the app specific .ddev directory because it's more likely to get cleaned out if a user goes beyond ddev rm and wants to wipe out everything, they don't have lingering databases hanging around.
  • Workflow: We still need to nail down when this is removed. Anyone running a ddev import-db would expect to lose the previous DB, but what about ddev rm? We already explicitly warn the user about the containers going. However, there is likely a scenario where someone wants to nuke the containers but leave the DB. And if the user leaves the DB on persistent storage with the containers gone, it would seem reasonable when running ddev start to essentially look for that DB and generate the settings.php and wp-config.php files to point to that persistent DB. Just things that we need to nail down, at least for a first pass so that this persistence is actually useful and it doesn't get wiped out anyway!

@rickmanelius
Copy link
Contributor

Two next actions before I think we can make a decision on this:

  • Get the workflow item resolved.
  • Sanity check with another member of the dev team.

@rfay
Copy link
Member Author

rfay commented Jun 15, 2017

It's pretty easy to do an experimental PR for this if that will help sort things.

@rickmanelius
Copy link
Contributor

Reviewing. Providing some resources for those that are following along https://docs.docker.com/engine/tutorials/dockervolumes/.

@rickmanelius
Copy link
Contributor

After a quick scan of the Docker documentation, I'm actually biasing towards the Docker Volumes approach. A couple key points:

  • Fat Finger Protection. One of my concerns with storing a database in ~/.ddev or in project/.ddev is that if someone says something like "oh just delete your .ddev folder" to fix another issue that they don't inadvertently lose their data. It's going to happen. It's only a matter of time!
  • Container Independence. Initially, I thought volumes might be lost when a container was discarded. It seems like there are protections in place to make that happen. Notably, you need to use a completely different command to add/remove volumes and the default behavior of pruning volumes is to leave ones that are being actively used by a container.
  • Business Logic. We get functionality for free. We no longer have to check if containers are still using that data.
  • Parity with non-local Solutions. We are likely going to need to use a similar approach in our hosting offerings. It would seem useful to adopt similar solutions when possible.

@rickmanelius
Copy link
Contributor

rickmanelius commented Jun 20, 2017

I realize we're bouncing around the conversation between multiple PRs and the issue summary. Regardless of the approach, we still need to figure out the fate of the ddev rm command (see conversation here) as well as providing a ddev means of removing the data.

Proposed next actions:

  • Need to resolve ddev rm and provide a way to purge the data (potentially with a new command).
  • Thoroughly test the workflow to make sure it makes sense.

Regardless, unless there is a major reason to adopt the non-docker approach, I think we gained a lot from both PRs but we should go in that direction.

@rfay
Copy link
Member Author

rfay commented Jun 25, 2017

Not done yet, done & redone and reverted, still needs work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants