Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get rid of the state file #10474

Closed
josb opened this Issue Dec 1, 2016 · 5 comments

Comments

Projects
None yet
2 participants
@josb
Copy link

josb commented Dec 1, 2016

Terraform v0.8.0-dev

https://www.terraform.io/docs/state/index.html says:

Note: Terraform currently requires the state to exist after Terraform has been run. Technically, at some point in the future, Terraform should be able to populate the local state file with the real infrastructure if the file didn't exist. But currently, Terraform state is a mixture of both a cache and required configuration and isn't optional.

I'd go even further and suggest there should be no state file at all. Terraform should just query the remote state every time it's needed. The state file going out of sync with the real infrastructure state is a major source of issues with Terraform. Only the provider knows what the state is. I realize this would make Terraform slower, but correctness is more important.

If this isn't possible, can you tell me why not, please?

Thanks for your consideration.

@mitchellh

This comment has been minimized.

Copy link
Member

mitchellh commented Dec 1, 2016

I'm glad you asked since I haven't had a chance to really write this down in any single place.

This isn't possible because there needs to be some database of some sort to map Terraform config <=> real world. For some providers like AWS you could theoretically use something like AWS tags (early versions of Terraform actually had no state file and did this). We quickly ran into problems: not all resources support tags.

Going forward from there, we ran into bigger issues: we encode more than just attributes in the state file. We have to encode things like depends_on so that when you delete items from a configuration we can delete them in the proper order. We can't just encode rules like "subnets before VPC" in Terraform because this also effects cross-provider resources and the complexity is effectively infinite.

In addition to depends_on, we are going to store (in the future) information like last run time, when a resource was created, lifecycle options like prevent destroy to avoid accidental destroy, Terraform-specific tags/annotations on resources, etc.

We need state somewhere.

You brought up sync issues, Terraform by default will refresh the state on every plan/apply operation. This is effectively the same as if we didn't have a state file.

The pain I've found people have with state files is generally in conflicts when two people modify it. We are certainly working to improve that, but "removing the state file" just shifts a WHOLE INCREDIBLE AMOUNT of complexity from one place to a completely new place.

Beyond that, you also brought up performance. We have customers that manage over 10,000 resources with Terraform (in a single state file). Personally, we don't recommend managing that many resources in a single state file, but Terraform can do it. They get around this with clever tooling around targeted refreshing. If Terraform synced every resource on every operation, these users just could not use Terraform. They must work under the assumption that the state is in sync most of the time, and allow errors when its not.

But I think its important to reiterate that the state file isn't a convenient performance optimization. If anything, it is an inconvenient performance optimization that we need to store the critical metadata above it.

We have plans to improve things though! For example, for Terraform 0.9 we actually plan to split the state into two files: tfcache and tfdata (final named tbd). The tfcache will be the attribute data that syncs, and you can openly ignore this if you want and let Terraform sync your entire state. The tfdata is critical metadata for syncing and operations that must not be deleted. This should help lower conflicts a lot and simplify management.

As an aside: This sort of issue reminds me of another issue. I'm not trying to degrade your viewpoint in any way and I appreciate you asking this question, but its a pattern I find folks do that usually isn't the right approach: X is complicated and causes problems, please get rid of X. When something is complicated and causes problems, the implementors generally have had multiple conversations about "why do we have this? do we need it? can we get rid of it?" and have determined that its either needed or that there is a way to improve it. State falls under this and I hope that this helps!

As I said, I'm glad you asked since I haven't had a chance to really write this down in any single place.

Another time this cropped up (the issue is still around here somewhere, closed) is when someone recommended we abandon representing infrastructure as a graph, and just try everything in parallel until it succeeds. That may be oversimplifying it but basically: retry until you don't get an error or you've retried enough times.

The graph is SUPER complicated but it enables a level of safety and understanding. I view state as a similar thing, but we've done it less well... up to this point. We're working on improving that with time though.

@mitchellh mitchellh closed this Dec 1, 2016

@josb

This comment has been minimized.

Copy link
Author

josb commented Dec 1, 2016

Thanks for taking the time to render this excellent writeup, @mitchellh ! Would it make sense to stick this in the docs under the heading "Why does Terraform have a state file?" ?

@mitchellh

This comment has been minimized.

Copy link
Member

mitchellh commented Dec 1, 2016

That's a good idea @josb. I'll type that up tonight.

mitchellh added a commit that referenced this issue Dec 3, 2016

@mitchellh

This comment has been minimized.

Copy link
Member

mitchellh commented Dec 3, 2016

@josb Check out: #10519

We'll publish this online as part of TF 0.8.

mitchellh added a commit that referenced this issue Dec 3, 2016

Merge pull request #10519 from hashicorp/b-state-purpose
website: document state purpose [GH-10474]

gusmat pushed a commit to gusmat/terraform that referenced this issue Dec 6, 2016

@josb

This comment has been minimized.

Copy link
Author

josb commented Dec 9, 2016

Thanks @mitchellh , that looks great. It gives folks like me who are interested in the inner mechanics of Terraform more insight into the design decisions underlying the implementation. Very helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.