Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preserve comments and flow style when programmatically modifying meltano.yml (Ruamel) #2133

Closed
MeltyBot opened this issue Jul 17, 2020 · 2 comments · Fixed by #6308
Closed

Comments

@MeltyBot
Copy link
Contributor

Migrated from GitLab: https://gitlab.com/meltano/meltano/-/issues/2177

Originally created by @DouweM on 2020-07-17 16:23:10


PyYAML doesn't support this currently (yaml/pyyaml#90), but https://pypi.org/project/ruamel.yaml does.

Note:

Our usage of the Canonical class may also be a factor we need to consider in our implementation. Noted by @kgpayne below: https://gitlab.com/meltano/meltano/-/issues/2177#note_708071594

@MeltyBot
Copy link
Contributor Author

@tayloramurphy
Copy link
Collaborator

After reflecting on this issue more deeply I believe it makes sense to prioritize for several reasons.

First, it is a bad experience to have Meltano mess with the structure and formatting of YAML files too much. Many data projects have folks hand-editing YAML and there aren't a ton of projects in the data space that rely on the CLI to update YAML (maybe great expectations does a bit). Even with dbt the sources.yml, models.yml and dbt_project.yml are generally all hand-edited. So we're relatively unique in that we push the CLI on folks but we know that people will likely hand edit things as well. I've done this myself many a time. (This also makes me think of telemetry possibilities based on the hash of the YAML file).

Second, for the rest of 2022 and into 2023 we're heavily focusing on the CLI on our path to a Managed offering. With the YAML file being the core of your Meltano project, we're expecting users to look at and reference it a lot. Even in the UI we will likely expose the YAML file to users. It's an interaction point between them and the Meltano service - a key part of that relationship is not doing things that are unexpected or frustrating. If I move some keys around and then ask Meltano to do something, I wouldn't want it to undo my changes.

Third, this can hamper our version control story a bit. If Meltano is making changes to the structure I wasn't expecting, when I go to commit a change, I can be surprised by the diff. This violates quite aggressively the Principle of least astonishment.

Fourth, it does align well with our new user experience. If I'm coming from dbt in particular where I'm used to hand-editing YAML, and then I start mixing the CLI with manual edits (or comments etc.) and then Meltano just deletes those, that would be a deeply frustrating experience.

Given all that, and that this is scoped down to an 8, I'm keen to get this done. This has gone from a "nice-to-have" to something that absolutely lowers the user experience in light of our focus on the CLI.

FYI @DouweM @aaronsteers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants