Formula dependency management #12179

Open
chrismoos opened this Issue Apr 22, 2014 · 47 comments

Projects

None yet
@chrismoos
Contributor

I think there should be a standard way to manage formula dependencies. It is very natural for a state to be composed of other states but when using third party formula there is no easy way to manage the dependencies, versions, etc,.

A popular tool for Chef is Berkshelf. You put a file in your cookbook root (or state tree root in Salt's case) like this:

source "https://api.berkshelf.com"

metadata

cookbook "mysql"
cookbook "nginx", "~> 2.6"

There is a CLI command so you can fetch and install all of the dependencies, for example.

I propose we do the same thing for Salt's formula:

Saltfile - This will contain the dependencies, sources, etc,.
Saltfile.lock - This will contain all of the active/installed dependencies and their versions.

There will be a tool that will fetch and install the dependences, just like the berkshelf command.

I think that having this feature is really important especially as your state tree gets more advanced and you start pulling in third party formula.

Example Saltfile:

- sources:
    - https://formula.mycompany.com
    - https://formula.saltstack.org
- dependencies:
    nginx:
    redis:
        version: '>= 1.0.5'
    ntp:
        git: 'https://github.com/saltstack-formulas/ntp-formula.git'
@pidah
pidah commented Apr 22, 2014

+1

@westurner
Contributor

Questions:

  • Should the metadata be stored in a separate JSON/YAML file which does not require code execution?
  • Should the metadata be stored within templated sls files?
  • How to specify/handle GitFS/HgFS branch <-> environment mappings?
  • How easy should it be to diff between forks and communicate changes? [EDIT]
  • See also: #12183 "ENH: GPG signatures, branch-environment map (GitFS/HgFS workflow)"

Python packaging tools handle dependency graphs:

Conda packages solve for this with many languages:

@basepi basepi added the Feature label Apr 23, 2014
@basepi basepi added this to the Approved milestone Apr 23, 2014
@ahambrick

πŸ‘ Would be a great help in implementing a Continuous Delivery Process.

@avimar
Contributor
avimar commented Jun 24, 2014

Just to mention: nodejs has a rather awesome npm system for uploading, managing, and using dependencies (from any source!). It even allows them to have their own sub-dependencies of it's own specified version so they won't conflict with other things using a different version of the dependency.

@elmariofredo

1+ for simple sub dependency chain, also something like npmjs.org registry for formulas would be nice.

@westurner
Contributor

Formula Dependencies

In lieu of a standard way to manage this (e.g. setup.py + pip with $VIRTUAL_ENV/src[/salt-formulas] on sys.path and/or GitFS and/or salt file_roots),
an informal README.rst heading for "Formula Dependencies" may be helpful.

e.g. https://github.com/bechtoldt/iscdhcp-formula/blob/master/README.rst#formula-dependencies :

Formula Dependencies
====================

None

Namespacing

It may be easier to prefix/postfix things with <github-username>. e.g.:

https://github.com/salt-formula/salt-formula
salt-formula-salt-formula

https://github.com/westurner/salt-formula
westurner-salt-formula
@westurner
Contributor

Python Packages

Packaging salt formulas as Python packages with setup_requires/requirements.txt dependencies:

Tools

Caveats

  • Specifying dependencies with near-complete URIs / URLs (e.g. Golang) would be great.
  • It's possible to index JSON[-LD] metadata without indexing code
  • Conda is also platform-portable/specifiable: http://conda.pydata.org/docs/spec.html

[EDIT]

@westurner westurner referenced this issue in westurner/cookiecutter-saltformula Aug 22, 2014
Closed

ENH: Add "Formula Dependencies" heading to README.rst template #2

@edword
edword commented Oct 14, 2014

+1 for some sort of berkshelf or npm like dep management

@bechtoldt
Contributor

+1

@skylerberg
Contributor

I like @westurner's plan of using Python packages. However, we also need to be able to include the installed formulas in salt easily.

I think these could be solved with a pypifs, which would be like gitfs, but for Python packages. So instead of specifying git repos, you would have a list with entries like

  - westurner-salt-formula
  - salt-formula-apache-formula

This would handle finding the packages on your system, and would also find and include all of the dependencies based on requirements.txt.

Thus by editing your salt master's config and restarting the salt master, you could have all of your formulas and not have to worry about dependencies at all.

Finally, you should be able to specify a version just like you would when using pip manually

  - salt-formula-apache-formula==1.0.4
@westurner
Contributor
@westurner
Contributor
@iggy
Contributor
iggy commented Nov 10, 2014

What about something simple like using git submodules. Maybe Salt could even have some magic added that added the top level subdirs of the submodule to the top level path structure.

i.e.

graphite-formula---+----graphite----init.sls
                   |
                   +--- nginx-formula (submodule) --- nginx --- init.sls
                   |

and the graphite and nginx dirs get added to the top level salt dir (somehow, haven't really thought too much about that yet).

@skylerberg
Contributor

I think git submodules have several drawbacks compared to packaging.

Git commits do not hold the same semantic meaning that package releases do. For example, if you update a package with a bugfix, then you would have to go into all of the packages that depend on it and change their submodules. With versions you do not require such a specific version, just the same major version must match (unless you need features introduced in a minor version).

Shared dependencies would be duplicated.

I think packages could be handled more elegantly: No having to include .gitmodules, no having to initialize every time you clone, etc.

@iggy
Contributor
iggy commented Nov 10, 2014

.gitmodules is worse than a dependencies.txt/SaltFile/whatever.yaml/etc somehow?

And there's nothing that says you can't have a script/shell alias/whatever that does the checkout -> submodule init (in place of pip/npm/etc).

And as far as having to change .gitmodules when you commit fixes, git supports branches for submodules. So maybe each formula has a branch for each upstream release (or just master if it's a fairly generic formula).


I honestly think this is a problem that doesn't need to be solved right now.

The -formulas have enough other problems that the landscape could be complete different by the time we get around to needing real dependency management.

I'm not saying having this discussion is pointless, but I don't think implementing something right now is prudent. And I think too much discussion on the topic takes something away from the real problems that the formulas have.

There is a real problem of developer bandwidth right now. Trying to shoehorn formula dependencies in right now when nobody really knows what formulas will eventually look like is a Bad Ideaβ„’

@skylerberg
Contributor

I agree that inside the formula, .gitmodules is equivalent to .requirements.txt. However, I would like to see a solution where formula users do not need to have a .gitmodules and have to configure gitfs to point to the submodules. Just change the salt config, not change the salt config and have other files hanging around.

Of course, having the packages and a gitfs like way to include them is a rather large change and as you said, there are more important problems in formulas at the moment.

When we do get to solving this problem, I just want to make sure that we do it right (whatever right ends up being).

@westurner
Contributor

Is it possible to pull a specific version with .gitmodules, or just a branch?

How do I avoid push -f'ing over a whole tree?

@jeffrey4l
Contributor

I'd like to use the Python Package to manage the formulas. Just like the what python-xstatic[1] does.

E.g. There will be a packages named nginx-salt-formula which can be installed through pip or easy_install

There are several benefit for this.

  1. version manage and dependency are easy. Just change the setup.py/requirements.txt file in the formulas. Then PIP can solve the dependency.
  2. formula may depend on some Python Package in some case. ( for example nginx-salt-formula may has it own _state or _module, which ask for some Python Packages.) This can be solved by pip
  3. installation is easy. Just add the package's name to the salt master configure should be ok.

[1] https://pypi.python.org/pypi/XStatic

@UtahDave
Member

+1 for using python packages.

@iggy
Contributor
iggy commented Nov 12, 2014

Currently we have a hard enough time getting people to contribute their changes back. It's also difficult getting things merged for formulas that the couple people that can commit don't understand.

I'm worried that something along the lines of full pypi packages would make that even worse.

Not to mention the fact that formulas aren't even python code...

If you require strict formula ownership, I see the number of formulas plummeting.

Again, this is as things stand now. I think things will likely be different at some future time.

@whiteinge
Member

Very interesting discussion so far. Quick note about one remark:

the couple people that can commit

Everyone on the Contributors team should have full commit access on all formulas repos. I know a few of those have slipped through the cracks. If you notice one let me know and I'll add it under the team. On a related note, I have plans to toss a web interface up (soon as work-load permits) that will allow people on the Contributors team to create repos and fork repos into the org.

@iggy
Contributor
iggy commented Nov 13, 2014

I more meant that people are reticent to commit changes to formulas they don't use (unless they seem like obvious changes). Making it more difficult for people to contribute at this point in time doesn't seem prudent.

FWIW, I've personally had very good response with getting my PRs committed.

@chrismoos
Contributor

I really believe that having the formulas reside in a Git repository somewhere is going to be the best. I agree with @iggy that doing pypi packages just raises the barrier to contribute higher. There is plenty of evidence that the the model of forking a git repo to contribute has been very successful. It encourages people to make changes and to push them back upstream.

What's really needed is just a way to manage locating and fetching the formulas that you depend on. I don't think we have to say that We must use Git!, but instead be flexible with where formula dependencies can reside. Look at projects like CocoaPods, Bundler, and Berkshelf. They have some things in common like:

  • Dependencies are not limited to a specific source (i.e you can use git, local file system, etc,.)
  • Dependencies are set forth in a file in the top level directory
  • The tool resolves dependency versioning between components

In addition, all of the aforementioned tools have been wildly successful at what they do and have really provided an easy way for people to collaborate and contribute.

CocoaPods has a central repo, kind of like Homebrew, which lists out the canonical list of all packages and metadata for each version. This gives you the ability to just specify a dependency with a simple name (and an optional version specifier). The central repository is a good one but obviously requires maintenance and people to manage pull requests of people wanting to add their packages to the offical list.

Bundler and Berkshelf also have central listings of packages, albeit a bit different than CocoaPods.

I propose the following high level idea:

  • Formulas have a file in the top level directory containing package metadata, the package metadata file
    • Dependencies
    • Package information
    • etc,.
  • A tool will be developed to facilitate the following (at a minimum)
    • Resolve and install dependencies
    • Generate a sample package metadata file
  • There will be a Git repository created to house all of the available formula
    • Pull requests will be sent to bring in new formula or update existing ones
    • Each formula will be a directory in the repository that contains a version folder, and inside of that folder is the package metadata file.
  • Dependencies can be listed as:
    • A name and version, this will resort to using the official Git repository to locate the package
    • A local file path where the formula is located
    • A Git repository URL and branch/commit specifiers (also, consider supporting hg as well)
    • At a later time, maybe support https + the SHA256 of the formula's tarball

Obviously there is a lot to spec out, but my 2 cents is that the above is the way to go, not pypi.

@TheCatPlusPlus

setuptools along with pip/peep already allows for everything listed above, Salt really doesn't have to reinvent the wheel (heh). And you don't necessarily have to make people upload anything to PyPI: just make a custom index that generates package entries from the currently existing GitHub organisation.

@iggy
Contributor
iggy commented Nov 13, 2014

An instance came up just yesterday with dependencies in a formula that I help maintain. To my knowledge it's the first formula to specifically list any dependencies. This particular instance is the aptly formula and it says it depends on the nginx formula.

The thing is, it doesn't depend on the nginx formula. It depends on a salt state module called nginx (we have our own nginx state module rather than using the formula).

So if this had been codified already and the aptly formula had a hard dependency on the nginx-formula, we wouldn't have used it (or more likely I would have cloned the code, gutted the nginx dependency, fixed all the issues it had, and never bothered to contribute my fixes back upstream).

Just a use-case to keep in mind.

P.S. I'm still not sold on pypi being open to a bunch of packages that are going to contain virtually no python code. All over the pypi site it specifically says it's for python packages, modules, and apps. Not a bunch of yaml code with some jinja sprinkled in.

@westurner
Contributor

Python Packages

I put these notes together about python packages, in general: https://westurner.github.io/dotfiles/tools.html#python-packages

Examples of including package_data in Python packages:

You can generate MANIFEST.in from the repository manifest:

git ls-files | sed 's/\(.*\)/include \1/g' > MANIFEST.in

You can add commands to setup.py:

# setup.py
from distutils.command.build import build as DistutilsBuildCommand


def generate_manifest_in_from_hg():
    """Generate MANIFEST.in from 'hg manifest'"""
    print("generating MANIFEST.in from 'hg manifest'")
    cmd = r'''hg manifest | sed 's/\(.*\)/include \1/g' > MANIFEST.in'''
    return subprocess.call(cmd, shell=True)


def generate_manifest_in_from_git():
    """Generate MANIFEST.in from 'git ls-files'"""
    cmd = r'''git ls-files | sed 's/\(.*\)/include \1/g' > MANIFEST.in'''
    return subprocess.call(cmd, shell=True)

class RunCommand(setuptools.Command):
    user_options = []
    description = "<TODO>"

    def initialize_options(self):
        pass

    def finalize_options(self):
        pass

    def run(self):
        print(self.__class__.__name__)

class GitManifestCommand(RunCommand):
    """Generate MANIFEST.in from $(git ls-files)"""
    description = __doc__

    def run(self):
        generate_manifest_in_from_git()

# ...
class DotfilesBuildCommand(DistutilsBuildCommand):
    """re-generate MANIFEST.in and build"""
    description = (
        "update MANIFEST.in AND " + DistutilsBuildCommand.description)

    def run(self):
        generate_manifest_in_from_git()
        DistutilsBuildCommand.run(self)

# setup(
    cmdclass={
        'git_manifest': GitManifestCommand,
        'build': DotfilesBuildCommand,
    }
# )

And then

python setup.py git_manifest
python setup.py build   # calls generate_manifest_in_from_git() before building

Hosting salt formula python packages

How is git insufficient?

  • Packaging is an important stopgap in any release process.
  • Controlling committer access is as difficult as controlling the
    release process
  • See TODO about signing commits.
    (#12183
    ENH: GPG signatures, branch-environment map (GitFS/HgFS workflow))

Version Strings

  • Ideally, package version strings could contain a http://semver.org
    version number and a git commit id.
    E.g. major.minor.patch-<gitcommitid>, in order to make
    diff-ing easy.

Testing

Should there be a requirement that each formula can be tested with a
standard interface and/or convention?

Something like ./tests/__init__.py in each formula?

Or should that just be functionality provided in salt core?

Documentation in the README?

@westurner
Contributor

For merging each salt-formula into one major (git) repository (as I think @chrismoos is describing):

I don't know how to do this with hg; though I'm sure there's a way. The immutability of hg has always been a selling point for me.

@jeffrey4l
Contributor

There is another big issue in the salt-formula repository. There is few version management in current's formulas. It is useless and dangerous for production environment. Because formula may be changed and cause some issue if there is only one master branch.

I think this is a big issue which will block the re-use of formulas.

@samos123
samos123 commented Dec 9, 2014

+1 for using Python packages

@westurner westurner referenced this issue in westurner/winswitch-formula Jan 8, 2015
Merged

don't use minus in module names #1

@bechtoldt
Contributor

I'm thinking about extending https://github.com/bechtoldt/vcs-gather with dependency resolution support for SaltStack formulas and Puppet modules. The metadata.json file from Puppet (https://docs.puppetlabs.com/puppet/latest/reference/modules_publishing.html#write-a-metadatajson-file) could be acceptable for it.

@westurner
Contributor

@bechtoldt https://github.com/westurner/pyrpo (pyrpo -s . -r sh) and/or pypi:vcs and/or https://github.com/conda/conda/tree/master/conda (http://conda.pydata.org/docs/#requirements (pycosat) may be useful).

Conda packages have a meta.yaml file. https://github.com/conda/conda-recipes/blob/master/requests/meta.yaml

Python packages have a pydist.json (PEP 426)

@bechtoldt bechtoldt referenced this issue in bechtoldt/formula-docs Jun 21, 2015
Open

Provide a clean way to manage Salt environments (Git repos) #2

1 of 8 tasks complete
@DanyC97
DanyC97 commented Jun 29, 2015

very useful info, is any traction being put on this for next Salt release?
Asking as i'm at the point where i want to move from states (where i have parent-child/ inheritance relationship) to formula based but then seeing this topic i'm worried i'll bum into a bigger problem.

@westurner
Contributor

Salt Formulas work great without automated dependency resolution (formula
dependency management).

Here's one way to do Salt Formulas in separate repos + GItFS:

https://github.com/saltstack-formulas/salt-formula/blob/master/salt/formulas.sls
On Jun 29, 2015 6:52 AM, "Dani Comnea" notifications@github.com wrote:

very useful info, is any traction being put on this for next Salt release?
Asking as i'm at the point where i want to move from states (where i have
parent-child/ inheritance relationship) to formula based but then seeing
this topic i'm worried i'll bum into a bigger problem.

β€”
Reply to this email directly or view it on GitHub
#12179 (comment).

@bechtoldt
Contributor

I'm going to implement bechtoldt/GatherGit#3 in a few weeks which will address the ideas of this issue. If you have any further comments, let me know.

@westurner
Contributor

That would be cool.

For test cases, you might have a look at some of the:

And a start at a test framework for salt formulas:

@bechtoldt
Contributor

@westurner salt formula testing is a completely different topic, I'll cover that in bechtoldt/formula-docs#4 :)

@westurner
Contributor

@bechtoldt Some tests are probably apropriate? (e.g. 'compiles' w/o syntax error, [...])

Should this/these metadata/test skeletons be standard functionality of e.g. salt.formulas or copied into every formula?

An example metadata file in https://github.com/westurner/cookiecutter-saltformula could be helpful.

@bechtoldt
Contributor

ouh, you mean testing the metadata itself? of course, this will be important.

@westurner
Contributor

where/how do I call e.g. check_formula_metadata('./path'),
check_formula_'importable'('name')?

On Tue, Aug 11, 2015 at 4:31 PM, Arnold Bechtoldt notifications@github.com
wrote:

ouh, you mean testing the metadata itself? of course, this will be
important.

β€”
Reply to this email directly or view it on GitHub
#12179 (comment).

@bechtoldt
Contributor

πŸ‘

@westurner westurner referenced this issue in westurner/provis Sep 7, 2015
Open

BLD: GitFS configuration #4

@bechtoldt
Contributor

The Salt Package Manager might be a solution for this issue in the future. I think it's still in a very early state. I'll file some feature requests.. :)

#24896 (PR)
#25210
#25211

https://docs.saltstack.com/en/develop/topics/spm/

@rallytime
Contributor

Good call @bechtoldt. Any addition thoughts about this ^^ @techhat?

@activars

Here's my thought. The reason we need dependency management is being able to

  1. Easily reproduce formular used in a code base
  2. Explicitly understand it's original reference location
  3. Being able to compare or track down changes for formulars
  4. Being able to collaborate in both formular module development and large complex formular deployment

There are many ways to implement this, the simplest way is to use git to begin with - like Golang community's Godep. There's a great advantage of this:

  • Everything is version and hosted (privately or publicly), no need to worry about storage location
  • Dependency is aiming to produce a working package (combination of formulars), using Godep's approach would help us to get there first (quick win)
  • Introducing "Saltfile" like specification would be a second challenge (medium and long term)
@activars

btw, spm looks great.

@themalkolm
Contributor

Are there any plans to have dependencies in spm?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment