Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define workflow for new versions of already existing packages #101

Closed
memsharded opened this issue Jan 18, 2016 · 22 comments
Closed

Define workflow for new versions of already existing packages #101

memsharded opened this issue Jan 18, 2016 · 22 comments

Comments

@memsharded
Copy link
Member

Let's suppose that I publish MyLib/0.1@memsharded/stable. But I made a mistake in my packages, and the package doesn't link in some cases. I want to publish a new package, what should I do?

  • Remove the old packages, overwrite with the new ones
  • Create a new channel, like stable-1, stable-2, and publish there the new packages

Maybe it is just a matter of clearly defining a workflow, and no new features of the tool are needed, but at least a recommended way to solve the issue should be defined.

@bjoernpollex
Copy link

On the client-side it might be useful to explicitly exclude certain versions from a version range. So I may be requiring version 1.X.Y of a given dependency, but I know that 1.3.2 has a bug that breaks my build.

@TyRoXx
Copy link
Contributor

TyRoXx commented Jan 26, 2016

Conan should keep a history of all uploads on the server. The conan.io web site should also display this history.
Maybe it would be easier if Conan would just pull changes from Git directly when notified by a commit hook. You would get a version history for free.

@TyRoXx
Copy link
Contributor

TyRoXx commented Feb 2, 2016

TLDR: It seems that the only option is to use the channel as a tag if you don't want to change the version.

I have two different workflows for two different kinds of conanfiles.

When I am the author and packager at the same time, I increase the minor version of the library to get a new, unique export reference.

When I am only the packager - for example for WebSocket++ - I am not sure what to do. The library already has a version and I use that exact version for Conan so that users of my export know what they get. Conan seems to have "stable references" that look like bzip2/1.0.6@lasote/stable:d798d7a029717444eae5633c08050939fe92cb8c, but these seem to include the settings already.

@memsharded
Copy link
Member Author

Some other user also suggessted that conan automatically checked upstream for changes in packages (it could be done with package sha-signatures, which are already embedded). I think it is doable and it could help in this issue, if a package is modified upstream, it will be updated downstream. Probably the UX has to be considered, should it be opt-in or opt-out?

@thiagocrepaldi
Copy link

👍 @memsharded I like that idea. In fact, it might be also related to #15. I've talking to Diego about this.

Conan is really good for referencing final releases. But there is ground to improve it for the development cycle of libraries/applications. Below is my idea regarding a possible workflow and how the feature above might help

Assume we have a development of an app (app1) in progress that depends on another development library (lib1). When app1 is installed locally, it will have the latest version of lib1. However, after some time, a new 'development' version of lib1 might be submitted into VCS (e.g. GIT). In that case, Continuous Integration would have the opportunity to generate a new 'development' package and upload it to conan server. A second 'lib1' is available remotely, but locally we don't know that - and never will unless we remove it locally and reinstall it.
It would be great if the next time we executed app1's 'conan install' we were warned about the new package for lib1 and given us the chance to install the new package in our system.

That approach has two benefits: 1) Even during development phase we would save time by not recompiling our dependencies from scratch all the time. 2) we might keep using an old version of a dependency for a long time until we realize a new develop version is out.

Another possibility is to implement a GIT hook as suggested by @TyRoXx instead of using Continuous Integration to create the precompiled package. One disadvantage is that we will depend too much on the of VCS. If GIT dies tomorrow or it breaks back compatibility with something, Conan would require refactoring to accommodate the new VCS or the new behavior

@memsharded
Copy link
Member Author

@thiagocrepaldi If git dies tomorrow, I think most of us would die too... XDDD

About keeping the whole history of the same package, lets say MyLib/0.1@myuser/testing, it is not clear for me. It doesn't make sense to have a whole history if you cannot reference them to use them in your builds. So the commit or any other info would be necessary. If such commit is added as an option to the package, then you get it, as it would generate a different sha, and thus a different package for each commit.

That approach is not exclusive of the synchronization approach. If a package with exactly the same signature is continuously overwritten, then, consumers of that package should be notified and be able to automatically upgrade, without having to manually conan remove the packages to retrieve fresh ones. I will propose an implementation for this for 0.8 release, as this feature seems necessary and clearly defined.

@thiagocrepaldi
Copy link

@memsharded Today I was talking to the guys in our company about keeping, optionally, git's sha1 inside the precompiled packages. That is a great feature, indeed.

However, it would be even more useful if we keep, somehow inside the precompiled package, not only the sha1 of the (toplevel) project, but also the sha1 of all dependent packages. With this information, conan would be able to retrieve from the server an exact snapshopt of all the source-code/precompiled packages. That is particularly useful for hot-fix scenarios. We don't want to use the latest versions of all packages to reduce the risk of adding to much changes. A small change in a full environment would suffice.

Is the feature you are proposing taking the second paragraph in consideration (storing all packages sha1 in the package) or conan should stick to a simpler implementation and keep only the parent's sha1 ?

It is important to notice that if we store sha1 of packages, we should have a way to get the package based on the sha1, not only in the package version+name+user+channel. Do you agree ?

@memsharded
Copy link
Member Author

Hi everybody,

So far, we have implemented the synchronization of overwritten packages, so if you make a mistake and want to re-upload a package with exactly the same version, user, channel, you can (it was already possible before). But now, the consumer is notified that their dependency is outdated, and can update it from the upstream (it is opt-in, nobody wants to have automatically deleted and re-installed a new version of a dependency without control). It has been released in 0.8 and so far it is working fine.

Now, we are onto these other issues. I have specifically created a simple example (https://github.com/memsharded/conan-hello-embed) in which I use the git repository commits in the channel field, so I can get a new conan package for each commit very easily. I have used a small python script, which should be simple to understand: https://github.com/memsharded/conan-hello-embed/blob/master/build.py. This script could be very easily used in CI to automatically generate package recipes, package binaries, and upload them to a conan server.

If I understood @thiagocrepaldi fine, conan already stores in each package the exact reference of its dependencies, so it is fully deterministic (you can check the conaninfo.txt file inside packages). If you are using the repo git sha as channel, as proposed in my example, then your "requirements" will contain such specific sha, which helps to track the original sources origin.

Note that with my example, a full reference is something like:

[full_requires]   Hello/0.1@memsharded/c0a877dde08930a751c277b1a11ec7453802a62f:63da998e3642b50bee33f4449826b2d623661505

That is, there are two shas: the first one is the channel, corresponding to the package recipe that packages such commit, and the second one is the package binary ID, that is the sha of all settings that differentiate this particular binary (e.g. Visual 14, Debug, 64, MDd) from other binaries.

Please note that you can also change the package version number, and have it named "dev" or any other thing.

I'd love to have feedback on this approach, if it could serve your needs, etc. This is notwithstanding other issues and ongoing work, specifically I am considering too the "scopes" for dev-production, and allowing some logic on the versions (latest, ranges, etc)

@Overdrivr
Copy link

Overdrivr commented Aug 23, 2016

So far, we have implemented the synchronization of overwritten packages, so if you make a mistake and want to re-upload a package with exactly the same version, user, channel, you can (it was already possible before).

Personnally, I am not a big fan at all of being able to overwrite an existing package version.

If you messed up a package version that doesn't build, then just push a new version fixing it and increment patch number (major.minor.patch).

Imagine the case where a user account on conan.io is compromised, and existing versions of the user's package are overwritten by broken ones. This will break all other modules of all other packages that depend on it. (Push the scenario by imagining the compromised user's packages are overwritten by packages with backdoors/malwares/etc.).

On the other hand, if overwriting is not possible, then the attacker will only be able to push a new version. Users that have hardcoded (and really, if you are a bit serious about deps management, you should) dependencies versions will be fine.

Another, more likely scenario: you overwrite by mistake the latest package version with a new, API-breaking one. Same bad outcome, builds breaking all over the place, even with hardcoded deps.

@memsharded
Copy link
Member Author

Hi @Overdrivr!

I'd also like this behavior, but there are reasons to do so, let me explain:

  • Sometimes it happens that you upload a package for a version, lets say Boost/1.60.0, and your package is broken, because you forgot to package libs in the package() method. You certainly don't want to create a Boost/1.60.1 version, as it would diverge from the real version of the library you are packaging, and that would be extremely confusing.
  • You might use the channel as a patch version: stable.v1, stable.v2, etc, but it seems a bit ugly, and package consumers generally don't like this approach.
  • Users that are currently creating and sharing conan packages wanted this behavior. It is important to have as many packages as possible, so package creators feature-requests are very important.
  • Actually, the current heavy usage of conan is not in the conan.io server, but using the in-house conan_server. Any corporate user concerned about security basically will not trust conan.io (or any other remote), so they create and host their own packages in-house. Getting existing recipes from conan.io or github and creating packages, thanks to the decentralization, is very easy, so they have full control over their full dependencies trees, and they decide which policy to follow, if they overwrite packages or not.

In any case, you are right, this is an important issue that shouldn't be overlooked. Overwriting is still necessary to not create a nightmare of minor versions/patches not corresponding to the library version. Maybe it could be done on specific channels, as stable.*. If you want to depend on any testing or development channel, you know that they can change, but stables or releases are frozen forever.

@Overdrivr
Copy link

Hi @memsharded !

Thanks for the answer. I didn't think of the use case where people package external libraries indeed.

However, I still maintain it is extremely bad to overwrite a specific version.

  • It breaks confidence you can have in a package version string
  • It can lead to extremely hard-to-debug issues
  • This is against one of the core rules of SEMVER.

What do I do if I accidentally release a backwards incompatible change as a minor version?

[...] Even under this circumstance, it is unacceptable to modify versioned releases.

I understand you want to grow the contributor-base, and early contributors may be requesting all sorts of features, but this kind of design choice will be a no-go for many other future contributors.

@mnowy
Copy link

mnowy commented Aug 30, 2016

Hi guys!

I think that package update is one of the most important functionalities of Conan.
It gives "Maven SNAPSHOT" like functionality for C/C++ developers.

This functionality should be way better documented because except for this discussion and changelog entries in documentation there is no sign for any "conan install --update" option.

Additionally i think it could be a good idea to show warning message when user has outdated package in local repo.

@memsharded
Copy link
Member Author

I totally agree, clearly undocumented feature, lets open an specific issue for this in the docs repository.

About showing the user warning messages, this was on purpose, we made the notifications for updates opt-in, instead of automatic (it was automatic at the beginning). There are some users that are managing large (from 50 to >300 packages) projects, and making conan install automatically do a remote check of dependencies makes it unnecessarily slower, even worse when poor internet connection. So we decided to make it explicit: conan info --update checks for updates, while conan install --update actually execute the update of obsolete packages.

@NingyuShi
Copy link

I agreed with @Overdrivr on the strictness of a package manager, ideally a version is pushed to the server should be frozen, so next time if my build can go back to exactly same dependencies when I created them.

At the same time, I understand the pain of fighting over the tiny bits of the conanfile.py to get a complex package to work, and been able to overwrite a published version is very useful. So I would propose to have both, basically each uploaded package get an unique id, and user can search and get it and put it into their requirements if they want. At the same time, if people are doing rapid development, and just want to use a version like 1.1, then they will get the latest uploaded version. There should be configuration in server for each user/channel that if the package there can be overwritten or not. For scenario which requires strictness, user can use the frozen package with id, otherwise they just use some version.

@ttencate
Copy link
Contributor

@Overdrivr @memsharded The way Arch Linux solves this issue is by having a separate "package version" in the version number. For instance, Boost/1.60.0-1 would be the first packaging of upstream version 1.60.0, and if the package maintainer somehow messed up, or dependencies get upgraded, or something else was changed in the packaging but not upstream, then they would simply release Boost/1.60.0-2.

@memsharded
Copy link
Member Author

Thanks @ttencate, sounds good approach and in line with the recent discussions in the "revision" thread: #798

@marco-m
Copy link

marco-m commented Aug 13, 2017

We at work do exactly what @ttencate suggests: Foo/1.2.3-N, where N is the recipe revision number. Each time we touch the recipe but not the version of the package it points to, we just bump the revision. This works well.

Being able to override a package reference in the conan server, as is the case now, is wrong in my opinion. It is exactly the same of doing a git commit --amend and force push on the master branch: you don't do it. Public history is immutable. Just bump the revision number :-)

@kenfred
Copy link

kenfred commented Sep 7, 2017

What about using a VCS like git for recipe revisions? A client could specify a hash or tag to get a particular recipe. If not specified, the most recent is chosen.

If not specified in the recipe, it would still be useful to pin exact revisions so you and your colleagues or the CI server are definitely on the same revision of package recipes. For this I would suggest a conan.lock file that works a lot like yarn.lock or the npm equivalent.

Benefits to doing it this way is leveraging existing git features. For example merging. Under @ttencate's approach, a recipe has a race condition if two people bump the revision number and try to submit a recipe. Another git benefit is that the recipes will be stored verify efficiently, just text file diffs.

I doubt the conan server would need branching or more advanced git feaures. But handling merges and maintaining history should be required if overwriting recipes is allowed.

I mostly agree with @marco-m and think that public recipe revisions should be mostly immutable. EXCEPT, I would like the ability to update the URL of a git clone in the source function. That way, if I migrate the location of the git repo, the recipe still works perfectly. In fact, this is one of the prime reasons I use conan over git submodules. I need to do this without changing the recipe revision.

@kenfred
Copy link

kenfred commented Sep 7, 2017

Let me make another case for automatic revisioning within Conan.

I should have the confidence that a recipe (or a recipe and a .lock file) will always result in the exact build output. That means recipes that pull from a git repo best contain a tag or hash that resolves to a single commit.

If I am co-developing packages (the subject of the development flow issue label) then the commit of a dependency I'm tracking is rapidly changing. I still want the recipe to contain the hash so I can go back to an arbitrary commit and have a fully reproducible build. Therefore, I need to continually revision and update the recipes as my development progresses. Doing this manually is painful and now you can see why I'm worried about recipe revision collisions when multiple developers are involved.

You can imagine how this will clutter the conan server. There will be many recipe revisions whose only change is an update to a repo hash in the source function. I'd prefer the conan server manage the revisions of recipes (or at least be aware of it) so it can hide all the redundant copies. Also it can efficiently store the recipes themselves since there is such minor change to the text.

@marco-m
Copy link

marco-m commented Sep 7, 2017

@kenfred in my opinion there should be NO exceptions to the recipe immutability rule. Again, it is like git master branch, you do not history rewrite it. In any case, this is somehow tangential to this discussion, since already today you can install your own conan server and override the recipes (or
am I missing something ? :-)

Regarding the "clutter on the conan server", how this can be avoided ? Also if the conan server where to learn about git as you propose, at the end the conan server is not only a "repository" of recipes, it is also a "repository" of binary artifacts (the packages), so the "clutter" due to packages would still be there. Otherwise the conan server, depending on how it is tracking recipes (plain versioning or git-aware) would sometimes provide binary packages and sometimes not. Although I can understand the dev workflow problem (and in my team we want to use conan exactly for this use case), I am scared by any additional toggles or conditionals added to conan or any tool. Conditionals add bugs in code and in human understanding :-)

@kenfred
Copy link

kenfred commented Sep 7, 2017

@marco-m I threw out a lot of info, so let me better organize my ideas.

Automated Revisioning

You and @ttencate suggested a convention where a person manually bumps the revision of a recipe and uploads it. It becomes a completely independent recipe where the package version is augmented with a recipe revision number. (i.e. Foo/1.2.3-N@user/channel).

The issues I see with that approach are:

  1. It is manual
  2. You have no way of managing collisions if multiple people are bumping the revision and uploading
  3. It is impractical for less-stable packages. For co-developed packages that are constantly changing, it becomes a manual task you must do upon every checkin and the chances of collisions are very high. Also your Conan server is littered with redundant recipes where the only difference is the source repo commit referenced.

My (rough) idea to address these issues:

Handle the revisioning automatically within Conan. When I export or upload to a remote, and if the recipe has the same name/version/user/channel as an existing recipe, it doesn't simply overwrite the recipe. It maintains the history of recipes. Someone using the package can get the latest by using Foo/1.2.3@user/channel (maintaining backward compatibility) or get a specific revision of the recipe by something like Foo/1.2.3@user/channel:hash or Foo/1.2.3@user/channel:rev#.

What this solves:

  1. It is automatic. Conan generates the hash or bumps the rev #.
  2. Conan can manage the race condition of multiple people uploading with some merge strategy.
  3. Even though each revision can be thought of a distinct recipe, Conan understands that it is an evolution of a single recipe, so it can present the info in a nicer way. Rather than a flat list of every revision of every recipe, it can stack the history of a single recipe. It allows the Conan repository to organize the recipes in a multi-dimensional fashion. For example, if I do conan search Foo/1.2.3*, I don't get inundated with a huge list of packages, I get a single result that tells me how many revisions are present. I can then optionally inspect a particular revision.
  4. With the aforementioned benefits (automated, managing the race condition, and the organization of recipe revisions) it is feasible to rev the recipe rapidly as a co-developed package changes.
  5. I think you can do it in a backward-compatible way. Without specifying the revision, it picks the latest.

I compare this feature to the way git works: it maintains a history of commits and manages merge conflicts. However, it needn't actually be git under the hood. It can be simplified.

Lock File

This is a suggestion to make automatically revisioned recipes more convenient and safer. If I get the latest by using Foo/1.2.3@user/channel, as described above, there is a risk that I'll be using rev N and a colleague will be using rev N+1.

So the proposed solution is to automatically generate a file called conan.lock where the recipe revision number is explicitly stated. I can check in conan.lock to my source repository so that colleagues will always be using the same revision of a recipe.

Now I have the option to explicitly call out Foo/1.2.3@user/channel:hash in conanfile.txt or, more conveniently, use Foo/1.2.3@user/channel and let the revision be marked for me automatically in conan.lock.

The lock file is also helpful (I'd say critical) for resolutions. Say Foo depends on A/1.2 and B/2.0, but A also depends on B/1.9. When you do conan install for Foo, you must manually resolve which version of B to use. The result of this resolution can be stored in the lock file to ensure that colleagues and CI servers pick the exact same resolution and their builds are the same.

NPM package locks and Yarn lock file do something very similar for precisely these reasons.

Mutable Recipes

As you said, this is only tangential to the revisioning discussion.

I agree that the ability to change a given revision of a recipe once it is public is like mutating public git history which is BAD. However, if the source repo a recipe pulls from is migrated to another server, an immutable recipe becomes broken and useless. What's worse, every commit of any customer who depended on that recipe is now broken as well; I can't go back to a random commit of Foo and do conan install because the recipe for Bar no longer has a valid URL to clone from! If I am allowed to edit the server URL within the recipe, then I restore the recipe to working order and all the customers remain happy.

Think of it this way: for a recipe to truly be immutable, the URL it pulls from must also be constant. But we know that repos move, mirrors change, things happen. It's even encouraged in the distributed nature of git. So if such a migration happens, you have already "rewritten history". The ability to update the recipe allows me to seamlessly recover.

Therefore, I would advocate allowing recipes to be overwritten with the suggestion that it should almost never be done. When it is done, it should only be to change how the source is retrieved and have no impact on recipe options or the binary results.

The only way to make a recipe fully immutable is to snapshot the source within the recipe and avoid the clone altogether.

Thanks,
Ken

@memsharded
Copy link
Member Author

This issue has been followed up in the "Package Revisions" issue, please follow conversation there: #798

ctin pushed a commit to ctin/conan that referenced this issue Jun 15, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests