-
-
Notifications
You must be signed in to change notification settings - Fork 267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: first-class support for "namespaces" in registries #1836
Comments
Now here is the important part. Suppose that somebody else creates a GitLab organization (technically GitLab calls them "groups" instead of "organizations") at the location Now this person wants to register a new namespace in the Julia General registry that corresponds to their This is not a bad thing. This is just the same as when someone registers a package, and then later, someone else wants to register the same package name. For example, Fredrik has an excellent package with the name Same thing here. If the namespace name already exists, you must pick a different name for your namespace. So let us return to the previous example. We have someone that has created the So they have to pick a different name for their namespace. For example, they may choose to pick the name |
In the above examples, I used different names for the GitHub/GitLab organizations and the namespaces. I did this to emphasize the difference between a GitHub/GitLab organization and the namespace in the Julia General registry. They are not the same thing, and they do not need to have different names. Now, that being said, I think in the vast majority of cases, people will choose to have a namespace that is the same as (or very similar) to the name of their GitHub/GitLab/Bitbucket/etc organization. For example, take @EricForgy's GitHub organization If he chooses |
This would be beautiful ❤️ Thanks for spelling it out like that and getting the conversation started @DilumAluthge 🙌 |
For reference, in addition to the mentioned:
There is also:
|
I was thinking this kind of layout: Normal packages:
Namespaced packages:
And e.g. the information for a namespace would be here:
|
To elaborate, one of the major advantages that this proposal has over those other issues/PRs is that this proposal does not require any other registries. Everything is registered in the Julia General registry. In contrast, most or all of those other issues/PRs requires people to maintain their own Julia package registries. |
That is probably better. Looking at, just as an example, |
What about not having a separate
with
? 🤔 Edit: I'm not sure the second "letter" folder is necessary. It is unlikely an org is going to have enough packages to warrant that, but it is more future proof 🤔 |
add Namespace/Package
add @Namespace/Package => PackageSpec(namespace="Namespace", name="Package")
Nevermind. It is an error, so it is fine without Edit: Notes:
|
Why do we need namespaces when we have UUIDs? |
I think that By that same argument, why do we have package names when we have package UUIDs? For user convenience. |
Update: I spent most of today looking into this. I have some upcoming high-profile projects that can potentially highlight JuliaFinance so I'd like to register some packages, but hope to get this namespace stuff worked out first. A picture is worth a thousand words so... I can now add packages with a namespace and it will split the namespace from the package name. I did end up adding a new field At the moment, it just falls back to using the name to determine the UUID, but the idea is that if there are multiple packages with the same name, it can use the namespace, if any, to determine which one to use. Next, on the registry side, I made some minor changes. First, I moved all JuliaFinance packages currently registered with General, i.e.
to 0fd90b74-7c1f-579e-9252-02cd883047b9 = { name = "Currencies", path = "J/JuliaFinance/Currencies" }
44e31299-2c53-5a9b-9141-82aa45d7972f = { name = "DayCounts", path = "J/JuliaFinance//DayCounts" }
4f18b42c-503e-5345-9536-bb0f25fc7038 = { name = "BusinessDays", path = "J/JuliaFinance/BusinessDays" }
a33ca353-0707-5c2b-b398-646075a850cd = { name = "CurrenciesBase", path = "J/JuliaFinance/CurrenciesBase" } It seems to be working as I hoped. Until the tooling (Registrator, TagBot, etc) gets updated to handle namespaces, I don't mind doing this manually with PRs diretcly to General, but don't worry. There will not be so many updates. There is certainly work to do with tooling, but that can come later I think. For example, Registrator could check a Namespace.toml before registering new packages, etc, but that is "nice to have" I think. This will solve one of my major problems so I appreciate your consideration. What do you think? |
I mean, adding by name is just a key to finding the UUID. We can add user/package as another key to that same UUID without having to mess with namespaces etc. |
For what it's worth, that is pretty much all I'm doing. It still uses name to find UUID, but if there is more than one UUID it will use the namespace before giving up. I'll try to submit a PR tomorrow. It is just a few lines added. Not major surgery by any means 🙏 Edit: I think I have an idea that is in the spirit of your comment and of #1071. I'll probably still need that |
What do you mean when you say "user"? |
Also, if we add "user" as another key, then we have the issue where the package name by itself is not a good name. One purpose of a namespace is that An example is the JuliaFinance package But |
I just meant that we can try harder to "pattern match" against info that is in the registry.
🤷 it is their loss really if they want to use such regular names. And from above, if it will be used in the code only by name, then what is the point of differentiating namespaces when installing? |
One of the big issues we have had in the General registry is that people want to register packages with names that in my opinion are not suitable for the registry. For example, someone might make a package for doing expectation-maximization algorithms. They might want to name their package So what does that person do? The only way for them to get to keep the name And this is how we ended up with multiple open source organizations maintaining their own registries. Examples include (but are not limited to) the BioJulia registry and the JuliaFinance registry. But there are many ecosystem problems with organizations having their own registries. Just to give one example: NewPkgEval only tests packages in General. So packages in other registries are not tested. So one of my main goals here is to give open source organizations the ability to have whatever package names they want, without polluting the "top-level namespace" of the General registry, and without having multiple registries. So in my And then later, someone else writes a package for solving problems in electricity and magnetism. They call this package |
Personally I think this has the potential to get confusing. What if there is a GitHub user named But anyway, this sort of URL pattern matching doesn't get to one of the core issues, which is that we want to register names that do not pollute the top-level namespace of the General registry. |
Actually I think it is our loss 😂 . Because as @StefanKarpinski said on Slack, if it isn't typo-squatting and it isn't offensive, then we let the package author choose the name. If the package author insists on a "bad" name, we still merge it. So it's our loss in that we end up with these "bad" package names polluting the top-level namespace of the registry. |
I don't understand these arguments. There will still be a package named
Yes, because Pkg won't just blindly add a package if the specification is ambiguous.
But how is that a loss for us as registry maintainers? It will just be a loss for the package maintainer in the sense that people will not find the package as easily. |
So that's fine for the first organization to register the name |
Last I checked, we don't actually allow two packages in the General registry to have the same name. Am I mistaken? |
Thanks @DilumAluthge and thanks @fredrikekre for thoughts and feedback 🙌 From the above, I can tell the issue is understood and hope it can be taken seriously (I haven't registered a package on General for ages because of it). Naming is important. A name that isn't appropriate in a general context could be perfectly appropriate in the context of a namespace. So I hope we are past the question of whether disambiguating is a good thing and focus on how to do it 🙏 In my first attempt at namespacing (#1064), I went too far. The namespace actually pointed to a different registry. I can still imagine a time when we could allow registries to depend on other registries, e.g. #1072, #1791, but that is a bigger question than we are addressing with this issue. In this issue, we are trying to allow namespacing within one registry, namely General. If I understand, Stefan and Frederik are both suggesting disambiguating along the lines of #1071. For example, we already know the repo url which is typically of the form: so we can just parse that if we need a tie breaker on the name. For example, pkg> add OrgName/Package.jl and it can match that against the repo URL. For sure, that is possible, but what Dilum (and I) are proposing here is a little more robust, while at the same time, not requiring much more effort. As a concrete example, the next two packages I would want to register on General are:
I imagine getting pushback on both the names because they are too generic for General, yet are perfectly reasonable for JuliaFinance (and I really do not want to preface the package names with "Financial***". Frederik, you make a good point. Just because a package can be disamguitated with a namespace shouldn't mean that you can't add it without the namespace if it isn't ambiguous. In my draft PR, it accomodates both. For example, another package I might register is called XBRL.jl. There is no other name I would consider for such a package and it is clear within the context of JuliaFinance what it is. Nevertheless, I think it is unlikely anyone outside JuliaFinance would register a package called XBRL, so we should be able to simply do: pkg> add XBRL even though pkg> add JuliaFinance/XBRL is also fine. I'm onboard with this idea 👍 In this way, if there is a clash, instead of having an interactive selection, we can simply suggest something along the lines of: Error: The name `FX` is ambiguous. Please add with either:
pkg> add Animators/FX
or
pkg> add JuliaFinance/FX Whether we parse the repo URL or add a more robust namespace, I lean toward namespace, but I see that it can be done with a similar amount of (little) effort. It is almost finished already 😊 I'll try to get a PR up later today and that will make it easier to pick at 🙏 |
So, I tried to register a package named https://github.com/DilumAluthge/Literate.jl/commit/1f2d06300df93e113dcac5795c9a1cba90018f27 So we currently do not allow you to register a package with the same name as an existing registered package. |
So then how does pattern matching the URLs help anything? If someone has registered a package named |
But that is not correct. If someone registers a package with the name |
I just don't understand why having two packages with the same name is somehow more desirable than just having proper namespaces. |
Maybe if someone tries to register a package with the same name as an already registered package, they are required to use namespace and then it is their problem if their users pull the wrong package. They need to specify clear instructions to use namespace when adding.
I don't see these as mutually exclusive. You can have both. Best if I can get a PR up. You will see what I mean. |
Think about how many people have something like If I register my own package with the name |
There is no chance we would allow that to happen 👍 |
But if we allow multiple packages with the same name, by definition we would allow that to happen. |
This is exactly what I want. But that's not the solution that e.g. @fredrikekre is suggesting. |
Not necessarily. I will show you. Be patient 🙏 |
Alright, I'll wait for your PR! |
We can have both. I agree with both of you 👍 |
I'm not particularly interested in this subject but since I got dragged into it...
If people can't organize their own local registries with unique package names, I suspect they have bigger problems than the tooling. |
That's a great point. There isn't any reason we need to have multiple packages with the same name in local (private) registries. So I don't think we need to add support for multiple packages with the same name to LocalRegistry.jl. |
But thats why we have UUIDs, I don't think we need to add another layer of disambiguation.
I don't think there are any decisions since this has not come up before.
People can use whatever names they want -- pushback are usually just suggestions. However, I would argue that naming something
You already get prompted in case of multiple packages.
That is a limitation/bug of Registrator.
Why does that suck? This is never seen by users.
It will be the same level of confusion IMO, there will still be multiple packages with the same name.
Why? This is an interactive REPL and you will just have to select the package you want.
If you are worried about this your instructions should be
Well, that is not the recommended approach so hopefully people don't do this. It is recommended to use a project file where both name + uuid are already specified so it will not cause any problems. |
Just to add my opinion, this feature is quite useful, but does not replace the proposition at #1791. Both propositions could be combined. This proposition #1836 would allow for a better organization of packages. Many Julia organizations could file for a NameSpace. This would solve the "package naming" issue, where, for instance, it would be silly to register "Countries.jl" from "JuliaFinance" directly into the general registry. But, all packages inside "JuliaFinance" NameSpace would still need to pass the "General" registry governance: approvals and name conventions (maybe). If this gets combined with #1791, a registry repo could be registered into the General registry as a NameSpace. Why this is useful? In one word: Governance. If, for instance, the registry repo at "JuliaFinance" gets registered into General, then we don't need to register every single package under "JuliaFinance" into the "General" registry. Also, "JuliaFinance" owners would be responsible for approvals when updating the "JuliaFinance" registry repo after the NameSpace gets approved by General repo owners. This is the value I see in the proposal at #1791. But this could be done as a second step after this issue #1836. |
Hi @felipenoris 👋 Thanks for your thoughts on the subject 🙌
I got pulled into some other things that are higher priority at the moment, but yes, the idea I have in mind combines elements of both #1791 and this issue. I am cautiously optimistic, but think having a PR to reference will aid the conversation, but it might take me a few days 🙏 |
So the idea is to use #1791 to sidestep governance issues? I don’t think that makes sense from any perspective. Technically, it feels a bit silly: do we really need yet another layer of indirection? What’s next? A registry of registries of registries? But from a governance perspective it also doesn’t make sense: if something is visible by default with a standard Julia install, then the Julia project must have governance over it. Having the technical ability to make other registries visible from General is not the issue. If we had such a feature, then the maintainers of Julia itself would have the same responsibility to users to make sure that packages in those available-by-default registries meet the same (lightweight) standards as General. So either Julia’s maintainers have enough control over those external registries to ensure it is also reasonably safe and correct — in which case why bother with a separate registry? — or we lack such control and it’s totally irresponsible to delegate a huge portion of the default namespace to external groups without being able to ensure that. In short, hard no on #1791. Namespaces, on the other hand, seem plausible since the argument that you want to have I do, however, think the focus here on how to organize the registry and file names is misplaced. Worry about that last—it doesn’t even matter from Pkg’s perspective. It’s only a problem for the tooling that manages registries. Focus instead on how these namespaces are supposed to work from a UI perspective and figure out a clear semantics for them. Before getting general buy in for semantics and UI, doing all the work to implement a PR seems premature. It is, however, a fine way to explore the UI if you’re ok with the possibility that you might need to throw away that work if there isn’t general agreement on the approach. |
Thanks for chiming in @StefanKarpinski . Your opinion obviously matters so I'm glad you're here. First comment, I think you know Felipe and I both work in finance. Finance is a highly regulated industry. If Felipe and I are talking about registries depending on other registries, rest assured, the reason is not to loosen governance. If anything, it is to have better governance, but that is not the point of this issue. This issue has a much cleaner focus, i.e. have first-class support for namespaces in registries with a focus on Like Felipe, I once thought having a separate registry would be good for JuliaFinance and having it play nice with With feedback from others (thanks @alecloudenback and @ScottPJones), I've come around and now I want JuliaFinance packages in One thing that is different now is that we have things like LocalRegistry.jl which seems to make it easier to manage local registries than last time I tried it.
Sure. So my idea starts with @DilumAluthge's idea of introducing a new When This brings up a related issue. There are already a ton of packages already registered on
In other words, I suggest that "no namespace" is a namespace and no two packages can have the same name in the same namespace. If
works today, it should work always. If I want to register a new Flux.jl on "Registering" a namespace on An added benefit of this is that it will play nicely with local private registries. Currently, we support private registeries, but your local registry cannot have a package with the same name as a package registered on
and it will know to use my private registry (and we can add warnings etc if needed for namesapces taking you out of
which could be attractive for large corporations. I would like to register JuliaFinance stuff on If we can get behind this idea of namespaces AND the rule that no two packages can have the same name in the same namespace where the default "no namespace" is considered a namespace, then that can simplify some things in Pkg and I'm happy to do the heavy lifting to get that to work since this is so important to me. Thanks again for your consideration 🙏 |
@StefanKarpinski, thanks for your reply. I respect your view and I was happy to close #1791 given that the namespace feature got better acceptance. I just would like to point out that it is a strong statement to associate my proposal with "sidestep governance". For me it is more of a delegation, which often occurs when you organize people into groups of interest or domains. We could debate and disagree on whether it is silly or technically inferior to namespacing and I get that. But all ideas we discuss are for the better of this community we care so much about, and not to damage the community. |
Perhaps I misunderstood when you wrote:
To me that sounds as though, when JuliaFinance is registered with General, General loses governance over its own namespace, ceding control over a portion of it to JuliaFinance. If registries that are registered in General are also managed by the maintainers of the Julia project as a whole, then delegating like this would be ok, since it would have no effect, but having Julia maintainers manage more registries seems like more work, not to mention complexity, to what end? If the goal is for checks on JuliaFinance packages to be more stringent than on General, then that's a viable position, but I think that would be better served by having a system where various entities can perform different kinds of checks and vouch for arbitrary subsets of a registry, via some form of trust metadata. In other words, it's an orthogonal concern to splitting up registries. |
Please 🙏 This proposal has nothing to do with governance. It is about namespaces. I would like to register packages on I did spend some time both writing code to make sure it makes sense and explaining it the best I could. I think what I describe is a clean solution to namespaces that leverage recent developments for managing local registries. If you have feedback on the proposal, that would be great and I'm happy to incorporate better ideas and I'm happy to put a PR together. What do you think? Should I drop it? Does this issue have a chance? |
Okay, how about this: Pkg.add("Coverage") Everyone has this in their |
In my proposal, no one will need to change any |
Yeah, sorry. Didn’t mean to sidetrack this, I was just addressing the part where #1791 got brought into it and then replying to responses. |
If I could sum up my proposal (TL;DR), I'd say it boils down to:
I am thinking we can point LocalRegistry.jl to |
What? Then you don't do things correctly.
We already have a way to disambiguate packages, and that is by UUID. I don't understand why we need a second way to "kinda specify" the package when we have an absolute way of doing it. In particular, what happens when I want to use the namespace |
Here, you also asked:
DIlum responded here:
You replied:
Then we danced around a bit until Stefan said:
This ☝️ is the answer to your question. Please accept this answer 🙏 I hope we can get past the question of whether namespaces are useful (they are) and start focusing on whether and, if so, how to implement them. I see two options: Option 1Implement #1071. This allows multiple packages with the same name to be registered in This would be ok, but it isn't obvious to me how to protect packages that were already registered that should somehow, I think, be given priority. In my example above, if I register If Option 1 is what we want, we can make that work and I'm happy to help implement that. Option 2Implement this issue, i.e. first-class support for namespaces. As I envision it, this can be implemented almost entirely using existing tools. In this approach, a namespace would be a full subregistry embedded into Implementing first-class support for namespaces this way comes with multiple advantage. For one, once we have proper first-class namespaces, we can introduce the reasonable rule:
This would be a non-breaking rule (since there are currently no duplicate package names in
will always and forever grab the FluxML version without the need to disambiguate because they are already registered in the default "no namespace" namespace of The rule would also lead to a significant simplification in the implementation of Decision neededThe first thing we need to decide is:
If it is a flat out "No", then we can stop the discussion now and I'll try to find another way to proceed with JuliaFinance outside of If the answer is a "Maybe, it depends on the implementation", then I am happy to implement a PR for review myself. Then the question becomes, "Which option do we want to implement?" My preference would be to implement Option 2: first-class support for namespaces and I am happy to submit a working PR, but I'm also happy to help with Option 1 if we decide to go that way. From a user (and package maintainer) perspective, Option 2 shoud introduce absolutely no change for existing packages in @StefanKarpinski, at the end of the day, I think this is your call, so please consult with anyone you think should have a say on this and let me know what you think. I have some projects going on and need to make some decisions how to proceed 🙏 Edit: Btw, on Slack, Stefan mentioned the possibility of introducing this as an experimental feature, which would be totally fine with me. More than fine. It would be awesome 👍 |
Some notes from discussion of this issue on the pkg-dev call today. Namespacing mechanisms are fundamentally about preventing/disambiguating name collisions. However, to a large extent Pkg already handles name collisions gracefully because of UUIDs:
There are only two places where any kind of further disambiguation might be helpful:
One solution to the second problem might be to implement JuliaLang/julia#33047 and decouple the name that a project refers to a dependency by from its canonical name. Then, when someone tries to add the second Another issue that was discussed was whether disambiguates should be hierarchical (namespaces) or more like tags/labels/keywords/categories. Recently, there was some discussion of whether some packages belonged in the JuliaIO or JuliaData organizations. So there's often some ambiguity in these things. Why not allow both? Some packages may be relevant to both finance and economics. Why not allow installing them by Another use case would be tagging packages that are useful for CI with the If we went with a non-hierarchical category system instead of a hierarchical namespace system, then the categories would be ordered with the earlier ones taking precedence. If you install both [deps]
"Finance/Instruments" = "55efc701-121b-4840-9663-6fa785ef03be"
"Music/Instruments" = "be17f318-eccd-4180-afc1-ec5a11373fef" The usage would look like |
Thanks for the update Stefan 👍 It sounds like the discussion brought out some good ideas to think about. I appreciate everyone taking the subject seriously and working together to find a solution. |
This issue is a feature request and proposal for first-class support for "namespaces" in registries.
The core idea is this: currently, you can register a package, and that package has a URL. In this proposal, you will also be able to register a namespace, and that namespace has a URL. Once a namespace has been registered, a package can be registered under that namespace if and only if the URL for the package begins with the URL for the namespace.
To avoid confusion, let's be careful about our terminology.
Here is a rough sketch of the workflow:
So I am envisioning something like this workflow:
Step 1: I create an organization in GitHub, GitLab, BitBucket, etc. For example, suppose I create an organization at
github.com/MyFooGitHub
.Step 2: I use some process to register an namespace with the Julia General registry. For example, I make a PR to the General registry that registers a new namespace called
MyFooNamespace
. And as part of that PR, I provide the URLhttps://github.com/MyFooGitHub/
. Once the PR has been merged, we have created the namespaceMyFooNamespace
in the Julia General registry. A package can be registered in theMyFooNamespace
namespace if and only if its URL begins with the stringhttps://github.com/MyFooGitHub/
.To go into more detail, a namespace is a "thing" that is registered in the General registry and has the following properties, all of which are constants:
For example, in my example, we might have a file in the General registry with filename
/namespaces/MyFooNameSpace/Namespace.toml
with the following contents:Step 3: I register packages. For example, I create a package called
CoolDilumStuff.jl
. The URL for this package ishttps://github.com/MyFooGitHub/CoolDilumStuff.jl
. I choose to register this package in theMyFooNamespace
namespace. So the "full name" of my package isMyFooNamespace/CoolDilumStuff
.When Registrator makes the PR to register this package, it would, for example, create a file with filename
/namespaces/MyFooNameSpace/C/CoolDilumStuff/Package.toml
with the following contents:Step 4: I can install my package in Pkg. In the Pkg REPL mode, I would install with this:
] add MyFooNamespace/CoolDilumStuff
If I want to use the Pkg API, I would install with either this:
Or, equivalently, this:
Step 5: After I install my package, I can import it by simply typing either this:
import CoolDilumStuff
Or this:
using CoolDilumStuff
This issue is kind of related to #1071. However, the proposal in this issue is a little different than what is proposed in #1071. If I understand correctly, #1071 does some automatic pattern-matching on the URL. In contrast, this issue explicitly adds full first-class support for official namespaces in registries.
This issue is not the same as #1064.
The text was updated successfully, but these errors were encountered: