Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disallow access to internals (unless overridden) #52314

Open
LilithHafner opened this issue Nov 27, 2023 · 14 comments
Open

Disallow access to internals (unless overridden) #52314

LilithHafner opened this issue Nov 27, 2023 · 14 comments
Labels
design Design of APIs or of the language itself kind:feature Indicates new feature / enhancement requests

Comments

@LilithHafner
Copy link
Member

LilithHafner commented Nov 27, 2023

Originally posted by @StefanKarpinski in #49973 (comment)

Not breaking existing code limits what we can do. However, there's a decent argument to be made that since what's breaking is not a public API, blocking access to package internals is not actually technically breaking according to semver. Of course, we still have to be cautious to prevent massive ecosystem breakage. It would not go well to just flip the "no private access" switch for the whole ecosystem at once. Here's a possible transition strategy:

  1. Allow packages to opt into preventing access to their private internals. This should probably be a flag in the project file, but should maybe also be in the registry. There are three cases for a project that depends on a package that blocks private access:
    1. Project doesn't access internals so there's no problem;
    2. Project accesses internals but can easily use a public API instead—minor fix required;
    3. Project accesses internals and can't easily use a public API instead—can't be easily fixed.
  2. To handle the last case, we also allow projects to override one of their dependencies blocking private access. A project opting into this acts as an explicit indicator that non-breaking upgrades can break.
  3. For a while make the flag for whether private access is allowed or not mandatory—you can choose yes or no but you need to have some value in the project file.
  4. Later we change to preventing private access by default. Projects can drop the flag if the value is to block private access since that's the new default and only packages that need to allow private access need to keep the flag in their project files.

I'm not sure about the implementation side. If we can control module internals access per depender that would be ideal. That suggests that private/public needs to be metadata associated with the binding itself, which is kind of gnarly. Maybe we can do something with auto-wrapping each package in a public-only wrapper that rebinds only the public bindings of the internal package. That's the simplest to implement, but feels kind of icky.

Having public/private annotations on fields in structs would also be good, but I'm not quite ready to tackle that.

@LilithHafner
Copy link
Member Author

I agree that preventing access to private internals must be opt in at first for the package whose internals are being accessed because this is not a viable option for packages that have public unexported functionality and have not yet adopted the public keyword.

I think that we should disallow (in the general registry) packages that both declare dependence on internals and declare compatibility with an infinite set of minor versions (an infinite set of patch versions is okay IMO)

This would pair well with normalizing retroactively changing compat entries on a package when new versions of their dependants are released.

That suggests that private/public needs to be metadata associated with the binding itself, which is kind of gnarly.

It already is :)

julia/src/julia.h

Lines 589 to 602 in 5b2fcb6

typedef struct _jl_binding_t {
JL_DATA_TYPE
_Atomic(jl_value_t*) value;
jl_globalref_t *globalref; // cached GlobalRef for this binding
_Atomic(struct _jl_binding_t*) owner; // for individual imported bindings (NULL until 'resolved')
_Atomic(jl_value_t*) ty; // binding type
uint8_t constp:1;
uint8_t exportp:1; // `public foo` sets `publicp`, `export foo` sets both `publicp` and `exportp`
uint8_t publicp:1; // exportp without publicp is not allowed.
uint8_t imported:1;
uint8_t usingfailed:1;
uint8_t deprecated:2; // 0=not deprecated, 1=renamed, 2=moved to another package
uint8_t padding:1;
} jl_binding_t;

@LilithHafner LilithHafner added design Design of APIs or of the language itself kind:feature Indicates new feature / enhancement requests labels Nov 27, 2023
@ParadaCarleton
Copy link
Contributor

Having public/private annotations on fields in structs would also be good, but I'm not quite ready to tackle that.

I think we might not need a new language feature for this--you can overload getproperty to error on accessing a field. (Maybe a keyword that does this automatically would be enough?)

@LilithHafner
Copy link
Member Author

For internal fields, I would want getproperty to work in my package code (and in the code of any package that depends on my internals) but not in ordinary code that is using MyPackage. a.b lowers to Base.getproperty(a, :b) (not __MODULE__.getproperty(a, :b) so IIUC there is no good way to make getproperty error only when other modules use it without another language feature.

@Tokazama
Copy link
Contributor

Tokazama commented Nov 27, 2023

This seems like the type of thing where it would be really nice to have the this/self keyword.

@LilithHafner
Copy link
Member Author

How would that help?

@Tokazama
Copy link
Contributor

IIUC, we're just trying to inject the call context somewhere between lowering owner.name and getglobal(owner, name) so that we can do more than just ispublic(owner, name) to test what throws an error? If there were a global context variable at the top of the module that changed how GlobalRef(owner, name) was resolved, then we could add a unique path for certain modules to go straight to getglobal instead of testing ispublic.

@BioTurboNick
Copy link
Contributor

BioTurboNick commented Nov 28, 2023

It would be nice if this feature was something you could turn on for a code block/on the REPL, and not just be for an entire package. Force the developer to document which specific internals they are accessing, either as a parallel to public/export in listing the symbols used (or e.g. something like import private Base:), or as a macro encasing the usage.

@Tokazama
Copy link
Contributor

It would be nice if this feature was something you could turn on for a code block/on the REPL, and not just be for an entire package. Force the developer to document which specific internals they are accessing, either as a parallel to public/export in listing the symbols used (or e.g. something like import private Base:), or as a macro encasing the usage.

This makes more sense to me than fully supporting getproperty access to explicitly marked private symbols in a module.

@StefanKarpinski StefanKarpinski changed the title Disallow access to internals Disallow access to internals (without override) Nov 29, 2023
@PallHaraldsson
Copy link
Contributor

PallHaraldsson commented Nov 30, 2023

It would not go well to just flip the "no private access" switch for the whole ecosystem at once.

A.
I suppose we can disallow by default. If we want to do that, then should we have a moratorium on old packages?

Maybe that's the wrong word, but I'm thinking all old (registered) packages are allowed the old rules. Packages registered from now on conform to the new default rules. Or since they've been in development for a while maybe allow registering under the old rules for a while, 1 or 2 months. From publishing this policy.

Then nobody needs to change their Project file or wherever this is marked. The marking will be added to the registry for old packages for you. Possibly you would also be able to add it explicitly (or not) for packages registered from now on.

There's a question, what to do about upgrading old packages? They probably need the old rules, forever..., at least until next major version upgrade only?

B.
One other possibility is why are people accessing internals? Because the API wasn't thought out? Only allow it for 0.x? Presumably in 1.x, or at least after next major upgrade the API was thought out.

@LilithHafner
Copy link
Member Author

@StefanKarpinski, does "Without override" mean there is no override option or that there is no access unless the user overrides it?

@StefanKarpinski StefanKarpinski changed the title Disallow access to internals (without override) Disallow access to internals (unless overridden) Dec 1, 2023
@StefanKarpinski
Copy link
Sponsor Member

I just wanted to make it clear that it's not a "there's no way to access internals" proposal, but rather that we want accessing internals to require a project-level declaration of the need to do so. I've edited the title again to hopefully make that clearer.

@LilithHafner
Copy link
Member Author

I think it would be useful to discuss this on triage and try to lay out a roadmap.

@baggepinnen
Copy link
Contributor

Preventing field access by modifying getproperty may not be a generally applicable solution. Explicit field access by calling the function getfield is quite common in practice for multiple reasons

  • Broadcasting, i.e., getfield.(vec_of_structs, :field): https://juliahub.com/ui/Search?q=getfield.&type=code (note how the search results only reach the letter H before the maximum number of results is reached)
  • Avoid type instabilities by recursive calls to getproperty etc.
  • Symbolically represented code (i.e., using Symbolics.jl)

@LilithHafner
Copy link
Member Author

Ian suggested a three phase approach

  1. Implement the public keyword so folks can declare public/private API
  2. Have tooling support this (e.g. docstring tooltips display a different color for private symbols, opt-in Aqua tests for private access, treat tab completion differently for public/non-public (color, presence), don't shadow-complete private names, etc.)
  3. Make a language-level enforced system (either through true enforcement, or with sort of name-spacing/sintax to bypass it)

Maybe we do 3 (this issue) eventually, maybe not. Jeff is not optimistic about doing 3, I am, but I don't know the internals as well.

Let's table this (3) until 2 is done well.

@LilithHafner LilithHafner removed the status:triage This should be discussed on a triage call label Feb 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design Design of APIs or of the language itself kind:feature Indicates new feature / enhancement requests
Projects
None yet
Development

No branches or pull requests

7 participants