Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get rid of integer allocation #189

Closed
RalfJung opened this issue Jun 7, 2017 · 41 comments · Fixed by #197
Closed

Get rid of integer allocation #189

RalfJung opened this issue Jun 7, 2017 · 41 comments · Fixed by #197
Labels
C-enhancement Category: a PR with an enhancement or an issue tracking an accepted enhancement

Comments

@RalfJung
Copy link
Member

RalfJung commented Jun 7, 2017

@eddyb mentioned that there are plans to get rid of the integer allocation (NEVER_ALLOC_ID). This is an attempt to summarize briefly what the plan is, and to document that such a plan exists. (I suppose the "refactor" label would make sense here?)

The idea is to change the "meaning" of the type Pointer to "pointers that actually point somewhere in memory". Every operation that also works on pointers obtained from integers should use a different type. They could either use PrimVal (but then these methods all have to handle undef...), or a new type that factors the Pointer and Bytes cases out of PrimVal, but cannot be Undef.

ptr-to-int-cast and their inverse literally become a noop on the data side. There is no longer any need to perform "normalization" in binary_op because all data now has a canonical representation. (The memory already does something like this, somewhat, by not adding any relocations when a pointer from the integer allocation is written to memory.)

Open questions:

  • What about zero-sized types? We could either keep the ZST allocation, or make allocation return an integer when a ZST is allocated.
  • Which name should the type that's either Pointer or just bytes have? PrimDefVal? I'm not good with names...
@oli-obk oli-obk added C-enhancement Category: a PR with an enhancement or an issue tracking an accepted enhancement refactor labels Jun 8, 2017
@oli-obk
Copy link
Contributor

oli-obk commented Jun 8, 2017

We can't remove the ZST allocation. If we did, we might have accidential conversions from integers to.zst pointers. I don't think it's much of an issue: zst pointers aren't handled specially as muh as integer pointers.

Wrt the naming: I think we should turn primval into two nested enums, the PrimVal type which is either Bytes or Pointer and a PrimDef type or sth which is either Undef or Def(PrimVal). This will make pattern matching as easy as it is now, and allow taking out the defined part without any additional effort.

@RalfJung
Copy link
Member Author

RalfJung commented Jun 8, 2017

So PrimDef is the thing that can be undefined? Sounds somewhat inintuitive to me. I mean, there's little chance this could be used incorrectly, but still.
Another option would be to use Option<PrimVal>. But using a dedicated type may be clearer.

@oli-obk
Copy link
Contributor

oli-obk commented Jun 8, 2017

You may rename PrimDef to anything you want ;) MaybeVal?

@oli-obk
Copy link
Contributor

oli-obk commented Jun 8, 2017

With Option it'll get messy since I think we do have some Option<PrimVal> in the code somewhere I think

@RalfJung
Copy link
Member Author

RalfJung commented Jun 8, 2017

If we were doing this from scratch, I'd probably call the current PrimDef Value, and call the current Value Variable or VariableValue or VariableVal or so.

@eddyb
Copy link
Member

eddyb commented Jun 8, 2017

We don't use "variable" in MIR terminology for all locals. MIR trans would call it LocalRef.
I think LocalVal or LocalValue would work fine.

@RalfJung
Copy link
Member Author

RalfJung commented Jun 8, 2017

We don't use "variable" in MIR terminology for all locals.

I see.
Another option may be LocalState, because it describes the current state of a local. Unfortunately, that type sounds like it is talking about "local state" in general. I'd suggest Local if that was not already used for the thing identifying a local...

EDIT: My point here is that the ByRef case doesn't actually describe the value the local has -- that value is stored in memory can change without the Value (using current naming) changing. However, I will acknowledge that "state" doesn't really capture this much better than "value".

@oli-obk
Copy link
Contributor

oli-obk commented Jun 8, 2017

Local can be renamed to LocalId

@eddyb
Copy link
Member

eddyb commented Jun 8, 2017

There's a reason MIR trans uses the FooRef convention, heh.

@RalfJung
Copy link
Member Author

RalfJung commented Jun 8, 2017

Where FooRef is the thing identifying a foo, or saying what state it currently has?

@eddyb
Copy link
Member

eddyb commented Jun 8, 2017

The former, if I'm reading you right. It looks like it was borrowed from LLVM but it doesn't make much sense because there it's used in the C API for the types that are pointers to the C++ classes.

@RalfJung
Copy link
Member Author

RalfJung commented Jun 8, 2017

By "the thing identifying a local", I mean the integer which is an index into the list of locals. From the name, that's also what I expect a Ref to be. So then things would be called LocalRef/LocalId (newtype around integer) and Local (that's the by-ref/by-val/by-val-pair thing).

That is, assuming it is even worth renaming this stuff. That's why I said "if we started from scratch" above. ;)

@eddyb
Copy link
Member

eddyb commented Jun 8, 2017

Oh, no, FooRef contains the rustc_trans/LLVM "instance" of mir::Foo.
So LocalRef contains the actual value, either an SSA immediate or an alloca.

@RalfJung
Copy link
Member Author

RalfJung commented Jun 9, 2017

Oh. That's odd naming to me... it's not a reference to a local then. I suppose they mean that it's the referent of a local.

@eddyb
Copy link
Member

eddyb commented Jun 9, 2017

No it literally is cargo-culting the LLVM C API, because a SSA value is ValueRef.

@oli-obk
Copy link
Contributor

oli-obk commented Jun 9, 2017

I don't think we need the *Ref naming scheme in miri. Let's just stay with SomeThingId and SomeThing

@oli-obk
Copy link
Contributor

oli-obk commented Jun 9, 2017

Also I think renaming everything in a more sensible manner is fine. We do have RLS so any renaming should be trivial to do.

@eddyb
Copy link
Member

eddyb commented Jun 9, 2017

(it's not like most renaming isn't just a multi-file regex replacement - I do those in my editor often)

That scheme doesn't work for things like mir::Lvalue, though, does it?

@oli-obk
Copy link
Contributor

oli-obk commented Jun 9, 2017

What do you mean? There's no need to rename rustc types. Also lvalue is not an ID. mir::Lvalue is the unresolved lvalue similar to a path, miri::Lvalue is the resolved lvalue containing pointers and such.

@eddyb
Copy link
Member

eddyb commented Jun 9, 2017

If you already have miri::Foo for mir::Foo, why would you need FooId or anything of the sort?
I wonder if people would object to doing the former in rustc_trans.

@oli-obk
Copy link
Contributor

oli-obk commented Jun 9, 2017

Uhm we don't. I got it confused with Lvalue::Local. So no mass renaming necessary.

@RalfJung
Copy link
Member Author

RalfJung commented Jun 9, 2017

ah, right, mir::Local is in librustc. I forgot about that. Wouldn't it be somewhat confusing to then call the "state of a local" also Local, just in a different module? (This is regarding the proposal to rename the current PrimDef to Local.)
Given these circumstances, I guess I'll go with @eddyb here and prefer LocalVal.

@oli-obk
Copy link
Contributor

oli-obk commented Jun 12, 2017

We can't separate Undef from PrimVal, because a PrimVal for a ZST is Undef. Creating a new Zst variant makes no sense, because that has no additional purpose (we discussed that back when ZSTs were moved from always being ByRef to also making them ByVal).

@eddyb
Copy link
Member

eddyb commented Jun 12, 2017

because a PrimVal for a ZST is Undef

What's wrong with Bytes(0) or anything else?

@oli-obk
Copy link
Contributor

oli-obk commented Jun 12, 2017

You might be able to do shenigans that end up reading it. But I can't think of any off the top of my head.

@RalfJung
Copy link
Member Author

(we discussed that back when ZSTs were moved from always being ByRef to also making them ByVal).

Is that discussion available anywhere? I am asking because that would have been my suggestion: Make ZSTs always ByRef, after all, that doesn't even require an allocation. Just a pointer to the ZST allocation.

@oli-obk
Copy link
Contributor

oli-obk commented Jun 12, 2017

That was on IRC in rust-internals. search the log for eddyb saying undef I think

@RalfJung
Copy link
Member Author

Do you remember a rough date? I have no idea whether I am looking for something that happened a month ago or a year ago. A brief search of the git history didn't turn up anything.

@oli-obk
Copy link
Contributor

oli-obk commented Jun 12, 2017

I have no clue. I can look in my local logs. But it was something about how zsts should never be read from anyway, so undef as zst makes more sense than a special zst value. But moving to byref to the zst alloc sounds good.

@eddyb
Copy link
Member

eddyb commented Jun 12, 2017

Yeah the point is that the representation doesn't matter not that we should pick a specific one.

@RalfJung
Copy link
Member Author

It matters insofar as we better be sure that any attempt to observe it throws an error. ;)

But moving to byref to the zst alloc sounds good.

Now you got me confused. Didn't you say above that this is how things used to be, and then they were changed, probably for a reason?

@oli-obk
Copy link
Contributor

oli-obk commented Jun 12, 2017

Yea it was changed. But I think that was part of other changes that were easier with that change happening, too

@RalfJung
Copy link
Member Author

So some other things will get harder if we go back to "ZSTs are always ByRef"?

@oli-obk
Copy link
Contributor

oli-obk commented Jun 13, 2017

Yes. We'll have problems with follow_by_ref_value, but I just checked, we don't use it everywhere anymore. I guess we refactored all other uses out. Looks easily doable now, but it'll mean a zst can't be a primval anymore, not sure if we rely on that anywhere else.

@oli-obk
Copy link
Contributor

oli-obk commented Jun 13, 2017

I've done some work in this direction: oli-obk@e59ff23 still has some problems, since the pointers in lvalue need to be able to store integers, too.

@RalfJung
Copy link
Member Author

RalfJung commented Jun 13, 2017

the pointers in lvalue need to be able to store integers, too.

When does that arise? Sounds like an invalid lvalue to me, couldn't that error out instantaneously?

@RalfJung
Copy link
Member Author

So you didn't end up even using a type that's just "Bytes or Ptr". Is handling the additional Undef case a problem anywhere?

@oli-obk
Copy link
Contributor

oli-obk commented Jun 14, 2017

Not yet. It might even be better to include it in some situations, because dereferencing and immediately referencing a pointer with undef value should fine

@RalfJung
Copy link
Member Author

You mean like so?

fn main() {
    let x : *const i32 = unsafe { mem::uninitialized() };
    let y = unsafe { &*x };
}

So you are saying, Undef lvalues are fine? I suppose that's an option. That would also explain when "integer pointer" lvalues can arise.

@oli-obk
Copy link
Contributor

oli-obk commented Jun 14, 2017

Yes

@RalfJung
Copy link
Member Author

Sounds reasonable. So the Lvalue::Ptr will change to take a PrimVal? And then I suppose the return lvalue can cease again to be an Option; Undef is covering that case just fine. Nice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Category: a PR with an enhancement or an issue tracking an accepted enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants