Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the design of GlobalAlias to not require dest type to match source type #10739

Closed
lattner opened this issue Jul 15, 2011 · 16 comments
Closed
Labels
bugzilla Issues migrated from bugzilla code-cleanup llvm:core

Comments

@lattner
Copy link
Collaborator

lattner commented Jul 15, 2011

Bugzilla Link 10367
Resolution FIXED
Resolved on Jun 02, 2014 21:54
Version 1.0
OS All
CC @asl,@majnemer,@jayfoad,@arsenm,@pcc,@rnk,@TNorthover

Extended Description

Global alias is currently defined to have its own type, and then have an initializer of the same type. The initializer is a "Constant*" which is either a) a global value, b) a constant expr bitcast, c) a constantexpr gep with all zero indices, d) null (which isn't valid, but transiently happens).

This doesn't make sense for a number of reasons. Instead, the initializer of a GlobalAlias should be required to be a GlobalValue, but the type of the source and dest of the alias should not be required to be the same, they should just be completely decoupled.

-Chris

@asl
Copy link
Collaborator

asl commented Jul 15, 2011

But what are these reasons? Right now everyone can look "through" alias without thinking how it should interpret the stuff.

@lattner
Copy link
Collaborator Author

lattner commented Jul 15, 2011

Because the design is broken. Aliases are aliases to a global not a constant expression. The fact that we have to allow getelementptr (for example) is a sign that this is a hack.

@lattner
Copy link
Collaborator Author

lattner commented Jul 15, 2011

I'm not saying that this is a huge priority to fix. I know that the current implementation works fine.

@asl
Copy link
Collaborator

asl commented Jul 15, 2011

Originally only GlobalValue was allowed as an aliasee and types should match. This was relaxed lately, but I don't remember the exact reason. Probably because it was implemented in gcc in such a wrong way that people often 'exploited' this and even considered as a feature (that types of alias / aliasee might now match), thus it was changed lately...

Maybe we should just be more conservative on this matter?

@lattner
Copy link
Collaborator Author

lattner commented Jul 15, 2011

The types have to be able to mismatch. This is important for the IR linker among other things. I'm just saying that a constantexpr isn't the right way to represent this.

@llvmbot
Copy link
Collaborator

llvmbot commented Jul 25, 2011

How would this interact with RAUW? Suppose you have

@​a = i32 alias i64 @​b

(for example), then you use RAUW to replace @​b with (eg) a ptrtoint:

@​a = i32 alias (ptrtoint i8* @​b.1 to i64)

Then you would get an illegal alias.

@lattner
Copy link
Collaborator Author

lattner commented Jul 25, 2011

Globals are always pointers, but your issue is a valid one none-the-less with bitcasts.

@jayfoad
Copy link
Contributor

jayfoad commented Jul 29, 2011

I posted a patch here to make GlobalAlias's API look like the aliasee is always a GlobalValue; that is, setAliasee() takes a GlobalValue, and getAliasee() always returns a GlobalValue:

http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20110725/125031.html

@lattner
Copy link
Collaborator Author

lattner commented Jul 29, 2011

I like this approach, this seems better than my suggestion in the description above.

@jayfoad
Copy link
Contributor

jayfoad commented Aug 10, 2011

I posted a patch here to make GlobalAlias's API look like the aliasee is always
a GlobalValue; that is, setAliasee() takes a GlobalValue, and getAliasee()
always returns a GlobalValue:

http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20110725/125031.html

Duncan argued that it would be better if the type of the GlobalAlias always matched the type of its operand. I updated the patch accordingly:

http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20110801/125477.html

@llvmbot
Copy link
Collaborator

llvmbot commented Oct 31, 2013

I was thinking about this a bit. The object files we support have a different view of what an alias is.

On Mach-O there is (or there will be once the linker implements it) N_INDR. It is just a directive that tells the linker that this symbol should have the same address as another one.

The consequences of implementing alias like that are

  • Alias can point to symbols in other files.
  • Alias must point to a symbol.

On ELF (and I am almost sure on COFF), an alias is just another symbol. It just happens to point to a position also pointed by another symbol.

The consequences of implementing alias like that are:

  • Alias can only point to symbols in this file.
  • Alias can point inside an object.

This could be useful for representing cases where the abi mandates a symbol inside a particular data structure (like microsoft vtables).

My suggestion is then:

  • Have ELF/COFF backends error on an attempt to alias a declaration. (right now they miscompile).
  • Implement explicit COMDATs at the llvm IR. This is needed to have the notion of a set of symbols/data that can be dropped as a group.
  • Allow alias with non-zero GEPs, but have the Mach-O backend error on them.

@TNorthover
Copy link
Contributor

TNorthover commented Oct 31, 2013

For MachO, I was toying with only producing an N_INDR symbol when the destination couldn't be resolved within the file. I don't think I'd implemented it like that yet, because I'd viewed it as more of an optimisation; but I now see it affects semantics and permitted forms.

Making that a policy decision would seem to remove at least some of the special cases if I'm reading your description properly. If so, I'd be happy to make sure it does work like that, if it would help.

@llvmbot
Copy link
Collaborator

llvmbot commented Oct 31, 2013

For MachO, I was toying with only producing an N_INDR symbol when the
destination couldn't be resolved within the file. I don't think I'd
implemented it like that yet, because I'd viewed it as more of an
optimisation; but I now see it affects semantics and permitted forms.

Making that a policy decision would seem to remove at least some of the
special cases if I'm reading your description properly. If so, I'd be happy
to make sure it does work like that, if it would help.

It would help in two ways

  • Knowing what can be implemented in MachO
  • Having any alias support at all in MachO :-)

Thanks!

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 15, 2014

Starting with r198349 we also accept addrspacecast, and that does look like a valid use (see the thread following the commit).

For this bug this probably means that the "new alias" should also have an optional address space.

@rnk
Copy link
Collaborator

rnk commented Feb 13, 2014

One other design issue with GlobalAlias is that the LangRef doesn't give them a section, but in practice, I can construct an alias with a section using LLVM's C++ API because GlobalAlias inherits from GlobalValue. However, the section doesn't round trip through bitcode or LLVM asm because it was skipped.

There's a few other attributes on GlobalValue that aren't present in LangRef, but they probably don't matter as much.

@llvmbot
Copy link
Collaborator

llvmbot commented Jun 3, 2014

This was fixed, but not in the way that is discussed in this bug. Instead, alias are defined to point to arbitrary expressions and we changed all the code that thought it was possible to, for example, find the aliased symbol.

The argument that convinced me to go this ways is that at the assembly level expressions like

a = b - c

are sometime valid depending on what b and c are. Before r210062 there was no way to produce such an expression in LLVM. What we now have is

  • GlobalObjecs introduce new data. They are functions or variables. They normally have a symbol (except for some cases with private linkage).
  • GlobalAliases define a label pointing somewhere. At the LLVM level it is not possible to know if the expression defining the location is valid or not.

I would love to be able to tell if a given expression is valid at the IR level without dropping support for representing any expression an assembler supports, but that would be quite a massive change:

With pr10368 we would basically get expressions to represent what is representable with relocations in object files. In addition to that we would need to be have a different expression type for all the cases that don't need a relocation. This would require knowing at the llvm level what section/atom a label is in.

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 3, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla code-cleanup llvm:core
Projects
None yet
Development

No branches or pull requests

6 participants