-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for fast-path alloc / init methods and direct methods. #268
Conversation
Requires this clang branch: https://github.com/davidchisnall/llvm-project/tree/new-libobjc2-features I'll raise that as an LLVM PR when this is merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few tests seem to fail on Ubuntu 22.04.3 LTS aarch64
(Not sure if they are completely related).
I have compiled llvm, clang, and lld from your branch. Below are the test results:
cmake -B build -DCMAKE_C_COMPILER=clang-18 -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_LINKER=ld.lld -DTESTS=1 -GNinja
hugo@libobjc2:/Users/hugo/Source/github.com/gnustep/libobjc2$ clang-18 --version
clang version 18.0.0git (https://github.com/davidchisnall/llvm-project 0820147633f4943f4ba342fcace9aecc03f8500c)
Target: aarch64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/bin
hugo@libobjc2:/Users/hugo/Source/github.com/hmelder/llvm-project$ clang++ --version
clang version 18.0.0git (https://github.com/davidchisnall/llvm-project 0820147633f4943f4ba342fcace9aecc03f8500c)
Target: aarch64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/bin
90% tests passed, 20 tests failed out of 194
Total Test time (real) = 10.96 sec
The following tests did not run:
159 - UnexpectedException (Skipped)
160 - UnexpectedException_optimised (Skipped)
161 - UnexpectedException_legacy (Skipped)
162 - UnexpectedException_legacy_optimised (Skipped)
The following tests FAILED:
13 - AssociatedObject (SEGFAULT)
14 - AssociatedObject_optimised (SEGFAULT)
15 - AssociatedObject_legacy (SEGFAULT)
16 - AssociatedObject_legacy_optimised (SEGFAULT)
17 - AssociatedObject2 (SEGFAULT)
18 - AssociatedObject2_optimised (SEGFAULT)
19 - AssociatedObject2_legacy (SEGFAULT)
20 - AssociatedObject2_legacy_optimised (SEGFAULT)
43 - FastARC (SEGFAULT)
44 - FastARC_optimised (SEGFAULT)
45 - FastARC_legacy (SEGFAULT)
46 - FastARC_legacy_optimised (SEGFAULT)
53 - FastPathAlloc_legacy (Subprocess aborted)
54 - FastPathAlloc_legacy_optimised (Subprocess aborted)
89 - RuntimeTest (SEGFAULT)
90 - RuntimeTest_optimised (SEGFAULT)
91 - RuntimeTest_legacy (SEGFAULT)
92 - RuntimeTest_legacy_optimised (SEGFAULT)
173 - PropertyIntrospectionTest2_arc (SEGFAULT)
174 - PropertyIntrospectionTest2_arc_optimised (SEGFAULT)
A control run with Ubuntu clang version 14.0.0-1ubuntu1.1
:
cmake -B build-14 -DCMAKE_C_COMPILER=clang-14 -DCMAKE_CXX_COMPILER=clang++-14 -DCMAKE_LINKER=ld.lld-14 -DTESTS=1 -GNinja
100% tests passed, 0 tests failed out of 194
Total Test time (real) = 11.37 sec
The following tests did not run:
51 - FastPathAlloc (Skipped)
52 - FastPathAlloc_optimised (Skipped)
53 - FastPathAlloc_legacy (Skipped)
54 - FastPathAlloc_legacy_optimised (Skipped)
159 - UnexpectedException (Skipped)
160 - UnexpectedException_optimised (Skipped)
161 - UnexpectedException_legacy (Skipped)
162 - UnexpectedException_legacy_optimised (Skipped)
fast_paths.m
Outdated
{ | ||
objc_send_initialize(cls); | ||
} | ||
fprintf(stderr, "Testing metaclass %p (%s) for fast path: %d\n", cls->isa, class_getName(cls), objc_test_class_flag(cls->isa, objc_class_flag_fast_alloc_init)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fprintf
statement seems to be a left over
dtable.c
Outdated
static SEL retain, release, autorelease, isARC, alloc, allocWithZone, init, isTrivialAllocInit; | ||
if (NULL == retain) | ||
{ | ||
retain = sel_registerName("retain"); | ||
release = sel_registerName("release"); | ||
autorelease = sel_registerName("autorelease"); | ||
isARC = sel_registerName("_ARCCompliantRetainRelease"); | ||
alloc = sel_registerName("alloc"); | ||
allocWithZone = sel_registerName("allocWithZone:"); | ||
init = sel_registerName("init"); | ||
isTrivialAllocInit = sel_registerName("_TrivialAllocInit"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are alloc
, allocWithZone
, init
, and istrivialAllocInit
needed here? They do not seem to be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, they’re left over from before I pulled this out into a separate function.
#if __clang_major__ < 18 | ||
// Skip this test if clang is too old to support it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think testing the fast path functions directly is not a bad idea. The bug in the AArch64 message send logic occurred because we never called objc_msgSend
directly, but relied on clang.
The non-legacy test-cases of this PR are passing. A lot of general cases do not. |
(lldb) run
Process 30526 launched: '/Users/hugo/Source/github.com/gnustep/libobjc2/build/Test/FastARC' (aarch64)
Process 30526 stopped
* thread #1, name = 'FastARC', stop reason = signal SIGSEGV: invalid address (fault address: 0xffffff7ffff8)
frame #0: 0x0000fffff7f8054c libobjc.so.4.6`isSmallObject(obj=<unavailable>) at class.h:417
414 extern Class SmallObjectClasses[7];
415
416 static BOOL isSmallObject(id obj)
-> 417 {
418 uintptr_t addr = ((uintptr_t)obj);
419 return (addr & OBJC_SMALL_OBJECT_MASK) != 0;
420 } There seems to be an infinite loop: A selected list of lldb frames(lldb) frame select 1
frame #1: 0x0000fffff7f7f160 libobjc.so.4.6`retain(objc_object*, unsigned char) [inlined] isPersistentObject(obj=0x0000aaaaaaaf6b38) at arc.mm:291:6
288 return YES;
289 }
290 // Small objects are never accessibly by reference
-> 291 if (isSmallObject(obj))
292 {
293 return YES;
294 }
(lldb) frame select 2
frame #2: 0x0000fffff7f7f140 libobjc.so.4.6`retain(obj=0x0000aaaaaaaf6b38, isWeak=NO) at arc.mm:302:6
299
300 static inline id retain(id obj, BOOL isWeak)
301 {
-> 302 if (isPersistentObject(obj)) { return obj; }
303 Class cls = obj->isa;
304 if ((Class)&_NSConcreteMallocBlock == cls ||
305 (Class)&_NSConcreteStackBlock == cls)
(lldb) frame select 3
frame #3: 0x0000fffff7f7f10c libobjc.so.4.6`objc_retain(obj=0x0000aaaaaaaf6b38) at arc.mm:592:9
589 extern "C" OBJC_PUBLIC id objc_retain(id obj)
590 {
591 if (nil == obj) { return nil; }
-> 592 return retain(obj, NO);
593 }
594
595 extern "C" OBJC_PUBLIC id objc_retainAutorelease(id obj)
(lldb) frame select 4
frame #4: 0x0000fffff7f7f228 libobjc.so.4.6`retain(obj=0x0000aaaaaaaf6b38, isWeak=NO) at arc.mm:313:9
310 {
311 return retain_fast(obj, isWeak);
312 }
-> 313 return [obj retain];
314 }
315
316 extern "C" OBJC_PUBLIC BOOL objc_release_fast_no_destroy_np(id obj)
(lldb) frame select 5
frame #5: 0x0000fffff7f7f10c libobjc.so.4.6`objc_retain(obj=0x0000aaaaaaaf6b38) at arc.mm:592:9
589 extern "C" OBJC_PUBLIC id objc_retain(id obj)
590 {
591 if (nil == obj) { return nil; }
-> 592 return retain(obj, NO);
593 }
594
595 extern "C" OBJC_PUBLIC id objc_retainAutorelease(id obj)
(lldb) frame select 6
frame #6: 0x0000fffff7f7f228 libobjc.so.4.6`retain(obj=0x0000aaaaaaaf6b38, isWeak=NO) at arc.mm:313:9
310 {
311 return retain_fast(obj, isWeak);
312 }
-> 313 return [obj retain];
314 }
315
316 extern "C" OBJC_PUBLIC BOOL objc_release_fast_no_destroy_np(id obj)
[...]
frame #18317: 0x0000fffff7f7f10c libobjc.so.4.6`objc_retain(obj=0x0000aaaaaaaf6b38) at arc.mm:592:9
frame #18318: 0x0000fffff7f7f228 libobjc.so.4.6`retain(obj=0x0000aaaaaaaf6b38, isWeak=NO) at arc.mm:313:9
frame #18319: 0x0000fffff7f7f10c libobjc.so.4.6`objc_retain(obj=0x0000aaaaaaaf6b38) at arc.mm:592:9
frame #18320: 0x0000fffff7f7f228 libobjc.so.4.6`retain(obj=0x0000aaaaaaaf6b38, isWeak=NO) at arc.mm:313:9
frame #18321: 0x0000fffff7f7f10c libobjc.so.4.6`objc_retain(obj=0x0000aaaaaaaf6b38) at arc.mm:592:9
frame #18322: 0x0000fffff7f7f228 libobjc.so.4.6`retain(obj=0x0000aaaaaaaf6b38, isWeak=NO) at arc.mm:313:9
frame #18323: 0x0000fffff7f7f10c libobjc.so.4.6`objc_retain(obj=0x0000aaaaaaaf6b38) at arc.mm:592:9
frame #18324: 0x0000fffff7f7f228 libobjc.so.4.6`retain(obj=0x0000aaaaaaaf6b38, isWeak=NO) at arc.mm:313:9
frame #18325: 0x0000fffff7f7f10c libobjc.so.4.6`objc_retain(obj=0x0000aaaaaaaf6b38) at arc.mm:592:9 Same issue in Process 30554 launched: '/Users/hugo/Source/github.com/gnustep/libobjc2/build/Test/RuntimeTest' (aarch64)
testInvalidArguments() ran
testAMethod() ran
testAMethod() ran
testGetMethod() ran
testProtocols() ran
testMultiTypedSelector() ran
testClassHierarchy() ran
testAllocateClass() ran
Instance of NSObject: 0xaaaaaaafc6a8
Enter synchronized code
Process 30554 stopped
* thread #1, name = 'RuntimeTest', stop reason = signal SIGSEGV: invalid address (fault address: 0xffffff7ffff8)
frame #0: 0x0000fffff7f8054c libobjc.so.4.6`isSmallObject(obj=<unavailable>) at class.h:417
414 extern Class SmallObjectClasses[7];
415
416 static BOOL isSmallObject(id obj)
-> 417 {
418 uintptr_t addr = ((uintptr_t)obj);
419 return (addr & OBJC_SMALL_OBJECT_MASK) != 0;
420 } |
Just tested it with clang main and the failures seem to be related to these changes (probably in clang): hugo@libobjc2:/Users/hugo/Source/github.com/gnustep/libobjc2$ clang --version
clang version 18.0.0git (ssh://git@github.com/hmelder/llvm-project.git 360996ac5ad26714a6ddbee45730fbcfb7dc3eea) 98% tests passed, 4 tests failed out of 194
Total Test time (real) = 10.99 sec
The following tests did not run:
159 - UnexpectedException (Skipped)
160 - UnexpectedException_optimised (Skipped)
161 - UnexpectedException_legacy (Skipped)
162 - UnexpectedException_legacy_optimised (Skipped)
The following tests FAILED:
51 - FastPathAlloc (Subprocess aborted)
52 - FastPathAlloc_optimised (Subprocess aborted)
53 - FastPathAlloc_legacy (Subprocess aborted)
54 - FastPathAlloc_legacy_optimised (Subprocess aborted) The failure of FastPathAlloc was expected. |
Can you show the output from the failing tests? Older clang should simply not call these functions, so I must have broken something internally. |
[NoAlloc allocWithZone: NULL]; | ||
assert(!called); | ||
called = NO; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[NoAlloc allocWithZone: NULL]; | |
assert(!called); | |
called = NO; | |
called = NO; | |
[NoAlloc allocWithZone: NULL]; | |
assert(!called); |
This is great, thank you for implementing this! Optimizations like the fast path are very much relevant for our app, and I’m looking forward to being able to use direct methods to optimize hot paths in our codebase.
Do I understand this correctly that we should be able to add |
Yes, exactly. Though very few classes will override
I didn't think that needed new tests because it doesn't change existing behaviour. Calls to |
Sure! For what ever reason, only FastARC and the FastPathAllloc_legacy tests seem to be failing... The following tests FAILED:
43 - FastARC (Subprocess aborted)
44 - FastARC_optimised (Subprocess aborted)
45 - FastARC_legacy (Subprocess aborted)
46 - FastARC_legacy_optimised (Subprocess aborted)
53 - FastPathAlloc_legacy (Subprocess aborted)
54 - FastPathAlloc_legacy_optimised (Subprocess aborted) This is with the same configuration and libobjc2 40fa8e3. I am not sure why I had the test failures yesterday (I am now using a clang build with hugo@libobjc2:/Users/hugo/Source/github.com/gnustep/libobjc2$ ./build/Test/FastARC
FastARC: /Users/hugo/Source/github.com/gnustep/libobjc2/Test/FastARC.m:80: int main(): Assertion `called == YES' failed.
Aborted It seems like |
That’s confusing. I haven’t changed the situations when the fast arc flag is set or cleared, so I’m not sure why that test is failing. |
I can generate the LLVM IR for this test if this helps |
Well, the confusing thing is that this shouldn’t rely on anything in clang. It calls objc_retain explicitly, and somehow that’s detecting fast ARC is safe for the class, in spite of it not opting in. |
Here
Yep. Weird. |
Well turns out that configuring libobjc2 with Here a test with llvm-project @ 360996ac5ad26714a6ddbee45730fbcfb7dc3eea:
So this should be related to the clang changes :/ |
I disabled --- a/clang/include/clang/Basic/ObjCRuntime.h
+++ b/clang/include/clang/Basic/ObjCRuntime.h
@@ -211,7 +211,7 @@ public:
case GCC:
return false;
case GNUstep:
- return true;
+ return false;
case ObjFW:
return false;
} |
Lines 361 to 380 in 6528090
PatchWhy not call retain, release, and autorelease using objc_msgSend, to avoid replacement? commit 3adbaaaa16aeafeaa5f09adf2f916f985e0412fe (HEAD -> direct-methods-changes)
Author: hmelder <service@hugomelder.com>
Date: Sat Jan 13 02:20:56 2024 +0100
ARC: Call retain, release, autorelease using objc_msgSend directly
diff --git a/arc.mm b/arc.mm
index 5a4cc89..3c53e44 100644
--- a/arc.mm
+++ b/arc.mm
@@ -310,7 +310,7 @@ static inline id retain(id obj, BOOL isWeak)
{
return retain_fast(obj, isWeak);
}
- return [obj retain];
+ return objc_msgSend(obj, @selector(retain));
}
extern "C" OBJC_PUBLIC BOOL objc_release_fast_no_destroy_np(id obj)
@@ -376,7 +376,7 @@ static inline void release(id obj)
objc_release_fast_np(obj);
return;
}
- [obj release];
+ objc_msgSend(obj, @selector(release));
}
static inline void initAutorelease(void)
@@ -436,7 +436,7 @@ static inline id autorelease(id obj)
}
return obj;
}
- return [obj autorelease];
+ return objc_msgSend(obj, @selector(autorelease));
}
extern "C" OBJC_PUBLIC unsigned long objc_arc_autorelease_count_np(void)
|
And for the failing legacy tests: commit 75b9dd39b2b4e4ff5750f511c6c60f504ce6a933 (HEAD -> direct-methods-changes)
Author: hmelder <service@hugomelder.com>
Date: Sat Jan 13 02:25:22 2024 +0100
Move non-legacy unit tests into NEW_TESTS
diff --git a/Test/CMakeLists.txt b/Test/CMakeLists.txt
index fc190df..7743cdc 100644
--- a/Test/CMakeLists.txt
+++ b/Test/CMakeLists.txt
@@ -20,11 +20,9 @@ set(TESTS
BlockTest_arc.m
ConstantString.m
Category.m
- DirectMethods.m
ExceptionTest.m
FastARC.m
FastARCPool.m
- FastPathAlloc.m
FastRefCount.m
Forward.m
ManyManySelectors.m
@@ -86,6 +84,8 @@ endif()
# shouldn't be run in legacy mode.
set(NEW_TESTS
category_properties.m
+ DirectMethods.m
+ FastPathAlloc.m
)
remove_definitions(-D__OBJC_RUNTIME_INTERNAL__=1) |
Thanks. Well also need the fallback path for platforms with no assembly, but that’s easy for me to add. |
40fa8e3
to
0e2e885
Compare
I believe this is now all fixed. |
LLVM PR is here: Please test! |
Great! Thank you for implementing this :) |
The fast paths follow the pattern that we established for fast ARC: Framework base classes can opt in by implementing a `+_TrivialAllocInit` method. This opt-in behaviour is inherited and is removed implicitly in any subclass that implements alloc or init methods (alloc and init are treated independently). Compilers can emit calls to `objc_alloc(cls)` instead of `[cls alloc]`, `objc_allocWithZone(cls)` instead of `[cls allocWithZone: NULL]`, and `objc_alloc_init` instead of `[[cls alloc] init]`. Direct methods don't require very much support in the runtime. Apple reuses their fast path for `-self` (which is supported only in the Apple fork of clang, not the upstream version) for a fast init. Given that the first few fields of the runtime's class structure have been stable for around 30 years, I'm happy moving the flags word (and the initialised bit, in particular) into the public ABI. This lets us do a fast-path check for whether a class is initialised in class methods and call `objc_send_initialize` if it isn't. This function is now exposed as part of the public ABI, it was there already and does the relevant checks without invoking any of the message-sending machinery. Fixes #165 #169
0e2e885
to
f5e02c4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All tests are passing now!
Can someone run the -base test suite with this and the clang changes? |
9643 Passed tests
34 Dashed hopes
1 Skipped set
0 Failed set |
Do we see any noticeable code size change between the two (for -base, with the old and new compilers)? |
I didn't know that We have shaved about ~2MB off the shared library. StatClang with changes from PR: hugo@libobjc2:~/libs-base$ stat /usr/local/lib/libgnustep-base.so.1.29.0
File: /usr/local/lib/libgnustep-base.so.1.29.0
Size: 12693576 Blocks: 24776 IO Block: 4096 regular file
Device: 24h/36d Inode: 1635115 Links: 1
Access: (0755/-rwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2024-01-13 21:45:03.466030044 +0100
Modify: 2024-01-13 21:45:03.484030115 +0100
Change: 2024-01-14 11:11:39.554747674 +0100
Birth: 2024-01-14 11:11:39.542747847 +0100 Clang 14: hugo@libobjc2:~/libs-base$ stat /usr/local/lib/libgnustep-base.so.1.29.0
File: /usr/local/lib/libgnustep-base.so.1.29.0
Size: 14734176 Blocks: 28768 IO Block: 4096 regular file
Device: 24h/36d Inode: 1640092 Links: 1
Access: (0755/-rwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2024-01-14 11:19:37.063676199 +0100
Modify: 2024-01-14 11:19:37.091675786 +0100
Change: 2024-01-14 11:20:10.209187899 +0100
Birth: 2024-01-14 11:20:10.204187972 +0100 ConfigurationGNUstep Makeexport DEB_HOST_MULTIARCH=aarch64-linux-gnu
./configure --with-layout=debian \
--enable-native-objc-exceptions \
--enable-objc-arc \
--prefix=/usr \
--with-runtime-abi=gnustep-2.2 \
--with-library-combo=ng-gnu-gnu \
CC="clang" CXX="clang++" CPP="clang -E" LDFLAGS="-fuse-ld=lld -L/usr/lib/$(DEB_HOST_MULTIARCH)" SHELLPROG=/bin/bash GNUMAKE=make GNUstep Base./configure |
Looks like roughly a 14% reduction. Very nice to have. I'll add this to the release notes. |
The fast paths follow the pattern that we established for fast ARC: Framework base classes can opt in by implementing a
+_TrivialAllocInit
method.This opt-in behaviour is inherited and is removed implicitly in any subclass that implements alloc or init methods (alloc and init are treated independently).
Compilers can emit calls to
objc_alloc(cls)
instead of[cls alloc]
,objc_allocWithZone(cls)
instead of[cls allocWithZone: NULL]
, andobjc_alloc_init
instead of[[cls alloc] init]
.Direct methods don't require very much support in the runtime. Apple reuses their fast path for
-self
(which is supported only in the Apple fork of clang, not the upstream version) for a fast init. Given that the first few fields of the runtime's class structure have been stable for around 30 years, I'm happy moving the flags word (and the initialised bit, in particular) into the public ABI. This lets us do a fast-path check for whether a class is initialised in class methods and callobjc_send_initialize
if it isn't. This function is now exposed as part of the public ABI, it was there already and does the relevant checks without invoking any of the message-sending machinery.Fixes #165 #169