Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make ObjC memory management errors more debuggable #1111

Merged
merged 32 commits into from
May 5, 2024
Merged

Conversation

liamappelbe
Copy link
Contributor

When retaining or releasing a reference to an ObjC object or block, assert that the pointer is valid. The goal is to catch memory management errors (most importantly, retaining a ref to a dead object) in Dart where we can get a nice stack trace. Without this change, the retain/release call will seg fault in ObjC land, which gives no useful debugging information.

The key functions are isValidBlock (in objective_c.c) and isValidObject (in internal.dart).

isValidBlock is based on the getBlockRetainCount function from the ffigen tests. Blocks have a field named isa, which describes what kind of block it is. I tracked down all possible values of this field (there are 6), and so I can just check that the value is one of these. When a block is destroyed the isa field is cleared, so the odds of falsely declaring a block as valid are pretty low.

isValidObject is similar. Objects have an 8 byte header, part of which is a pointer to a class object. The object_getClass function extracts that pointer, and then we check if it's valid by looking it up in a set that contains every valid class object (there are typically a few thousand). We construct this set using objc_copyClassList. Like blocks, this header is cleared when the object is destroyed, so false positives are unlikely. See #1038 for more info.

Fixes #1038

@liamappelbe liamappelbe requested a review from dcharkes May 1, 2024 03:51
@github-actions github-actions bot added type-infra A repository infrastructure change or enhancement package:ffigen package:objective_c labels May 1, 2024
@github-actions github-actions bot removed the type-infra A repository infrastructure change or enhancement label May 1, 2024
@liamappelbe
Copy link
Contributor Author

Looks like I used the wrong base branch. Commits e4a844e and later are real.

Copy link

github-actions bot commented May 1, 2024

PR Health

Changelog Entry ✔️

Details
Package Changed Files

Changes to files need to be accounted for in their respective changelogs.

Coverage ⚠️

Details
File Coverage
pkgs/objective_c/lib/src/c_bindings_generated.dart 💔 Not covered
pkgs/objective_c/lib/src/internal.dart 💔 Not covered

This check for test coverage is informational (issues shown here will not fail the PR).

This check can be disabled by tagging the PR with skip-coverage-check.

License Headers ✔️

Details
// Copyright (c) 2024, the Dart project authors. Please see the AUTHORS file
// for details. All rights reserved. Use of this source code is governed by a
// BSD-style license that can be found in the LICENSE file.
Files
no missing headers

All source files should start with a license header.

Unrelated files missing license headers
Files
pkgs/ffi/example/main.dart
pkgs/ffigen/example/libclang-example/generated_bindings.dart
pkgs/ffigen/example/shared_bindings/generate.dart
pkgs/ffigen/example/shared_bindings/lib/generated/a_gen.dart
pkgs/ffigen/example/shared_bindings/lib/generated/a_shared_b_gen.dart
pkgs/ffigen/example/shared_bindings/lib/generated/base_gen.dart
pkgs/ffigen/example/simple/generated_bindings.dart
pkgs/ffigen/lib/src/config_provider/config_spec.dart
pkgs/ffigen/lib/src/header_parser/clang_bindings/clang_bindings.dart
pkgs/ffigen/test/collision_tests/expected_bindings/_expected_decl_decl_collision_bindings.dart
pkgs/ffigen/test/collision_tests/expected_bindings/_expected_decl_symbol_address_collision_bindings.dart
pkgs/ffigen/test/collision_tests/expected_bindings/_expected_decl_type_name_collision_bindings.dart
pkgs/ffigen/test/collision_tests/expected_bindings/_expected_reserved_keyword_collision_bindings.dart
pkgs/ffigen/test/header_parser_tests/expected_bindings/_expected_comment_markup_bindings.dart
pkgs/ffigen/test/header_parser_tests/expected_bindings/_expected_dart_handle_bindings.dart
pkgs/ffigen/test/header_parser_tests/expected_bindings/_expected_forward_decl_bindings.dart
pkgs/ffigen/test/header_parser_tests/expected_bindings/_expected_functions_bindings.dart
pkgs/ffigen/test/header_parser_tests/expected_bindings/_expected_imported_types_bindings.dart
pkgs/ffigen/test/header_parser_tests/expected_bindings/_expected_native_func_typedef_bindings.dart
pkgs/ffigen/test/header_parser_tests/expected_bindings/_expected_opaque_dependencies_bindings.dart
pkgs/ffigen/test/header_parser_tests/expected_bindings/_expected_packed_structs_bindings.dart
pkgs/ffigen/test/header_parser_tests/expected_bindings/_expected_regress_384_bindings.dart
pkgs/ffigen/test/header_parser_tests/expected_bindings/_expected_struct_fptr_fields_bindings.dart
pkgs/ffigen/test/header_parser_tests/expected_bindings/_expected_typedef_bindings.dart
pkgs/ffigen/test/header_parser_tests/expected_bindings/_expected_unions_bindings.dart
pkgs/ffigen/test/header_parser_tests/expected_bindings/_expected_varargs_bindings.dart
pkgs/ffigen/test/large_integration_tests/_expected_cjson_bindings.dart
pkgs/ffigen/test/large_integration_tests/_expected_libclang_bindings.dart
pkgs/ffigen/test/large_integration_tests/_expected_sqlite_bindings.dart
pkgs/ffigen/test/native_test/_expected_native_test_bindings.dart
pkgs/jni/lib/src/lang/jcharacter.dart
pkgs/jni/lib/src/third_party/generated_bindings.dart
pkgs/jni/lib/src/third_party/global_env_extensions.dart
pkgs/jni/lib/src/third_party/jni_bindings_generated.dart
pkgs/jnigen/android_test_runner/lib/main.dart
pkgs/jnigen/example/in_app_java/lib/android_utils.dart
pkgs/jnigen/example/kotlin_plugin/example/lib/main.dart
pkgs/jnigen/example/kotlin_plugin/lib/kotlin_bindings.dart
pkgs/jnigen/example/kotlin_plugin/lib/kotlin_plugin.dart
pkgs/jnigen/example/pdfbox_plugin/lib/pdfbox_plugin.dart
pkgs/jnigen/example/pdfbox_plugin/lib/src/third_party/org/apache/pdfbox/pdmodel/PDDocument.dart
pkgs/jnigen/example/pdfbox_plugin/lib/src/third_party/org/apache/pdfbox/pdmodel/PDDocumentInformation.dart
pkgs/jnigen/example/pdfbox_plugin/lib/src/third_party/org/apache/pdfbox/pdmodel/_package.dart
pkgs/jnigen/example/pdfbox_plugin/lib/src/third_party/org/apache/pdfbox/text/PDFTextStripper.dart
pkgs/jnigen/example/pdfbox_plugin/lib/src/third_party/org/apache/pdfbox/text/_package.dart
pkgs/jnigen/lib/src/bindings/descriptor.dart
pkgs/jnigen/lib/src/elements/elements.g.dart
pkgs/jnigen/test/jackson_core_test/third_party/bindings/com/fasterxml/jackson/core/_package.dart
pkgs/jnigen/tool/command_runner.dart

Package publish validation ✔️

Details
Package Version Status
package:ffi 2.1.2 already published at pub.dev
package:ffigen 12.0.0-wip WIP (no publish necessary)
package:jni 0.9.0 already published at pub.dev
package:jnigen 0.9.0 already published at pub.dev
package:native_assets_builder 0.6.1 already published at pub.dev
package:native_assets_cli 0.5.4 already published at pub.dev
package:native_toolchain_c 0.4.1 already published at pub.dev
package:objective_c 0.0.1-wip WIP (no publish necessary)

Documentation at https://github.com/dart-lang/ecosystem/wiki/Publishing-automation.

@coveralls
Copy link

coveralls commented May 1, 2024

Coverage Status

coverage: 91.246% (+0.6%) from 90.676%
when pulling 49de3a4 on memerror
into 6a9282c on main.

@liamappelbe
Copy link
Contributor Author

liamappelbe commented May 1, 2024

Looks like the object ref counting test sometimes crashes (2/7 times I've run it). The pointer passed to object_getClass doesn't need to be a valid object, but it does need to be readable. I guess the memory page that the object was sitting in was unmapped right after the object was deleted.

I can work around this for the test by using one of the Mac virtual memory functions to test if the pointer is readable. These return an error code on failure rather than crashing. I'll do this for both the object and block tests.

I could potentially do this virtual memory test in the assert itself, but I feel like that would be overkill. I've never seen this crash on my local machine, so I think it's fairly rare. And unlike the test, if an unreadable pointer is passed to the assert, we were about to crash anyway, and the assert is just to give a best effort stack trace.

@liamappelbe liamappelbe marked this pull request as ready for review May 2, 2024 04:35
@dcharkes
Copy link
Collaborator

dcharkes commented May 2, 2024

I could potentially do this virtual memory test in the assert itself, but I feel like that would be overkill. I've never seen this crash on my local machine, so I think it's fairly rare. And unlike the test, if an unreadable pointer is passed to the assert, we were about to crash anyway, and the assert is just to give a best effort stack trace.

Hm, maybe we should do it in asserts. There could be a chance that it's more common on a different type of machine. We don't pay any costs for asserts in release mode, so I'm not worried about the speed.

We should probably also make some clear documentation on the FFIgen and ObjectiveC package readme's that if there's segfaults the first thing users should do is run with asserts enabled.

@liamappelbe
Copy link
Contributor Author

liamappelbe commented May 2, 2024

Hm, maybe we should do it in asserts. There could be a chance that it's more common on a different type of machine. We don't pay any costs for asserts in release mode, so I'm not worried about the speed.

Done.
Actually, it looks like that check is too strict to use in general code. ObjC sometimes uses "tagged pointers" (this means something different than in Dart, so I can't just mask off the tag).

We should probably also make some clear documentation on the FFIgen and ObjectiveC package readme's that if there's segfaults the first thing users should do is run with asserts enabled.

Yeah, I need to write a readme for package:objective_c. At the moment it just has a placeholder readme. When I do I'll include a note about this.

@github-actions github-actions bot added the type-infra A repository infrastructure change or enhancement label May 2, 2024
isLeaf: true, symbol: 'getObjectRetainCount')
external int _getObjectRetainCount(Pointer<Void> object);

int getObjectRetainCount(Pointer<ObjCObject> object) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drive by: we don't like get in Dart land:
https://dart.dev/effective-dart/design#avoid-starting-a-method-name-with-get

It seems to apply to functions in addition to methods.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Collaborator

@dcharkes dcharkes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 👍

- name: Run VM tests
run: dart test
- name: Collect coverage
- name: Run VM tests and collect coverage
run: ./tool/coverage.sh
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this give nice logs if the tests fail on CI?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. I thought the set -e line of the coverage script would be sufficient, but apparently not. This is another good argument to delete that script and use a Dart script that gives us more control. Filed #1120

// nice stack trace before the real crash, but it would be a problem if
// isValidObject broke due to a runtime update.
final mask =
Abi.current() == Abi.macosX64 ? 0x00007ffffffffff8 : 0x00000001fffffff8;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any documentation we can reference for these masks?

if not, maybe leave a link here to your investigation comment on the GitHub issue.

Nit: Maybe make two consts for the different archs on separate lines so it's easier to see how they differ.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

uint64_t getObjectRetainCount(ObjectRefCountExtractor* object) {
// The object ref count is stored in the largest byte of the object header,
// for counts up to 255. Higher counts do something more complicated.
return object->header >> 56;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider defining 256 as a special const and documenting on the Dart side that we give the refcount as [0, ..., 255, k256OrMore]? With k256OrMore = 256;. So that we have well-defined behavior above 255?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The values above 255 don't set this to a fixed constant, but they do always set the high bit (I think the lowest 7 bits of the count live here, and the rest live elsewhere). So I can check if the value is above 127 and then return k128OrMore.

doGC();
expect(counter.value, 0);
expect(objectRetainCount(obj1bRaw), 0);
calloc.free(counter);
});
});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a test that exercises the behavior above 255. (And define and document the behavior, also see the other comment.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@github-actions github-actions bot removed the type-infra A repository infrastructure change or enhancement label May 5, 2024
@liamappelbe liamappelbe merged commit b2b0e32 into main May 5, 2024
19 checks passed
@liamappelbe liamappelbe deleted the memerror branch May 5, 2024 22:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Can we make memory management bugs more debuggable?
4 participants