Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update or replace ffigen4 #13

Open
xrme opened this issue Feb 18, 2017 · 80 comments
Open

Update or replace ffigen4 #13

xrme opened this issue Feb 18, 2017 · 80 comments

Comments

@xrme
Copy link
Member

xrme commented Feb 18, 2017

The interface databases that CCL uses are generated by a program called ffigen4. It is a set of patches to gcc-4.0.0 (see http://svn.clozure.com/publicsvn/ffigen4/)

These patches should be brought up-to-date. Alternatively, it might be an option to replace ffigen4 with some other tool. https://github.com/rpav/c2ffi might be suitable.


Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

@eschaton
Copy link

What about just using CFFI? Or would that be insufficient?

@xrme
Copy link
Member Author

xrme commented Feb 18, 2017

The interface databases make the #_ and #$ reader macros work. These reader macros are used extensively in the implementation of CCL itself.

I'm not a CFFI user, so I'm not really qualified to say whether it is nicer than CCL's native FFI, but I can say that I think that CCL's native FFI is a great feature.

@ailisp
Copy link
Contributor

ailisp commented Mar 11, 2017

I know that CFFI is only a portable layer. Like what bordeaux-threads do with CCL's multiprocessing. So it may be not appropriate to use CFFI here,since we don't need to use the interface database in other CL and also Clozure's FFI provide more functionality. I suggest to update ffigen, because writing a new backend for c2ffi may only interested for developing CCL itself, library authors usually use CFFI other than platform specific ones.

@ghost
Copy link

ghost commented May 30, 2017

I agree that CCL's native FFI is a great feature. But unfortunately, rarely projects build upon it, instead they actually build upon CFFI. so I made a little project ccl-cffi to use same function interface as CFFI, but implement it upon CCL more efficiently.
In this way, my programs runs more efficiently, and I could still use other packages depend on CFFI.

@ailisp
Copy link
Contributor

ailisp commented Dec 27, 2017

I would like to work on this. Today I took a look at ffigen, and CCL's ffi doc. Looks not good to always patch gcc to build ffigen. Using https://github.com/rpav/c2ffi is a good idea. Here I can either

  1. add a driver for c2ffi to generate ffi format
    or
  2. replace lib/parse-ffi.lisp with a new lisp program that input c2ffi's sexp output and output cdb.
    @xrme Which way you prefer? Approach 2 is using lisp, of course more fun than c++ :) Thanks!

@xrme
Copy link
Member Author

xrme commented Dec 27, 2017

You are welcome to work on this if you want to, but I worry that it is a rather big project.

I agree that we should try out https://github.com/rpav/c2ffi. If we can parse a simple header file with c2ffi and convince ourselves that the output matches (well, is isomorphic to) the current ffigen output, then that will give us some confidence that we can make it work.

I don't think we can completely replace parse-ffi.lisp. But I see no problem with writing (in Lisp or whatever) some program that will reformat c2ffi's output (either json or sexp) into the s-expression style ffigen format that parse-ffi.lisp knows how to process. If we find that c2ffi is working for us, we can consider writing a c2ffi driver in C++ at a later time.

My only reservation about c2ffi is that it uses an unstable (if not private) API to clang. There is a library called libclang. It provides a stable, C-based API. When I last looked at it, I didn't see how libclang dealt with C preprocessor content.

It would be great if we could use libclang for the interface translator, but maybe this is either not possible, or too much work.

If you are feeling up to the task of investigating this, then great! Thank you and good luck to you. I'll help you any way I can. If you spend some time on it and decide that it is too much trouble, I will certainly understand that, too.

@ailisp
Copy link
Contributor

ailisp commented Dec 29, 2017

@xrme Thanks for the detailed observation. I need to study whether c2ffi generates isomorphic to ffigen after get ffigen4 works and try to compare their output. It's a bit difficult to get a working gcc 4.0 in current environment, but it's easier to do that in an old vm. But I will first try to patch current gcc (7.2) and if this is done, at least we have a modern ffigen4 and could compare its output with c2ffi.

c2ffi looks "relative" stable as it just updates for new llvm version and didn't change the example output json for 4 years:
https://github.com/rpav/c2ffi/blame/llvm-5.0.0/README.md
But I'll contact Ryan Pavlik to see if it's API is stable (after make sure the output is isomorphic).

As for libclang, I did some search, and there's a new flag to use C preprocessor:
https://stackoverflow.com/questions/13881506/retrieve-information-about-pre-processor-directives
I agree with you it would be much amount of work to use libclang. libclang is interesting and I would like to learn it but it takes some time.

@xrme
Copy link
Member Author

xrme commented Dec 29, 2017

If you haven't already, it may be helpful to consult https://trac.clozure.com/ccl/wiki/BuildFFIGEN and also https://trac.clozure.com/ccl/wiki/CustomFramework

In particular, there's an Mac-specific ffigen branch. I don't know if it builds on an up-to-date system. I have an ffigen binary that works.

Also see http://svn.clozure.com/publicsvn/ffigen4/ (in particular the branches/ directory)

@ailisp
Copy link
Contributor

ailisp commented Dec 29, 2017

Thanks for these guides. Today I tried to build it on archlinux, but the gcc-4.0's makefile doesn't work for gcc-7.2. So I tried to build it in a Fedora 4 vm, which has exactly a gcc-4.0.0. The build is almost automatic, except I need to give objc-act.c's position to patch it. And I did a compare with c2ffi's generation:
input:

#define FOO (1 << 2)

const int BAR = FOO + 10;

typedef struct my_point {
    int x;
    int y;
    int odd_value[BAR + 1];
} my_point_t;

enum some_values {
    a_value,
    another_value,
    yet_another_value
};

void do_something(my_point_t *p, int x, int y);

c2ffi's output

[
{ "tag": "const", "name": "BAR", "location": "/home/rpav/test.h:3:11", "type": { "tag": ":int" }, "value": 14 },
{ "tag": "struct", "name": "my_point", "id": 0, "location": "/home/rpav/test.h:5:16", "bit-size": 544, "bit-alignment": 32, "fields": [{ "tag": "field", "name": "x", "bit-offset": 0, "bit-size": 32, "bit-alignment": 32, "type": { "tag": ":int" } }, { "tag": "field", "name": "y", "bit-offset": 32, "bit-size": 32, "bit-alignment": 32, "type": { "tag": ":int" } }, { "tag": "field", "name": "odd_value", "bit-offset": 64, "bit-size": 480, "bit-alignment": 32, "type": { "tag": ":array", "type": { "tag": ":int" }, "size": 15 } }] },
{ "tag": "typedef", "name": "my_point_t", "location": "/home/rpav/test.h:9:3", "type": { "tag": ":struct", "name": "my_point", "id": 0 } },
{ "tag": "enum", "name": "some_values", "id": 0, "location": "/home/rpav/test.h:11:6", "fields": [{ "tag": "field", "name": "a_value", "value": 0 }, { "tag": "field", "name": "another_value", "value": 1 }, { "tag": "field", "name": "yet_another_value", "value": 2 }] },
{ "tag": "function", "name": "do_something", "location": "/home/rpav/test.h:17:6", "variadic": false, "parameters": [{ "tag": "parameter", "name": "p", "type": { "tag": ":pointer", "type": { "tag": "my_point_t" } } }, { "tag": "parameter", "name": "x", "type": { "tag": ":int" } }, { "tag": "parameter", "name": "y", "type": { "tag": ":int" } }], "return-type": { "tag": ":void" } }
]

ffigen's output. Modify a little to the struct definition for ANSI C, otherwise ffigen will complain "struct size is variant". Also a lot of (macro... ) lines are omitted here.

(macro ("test.h" 1) "FOO" "(1 << 2)")
(var ("test.h" 3)
 "BAR"
 (int ()) (static))
(struct ("" 0)
 "my_point"
 (("x" (field (int ()) 0 4))
  ("y" (field (int ()) 4 4))
  ("odd_value" (field (array 5 (int ())) 8 20))))
(type ("test.h" 9)
 "my_point_t"
 (struct-ref "my_point"))
(enum ("" 0)
 "some_values"(("a_value" 0)("another_value" 1)("yet_another_value" 2)))
(enum-ident ("" 0)
 "a_value" 0)
(enum-ident ("" 0)
 "another_value" 1)
(enum-ident ("" 0)
 "yet_another_value" 2)
(function ("test.h" 17)
 "do_something"
 (function
  ((pointer (typedef "my_point_t")) (int ()) (int ()) )
  (void ())) (extern))

For toplevel variable, struct, typedef, enum and function definition c2ffi contains enough information to build a ffi definition. The thing ffigen has but c2ffi doesn't is macro definitions, though c2ffi has a option -M to dump macro definitions to a separate file:

const long __c2ffi_FOO = FOO;

It doesn't really parse the macro definition, but this is a clever work around and let clang compile this snippet, then he knows the value of FOO and is able to convert it into a defconst . But to generate a ffigen style (macro ("test.h" 1) "FOO" "(1 << 2)") I need to patch c2ffi :-(
I wonder how c2ffi will deal with macros like #define max(a,b) ((a)>(b)?(a):(b)), so I also try it. And c2ffi simply output nothing for it. ffigen will leave a raw (macro ...) line as expected.
And I found c2ffi also need to update for each new version of clang. So based on your suggestions my final plan is:

  1. Maintain ffigen's patches for current gcc and maybe future gcc;
  2. Study libclang and c2ffi's src, build a slightly variant version that include raw macro lines and output in ffi format, and it's better to also utilize libclang's new feature on C preprocessor content.

@xrme
Copy link
Member Author

xrme commented Dec 30, 2017

Thanks for that research. Your planned approach seems good.

@ailisp
Copy link
Contributor

ailisp commented Dec 31, 2017

Hi @xrme. I made a little progress today. Also who is gb in the svn log? I would rebase and use his name in git. Thanks!
Made some minor change on Makefile.in. Now it can build with recent gcc, but still need to download gcc-4.0.0 source (in gcc-4.0.0 branch). I tested building with gcc version 7.2.1 20171128 (GCC):
https://github.com/ailisp/ffigen
Also try to patch gcc-7.2.0 in gcc-7.2.0 branch. However, build unpatched gcc 7.2 took me ~2 hours so the progress is slow. If still no progress I'll study libclang and c2ffi and working on a new ffigen.

@xrme
Copy link
Member Author

xrme commented Dec 31, 2017

@ailisp: gb is Gary Byers gb@clozure.com. He doesn't have a GitHub id.

Don't feel pressured to get this done because I mentioned this issue from that FreeBSD 12 bug. I can always build an ffigen on an older system and copy it to a FreeBSD 12 system if I need to.

@ailisp
Copy link
Contributor

ailisp commented Jan 3, 2018

@xrme Thanks. Recent progress: after read ffigen.c, I found its structure is a bit difficult to fit libclang. libclang is given you the AST and you walk on it but current ffigen.c is to patch and execute in the parsing step in gcc. To build a new version in libclang will be simpler than working on current ffigen.c. Sorry for this, though previous work from Gary, Helmut and others are quite helpful and I'll attribute most of contributions to them.
Current libclang support on preprocessing information is still incomplete. As we know for empty .h file there's hundred of lines of #define __GNUC__ 4, #define __linux__ 1, etc. libclang can only get __GNUC__ but not 4, and filename for these macros are NULL. For macros in specific files, /usr/include/stdio.h or a foo.h it's not a problem. libclang can get start/end locations of this macro definition and I can manually read it from file. So I can get:

(macro ("test.h" 1) "FOO" "(1 << 2)") 

but not:

(macro ("" 1) "__GNUC__" ???)

??? is not accessible (because don't know where's file). After long attempt I feel ashamed that I can get these from clang -dM -E -x c /dev/null > predefined.h :-)
Also, I'm delight to find that I can only produce raw visible macro lines and parse-ffi.lisp will take care of recursive replace, macro with arguments, parse and eval c expressions. It's really a great work.

@ailisp
Copy link
Contributor

ailisp commented Jan 8, 2018

Progress report: finished macro, enum, reference a primitive types, part of reference a pointer type and define a variable of primitive type:
https://github.com/ailisp/ffigen5
When I'm testing with various type of pointer type, found a very bad news about function pointer:
If parse void (*f)(void);, original ffigen will produce

(var ("test.h" 32)
 "f"
 (pointer (function
  ()
  (void ()))) (static))

But for libclang, it can first recognize f is a pointer, then clang_getPointeeType of this type returns a CXType_Unexposed, which means this information (function prototype that f points to) is not export to libclang. Can only be accessed by clang's C++ library libTooling (which is also used by c2ffi). But in it's introduction:
https://clang.llvm.org/docs/Tooling.html

Do not use LibTooling when you…:
want a stable interface so you don’t need to change your code when the AST API changes

What else I can get from libclang is a raw string of f's type: void (*)(void)
I'm thinking about 3 ways for this (all have some disadvantages):

  1. Isolate and wrap required C++ part in a separate small lib parallelled to libclang, need to update as clang update. Additional maintainance required for the future but more general, and it's possible there's other features needed only in LibTooling.
  2. Though libclang doesn't allow access to c++ pointer, it have access to function definition. Add a temporary line replace (*) to a internal name ___g1234_ so I'm able to produce something like: (function () (void ())). Also works if there's parameters. But if there's function pointer parameters, well, a little messed up.
  3. Ignore and simply treat it as a void * pointer. As I read in parse-ffi.lisp, doing this looks safe (maybe I lose something?) But I want to produce at least as complete as original ffigen and don't like this way.
    Any good idea about this? Thanks!

@xrme
Copy link
Member Author

xrme commented Jan 9, 2018

Thank you, @ailisp, for investigating this.

I really want to use the stable libclang interface if we possibly can.

Let's try your approach number 3. The C ABIs don't distinguish between a function pointer and any other generic pointer. Writing something like:

(#_qsort :address base :size_t nel :size_t width :address comp)

where comp is defined via defcallback seems fine to me. There's no way anyone is going to write out the type of the comp function in CCL's FFI notation (even if the notation supports function pointers, which I'm not sure it even does).

@ailisp
Copy link
Contributor

ailisp commented Jan 9, 2018

Thank you! Sounds good since it doesn't affect how we use such callback in lisp. I also prefer a stable interface.

@eschaton
Copy link

Please also file a bug against clang if you can, I think they’d want to know that this information isn’t exposed.

@ailisp
Copy link
Contributor

ailisp commented Jan 18, 2018

@eschaton Hi, thanks and sorry for the late response. I was busy with a interview in San Francisco and just back home. I post a message in cfe-dev mail list: http://lists.llvm.org/pipermail/cfe-dev/2018-January/056566.html. Didn't hear replies though.
@xrme I'm mostly done with reference a type. Having a problem for transparent union. Is transparent union means something like:

struct {
    int a;
    union {
        int b;
        float c;
    }
}
```
or gcc extension: `__attribute__((__transparent_union__))`?

@ailisp
Copy link
Contributor

ailisp commented Jan 19, 2018

Today I finished almost all c part. Now lefting objc class and category. I have a question about function definition: in about line 460 of ffigen.c:

      /* struct ffi_typeinfo *arg_type_info; */
        /*
          It seems like functions that take a fixed number of arguments
          have a "void" argument after the fixed arguments, while those
          that take an indefinite number don't.
          That's perfectly sane, but the opposite of what LCC does.
          So, if there's a "void" argument at the end of the arglist,
          don't emit it; if there wasn't, emit one.
          Sheesh.
        */

But what I tested seems the opposite:
given:

int af(int a, ...);
int bf(int a);

ffigen gives:

(function ("test.h" 62)
 "af"
 (function
  ((int ()) (void ()))
  (int ())) (extern))
(function ("test.h" 63)
 "bf"
 (function
  ((int ()) )
  (int ())) (extern))

Is this comment obsolete? I use the same behavior as ffigen gives.

@xrme
Copy link
Member Author

xrme commented Feb 20, 2018

@ailisp I've had a chance to experiment with your code, and it looks very promising. It is so helpful that you figured out so much of the libclang API. Thank you very much.

I need to generate a new set of interface databases from FreeBSD 12 header files. I spent part of today hacking on and using (my private fork of) your libclang-based ffigen, and I think it's going to work. I'm planning to spend the next two days on this and see how far I get. Starting Thursday, I'll be away for two weeks and probably won't have a chance to do very much hacking on CCL, but I am hoping that two days will be enough time to get it done.

FreeBSD will be a good start because we won't have to worry about dealing with Objective-C.

@ailisp
Copy link
Contributor

ailisp commented Feb 20, 2018 via email

@GOFAI
Copy link

GOFAI commented Sep 11, 2018

Any further progress on this?

@ailisp
Copy link
Contributor

ailisp commented Sep 11, 2018

Hi @GOFAI
@xrme have further update in https://github.com/xrme/ffigen5, not sure is it fully working?

@xrme
Copy link
Member Author

xrme commented Sep 11, 2018

I have gotten far enough with a new ffigen to be able to generate working headers for FreeBSD. I have been meaning to track down that code and check it in, but I haven't done that yet. I will try to do that soon.

@GOFAI
Copy link

GOFAI commented Sep 12, 2018

I'm particularly interested in generating interface files for the newer macOS frameworks like SceneKit. How complete is the ObjC functionality?

@ailisp
Copy link
Contributor

ailisp commented Sep 12, 2018 via email

@GOFAI
Copy link

GOFAI commented Sep 14, 2018

Has anyone gotten ffigen4 to compile on macOS using a recent XCode? The ObjC blocks version (ffigen-apple-gcc-5646/ffigen4) exits compilation on the following errors:

../../gcc-5646/gcc/toplev.c:564:1: error: redefinition of a 'extern inline'
      function 'floor_log2' is not supported in C99 mode
floor_log2 (unsigned HOST_WIDE_INT x)
^
../../gcc-5646/gcc/toplev.h:174:1: note: previous definition is here
floor_log2 (unsigned HOST_WIDE_INT x)
^
../../gcc-5646/gcc/toplev.c:599:1: error: redefinition of a 'extern inline'
      function 'exact_log2' is not supported in C99 mode
exact_log2 (unsigned HOST_WIDE_INT x)
^
../../gcc-5646/gcc/toplev.h:180:1: note: previous definition is here
exact_log2 (unsigned HOST_WIDE_INT x)
^

I'd try compiling it using the Homebrew formula that provides Apple's gcc 4.2.1-5666.3, but it only works on OS X 10.9 or older.

@ailisp
Copy link
Contributor

ailisp commented Sep 14, 2018 via email

@GOFAI
Copy link

GOFAI commented Nov 13, 2019

I'm finally getting around to trying to test this--got caught up in work stuff. In the version of ffigen5 that works with h-to-ffi.sh, was the CFLAGS argument in the script left the same as for ffigen4? It's complaining at me about some of the inputs:

error: unknown argument: '-quiet'
error: unknown argument: '-fffigen'
warning: /Users/Walrus/ffigen5/ffigen5: 'linker' input unused [-Wunused-command-line-argument]

@GOFAI
Copy link

GOFAI commented Nov 15, 2019

OK, I think I'm getting close. h-to-ffi.sh now processes the Cocoa headers without errors. When I tried to test it on SceneKit, though, it turns out that there are a few typedefs intended for SIMD that confuse it because the associated CXTypeKind is "Unexposed." Here's an example from simd/vector_types.h:

/*! @abstract A vector of two 16-bit signed (twos-complement) integers.
 *  @description In C++ and Metal, this type is also available as
 *  simd::short2. The alignment of this type is greater than the alignment
 *  of short; if you need to operate on data buffers that may not be
 *  suitably aligned, you should access them using simd_packed_short2
 *  instead.                                                                  */
typedef __attribute__((__ext_vector_type__(2))) short simd_short2;

I'm doing some exploratory programming to determine what can be extracted from the cursors for these, but I'm not actually sure how they ought to be represented in the ffigen output. Maybe something like this?

(type ("/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.13.sdk/usr/include/simd/vector_types.h" 220)
 "simd_short2"
 (array 2 (short ())))

@GOFAI
Copy link

GOFAI commented Nov 17, 2019

It turns out the LLVM 9's libclang exposes the ExtVector type, solving the problem above, and I also found a way to handle protocol-qualified types. I've made semi-educated guesses about how to handle both of these, because I'm not sure that they existed in the version of Cocoa that you provided the historical ffigen output for.

My current version of ffigen5 writes .ffi files for macOS frameworks that parse-ffi.lisp reads in without complaint, which suggests that the formatting of the .ffi file is OK. parse-ffi.lisp seems to hang while writing out the .cdb files, though, but I'm not sure why because it's not giving me useful error messages and the .cdb files themselves don't appear to be human-readable. (The issue could have originated with the arguments passed in the populate.sh script I used, as well.) A bug report on my ffigen5 fork from one of you familiar with running the old system would be very appreciated.

Update: ./populate.sh and parse-ffi now run without complaint for the Cocoa and SceneKit headers from the macOS 10.14 SDK as of 11/19.

@GOFAI
Copy link

GOFAI commented Nov 27, 2019

Update: with a few tweaks to the ObjC bridge to account for differences in the ffigen5 output, (require "COCOA") with the new interface files now gets to the point where it tries to open a new listener and then chokes because of a difficulty encountered while trying to execute HEMLOCK-INTERFACE:BUFFER-END-MARK for the listener's echo buffer. Is that akin to the difficulties you've been having with the different store versions of the IDE?

The tweaks are as follows: it turns out one of the struct definitions for an ObjC class in the ffigen4 output, that for NSConstantString, is used by the bridge, so it has to be specified manually. A few low-level clang constructs, namely __va_list_tag and __builtin_va_list, turn up in the ffigen5 output and also need to be defined. I'll be happy to share what I've been using for those as well as the tweaked populate.sh files with anyone who wants to experiment. Once everything is working I'll clean up the ffigen5 code.

@GOFAI
Copy link

GOFAI commented Apr 17, 2020

Since I've been stuck inside due to the pandemic, I've been revisiting this again and have managed to recreate the same issue on another machine with a more recent macOS (10.14 vs. 10.13) and a saner compiler install. With the new interface files and the necessary tweaks to the ObjC bridge, (require "COCOA") gets very far along, including adding the "CCL" icon to the dock. My read is that the bridge and the interface files must be close to right or it'd flame out a lot earlier.

It seems that the specific issue that's causing it to fail has to do with idiosyncrasies in the cocoa ide; in particular, (#/textStorage echo-area-view) returns a NS-TEXT-STORAGE with the new interface files but a HEMLOCK-TEXT-STORAGE with the old ones (cocoa-editor.lisp line 2274). The hemlock-buffer method falls through on the former and the echo-buffer gets set to NIL. Does anyone with experience hacking on the IDE have any advice? A possible remedy is to define a #/textStorage method for echo-area-view.

I'll be more than glad to share interface files/etc. generated with ffigen5 for anyone who wants to experiment.

@GOFAI
Copy link

GOFAI commented Apr 20, 2020

I've managed to use ffigen5 to generate working interface files for Cocoa and SceneKit from the OSX 10.8 SDK. With these files (require "COCOA") launches a working IDE under macOS 10.14.

I need to add one feature to the ffigen5 source to make it work cleanly, but it looks like this thing has crossed the threshold of usability, subject to a few minor changes to the ObjC bridge. Unfortunately I think getting recent SDK headers to work is going to require greater modification to the ObjC bridge.

@eugeneia
Copy link
Contributor

Woot! Amazing hacks happening. :-)

@GOFAI
Copy link

GOFAI commented Apr 21, 2020

Upon experimentation, I've managed to make working interface directories for the 10.9 SDK Cocoa headers with ffigen5. Unfortunately, the 10.10 SDK produces the same issues as the 10.14 SDK: If I had to guess the issue is very subtle and relatively low-level.

@xrme
Copy link
Member Author

xrme commented Apr 21, 2020

That's awesome progress. The current interfaces are generated from the 10.8 SDK, and you've been able to make working interfaces from the 10.9 SDK.

It may well be that we'll need to make some modifications deal with newer constructs used in the headers from 10.10 (and later).

What's the link to your ffigen repository, if you are willing to let me take a look?

@GOFAI
Copy link

GOFAI commented Apr 21, 2020

My fork of ffigen5 is here. This is set up to build against Homebrew clang--I'm not sure if XCode currently provides a current enough libclang for it to work. It's also somewhat picky about include paths: h-to-ffi.sh needs an -isystem directive pointing to the correct stdarg.h header, which sometimes isn't in a sensible place under macOS. I've put in a sample populate.sh showing how it's set up on one of my machines.

@GOFAI
Copy link

GOFAI commented Apr 21, 2020

As for the tweaks to the ObjC bridge needed for these interface files to work. These type definitions can simply be evaluated in the REPL before running (require "COCOA"):

(def-foreign-type nil
                  (:struct #>NSConstantString
                           (:isa (:* (:struct #>objc_class)))
                           (:bytes (:* :char))
                           (#>numBytes :int)
                           (:_unused :int)))

;;; Need to define ObjC "instancetype" keyword as a proxy for ObjC id
(def-foreign-type :instancetype :id)

;;; Some kind of definition for __builtin_va_list (this is a clang builtin)
;;; and probably __va_list_tag too...

;;; Supposedly it's defined as follows on AMD64...
;;; typedef struct {
;;;   unsigned int gp_offset;
;;;   unsigned int fp_offset;
;;;   void *overflow_arg_area;
;;;   void *reg_save_area;
;;; } va_list[1]

(def-foreign-type :__va_list_tag
                  (:struct nil
                           (:gp_offset :unsigned)
                           (:fp_offset :unsigned)
                           (:overflow_arg_area :address)
                           (:reg_save_area :address)))

(def-foreign-type :__builtin_va_list (:array :__va_list_tag 1))

The following change to objc-support.lisp is sometimes, but not always, necessary:

;;; from objc-support.lisp, line 590
;;; FOREIGN-SIZE only works on foreign types defined with an alias
;;; Not always, but often necessary to make this change depending
;;; on combination of SDK version and compiler flags

(defun initialized-nsobject-p (nsobject)
  (or (objc-class-p nsobject)
      (objc-metaclass-p nsobject)
      (has-lisp-slot-vector nsobject)
      (let* ((cf-p (%cf-instance-p nsobject)) 
             (isize (if cf-p (external-call "malloc_size" :address nsobject :size_t) (%objc-class-instance-size (#/class nsobject))))
             (skip (if cf-p (+ (foreign-size :id :bytes) 4 #+64-bit-target 4) (foreign-size :id :bytes))))
        (declare (fixnum isize skip))
        (or (> skip isize)
            (do* ((i skip (1+ i)))
                 ((>= i isize))
              (declare (fixnum i))
              (unless (zerop (the (unsigned-byte 8) (%get-unsigned-byte nsobject i)))
                (return t)))))))

@eschaton
Copy link

eschaton commented Apr 21, 2020

Support for the latest language constructs in Apple’s headers is why I’ve advocated that ffigen be reimplemented atop libclang in the strongest possible terms. It should be possible to use it to parse the headers into something that Lisp can call the runtime with “generically,” though there might be some subtlety in when to invoke which objc_msgSend variant for structure return types and such.

Edit: I see that ffigen5 is based on libclang, that’s great! Thank you!

@GOFAI
Copy link

GOFAI commented Apr 23, 2020

I think I may have an idea as to what's causing the recent SDKs to fail--it may be an easier fix than I'd feared. But it turns out that some low-level ObjC stuff isn't in the ffigen5 output and the ObjC bridge breaks down in CCL 1.12 with it because of their absence from the libc interface. (For example, ObjC #$YES and #$NO.) The IDE works in 1.12 with the old interfaces because these types are defined from the 10.8 SDK in the Cocoa interface, but with the ffigen5 Cocoa interfaces they aren't defined anywhere. Could you please send me the .ffi files from legacy ffigen used to make the Cocoa interface for CCL 1.12, particularly that for objc_exception.h?

@xrme
Copy link
Member Author

xrme commented Apr 24, 2020

@GOFAI You can grab a copy from https://ccl.clozure.com/~rme/Cocoa.ffi.gz.

@GOFAI
Copy link

GOFAI commented Apr 24, 2020

Thanks! I've managed to fix the #$YES/#$NO issue, but now CCL 1.12 with the new ffigen5 interfaces gets stuck when trying to add a modeline when opening the listener window. That reminded me of something you said in this issue last year. Perhaps the problem I'm seeing is similar?

Under 1.11 with the 10.9 SDK I had to tweak a few things to correct some faulty behavior opening documents, in the course of which I discovered that the CLOS-type initializers for some ObjC classes (e.g., those using make-instance and keywords) aren't being set up properly. Instead of simply not working or throwing errors they can sometimes return improperly-initialized instances of related classes. This turned out to be the cause of the issue I was having with the echo-area, but switching to the #/ syntax cleared it up.

@eschaton
Copy link

Is the new ffigen following #import and #include directives? The boundaries between the objc, CoreFoundation, and Foundation headers can be pretty fluid for a variety of reasons, and there’s heavy reliance on header imports causing the right symbols to be visible at ObjC/Swift compile time. In other words, if ffigen works by processing the result of #import <Foundation/Foundation.h> in an ObjC file to produce a package for Foundation it should do the right thing, but if it works by processing each of the headers in Foundation.framework/Headers then it may not.

There are also a bunch of macros set by the compiler that ffigen will need to take into account (which hopefully will happen as a side-effect of using libclang), most important among them MACOSX_VERSION_MIN_REQUIRED and MACOSX_VERSION_MAX_ALLOWED. Without being specified these will both default to the same value (e.g. 101500 when building against the macOS 10.15 SDK) while to target earlier releases of macOS you’ll need to have the former set to a lower value. (This is varied for ObjC/Swift apps by setting the Deployment Target build setting, say to 10.12, which then causes -mmacosx-version-min=10.12 to be passed to the compiler and linker.)

Finally, a few years ago Apple added C++ style “typed enum declarations” to ObjC, which provides significant developer benefit and is critical for how enum declarations import into Swift. In particular, Apple didn’t just add the language feature, but also a couple of __attribute__s to describe the intent of an enum declaration—whether the items should be treated as combinable or mutually exclusive. I know CFFI doesn’t take this into account, and it would be good if ffigen did as well.

@GOFAI
Copy link

GOFAI commented Apr 24, 2020

Using libclang #import and #include seem to work as they should. Passing -mmacosx-version-min= to ffigen5 changes its output, which suggests that MACOSX_VERSION_MIN_REQUIRED is getting set correctly.

So far I've been focused on just reproducing ffigen4 outputs; I'm not sure that the existing .ffi file format expected by CCL's PARSE-FFI can actually express the new features that Apple has been adding to ObjC, or if CCL's ObjC bridge can make use of them. I'm in favor of adding that functionality in principle, but significant changes to the bridge could be non-trivial to implement.

@bitmappergit
Copy link

@GOFAI could you push your updates for that?

@GOFAI
Copy link

GOFAI commented May 16, 2020

My current version can be found here. This version of ffigen5 built interfaces to Cocoa from the 10.9 SDK that worked under CCL 1.11 as well as interfaces to SceneKit, SpriteKit, and Model I/O that work with CCL 1.12. (Unfortunately, I can't ascertain why the ffigen5 interfaces to Cocoa don't work under 1.12.) The populate.sh scripts will need to be tweaked to match your environment--please let me know if you have any questions.

@bitmappergit
Copy link

Isn't that missing the fixes for YES and NO?

@GOFAI
Copy link

GOFAI commented May 16, 2020

Fixes for YES and NO need to be added when building interfaces from the 10.14 SDK but are unnecessary with the earlier ones I've tested (10.8, 10.9, 10.10, and 10.11). I put a little effort into trying to come up with a more elegant solution but got distracted with trying to make additional framework interfaces that work with CCL 1.12. My interim solution consisted of kludging it by cutting and pasting the needed two entries into the .ffi files I generated from the 10.14 SDK. The necessary two lines are:

(macro ("" 0) "YES" "((BOOL)1)")
(macro ("" 0) "NO" "((BOOL)0)")

@bitmappergit
Copy link

This is basically the last thing before I can submit a pull request with Dark Mode support for the IDE.

It depends on the new system color definitions introduced in the Cocoa.framework.
Right now I've got a hack that works by directly sending the message to NSColor via objc-send-message.

@GOFAI
Copy link

GOFAI commented May 18, 2020

From my experiments, it looks like getting full Cocoa headers from the 10.14 headers to work will be hard and may require changes, possibly significant, to the ObjC bridge. But for your use case, there's probably an immediately-applicable workaround. PARSE-FFI consumes every .ffi file in the directory it's run on and adds its contents to the .cdb files it builds. Since it sounds like you just need a small number of methods that don't use any constructs that don't exist in the existing Cocoa headers, it should be possible to put the ffigen5 output for those methods/classes into a short .ffi file and include it with the ffigen4 .ffi files used for the Cocoa interface in CCL 1.12. Running PARSE-FFI as usual should then make the needed methods available to add Dark Mode to the IDE.

@snunez1
Copy link

snunez1 commented Mar 16, 2024

Just wondering how (if) this was concluded?

@GOFAI
Copy link

GOFAI commented Mar 17, 2024

I paused my efforts on this waiting for the Apple Silicon port of CCL, as it would make sense to modernize both the ObjC bridge and the header files together to work with the new version. I'd basically given up hope in the last couple of years that the port was going to happen, but recent messages on openmcl-devel suggest that maybe the arm64 port will eventually come about after all. If so, I have a lot of ideas from my experiments back in 2020 about how to rework the ObjC bridge and header interface system that I'd be game for trying to implement.

@xrme
Copy link
Member Author

xrme commented Apr 17, 2024

@GOFAI, I am going to pull from your repo and then transfer xrme/ffigen5 to the Clozure organization.

Please let me know if you don't want this, but I'd like to build on your work if that's OK with you.

@ailisp I hope this is OK with you also. Your work on figuring out how libclang works has been extremely useful.

@xrme
Copy link
Member Author

xrme commented Apr 17, 2024

I decided to put everything in https://github.com/Clozure/ccl-ffigen.

My idea here is that source code for the ffigen tool will be there, as well as the scripts and so forth for generating the ffi files for each of the platforms.

It may be that we want part or all of that in the ccl repository, but I propose that we use this location for now.

@xrme
Copy link
Member Author

xrme commented Aug 14, 2024

With the 1.13 release, the FreeBSD, Linux, and Solaris-ish ports all are using interface databases built using the new translator.

The Mac and Windows ports are still carrying forward the old interface databases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants