New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce unified Lua ffi.cdef builder #4202
Comments
Just to realize the problem: SWIM needs to duplicate 96 lines from two headers; schema.lua duplicates 97 lines; tuple.lua duplicates 49 lines; buffer.lua duplicates 29 lines; digest.lua duplicates 10 lines; error.lua duplicates 27 lines; fiber.lua - 8 lines; iconv.lua - 5 lines; uuid.lua - 24 lines. |
I propose to combine the two options into: |
Thanks for the proposal! It is good that this thing is started. |
Wow, the feature is one vital for both Tarantool developers and users, and I'd like to add my two cents. At first, I guess As for me, the processing of ffi.include('example.h') Moreover I see another approach to differ functions to be exported via #define FFISAFE
FFISAFE int
box_example1(void);
FFISAFE int
box_example1(void); There are only functions mentioned in the original proposal. And what about typedefs and enums declarations? |
I like Talking of Talking of What bother me more - can we do that import at runtime? And how? |
We can just load all generated definitons at once at startup, before loading built-in Lua modules. This way ffi.{cdef_file,cdef_tag,import} will not be needed. This solution looks as easiest for me and it handles the stated problem. Aside of that we don't need to design a general purpose feature that nobody actually asks. FFISAFE proposal looks interesting, however I don't sure it actually handles FFI misuse problem: what if a function that is called from FFISAFE one is changed? It seems that this is the primary source of misusages. Prevention of FFI misusages deserves its own issue. I guess it requires a way more work, then automatic generation of cdefs for FFI. |
Thank you all for your comments!
|
I propose to don't add any Waiting for comments from @Gerold103 and @igormunkin. |
What is 'generated declarations' you are talking about? Only a few things are a part of public API in module.h. I need to be able to export internal not generated structures, functions, types. Otherwise this is useless. Moreover, public 'generated' API is fixed. It never changes, so this code does not need that feature. Also what is 'load at startup'? You need Talking of the problem about 'misusage' - it has nothing to do with this ticket. The ticket is purely about simplification of |
I'm not about So, I propose to generate a file that may look like so: local ffi = require('ffi')
ffi.cdef([[
<all declarations between /* \cond ffi */ and /* \endcond ffi */ here>
]]) Then add the file name to |
Thanks for the explanation. Yeah, that looks good to me as well. I don't have a strict preference to |
@Totktonada, thanks for the proposal! Yours one looks to fit better regarding the problem to be solved. For internal usage I'm totally OK with your solution, and I totally agree that we need another issue for "FFI misusage" problem. Could you please dump the final proposal as @servoin asked before? |
So as I currently understand, there should be some module which generates .lua source file containing ffi.cdef with definitions from some .h file using tags to include only parts of this .h file. Correct me if I'm wrong. What I don't understand for now is how this module will be called during the build process. Could you please give an example? |
#4202 (comment) is not enough? Don't know what to add here.
In the similar way as module.h is generated. See |
BTW, should we do the same for |
Currently, there are ffi.cdefs not only in tarantool, but also in its submodules, namely: msgpuck and small. The question is: what should we do with these ffi.cdefs in submodules? Leave them as is or apply changes for unified cdef builder (it will require separate patches for submodules)? Also, the question about luaL_cdef() calls is still open. |
IMO, yes, because this libraries are standalone and tarantool depends on them but not the opposite |
Looks connected with #911. |
Lua ffi is quite fast and simple mechanism allowing to call C functions directly, without using Lua C API. It is usually faster, sometimes much faster. But ffi requires to define all the C methods again in a Lua file in ffi.cdef function. It is as bad as any other code duplication - when the original API is changed, a one could forget to update Lua files and ffi.cdef arguments. It is especially painful, when ffi C definition is really big. The goal is to introduce a mechanism to generate ffi C definitions. The implementation. =================== The mechanism to generate ffi C definitions is similar to how module.h is generated. There is a script extra/fficdefgen which parses specified set of C headers and produces content for ffi.cdef. The idea is similar to module.h generation: use comment tags in C headers to rip off the content between two tags: /* \cond ffi */ and /* \endcond ffi */ Everything between these tags will be the output of this script. The output of extra/fficdefgen is placed into src/lua/load_ffi_defs.lua file. This file contains three parts: 1. Header - from src/lua/load_ffi_defs.header.lua 2. Output from extra/fficdefgen 3. Footer - from src/lua/load_ffi_defs.footer.lua So we can add some C definitions manually before and after those which are automatically generated. This is important since some definitions cannot be added automatically, e.g. from system headers, submodules and containing identifiers which are Lua keywords (see comments in load_ffi_defs.header.lua and load_ffi_defs.footer.lua). load_ffi_defs.lua file is produced by rebuild_module_fficdef function in cmake/fficdef.cmake CMake script. load_ffi_defs.lua is then compiled into src/lua/load_ffi_defs.lua.c source (see lua_sources list in CMakeLists.txt). load_ffi_defs.lua content is referenced in lua_modules in src/lua/init.c and is loaded at startup with all ffi.cdef definitions gathered from specified C headers and load_ffi_defs.header.lua/load_ffi_defs.footer.lua. So, what we need to include in C headers are the tags. But how do we know which C headers to include into load_ffi_defs.lua? There is a list (similar to one for module.h) in src/CMakeLists.txt file called fficdef_headers. rebuild_module_fficdef function is also called from CMakeLists.txt. Important notice. For now, only core C headers are included into fficdef_headers list. This means that headers from Tarantool submodules are not included and some C definitions are thereby manually placed into load_ffi_defs.header.lua/load_ffi_defs.footer.lua. Currently, this ffi.cdef builder is kind of internal mechanism, but can be used by anyone following the scenario: 1. Include /* \cond ffi */ and /* \endcond ffi */ tags into needed C header. 2. Include needed C header in fficdef_headers list in CMakeLists.txt. Closes: #4202
Lua ffi is quite fast and simple mechanism allowing to call C functions directly, without using Lua C API. It is usually faster, sometimes much faster. But ffi requires to define all the C methods again in a Lua file in ffi.cdef function. It is as bad as any other code duplication - when the original API is changed, a one could forget to update Lua files and ffi.cdef arguments. It is especially painful, when ffi C definition is really big. The goal is to introduce a mechanism to generate ffi C definitions. The implementation. =================== The mechanism to generate ffi C definitions is similar to how module.h is generated. There is a script extra/fficdefgen which parses specified set of C headers and produces content for ffi.cdef. The idea is similar to module.h generation: use comment tags in C headers to rip off the content between two tags: /* \cond ffi */ and /* \endcond ffi */ Everything between these tags will be the output of this script. The output of extra/fficdefgen is placed into src/lua/load_ffi_defs.lua file. This file contains three parts: 1. Header - from src/lua/load_ffi_defs.header.lua 2. Output from extra/fficdefgen 3. Footer - from src/lua/load_ffi_defs.footer.lua So we can add some C definitions manually before and after those which are automatically generated. This is important since some definitions cannot be added automatically, e.g. from system headers, submodules and containing identifiers which are Lua keywords (see comments in load_ffi_defs.header.lua and load_ffi_defs.footer.lua). load_ffi_defs.lua file is produced by rebuild_module_fficdef function in cmake/fficdef.cmake CMake script. load_ffi_defs.lua is then compiled into src/lua/load_ffi_defs.lua.c source (see lua_sources list in CMakeLists.txt). load_ffi_defs.lua content is referenced in lua_modules in src/lua/init.c and is loaded at startup with all ffi.cdef definitions gathered from specified C headers and load_ffi_defs.header.lua/load_ffi_defs.footer.lua. So, what we need to include in C headers are the tags. But how do we know which C headers to include into load_ffi_defs.lua? There is a list (similar to one for module.h) in src/CMakeLists.txt file called fficdef_headers. rebuild_module_fficdef function is also called from CMakeLists.txt. Important notice. For now, only core C headers are included into fficdef_headers list. This means that headers from Tarantool submodules are not included and some C definitions are thereby manually placed into load_ffi_defs.header.lua/load_ffi_defs.footer.lua. Currently, this ffi.cdef builder is kind of internal mechanism, but can be used by anyone following the scenario: 1. Include /* \cond ffi */ and /* \endcond ffi */ tags into needed C header. 2. Include needed C header in fficdef_headers list in CMakeLists.txt. Closes: #4202
Created infrastructure which simplifies building of cdef header for integration into SQL AST module. We will use approach similar to module_api scripts, but integrated into build the different way, e.g: 1. we created `cdef_source` cmake macro, which is similar to `lua_source` macro and embeds source literal to the Tarantool executable, but this file is created as result of preprocessing similar to module_api; 2. we use `\cond ffi` / `\endcond ffi` for marking regions we would like to keep for exporting to FFI; 3. we preprocess using the referenced file as start. Preprocessor invoked with `-CC` option first to retain `\cond`/'\endcond` comments ; 4. We cleanup the generated file using sed to delete empty lines end prerpocessor noise; 5. We leave only relevant `\(end)cond` region; 6. Then preprocess again to remove comments and trim header; 7. The generated file to be embedded to C as literal via `txt2c`; 8. And at the initialization of Tarantool we could use this literal for luaL_cdef to make known whole AST hierarchy via single hop; Could not resist and not note that I hate cmake, it took more than single day to figure out the proper way to construct $CC command line extended with all necessary include paths. I had to use `COMMAND_EXPAND_LISTS` `add_custom_command`' option, which is there only since cmake 3.8, and if we had to use anything older then we ought to rework this approach using external script. But I'd prefer we not do it. Inspired by #4202
Prior commit has established facilities for semi-automatic build of header file whih might be consumed by ffi.cdef. Here we use these facilities for SQL AST generation. All relevant headers have been marked appropriately with: - `/** \cond ffi */` for range begin; - `/** \endcond ffi */` for range end. This all sounds very similar to module_api, but integrated in a different way - we create `sql_ast_cdef[]` literal, not publicly accessible headers. Please not be surprised with strange renames of typedefs used with `struct StructName` here and there. It simplifies integration with LuaJIT ffi compiler. NB! Eventually we have to fix error propagation from LuaJIT ffi compiler in luaL_cdef. At the moment it simply produces 'Unknown exception' and stops. I had to use luajit for checking syntax errors there, which is gracefully shown there ``` 20:08 $ luajit loadcdef.lua luajit: loadcdef.lua:12: declaration specifier expected near 'ExprList' \ at line 83 stack traceback: [C]: in function 'cdef' loadcdef.lua:12: in main chunk [C]: at 0x5584f179c1d0 ``` Still is inspired by #4202
Prior commit has established facilities for semi-automatic build of header file whih might be consumed by ffi.cdef. Here we use these facilities for SQL AST generation. All relevant headers have been marked appropriately with: - `/** \cond ffi */` for range begin; - `/** \endcond ffi */` for range end. This all sounds very similar to module_api, but integrated in a different way - we create `sql_ast_cdef[]` literal, not publicly accessible headers. Please not be surprised with strange renames of typedefs used with `struct StructName` here and there. It simplifies integration with LuaJIT ffi compiler. NB! Eventually we have to fix error propagation from LuaJIT ffi compiler in luaL_cdef. At the moment it simply produces 'Unknown exception' and stops. I had to use luajit for checking syntax errors there, which is gracefully shown there ``` 20:08 $ luajit loadcdef.lua luajit: loadcdef.lua:12: declaration specifier expected near 'ExprList' \ at line 83 stack traceback: [C]: in function 'cdef' loadcdef.lua:12: in main chunk [C]: at 0x5584f179c1d0 ``` Still is inspired by #4202
Created infrastructure which simplifies building of cdef header for integration into SQL AST module. We will use approach similar to module_api scripts, but integrated into build the different way, e.g: 1. we created `cdef_source` cmake macro, which is similar to `lua_source` macro and embeds source literal to the Tarantool executable, but this file is created as result of preprocessing similar to module_api; 2. we use `\cond ffi` / `\endcond ffi` for marking regions we would like to keep for exporting to FFI; 3. we preprocess using the referenced file as start. Preprocessor invoked with `-CC` option first to retain `\cond`/'\endcond` comments ; 4. We cleanup the generated file using sed to delete empty lines end prerpocessor noise; 5. We leave only relevant `\(end)cond` region; 6. Then preprocess again to remove comments and trim header; 7. The generated file to be embedded to C as literal via `txt2c`; 8. And at the initialization of Tarantool we could use this literal for luaL_cdef to make known whole AST hierarchy via single hop; Could not resist and not note that I hate cmake, it took more than single day to figure out the proper way to construct $CC command line extended with all necessary include paths. I had to use `COMMAND_EXPAND_LISTS` `add_custom_command`' option, which is there only since cmake 3.8, and if we had to use anything older then we ought to rework this approach using external script. But I'd prefer we not do it. Inspired by #4202
Prior commit has established facilities for semi-automatic build of header file whih might be consumed by ffi.cdef. Here we use these facilities for SQL AST generation. All relevant headers have been marked appropriately with: - `/** \cond ffi */` for range begin; - `/** \endcond ffi */` for range end. This all sounds very similar to module_api, but integrated in a different way - we create `sql_ast_cdef[]` literal, not publicly accessible headers. Please not be surprised with strange renames of typedefs used with `struct StructName` here and there. It simplifies integration with LuaJIT ffi compiler. NB! Eventually we have to fix error propagation from LuaJIT ffi compiler in luaL_cdef. At the moment it simply produces 'Unknown exception' and stops. I had to use luajit for checking syntax errors there, which is gracefully shown there ``` 20:08 $ luajit loadcdef.lua luajit: loadcdef.lua:12: declaration specifier expected near 'ExprList' \ at line 83 stack traceback: [C]: in function 'cdef' loadcdef.lua:12: in main chunk [C]: at 0x5584f179c1d0 ``` Still is inspired by #4202
Lua ffi is quite fast and simple mechanism allowing to call C functions directly, without using Lua C API. It is usually faster, sometimes much faster. But ffi requires to define all the C methods again in a Lua file in
ffi.cdef
function. It is as bad as any other code duplication - when the original API is changed, a one could forget to update Lua files andffi.cdef
arguments. It is especially painful, when ffi C definition is really big.Interestingly, we have exactly the same issue with
module.h
file - it is in fact a set of copy-pasted code blocks from some header files. But it is copy-pasted automatically. CMake calls a special routine walking over specified header files and extracting code inside/** \cond public */
and/** \endcond public */
tags.Goal of this ticket is to introduce a similar mechanism to generate ffi C definitions. API RFC:
Option 1
File box_example.h
File box_example.lua
Option 2
This option allows to collect C definitions by name, from multiple files.
File box_example.h
File box_another_example.h
File box_example.lua
The text was updated successfully, but these errors were encountered: