Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework Lua module searchers and loaders #9511

Merged
merged 9 commits into from
Jan 24, 2024

Conversation

igormunkin
Copy link
Collaborator

@igormunkin igormunkin commented Dec 19, 2023

This patchset implements the refactoring of Tarantool package.loaders required for Tarantool integrity protection (needed for tarantool/tarantool-ee#585).

Per-patch review is highly recommended (though GitHub works badly in this mode).

Since Tarantool follows Lua 5.1 semantics, the set of routines in package.loaders is used to tweak Lua modules loading. E.g. overriding
machinery has been implemented on top of the package.loaders. Unfortunately, package.loaders are not such flexible as it's needed and this mechanism is redesigned in Lua 5.2.

Starting from Lua 5.2 package.loaders has been superseded by package.searchers. This is quite natural change, since nobody wants to tweak the "loading" rules, but rather the "searching" process. As a result of this redesign evolution up to the Lua 5.4, users are able to obtain the "source" of the module being loaded (i.e. file name or specific sentinel for preload packages).


This first patch introduces Tarantool internal table with package.searchers and reimplements package.preload and tarantool.builtin loaders using the machinery described above:

  • The new searcher function, respecting Lua 5.4 semantics, is implemented for each loader.
  • Each searcher is wrapped with the function to implement "dummy" loader respecting Lua 5.1 semantics.

Originally, Lua uses only constant path patterns in package.path and package.cpath. Tarantool enhances user experience here and provides more relaxed rules for module searching: some paths are calculated at the moment the module is being searched and are relative to the current working directory. Unfortunately it makes "searchers" more complex, since all the paths can't be calculated when Tarantool starts up.

Furthermore, there is the particular priority while loading Lua modules in Tarantool. Originally, Lua tries to load the particular Lua script using the hints from package.path and only if no Lua module is found, Lua tries to load the corresponding Lua C library using the hints from package.cpath. Tarantool loaders are prioritized by the locations at first and only then by the module implementation. This also makes "searchers" structure more complex than it could be.

Anyway, "file" loaders are split into the three layers:

  1. "Pathogen": the function generated by <gen_path_builder>, building the hints for searchers.
  2. "Searcher": the function generated by <gen_file_searcher> helper, using the particular file loading method (<load_lua> or <load_lib>) and the given pathogen to obtain the hints with path patterns to be used in <package.searchpath> routine.
  3. "Loader": the function generated by <gen_file_loader> helper, following Lua 5.1 loaders semantics and using the particular searcher.

All the generated searchers are stored in the <package.searchers> analogue introduced in the first patch of this refactoring series.


Considering the way "app" paths injection is implemented in Tarantool loaders to save the loading priority, the chained loaders machinery introduced in the commit 7e9051c ("lua: don't set built-in modules to package.loaded") is also implemented for the new searching mechanism. The approach is the same as the one used for chaining loaders. As a result, "app" paths searchers are injected into the "cwd" paths searchers the same way it has been done in the original loaders context.

The third patch also introduces numerical indices into <package.searchers> analogue to provide the convenient searchers <-> loaders mapping used in the final patch of this series.


Within the fourth patch the overriding mechanism introduced in the commit ec47c38 ("lua: allow to override a built-in module") is moved to the "searching" phase of the new loading machinery. Furthermore, package.path/package.cpath searchers are also added to the indexed list made for the mapping.


Within the fifth patch the mechanism for enabling/disabling loaders introduced in the commit 87b4da3 ("lua: enable/disable override loader") is moved to the "searching" phase of the new loading machinery.


Though croot loader is not the popular way to work with Lua C libraries, some modules are organised using this mechanism, so they have to be verified via Tarantool integrity protection system, hence croot loader has to be reimplemented to the new searching machinery.

The corresponding croot searcher is implemented in this patch and added to the indexed part of the <package.searchers> table.

It's worth to mention two peculiarities related to the croot loader and searcher:

  • Since Tarantool uses original croot loader from LuaJIT, it ignores all paths except ones specified in <package.cpath>. However, there might be some croot-like modules in the Tarantool-specific cpaths (cwd or application root). It looks like croot machinery is not widely used, if this bug bothers nobody, but it's still need to be fixed.
  • Similar situation relates to the overriding mechanism, that was recently introduced: there is no "prefixed" croot loader and all libraries using croot loading machinery are ignored for the overridden paths as a result.

Though all Lua 5.1 loaders machinery will be superseded by Lua 5.4 searchers machinery, Tarantool still should follow Lua 5.1 semantics, so the "legacy" loader is required. This patch implements the common helper <gen_legacy_loader> to wrap all loaders and make the new machinery conform the Lua 5.1 semantics.

Moreover, some since all tweaks and helpers are implemented for searching machinery, <chain_loaders> is not required anymore.


Finally, to make it possible to instrument the searchers with Tarantool integrity protection, two steps are left.

  • Implement mapping from searchers table to <package.loaders> using <gen_legacy_loader> helper.
  • Obtain the searcher from the table directly, so the runtime instrumentation take the effect.

As a result <gen_legacy_loader> is reimplemented to use the table with searchers as an upvalue instead of the particular searcher routiine, and <package.loaders> table is filled by the function generated with the updated helper.



The first patch of the series fixes error handling between module "searching" and "loading" processes. Originally, if module is not found, the error message is returned to and is accumulated to the "not found" message. If no suitable module is found, the error is raised by <require>. In case, when the desired module is found, tries to load it. If any error occurs while loading the module, the "loader" yields the corresponding error. The latter part was broken since the commit 4e27b0e ("app: let user define search root for packages"), so if the error occurs while loading the module, Tarantool proceed with searching the module in other locations.

This bug was found via lua-Harness suite used in tarantool/luajit testing routine, so the test added in the patch partially duplicates the particular test case in lua-Harness.

@igormunkin igormunkin added the full-ci Enables all tests for a pull request label Dec 19, 2023
@coveralls
Copy link

coveralls commented Dec 19, 2023

Coverage Status

coverage: 86.927% (-0.006%) from 86.933%
when pulling ffd3654 on igormunkin:imun/instrument-loaders
into 4007552
on tarantool:master
.

@igormunkin igormunkin removed the full-ci Enables all tests for a pull request label Jan 21, 2024
@igormunkin igormunkin force-pushed the imun/instrument-loaders branch 5 times, most recently from b94fda1 to b6071a6 Compare January 22, 2024 09:51
@igormunkin igormunkin marked this pull request as ready for review January 22, 2024 09:58
@igormunkin igormunkin requested a review from a team as a code owner January 22, 2024 09:58
@igormunkin igormunkin added the full-ci Enables all tests for a pull request label Jan 22, 2024
Copy link
Collaborator

@Buristan Buristan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, Igor!
Thanks for the patchset!
I'll proceed with the review per-patch.

[PATCH 1/9] lua: fix loader behaviour in case of syntax error

Thanks for the patch!
Please consider my comments below.

processes. Originally, if module is not found, the error message is

Typo: s/module/a module/

returned to and is accumulated to the "not found" message.

Nit: "accumulated" isn't clear to me, do you mean "appended"?

Tarantool proceed with searching the module in other locations.

Typo: s/proceed/proceeds/

This bug was found via lua-Harness suite[1] used in tarantool/luajit

Typo: s/via lua-Harness/via the lua-Harness/
Typo: s/suite[1]/suite [1]/
Typo: s<in tarantool/luajit><in the tarantool/luajit>

NO_CHANGELOG=bugfix

Should we mention this in the changelog as bugfix then?

Copy link
Collaborator

@Buristan Buristan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[PATCH 2/9] lua: implement searchers for preload and builtin

Thanks for the patch!
LGTM, with a few insignificant nits below.

Since Tarantool follows Lua 5.1[1] semantics, the set of routines in

Typo: s/Lua 5.1[1]/Lua 5.1 [1]/
Here and below with other links.

Unfortunately, package.loaders are not such flexible as it's needed and

Typo: s/not such flexible as/not as flexible as/
Typo: s/it's needed/they should be/

package.searchers. This is quite natural change, since nobody wants to

Typo: s/quite natural/quite a natural/

src/lua/loaders.lua Outdated Show resolved Hide resolved
src/lua/loaders.lua Outdated Show resolved Hide resolved
src/lua/loaders.lua Outdated Show resolved Hide resolved
Copy link
Collaborator

@Buristan Buristan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[PATCH 3/9] lua: implement searchers for file modules

I see no flaws, so patch LGTM, except several nits to the commit message.

This patch proceeds the refactoring of Tarantool package.loaders

Typo: s/proceeds/proceeds with/

package.cpath. Tarantool enhances user experience here and provides more

Typo: s/user/the user/ or s/user/a user/

working directory. Unfortunately it makes "searchers" more complex,

Typo: s/Unfortunately/Unfortunately,/

Worth saying that "app" paths injection is still implemented via chained

Typo: s/Worth/It is worth/

Copy link
Collaborator

@Buristan Buristan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[PATCH 4/9] lua: implement chained searchers

I see no flaws, so patch LGTM, except a single nit regarding to the commit message and a comment below.

This patch proceeds the refactoring of Tarantool package.loaders

Typo: s/proceeds/proceeds with/

src/lua/loaders.lua Show resolved Hide resolved
Copy link
Collaborator

@Buristan Buristan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[PATCH 5/9] lua: implement prefixed searchers

LGTM, with two nits regarding the commit message.

This patch proceeds the refactoring of Tarantool package.loaders

Typo: s/proceeds/proceeds with/

Within this patch the overriding mechanism introduced in the commit

Typo: s/Within this patch/Within this patch,/

Copy link
Collaborator

@Buristan Buristan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[PATCH 6/9] lua: implement conditional searchers

LGTM, with two nits regarding the commit message.

This patch proceeds the refactoring of Tarantool package.loaders

Typo: s/proceeds/proceeds with/

Within this patch the mechanism for enabling/disabling loaders

Typo: s/Within this patch/Within this patch,/

Copy link
Collaborator

@Buristan Buristan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[PATCH 7/9] lua: implement croot searcher

LGTM, with a minor nits regarding the commit message and the comments below.

This patch proceeds the refactoring of Tarantool package.loaders

Typo: s/proceeds/proceeds with/

verified via Tarantool integrity protection system, hence croot loader

Typo: s/hence/hence,/

  • Since Tarantool uses original croot loader from LuaJIT, it ignores all

Typo: s/original croot/the original croot/

if this bug bothers nobody, but it's still need to be fixed.

Typo: s/it's still need/it still needs/

  • Similar situation relates to the overriding mechanism, that was

Typo: s/Similar/A similar/
Typo: s/mechanism,/mechanism/

Copy link
Collaborator

@Buristan Buristan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[PATCH 8/9] lua: implement common legacy loader generator

LGTM, with a minor comments regarding the commit message.

This patch proceeds the refactoring of Tarantool package.loaders

Typo: s/proceeds/proceeds with/

Moreover, some since all tweaks and helpers are implemented for

Typo: s/some//

Copy link
Collaborator

@Buristan Buristan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[PATCH 9/9] lua: implement searchers-to-loaders mapping

LGTM, with a minor comments regarding the commit message.

instrumentation take the effect.

Typo: s/take/takes/

As a result <gen_legacy_loader> is reimplemented to use the table with

Typo: s/As a result/As a result,/

@igormunkin igormunkin force-pushed the imun/instrument-loaders branch 2 times, most recently from 286beaa to bf6e4fa Compare January 23, 2024 21:53
src/lua/loaders.lua Outdated Show resolved Hide resolved
Copy link
Member

@Totktonada Totktonada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK for the patchset.

Thank you for the decent work!

@Totktonada Totktonada removed their assignment Jan 24, 2024
@igormunkin igormunkin requested a review from a team as a code owner January 24, 2024 09:50
The patch fixes error handling between module "searching" and "loading"
processes. Originally, if a module is not found, the error message is
returned to <require> and is appended to the "not found" message. If no
suitable module is found, the error is raised by <require>. In case,
when the desired module is found, <require> tries to load it. If any
error occurs while loading the module, the "loader" yields the
corresponding error. The latter part was broken since the commit
5be3a82 ("lua: fix loaders behavior"),
so if the error occurs while loading the module, Tarantool proceeds with
searching the module in other locations.

This bug was found via the lua-Harness suite[1] used in the
tarantool/luajit testing routine, so the test added in the patch
partially duplicates the particular test case in lua-Harness.

[1]: https://fperrad.frama.io/lua-Harness/

Needed for tarantool/tarantool-ee#585

NO_DOC=bugfix
This patch initiates the refactoring of Tarantool package.loaders
required for Tarantool integrity protection.

Since Tarantool follows Lua 5.1[1] semantics, the set of routines in
package.loaders is used to tweak Lua modules loading. E.g. overriding
machinery has been implemented on top of the package.loaders.
Unfortunately, package.loaders are not as flexible as they should be and
this mechanism is redesigned in Lua 5.2[2].

Starting from Lua 5.2 package.loaders has been superseded by
package.searchers. This is quite a natural change, since nobody wants to
tweak the "loading" rules, but rather the "searching" process. As a
result of this redesign evolution up to the Lua 5.4[3], users are able
to obtain the "source" of the module being loaded (i.e. file name or
specific sentinel for preload packages).

This patch introduces Tarantool internal table with package.searchers
and reimplements package.preload and tarantool.builtin loaders using the
machinery described above:
* The new searcher function, respecting Lua 5.4 semantics, is
  implemented for each loader.
* Each searcher is wrapped with the function to implement "dummy" loader
  respecting Lua 5.1 semantics.

[1]: https://www.lua.org/manual/5.1/manual.html
[2]: https://www.lua.org/manual/5.2/manual.html
[3]: https://www.lua.org/manual/5.4/manual.html

Needed for tarantool/tarantool-ee#585

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
This patch continues the refactoring of Tarantool package.loaders
required for Tarantool integrity protection.

Originally, Lua uses only constant path patterns in package.path and
package.cpath. Tarantool enhances the user experience here and provides
more relaxed rules for module searching: some paths are calculated at
the moment the module is being searched and are relative to the current
working directory. Unfortunately, it makes "searchers" more complex,
since all the paths can't be calculated when Tarantool starts up.

Furthermore, there is the particular priority while loading Lua modules
in Tarantool. Originally, Lua tries to load the particular Lua script
using the hints from package.path and *only if* no Lua module is found,
Lua tries to load the corresponding Lua C library using the hints from
package.cpath. Tarantool loaders are prioritized by the locations at
first and only then by the module implementation. This also makes
"searchers" structure more complex than it could be.

Anyway, "file" loaders are split into the three layers:
1. "Pathogen": the function generated by <gen_path_builder>, building
   the hints for searchers
2. "Searcher": the function generated by <gen_file_searcher> helper,
   using the particular file loading method (<load_lua> or <load_lib>)
   and the given pathogen to obtain the hints with path patterns to be
   used in <package.searchpath> routine
3. "Loader": the function generated by <gen_file_loader> helper,
   following Lua 5.1 loaders semantics and using the particular searcher

All the generated searchers are stored in the <package.searchers>
analogue introduced in the first patch of this refactoring series.

It's worth saying that "app" paths injection is still implemented via
chained loaders, but this machinery will be reimplemented within the
following patches in this series.

Needed for tarantool/tarantool-ee#585

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
This patch continues the refactoring of Tarantool package.loaders
required for Tarantool integrity protection.

Considering the way "app" paths injection is implemented in Tarantool
loaders to save the loading priority, the chained loaders machinery
introduced in the commit 7e9051c ("lua:
don't set built-in modules to package.loaded") is also implemented for
the new searching mechanism. The approach is the same as the one used
for chaining loaders. As a result, "app" paths searchers are injected
into the "cwd" paths searchers the same way it has been done in the
original loaders context.

This patch also introduces numerical indices into <package.searchers>
analogue to provide the convenient searchers <-> loaders mapping used in
the final patch of this series.

Needed for tarantool/tarantool-ee#585

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
This patch continues the refactoring of Tarantool package.loaders
required for Tarantool integrity protection.

Within this patch, the overriding mechanism introduced in the commit
ec47c38 ("lua: allow to override a
built-in module") is moved to the "searching" phase of the new loading
machinery. Furthermore, package.path/package.cpath searchers are also
added to the indexed list made for the mapping.

Needed for tarantool/tarantool-ee#585

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
This patch continues the refactoring of Tarantool package.loaders
required for Tarantool integrity protection.

Within this patch, the mechanism for enabling/disabling loaders
introduced in the commit 87b4da3 ("lua:
enable/disable override loader") is moved to the "searching" phase of
the new loading machinery.

Needed for tarantool/tarantool-ee#585

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
This patch continues the refactoring of Tarantool package.loaders
required for Tarantool integrity protection.

Though croot loader is not the popular way to work with Lua C libraries,
some modules are organised using this mechanism, so they have to be
verified via Tarantool integrity protection system, hence, croot loader
has to be reimplemented to the new searching machinery.

The corresponding croot searcher is implemented in this patch and added
to the indexed part of the <package.searchers> table.

It's worth to mention two peculiarities related to the croot loader and
searcher:

* Since Tarantool uses the original croot loader from LuaJIT, it ignores
  all paths except ones specified in <package.cpath>. However, there
  might be some croot-like modules in the Tarantool-specific cpaths (cwd
  or application root). It looks like croot machinery is not widely
  used, if this bug bothers nobody, but it still needs to be fixed.
* A similar situation relates to the overriding mechanism that was
  recently introduced: there is no "prefixed" croot loader and all
  libraries using croot loading machinery are ignored for the overridden
  paths as a result.

Needed for tarantool/tarantool-ee#585

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
This patch continues the refactoring of Tarantool package.loaders
required for Tarantool integrity protection.

Though all Lua 5.1 loaders machinery will be superseded by Lua 5.4
searchers machinery, Tarantool still should follow Lua 5.1 semantics, so
the "legacy" loader is required. This patch implements the common helper
<gen_legacy_loader> to wrap all loaders and make the new machinery
conform the Lua 5.1 semantics.

Moreover, since all tweaks and helpers are implemented for searching
machinery, <chain_loaders> is not required anymore.

Needed for tarantool/tarantool-ee#585

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
This patch finalizes the refactoring of Tarantool package.loaders
required for Tarantool integrity protection.

Finally, to make it possible to instrument the searchers with Tarantool
integrity protection, two steps are left.
* Implement mapping from searchers table to <package.loaders> using
  <gen_legacy_loader> helper.
* Obtain the searcher from the table directly, so the runtime
  instrumentation takes the effect.

As a result, <gen_legacy_loader> is reimplemented to use the table with
searchers as an upvalue instead of the particular searcher routine, and
<package.loaders> table is filled by the function generated with the
updated helper. Furthermore, the aforementioned table with searchers is
added to 'internal.loaders' module exports to make the instrumentation
more convenient for users.

Needed for tarantool/tarantool-ee#585

NO_DOC=refactoring
NO_TEST=refactoring
NO_CHANGELOG=refactoring
@igormunkin igormunkin merged commit a11cb66 into tarantool:master Jan 24, 2024
60 checks passed
@igormunkin
Copy link
Collaborator Author

The patchset is backported to release/3.0 in scope of a24ebea..1ddbfc2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
full-ci Enables all tests for a pull request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants