New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rgw/lua: allow access to object data #46550
Conversation
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
comments from RGW refactoring call:
|
79d3978
to
12ae262
Compare
doc/radosgw/lua-scripting.rst
Outdated
Tracing functions can be used only in `postRequest` context. | ||
|
||
- ``Request.Trace.SetAttribute()`` - sets the attribute for the request's trace. | ||
Takes two arguments. The first is the `key`, which should be a string. The second is the value, which can either be a string or a number. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First bit of line 322 isn’t a sentence. Suggest
“the request’s trace, and takes two arguments.”
Also if `key` is in backticks, shouldn’t `value` be as well?
Maybe clarify if “number” means an integer, a real, or either?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hate to quibble, but I think you added "of" instead of "or"
(will squash before merge) Signed-off-by: yuval Lifshitz <ylifshit@redhat.com>
jenkins test make check |
17a7462
to
6d3448c
Compare
src/rgw/rgw_op.cc
Outdated
} | ||
if (run_lua) { | ||
filter = &*run_lua; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't we want to add this lua filter last so its data isn't transformed by compression or encryption? have you tested these combinations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
running lua on compressed and encrypted data won't be useful.
this is why i added it first in the "put" case.
if someone wants to encrypt or compress the after lua read it, there should not be any limitation.
anyway, will test to see that it works, and update the documentation accordingly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
running lua on compressed and encrypted data won't be useful.
this is why i added it first in the "put" case.
right, i just think you have it backwards. this code is adding wrapper filters around the object processor, so the filter added last is the one on the outside, so it sees the data first
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about the "get" case:
https://github.com/ceph/ceph/pull/46550/files#diff-319900fb50bdb02677039fb81949243b9c0e7d98690f6fddf35e6ca5d5d52b2cR2308 ?
here i added the lua filter last, should i switch it to be first (so it runs last)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the "get" case looks correct as-is. there we're wrapping filters around send_response_data()
. you're adding the lua filter before decompress/decrypt filters, so your lua filter should see the data in plain-text. it's still worth testing to confirm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the "get" case was reading the compressed data.
i moved it to be first in 3ca1a48
so it is executed last
doc/radosgw/lua-scripting.rst
Outdated
Global ``RGW`` Table | ||
Data Context | ||
-------------------- | ||
Both ``getdata`` and ``putdata`` contexts has a single field named ``Data`` which is read only, optional and iterable (byte by byte). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's probably worth noting that these scripts only see one chunk of data per execution, right?
doc/radosgw/lua-scripting.rst
Outdated
|
||
- ``Request.Trace.SetAttribute(<key>, <value>)`` - sets the attribute for the request's trace. | ||
The function takes two arguments: the first is the ``key``, which should be a string, and the second is the ``value``, which can either be a string or a number (integer or double). | ||
Using the attribute, you can locate specific traces. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may then locate specific traces by using this attribute.
doc/radosgw/lua-scripting.rst
Outdated
|
||
- ``Request.Trace.AddEvent(<name>, <attributes>)`` - adds an event to the first span of the request's trace | ||
An event is defined by event name, event time, and zero or more event attributes. | ||
Therefore, the function accepts one or two arguments. A string containing the event ``name`` should be the first argument, followed by the event ``attributes``, which is optional for events without attributes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function accepts one or two arguments: A string
doc/radosgw/lua-scripting.rst
Outdated
Global ``RGW`` Table | ||
Data Context | ||
-------------------- | ||
Both ``getdata`` and ``putdata`` contexts has a single field named ``Data`` which is read only, optional and iterable (byte by byte). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/has/have/
s/read only/read-only/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docs lgtm, a couple of small requests but not enough to block approval.
doc/radosgw/lua-scripting.rst
Outdated
A request (pre or post) or data (get or put) context script may be constrained to operations belonging to a specific tenant's users. | ||
The request context script can also access fields in the request and modify certain fields, as well as the `Global RGW Table`_. | ||
The data context script can access the content of the object as well as the request fields and the `Global RGW Table`_. | ||
All Lua language features can be used in all contexts. | ||
|
||
By default, all lua standard libraries are available in the script, however, in order to allow for other lua modules to be used in the script, we support adding packages to an allowlist: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lua capitalized?
doc/radosgw/lua-scripting.rst
Outdated
By default, this directory would be `/tmp/luarocks/<entity name>`. Its prefix part (`/tmp/luarocks/`) could be set to a different location via the `rgw_luarocks_location` configuration parameter. | ||
Note that this parameter should not be set to one of the default locations where luarocks install packages (e.g. `$HOME/.luarocks`, `/usr/lib64/lua`, `/usr/share/lua`) | ||
By default, this directory would be ``/tmp/luarocks/<entity name>``. Its prefix part (``/tmp/luarocks/``) could be set to a different location via the ``rgw_luarocks_location`` configuration parameter. | ||
Note that this parameter should not be set to one of the default locations where luarocks install packages (e.g. ``$HOME/.luarocks``, ``/usr/lib64/lua``, ``/usr/share/lua``) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Period at the end?
doc/radosgw/lua-scripting.rst
Outdated
@@ -511,3 +526,39 @@ in `postRequest` context, we can add attributes and events to the request's trac | |||
|
|||
Request.Trace.AddEvent("second event", event_attrs) | |||
|
|||
- Calculate the entropy and size of uploaded objects and print to debug log |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious, how is this entropy value useful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could be used to detect encryption of objects. added this to the doc
3ca1a48
to
f5f4ef5
Compare
doc/radosgw/lua-scripting.rst
Outdated
The data context script can access the content of the object as well as the request fields and the `Global RGW Table`_. | ||
All Lua language features can be used in all contexts. | ||
|
||
By default, all Lua standard libraries are available in the script, however, in order to allow for other lua modules to be used in the script, we support adding packages to an allowlist: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a second s/lua/Lua/ on this line :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed (together with some more lower case "lau") - hope this is the last round :-)
f5f4ef5
to
95c22b8
Compare
jenkins test api |
Signed-off-by: Yuval Lifshitz <ylifshit@redhat.com>
95c22b8
to
f4501f6
Compare
jenkins test make check |
http://pulpito.front.sepia.ceph.com/yuvalif-2022-08-07_13:01:20-rgw:verify-wip-yuval-lua-filter-distro-default-smithi/
|
@yuvalif please run the full rgw suite with --subset ~1/3, there are important regression tests in the other subsuites! |
i ran with: |
@@ -2084,6 +2086,20 @@ int RGWGetObj::get_data_cb(bufferlist& bl, off_t bl_ofs, off_t bl_len) | |||
return send_response_data(bl, bl_ofs, bl_len); | |||
} | |||
|
|||
int RGWGetObj::get_lua_filter(std::unique_ptr<RGWGetObj_Filter>* filter, RGWGetObj_Filter* cb) { | |||
std::string script; | |||
const auto rc = rgw::lua::read_script(s, store, s->bucket_tenant, s->yield, rgw::lua::context::getData, script); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this broke the build on main. it looks like a conflict with a1e21d0 which changed the signature of read_script()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not clear how we're supposed to get that LuaManager all the way into RGWGetObj/PutObj. in the meantime, i prepared a revert at #47612
Signed-off-by: Yuval Lifshitz ylifshit@redhat.com
testing
MON=1 OSD=1 MDS=0 MGR=0 RGW=1 ../src/vstart.sh -n -d
bin/radosgw-admin script put --infile=./data.lua --context=putData
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "pacific"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
Checklist
Show available Jenkins commands
jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard cephadm
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox
jenkins test windows