Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

common,ceph: add output file switch to dump json to #57215

Merged
merged 6 commits into from
May 23, 2024
Merged

Conversation

batrick
Copy link
Member

@batrick batrick commented May 1, 2024

The ceph tell mds.X cache dump and ceph tell mds.X ops commands have a
useful --path argument that directs the daemon to save the dump of the output
to a file local the daemon. This has several advantages:

  • We don't need to construct the JSON output in memory, which may be many gigabytes.
  • Streaming writes to a local file is significantly faster than sending the data over the ceph messenger.
  • The command spends as little time as possible holding relevant locks.

However, only some commands support this and we could make it generic to the
admin socket interface. So, add a generic --output-file argument to achieve
this.

The main concern with this is security (telling the daemon to write to
arbitrary files) but this is gated by the session caps which require
"allow_all" and that's typically only valid for the client.admin.

Fixes: https://tracker.ceph.com/issues/65747

Checklist

  • Tracker (select at least one)
    • References tracker ticket
  • Component impact
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows
  • jenkins test rook e2e

@batrick batrick added the common label May 1, 2024
@github-actions github-actions bot added the core label May 1, 2024
@batrick batrick marked this pull request as ready for review May 8, 2024 15:27
@batrick batrick requested review from a team as code owners May 8, 2024 15:27
@batrick batrick requested a review from a team May 8, 2024 15:27
@batrick batrick added cephfs Ceph File System needs-review labels May 8, 2024
@batrick
Copy link
Member Author

batrick commented May 8, 2024

jenkins test make check arm64

@batrick
Copy link
Member Author

batrick commented May 8, 2024

This PR is under test in https://tracker.ceph.com/issues/65867.

Copy link
Contributor

@rzarzynski rzarzynski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea very much!

src/common/Formatter.h Outdated Show resolved Hide resolved
if (format == "json-file") {
if (output.size()) {
auto* jff = new JSONFormatterFile(output, false);
auto&& of = jff->get_ofstream();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: not sure the r-value reference gives here anything. Below it's used just as l-value one.

Copy link
Contributor

@leonid-s-usov leonid-s-usov May 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

auto&& var is a deducing reference type, which can effectively be l-value or r-value depending on the initializer. See https://en.cppreference.com/w/cpp/language/auto#Explanation:

For example, given const auto& i = expr;, the type of i is exactly the type of the argument u in an imaginary template template<class U> void f(const U& u) if the function call f(expr) was compiled. Therefore, auto&& may be deduced either as an lvalue reference or rvalue reference according to the initializer, which is used in range-based for loop.

Copy link
Contributor

@rzarzynski rzarzynski May 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's a definition. Still, what's the benefit behind the extra visual clutter? Later it's used as regular l-value reference. Do we get anything from increasing mental load on humans who read this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd promote the usage of auto && so that people would be less intimidated by it.
However, I agree that in this context this isn't useful. If anything, I'd suggest an auto const& here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For what it's worth, my thoughts on using auto&& is that it makes the caller more resilient to possible minor type changes like a T* to std::shared_ptr<T>. If we had:

auto* T = foo();

then that definition would need updated. Similarly:

auto T = foo();

is invalid if T changes from a pointer to std::unique_ptr<T>&.

On the other hand,

auto&& T = foo();

is the most generic and so long as T still behaves like a pointer, we don't care how it's wrapped/proxied. (Similar logic can be constructed for why auto&& is preferable even if presently the return value of foo() is a T& as is the case here.)

Copy link
Contributor

@leonid-s-usov leonid-s-usov May 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UPD: sorry, the below doesn't apply to the pointer-like semantics. See my next comment.

The downside of using auto&& in this context is that it takes the cv-qualification of the initializer expression, while the constness of this local variable should be controlled by the call site rather than by the API. In this block, we want a const reference to whatever is returned, regardless of whether the initializer provides a const object.

This is in contrast with the use case of

for(auto&& i = std::cbegin(vec); ...)

, where we still control the constness at the call site, but with the initializer expression rather than the type definition.

Copy link
Contributor

@leonid-s-usov leonid-s-usov May 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@batrick you encouraged me for the following test. The TL;DR; is that auto && is a shorter form of auto const& as far as the call-site is concerned, so its usage is justified 👍🏻.
The potential unsafe usage of a mutable reference to a managed ptr is more of an API issue.

#include <vector>
#include <memory>
#include <cassert>

struct C {
    std::vector<int> items;
};

std::shared_ptr<C> global_c(new C());
std::unique_ptr<C> global_unique_c(new C());

C*                          get_p()     { return global_c.get(); }
C const*                    get_cp()    { return global_c.get(); }
std::shared_ptr<C>          get_s()     { return global_c; }
std::shared_ptr<C> &        get_rs()    { return global_c; }
std::shared_ptr<C> const&   get_crs()   { return global_c; }
std::unique_ptr<C>          get_u()     { return std::make_unique<C>(); }
std::unique_ptr<C> &        get_ru()    { return global_unique_c; }
std::unique_ptr<C> const&   get_cru()   { return global_unique_c; }

int main(int argc, char** argv)
{
    { auto c = get_p(); c->items.size(); }
    { auto c = get_cp(); c->items.size(); }
    { auto c = get_s(); c->items.size(); }
    { auto c = get_rs(); c->items.size(); }
    { auto c = get_crs(); c->items.size(); }
    { auto c = get_u(); c->items.size(); }
    // { auto c = get_ru(); c->items.size(); }     // error: call to implicitly-deleted copy constructor of 'std::unique_ptr<C>'
    // { auto c = get_cru(); c->items.size(); }    // error: call to implicitly-deleted copy constructor of 'std::unique_ptr<C>'

    // { auto& c = get_p(); c->items.size(); }     // error: non-const lvalue reference cannot bind to a temporary
    // { auto& c = get_cp(); c->items.size(); }    // error: non-const lvalue reference cannot bind to a temporary
    // { auto& c = get_s(); c->items.size(); }     // error: non-const lvalue reference cannot bind to a temporary
    { auto& c = get_rs(); c->items.size(); }
    { auto& c = get_crs(); c->items.size(); }
    // { auto& c = get_u(); c->items.size(); }     // error: non-const lvalue reference cannot bind to a temporary
    { auto& c = get_ru(); c->items.size(); }
    { auto& c = get_cru(); c->items.size(); }


    { auto&& c = get_p(); c->items.size(); }
    { auto&& c = get_cp(); c->items.size(); }
    { auto&& c = get_s(); c->items.size(); }
    { auto&& c = get_rs(); c->items.size(); }
    { auto&& c = get_crs(); c->items.size(); }
    { auto&& c = get_u(); c->items.size(); }
    { auto&& c = get_ru(); c->items.size(); }
    { auto&& c = get_cru(); c->items.size(); }

    { auto const& c = get_p(); c->items.size(); }
    { auto const& c = get_cp(); c->items.size(); }
    { auto const& c = get_s(); c->items.size(); }
    { auto const& c = get_rs(); c->items.size(); }
    { auto const& c = get_crs(); c->items.size(); }
    { auto const& c = get_u(); c->items.size(); }
    { auto const& c = get_ru(); c->items.size(); }
    { auto const& c = get_cru(); c->items.size(); }

    {
        auto&& c = get_ru();
        // this is valid code, but it's unlikely we ever want a mutable reference to a managed ptr
        // however, IMO it's more of a problem with the API
        [&] { c.reset(); }(); // unhandled race if this is executed by a separate thread
        assert(!get_ru());
    }

    {
        auto const& c = get_ru();
        // [&] { c.reset(); }(); // error: 'this' argument to member function 'reset' has type 'const std::unique_ptr<C>', but function is not marked const
    }
}

Copy link
Member Author

@batrick batrick May 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about auto const&&? 😈

PendingReleaseNotes Outdated Show resolved Hide resolved
Copy link
Contributor

@leonid-s-usov leonid-s-usov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@batrick , this is great!

I'd like to suggest an enhancement to this PR. It would mainly address a case of containerized setups, where the same host is running multiple daemons. It also applies to non-containerized setups like vstart clusters.

The issue in those setups is that multiple daemons may write to the same run/log directory. If I want to create dumps of ops from all mdses, for example, I'd like to run

ceph --format=json-file tell mds.\* ops

and know that the system would create unique files with the daemon's name in the filename suffix.

You may note that I haven't specified --output-file here, and that's another point. I would suggest that if omitted, the output file name would be generated automatically - subject to suffixing with the target daemon name.

If provided, I'd either consider it an --output-file-prefix that would be augmented with the daemon name suffix unconditionally, or have it templated with some placeholder like %daemon (this syntax is exemplary) to let the system inject the daemon's name or any other supported template parameters into the final output file name.

I'm not sure yet if this behavior is only useful for the tell or daemon cases, and if so, whether it makes any sense for other ceph commands. It could be we are looking here at different options to avoid overloading.

@batrick
Copy link
Member Author

batrick commented May 9, 2024

@batrick , this is great!

I'd like to suggest an enhancement to this PR. It would mainly address a case of containerized setups, where the same host is running multiple daemons. It also applies to non-containerized setups like vstart clusters.

The issue in those setups is that multiple daemons may write to the same run/log directory. If I want to create dumps of ops from all mdses, for example, I'd like to run

ceph --format=json-file tell mds.\* ops

and know that the system would create unique files with the daemon's name in the filename suffix.

You may note that I haven't specified --output-file here, and that's another point. I would suggest that if omitted, the output file name would be generated automatically - subject to suffixing with the target daemon name.

If provided, I'd either consider it an --output-file-prefix that would be augmented with the daemon name suffix unconditionally, or have it templated with some placeholder like %daemon (this syntax is exemplary) to let the system inject the daemon's name or any other supported template parameters into the final output file name.

I'm not sure yet if this behavior is only useful for the tell or daemon cases, and if so, whether it makes any sense for other ceph commands. It could be we are looking here at different options to avoid overloading.

Hmm, this reminds me that the --path argument for ops also dumps the file name to the output json. I think we could do that and add a prefix.

Copy link
Contributor

@leonid-s-usov leonid-s-usov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second thoughts on the new --format. Do we force json-file to be pretty? Or should we add a json-pretty-file?

Strictly speaking, this feature doesn't add a new format, but rather a remote output redirection option. I don't have a good name suggestion yet, but I'm pretty sure this shouldn't be a new format

@batrick
Copy link
Member Author

batrick commented May 10, 2024

Second thoughts on the new --format. Do we force json-file to be pretty? Or should we add a json-pretty-file?

Strictly speaking, this feature doesn't add a new format, but rather a remote output redirection option. I don't have a good name suggestion yet, but I'm pretty sure this shouldn't be a new format

I have changed it so all that's needed is a --daemon-output-file switch. PTAL.

@batrick
Copy link
Member Author

batrick commented May 10, 2024

Note to reviewers: the switch is now --daemon-output-file. There was a conflict with the existing undocumented --out-file. Ask me how I figured that out. See also: #57393.

The formatter would be deleted and automatically flush anyway but this is
conforming to the API.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
The `ceph tell mds.X cache dump` and `ceph tell mds.X ops` commands have a
useful `--path` argument that directs the daemon to save the dump of the output
to a file local the daemon. This has several advantages:

* We don't need to construct the JSON output in memory, which may be many gigabytes.
* Streaming writes to a local file is significantly faster than sending the data over the ceph messenger.
* The command spends as little time as possible holding relevant locks.

However, only some commands support this and we could make it generic to the
admin socket interface. So, add a generic --daemon-output-file argument to
achieve this.

The main concern with this is security (telling the daemon to write to
arbitrary files) but this is gated by the session caps which require
"allow_all" and that's typically only valid for the client.admin.

Fixes: https://tracker.ceph.com/issues/65747
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Its path/name include random characters making it unsuitable for static diffs.

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Copy link
Contributor

@leonid-s-usov leonid-s-usov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done! I love how you leveraged the existing config template functionality to generate the automatic filename.

The only source of discomfort for me is the :tmp: magic constant. You did a good job documenting it throughout, so it could stay this way, but maybe we'll give it another thought before we move forward with this.

I had a similar hiccup in the quiesce CLI: I wanted to use --await as a flag but optionally accept a wait timeout as a value. Ceph CLI parsing can't deal with this, so I had to make two separate arguments, --await as a switch and --await-for=<timeout> with a value.
This one feels similar, we want to enable the daemon local output functionality but avoid providing an explicit name for the file.

I prefer a separate argument switch to denote an autogenerated daemon filename. This would be cleaner code-wise and easier to understand for someone reading ceph -h output.

For this example, I had to add the local bit to your switch name so that it would sound good without the -file suffix:

--daemon-local-output[=true]             the daemon will write to a temp file, see config...
--daemon-local-output-file <filename>    the daemon will write to the provided local file name

@leonid-s-usov
Copy link
Contributor

leonid-s-usov commented May 11, 2024

Side note: the missing ceph cmd functionality is a generic default value argument option in the cmddesc model. Having that, we could pull it off with a single argument switch.

@batrick
Copy link
Member Author

batrick commented May 14, 2024

Well done! I love how you leveraged the existing config template functionality to generate the automatic filename.

The only source of discomfort for me is the :tmp: magic constant. You did a good job documenting it throughout, so it could stay this way, but maybe we'll give it another thought before we move forward with this.

I had a similar hiccup in the quiesce CLI: I wanted to use --await as a flag but optionally accept a wait timeout as a value. Ceph CLI parsing can't deal with this, so I had to make two separate arguments, --await as a switch and --await-for=<timeout> with a value. This one feels similar, we want to enable the daemon local output functionality but avoid providing an explicit name for the file.

I prefer a separate argument switch to denote an autogenerated daemon filename. This would be cleaner code-wise and easier to understand for someone reading ceph -h output.

For this example, I had to add the local bit to your switch name so that it would sound good without the -file suffix:

--daemon-local-output[=true]             the daemon will write to a temp file, see config...
--daemon-local-output-file <filename>    the daemon will write to the provided local file name

I've wavered one way or the other on this. My current preference is to just stick with --daemon-output-file=:tmp:. I feel it's simpler to keep a single switch and the potential for conflict is too small to complicate the API.

Copy link
Contributor

@leonid-s-usov leonid-s-usov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've wavered one way or the other on this. My current preference is to just stick with --daemon-output-file=:tmp:. I feel it's simpler to keep a single switch and the potential for conflict is too small to complicate the API.

I hear you! The downside of using a magic value is that it's arbitrary and is not validated by the argparse framework. On the other hand, two-switch approach pollutes the argument namespace in a binding way due to the backward compatibility requirements.

As I said, the best solution would be switches with default values. And the good news is that your approach is better future-compatible with this.

@batrick
Copy link
Member Author

batrick commented May 17, 2024

This PR is under test in https://tracker.ceph.com/issues/66086.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants