Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RGW - Allow starting RGW/dbstore without connecting to Mons #45987

Merged
merged 2 commits into from Apr 29, 2022

Conversation

dang
Copy link
Contributor

@dang dang commented Apr 21, 2022

DBStore, and some other Stores like Motr, don't need to connect to the
Mons to work. However, startup automatically connects to the mons.
There's provision to not connect, but the split isn't quite right. We
need to call global_pre_init() to get config from the file, to determine
which store to start, but we then need to decide before calling
global_init() whether the configured store needs to connect to the mons.

This requires a slight change to global_init() to set no_mon_config from
the new flags.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

const auto& config_store = g_conf().get_val<std::string>("rgw_backend_store");

cerr << "config_store: " << config_store << std::endl;
if ((config_store == "dbstore") ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is fine for immediate purposes, but how would we handle this with dynamic stores? would a command line option be appropriate here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(more than fine, I was expecting a lot more to be involved :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was expecting a lot more, too. I'm not sure what to do about dynamic stores. I have a few ideas, though.

  1. Require all stores to be defined at startup (not, not all filters, but all stores). They need not be loaded, but defined.
  2. Find some way to replace g_conf() with per-store config. Then, each store can initialize it's config in the correct way. This may need a lot of work in the config code, I don't know, and haven't looked yet.
  3. Strict upgrade of g_conf(). That is, if a store is loaded that needs RADOS, tear down g_conf() and re-create it with the MON connection.
  4. A config option to say whether this RGW should connect to a ceph cluster. In general, either a ceph cluster is available, or it's not. If it is, we can connect. If it's not, we can't. I'm fine not supporting adding a ceph cluster near an existing RGW without restarting that RGW.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If #1 means we cannot define a new store at load-time, it seems like a non-starter. #2 would be clean, if we could get away with it. It seems like we know it probably mostly works. Adam and others have certainly suggested to get rid of g_conf() usage in rgw. Using the right handle in RGW contexts seems manageable? #3 seems hacky. #4 seems hacky, but easily fixable later?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have to be able to load a completely unknown store at run-time? Is there some reason we can't know at startup all available, supported stores? Also, why do we need to support adding a ceph cluster to an already running RGW? Is someone really going to start a stand-alone rgw, decide it's not good enough, install a full ceph cluster around it, then add a rados store to the running RGW, all without restarting it?

I'm trying to understand the usecases for stores. My understanding of these usecases was much more limited than what the above comment implies.

(Note, none of this discussion applies to this PR, of course...)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if by startup you mean startup, yes, that's fine

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what I meant that I don't want, is to have to have a static check in the core rgw code that recognizes loadable drivers, I assume that's the plan

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I'll add this to the refactoring meeting agenda, rather than discussing it here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I just meant when the RGW is started, not at compile time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4. A config option to say whether this RGW should connect to a ceph cluster. In general, either a ceph cluster is available, or it's not. If it is, we can connect. If it's not, we can't. I'm fine not supporting adding a ceph cluster near an existing RGW without restarting that RGW.

this is my favorite option. if there is a monitor available, all stores should have the option to take advantage of the mon's config database and its ability to push config changes at runtime

whatever is deploying rgw will have to know what type of store to use, and whether monitors are available. if there is no monitor, it'll have to generate a ceph.conf to pass in all of this config

@soumyakoduri
Copy link
Contributor

DBStore, and some other Stores like Motr, don't need to connect to the Mons to work. However, startup automatically connects to the mons. There's provision to not connect, but the split isn't quite right. We need to call global_pre_init() to get config from the file, to determine which store to start, but we then need to decide before calling global_init() whether the configured store needs to connect to the mons.

Does it mean ceph-mon service need not be started for other stores?

I am trying to verify the behaviour for dbstore .
Even with these changes applied,

#MON=0 OSD=1 MDS=0 MGR=0 RGW=1 ../src/vstart.sh -o rgw_backend_store=dbstore -n -d

^^ throws error -
2022-04-25T18:38:35.596+0530 7f19164ae700 -1 monclient: get_monmap_and_config cannot identify monitors to contact
[errno 2] RADOS object not found (error connecting to the cluster)

@dang
Copy link
Contributor Author

dang commented Apr 25, 2022

Does it mean ceph-mon service need not be started for other stores?

I am trying to verify the behaviour for dbstore . Even with these changes applied,

#MON=0 OSD=1 MDS=0 MGR=0 RGW=1 ../src/vstart.sh -o rgw_backend_store=dbstore -n -d

^^ throws error - 2022-04-25T18:38:35.596+0530 7f19164ae700 -1 monclient: get_monmap_and_config cannot identify monitors to contact [errno 2] RADOS object not found (error connecting to the cluster)

That failure is from radosgw-admin trying to create the user. This PR doesn't touch radosgw-admin (or librgw), so those will still fail. If you have a ceph.conf already created by vstart, you can skip user creation by replacing -n with -k

I ran RGW directly like this:

radosgw -c /home/dang/src/ceph/master/zipper/install/data/ceph.conf --log-file=/home/dang/src/ceph/master/zipper/install/data/out/radosgw.8000.log --admin-socket=/home/dang/src/ceph/master/zipper/install/data/out/radosgw.8000.asok --pid-file=/home/dang/src/ceph/master/zipper/install/data/out/radosgw.8000.pid --rgw_luarocks_location=/home/dang/src/ceph/master/zipper/install/data/out/luarocks --debug-rgw=20 --debug-ms=1 -n client.rgw.8000 '--rgw_frontends=beast port=8000'

Note that this is just the same as the process run by vstart, so you can start a normal vstart, us ps to find the process invocation, stop vstart, then run radosgw directly.

@dang dang force-pushed the wip-dang-zipper-standalone branch from ab02ec7 to 1f2632f Compare April 25, 2022 16:53
@soumyakoduri
Copy link
Contributor

That failure is from radosgw-admin trying to create the user. This PR doesn't touch radosgw-admin (or librgw), so those will still fail. If you have a ceph.conf already created by vstart, you can skip user creation by replacing -n with -k

I ran RGW directly like this:

radosgw -c /home/dang/src/ceph/master/zipper/install/data/ceph.conf --log-file=/home/dang/src/ceph/master/zipper/install/data/out/radosgw.8000.log --admin-socket=/home/dang/src/ceph/master/zipper/install/data/out/radosgw.8000.asok --pid-file=/home/dang/src/ceph/master/zipper/install/data/out/radosgw.8000.pid --rgw_luarocks_location=/home/dang/src/ceph/master/zipper/install/data/out/luarocks --debug-rgw=20 --debug-ms=1 -n client.rgw.8000 '--rgw_frontends=beast port=8000'

Note that this is just the same as the process run by vstart, so you can start a normal vstart, us ps to find the process invocation, stop vstart, then run radosgw directly.

Got it. Thanks!

@dang dang force-pushed the wip-dang-zipper-standalone branch from 1f2632f to defeaba Compare April 25, 2022 17:26
@dang
Copy link
Contributor Author

dang commented Apr 26, 2022

jenkins test make check

@@ -1847,7 +1847,7 @@ namespace rgw::sal {

int DBStore::get_config_key_val(string name, bufferlist *bl)
{
return 0;
return -1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe -ENOTSUP for stuff like this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes please....
Use error codes from errno.h as much as possible.
At the moment I'm trying to figure out why the test for this is giving me a hard time.
Some of the suff is because things like ECANCEL in the tests are hardcoded as a value.

I'm old, but my coding 101 told me to avoid litterals as much as possible.
And in the case of comparing errors there are better alternatives

DBStore, and some other Stores like Motr, don't need to connect to the
Mons to work.  However, startup automatically connects to the mons.
There's provision to not connect, but the split isn't quite right.  We
need to call global_pre_init() to get config from the file, to determine
which store to start, but we then need to decide before calling
global_init() whether the configured store needs to connect to the mons.

This requires a slight change to global_init() to set no_mon_config from
the new flags.

Signed-off-by: Daniel Gryniewicz <dang@redhat.com>
For debugging purposes, allow radosgw-admin to run with stores other
than RadosStore.  Many operations will still fail (by crashing), so care
must be taken when running this way.

Signed-off-by: Daniel Gryniewicz <dang@redhat.com>
@dang dang force-pushed the wip-dang-zipper-standalone branch from defeaba to 4d1a039 Compare April 27, 2022 18:17
@dang dang mentioned this pull request Apr 29, 2022
14 tasks
@dang dang merged commit b286e3f into ceph:master Apr 29, 2022
@tchaikov
Copy link
Contributor

@dang hi Daniel, in future, could you please add "Reviewed-by" lines in the merge commit when merging pull requests?

@mattbenjamin
Copy link
Contributor

@dang hi Daniel, in future, could you please add "Reviewed-by" lines in the merge commit when merging pull requests?

Isn't that something we can rely on github to do, since we have approving github reviewers?

soumyakoduri added a commit to soumyakoduri/ceph that referenced this pull request May 3, 2022
With the changes in ceph#45987 ,
'radosgw-admin' command can be used to execute few admin operations on other stores.

This fix include changes to support  user creation/remove via `radosgw-admin`
command in dbstore.

Signed-off-by: Soumya Koduri <skoduri@redhat.com>
soumyakoduri added a commit to soumyakoduri/ceph that referenced this pull request May 4, 2022
With the changes in ceph#45987 ,
'radosgw-admin' command can be used to execute few admin operations on other stores.

This fix include changes to support  user creation/remove via `radosgw-admin`
command in dbstore.

Signed-off-by: Soumya Koduri <skoduri@redhat.com>
soumyakoduri added a commit to soumyakoduri/ceph that referenced this pull request May 5, 2022
With the changes in ceph#45987 ,
'radosgw-admin' command can be used to execute few admin operations on other stores.

This fix include changes to support  user creation/remove via `radosgw-admin`
command in dbstore.

Signed-off-by: Soumya Koduri <skoduri@redhat.com>
soumyakoduri added a commit to soumyakoduri/ceph that referenced this pull request May 6, 2022
With the changes in ceph#45987 ,
'radosgw-admin' command can be used to execute few admin operations on other stores.

This fix include changes to support  user creation/remove via `radosgw-admin`
command in dbstore.

Signed-off-by: Soumya Koduri <skoduri@redhat.com>
soumyakoduri added a commit to soumyakoduri/ceph that referenced this pull request May 9, 2022
With the changes in ceph#45987 ,
'radosgw-admin' command can be used to execute few admin operations on other stores.

This fix include changes to support  user creation/remove via `radosgw-admin`
command in dbstore.

Signed-off-by: Soumya Koduri <skoduri@redhat.com>
soumyakoduri added a commit to ceph/ceph-ci that referenced this pull request May 10, 2022
With the changes in ceph/ceph#45987 ,
'radosgw-admin' command can be used to execute few admin operations on other stores.

This fix include changes to support  user creation/remove via `radosgw-admin`
command in dbstore.

Also fixed an issue with updating objv_tracker in op_state.user

Signed-off-by: Soumya Koduri <skoduri@redhat.com>
dparmar18 pushed a commit to dparmar18/ceph that referenced this pull request May 19, 2022
With the changes in ceph#45987 ,
'radosgw-admin' command can be used to execute few admin operations on other stores.

This fix include changes to support  user creation/remove via `radosgw-admin`
command in dbstore.

Also fixed an issue with updating objv_tracker in op_state.user

Signed-off-by: Soumya Koduri <skoduri@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants