Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] Added api/v1/bookie/cluster_info REST API #3710

Merged
merged 3 commits into from
Dec 22, 2022

Conversation

dlg99
Copy link
Contributor

@dlg99 dlg99 commented Dec 22, 2022

Descriptions of the changes in this PR:

Motivation

Information provided by current REST API is not enough (and cumbersome to combine) to answer such question as "is any data in danger if I shut down one more bookie".
E.g. getting list of underreplicated ledgers can get some info but it is either fast (no ledgers) or can be super slow on large cluster with some bookies lost (it retrieves full list of ledgers).
Even if there are no UR ledgers it still possible that the problem is that Auditor is down etc.

Changes

Added api/v1/bookie/cluster_info REST API

curl -s 127.0.0.1:8080/api/v1/bookie/cluster_info
{
  "auditorElected" : false,
  "auditorId" : "",
  "clusterUnderReplicated" : false,
  "ledgerReplicationEnabled" : true,
  "totalBookiesCount" : 1,
  "writableBookiesCount" : 1,
  "readonlyBookiesCount" : 0,
  "unavailableBookiesCount" : 0
}%

Side-fix:
org.apache.bookkeeper.stream.cluster.StandaloneStarter (used by bookie standalone) did not pass LedgerManagerFactory to the http server thus REST calls that needed it didn't work.

@StevenLuMT
Copy link
Contributor

rerun failure checks

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very useful !
in BKVM we had to call many internal BookKeeperAdmin APIs to get these pieces of information

@nicoloboschi @mino181295 @aluccaroni @dianacle you may be interested

@@ -256,10 +260,19 @@ public static LifecycleComponent buildStorageServer(CompositeConfiguration conf,

// Build http service
if (bkServerConf.isHttpServerEnabled()) {
MetadataBookieDriver metadataDriver = BookieResources.createMetadataDriver(bkServerConf,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we going to create an additional connection to ZooKeeper ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT, there is an internal cache in MetadataDrivers so if there is one already it will be reused

@dlg99
Copy link
Contributor Author

dlg99 commented Dec 22, 2022

@StevenLuMT dead link checker failure is fixed at #3712

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

@dlg99 dlg99 added this to the 4.16.0 milestone Dec 22, 2022
@dlg99 dlg99 self-assigned this Dec 22, 2022
@dlg99 dlg99 merged commit 032aef7 into apache:master Dec 22, 2022
dlg99 added a commit to dlg99/bookkeeper that referenced this pull request Dec 23, 2022
Descriptions of the changes in this PR:

Information provided by current REST API is not enough (and cumbersome to combine) to answer such question as "is any data in danger if I shut down one more bookie".
E.g. getting list of underreplicated ledgers can get some info but it is either fast (no ledgers) or can be super slow on large cluster with some bookies lost (it retrieves full list of ledgers).
Even if there are no UR ledgers it still possible that the problem is that Auditor is down etc.

 Added api/v1/bookie/cluster_info REST API

```
curl -s 127.0.0.1:8080/api/v1/bookie/cluster_info
{
  "auditorElected" : false,
  "auditorId" : "",
  "clusterUnderReplicated" : false,
  "ledgerReplicationEnabled" : true,
  "totalBookiesCount" : 1,
  "writableBookiesCount" : 1,
  "readonlyBookiesCount" : 0,
  "unavailableBookiesCount" : 0
}%
```

Side-fix:
`org.apache.bookkeeper.stream.cluster.StandaloneStarter` (used by bookie standalone) did not pass `LedgerManagerFactory` to the http server thus REST calls that needed it didn't work.

Reviewers: Nicolò Boschi <boschi1997@gmail.com>, Enrico Olivelli <eolivelli@gmail.com>

This closes apache#3710 from dlg99/rest-cluster-info

(cherry picked from commit 032aef7)
dlg99 added a commit to dlg99/bookkeeper that referenced this pull request Dec 23, 2022
Descriptions of the changes in this PR:

Information provided by current REST API is not enough (and cumbersome to combine) to answer such question as "is any data in danger if I shut down one more bookie".
E.g. getting list of underreplicated ledgers can get some info but it is either fast (no ledgers) or can be super slow on large cluster with some bookies lost (it retrieves full list of ledgers).
Even if there are no UR ledgers it still possible that the problem is that Auditor is down etc.

 Added api/v1/bookie/cluster_info REST API

```
curl -s 127.0.0.1:8080/api/v1/bookie/cluster_info
{
  "auditorElected" : false,
  "auditorId" : "",
  "clusterUnderReplicated" : false,
  "ledgerReplicationEnabled" : true,
  "totalBookiesCount" : 1,
  "writableBookiesCount" : 1,
  "readonlyBookiesCount" : 0,
  "unavailableBookiesCount" : 0
}%
```

Side-fix:
`org.apache.bookkeeper.stream.cluster.StandaloneStarter` (used by bookie standalone) did not pass `LedgerManagerFactory` to the http server thus REST calls that needed it didn't work.

Reviewers: Nicolò Boschi <boschi1997@gmail.com>, Enrico Olivelli <eolivelli@gmail.com>

This closes apache#3710 from dlg99/rest-cluster-info

(cherry picked from commit 032aef7)
dlg99 added a commit to dlg99/bookkeeper that referenced this pull request Dec 23, 2022
Descriptions of the changes in this PR:

Information provided by current REST API is not enough (and cumbersome to combine) to answer such question as "is any data in danger if I shut down one more bookie".
E.g. getting list of underreplicated ledgers can get some info but it is either fast (no ledgers) or can be super slow on large cluster with some bookies lost (it retrieves full list of ledgers).
Even if there are no UR ledgers it still possible that the problem is that Auditor is down etc.

 Added api/v1/bookie/cluster_info REST API

```
curl -s 127.0.0.1:8080/api/v1/bookie/cluster_info
{
  "auditorElected" : false,
  "auditorId" : "",
  "clusterUnderReplicated" : false,
  "ledgerReplicationEnabled" : true,
  "totalBookiesCount" : 1,
  "writableBookiesCount" : 1,
  "readonlyBookiesCount" : 0,
  "unavailableBookiesCount" : 0
}%
```

Side-fix:
`org.apache.bookkeeper.stream.cluster.StandaloneStarter` (used by bookie standalone) did not pass `LedgerManagerFactory` to the http server thus REST calls that needed it didn't work.

Reviewers: Nicolò Boschi <boschi1997@gmail.com>, Enrico Olivelli <eolivelli@gmail.com>

This closes apache#3710 from dlg99/rest-cluster-info

(cherry picked from commit 032aef7)
dlg99 added a commit that referenced this pull request Dec 27, 2022
Descriptions of the changes in this PR:

Information provided by current REST API is not enough (and cumbersome to combine) to answer such question as "is any data in danger if I shut down one more bookie".
E.g. getting list of underreplicated ledgers can get some info but it is either fast (no ledgers) or can be super slow on large cluster with some bookies lost (it retrieves full list of ledgers).
Even if there are no UR ledgers it still possible that the problem is that Auditor is down etc.

 Added api/v1/bookie/cluster_info REST API

```
curl -s 127.0.0.1:8080/api/v1/bookie/cluster_info
{
  "auditorElected" : false,
  "auditorId" : "",
  "clusterUnderReplicated" : false,
  "ledgerReplicationEnabled" : true,
  "totalBookiesCount" : 1,
  "writableBookiesCount" : 1,
  "readonlyBookiesCount" : 0,
  "unavailableBookiesCount" : 0
}%
```

Side-fix:
`org.apache.bookkeeper.stream.cluster.StandaloneStarter` (used by bookie standalone) did not pass `LedgerManagerFactory` to the http server thus REST calls that needed it didn't work.

Reviewers: Nicolò Boschi <boschi1997@gmail.com>, Enrico Olivelli <eolivelli@gmail.com>

This closes #3710 from dlg99/rest-cluster-info

(cherry picked from commit 032aef7)
dlg99 added a commit that referenced this pull request Dec 27, 2022
Descriptions of the changes in this PR:

Information provided by current REST API is not enough (and cumbersome to combine) to answer such question as "is any data in danger if I shut down one more bookie".
E.g. getting list of underreplicated ledgers can get some info but it is either fast (no ledgers) or can be super slow on large cluster with some bookies lost (it retrieves full list of ledgers).
Even if there are no UR ledgers it still possible that the problem is that Auditor is down etc.

 Added api/v1/bookie/cluster_info REST API

```
curl -s 127.0.0.1:8080/api/v1/bookie/cluster_info
{
  "auditorElected" : false,
  "auditorId" : "",
  "clusterUnderReplicated" : false,
  "ledgerReplicationEnabled" : true,
  "totalBookiesCount" : 1,
  "writableBookiesCount" : 1,
  "readonlyBookiesCount" : 0,
  "unavailableBookiesCount" : 0
}%
```

Side-fix:
`org.apache.bookkeeper.stream.cluster.StandaloneStarter` (used by bookie standalone) did not pass `LedgerManagerFactory` to the http server thus REST calls that needed it didn't work.

Reviewers: Nicolò Boschi <boschi1997@gmail.com>, Enrico Olivelli <eolivelli@gmail.com>

This closes #3710 from dlg99/rest-cluster-info

(cherry picked from commit 032aef7)
yaalsn pushed a commit to yaalsn/bookkeeper that referenced this pull request Jan 30, 2023
Descriptions of the changes in this PR:

### Motivation

Information provided by current REST API is not enough (and cumbersome to combine) to answer such question as "is any data in danger if I shut down one more bookie".
E.g. getting list of underreplicated ledgers can get some info but it is either fast (no ledgers) or can be super slow on large cluster with some bookies lost (it retrieves full list of ledgers).
Even if there are no UR ledgers it still possible that the problem is that Auditor is down etc.

### Changes

 Added api/v1/bookie/cluster_info REST API

```
curl -s 127.0.0.1:8080/api/v1/bookie/cluster_info
{
  "auditorElected" : false,
  "auditorId" : "",
  "clusterUnderReplicated" : false,
  "ledgerReplicationEnabled" : true,
  "totalBookiesCount" : 1,
  "writableBookiesCount" : 1,
  "readonlyBookiesCount" : 0,
  "unavailableBookiesCount" : 0
}%
```

Side-fix:
`org.apache.bookkeeper.stream.cluster.StandaloneStarter` (used by bookie standalone) did not pass `LedgerManagerFactory` to the http server thus REST calls that needed it didn't work.


Reviewers: Nicolò Boschi <boschi1997@gmail.com>, Enrico Olivelli <eolivelli@gmail.com>

This closes apache#3710 from dlg99/rest-cluster-info
Ghatage pushed a commit to sijie/bookkeeper that referenced this pull request Jul 12, 2024
Descriptions of the changes in this PR:

### Motivation

Information provided by current REST API is not enough (and cumbersome to combine) to answer such question as "is any data in danger if I shut down one more bookie".
E.g. getting list of underreplicated ledgers can get some info but it is either fast (no ledgers) or can be super slow on large cluster with some bookies lost (it retrieves full list of ledgers).
Even if there are no UR ledgers it still possible that the problem is that Auditor is down etc.

### Changes

 Added api/v1/bookie/cluster_info REST API

```
curl -s 127.0.0.1:8080/api/v1/bookie/cluster_info
{
  "auditorElected" : false,
  "auditorId" : "",
  "clusterUnderReplicated" : false,
  "ledgerReplicationEnabled" : true,
  "totalBookiesCount" : 1,
  "writableBookiesCount" : 1,
  "readonlyBookiesCount" : 0,
  "unavailableBookiesCount" : 0
}%
```

Side-fix:
`org.apache.bookkeeper.stream.cluster.StandaloneStarter` (used by bookie standalone) did not pass `LedgerManagerFactory` to the http server thus REST calls that needed it didn't work.


Reviewers: Nicolò Boschi <boschi1997@gmail.com>, Enrico Olivelli <eolivelli@gmail.com>

This closes apache#3710 from dlg99/rest-cluster-info
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants