Skip to content

[Feature] Add a new rest api to query instance host and ip information for query profile action in branch-1.2-lts #18726#18728

Closed
bigben0204 wants to merge 769 commits intoapache:masterfrom
bigben0204:branch-1.2-lts-instances-api
Closed

[Feature] Add a new rest api to query instance host and ip information for query profile action in branch-1.2-lts #18726#18728
bigben0204 wants to merge 769 commits intoapache:masterfrom
bigben0204:branch-1.2-lts-instances-api

Conversation

@bigben0204
Copy link
Contributor

Proposed changes

Issue Number: close #18726

Problem summary

Add a new api to query instance host ip and port

Checklist(Required)

  • Does it affect the original behavior
    No
  • Has unit tests been added
    No need
  • Has document been added or modified
  • Does it need to update dependencies
    No need
  • Is this PR support rollback (If NO, please explain WHY)

Further comments

BePPPower and others added 30 commits March 1, 2023 17:54
* (feature)[DOE]Support array for Doris on ES
…json with jackson (#16806)

* Support mapping es date format, default/yyyy-MM-dd HH:mm:ss/yyyy-MM-dd/epoch_millis

* Replace simple json with jackson, resolve column order random problem

* Add es array doc version
…a from FE catalog (#16647)

When querying information_schema database, BE will call FE RPC
to get schema info such as db name list, table name list, etc.
But some external catalog when failed to get these info because of wrong connection info.
We should catch these kind of exception and skip it, so that it can continue to
get schema info of other catalogs.
Otherwise, the whole query on information_schema will fail, even if user just want to get
info of internal catalog.

And set jdbc connection timeout to 5s, to avoid thrift rpc timeout from BE to FE(default is 30s)
after 1.2.0, doris does not support native udf, return error when create native function
Signed-off-by: nextdreamblue <zxw520blue1@163.com>
fix not find dbname from internal catalog
…atalog and fix jdbc catalog issue (#17209)

Check required properties when creating catalog.
To avoid some strange error when missing required properties

This PR add checks for:

hms catalog: check the validation of dfs.ha properties

jdbc catalog: check jdbc_url, driver_url, driver_class is set.

Fix NPE when init MasterCatalogExecutor
The MasterCatalogExecutor may be called by FrontendServiceImpl from BE, which does not have ConnectionContext.

Add more jdbc url param to resolve Chinese issue

add useUnicode=true&characterEncoding=utf-8 by default in jdbc catalog when connecting to MySQL

Update FAQ doc of catalog
Co-authored-by: maochongxin <maochongxin@gmail.com>
…log (#17245)

1. The first property is `only_specified_database`:
In the past, `Jdbc Catalog` will synchronize all database from source database.
Now we add a parameter called `only_specified_database` to jdbc catalog to allow only the specified database to be synchronized, eg:

```sql
create resource if not exists ${resource_name} properties(
    "type"="jdbc",
    "user"="root",
    "password"="123456",
    "jdbc_url" = "jdbc:mysql://172.18.0.1:${mysql_port}/doris_test?useSSL=false",
    "driver_url" = "https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/jdbc_driver/mysql-connector-java-8.0.25.jar",
    "driver_class" = "com.mysql.cj.jdbc.Driver",
    "only_specified_database" = "true"
);
```
if `only_specified_database` is `true`, jdbc catalog will only synchronize the database which is specified in `jdbc_url`.

2. The second property is `lower_case_table_names`:
This property will synchronize jdbc external data source table names in lower case.

```sql
create resource if not exists ${resource_name} properties(
  "type"="jdbc",
  "user"="doris_test",
  "password"="123456",
  "jdbc_url" = "jdbc:oracle:thin:@172.18.0.1:${oracle_port}:${SID}",
  "driver_url" = "https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/jdbc_driver/ojdbc8.jar",
  "driver_class" = "oracle.jdbc.driver.OracleDriver",
  "lower_case_table_names" = "true"
);
```
Return msg like:
`ERROR 1105 (HY000): errCode = 2, detailMessage = (172.21.0.101)failed to init HDFSCommonBuilder, please check check be/conf/hdfs-site.xml`
So that we can know where the error comes from.
Only for branch-1.2-lts, master has done this
…in corner case of tablet rebalance (#16889)" (#17386)

This reverts commit 783c7d3.
* [Feature](load) Add submitter and comments to load job
… (#17396)" (#17440)

This reverts commit 3fe77d2.
This PR change the meta version, should not backport to 1.2 branch

Alternative:
For 1.2, you can put submitter and comments info in jobProperties of LoadJob, so that we don't need to upgrade metaversion. Because jobProperties is a map.
…stream load pipe instead of fragment instance id (#17439)

cherry-pick part of #17362

This PR does not affect load behavior, just add load id to Protobuf Message.
So user could upgrade to 2.x from 1.2.3 smoothly.
…14078) (#17472)

fmt::format dosen't support non-template object as args, even if it implements
`to_string()` or `operator<<`. so orignal code may cause false to be printed
instead of real cause of the failure. So to_string() need to be manually invoked.

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
morningman and others added 18 commits April 13, 2023 23:22
For type int identity of sqlserver, the column type read from JDBC is called int indentity. So we need deal with this case.
#18592)

Refresh table object while refresh external table. Including:
Refresh catalog, refresh database and refresh table.
Before visiting database, need to guarantee catalog has been initialized.
Before visiting table, need to guarantee catalog and database have been initialized.
… group by list (#18630)

Because of the limitation of ProjectPlanner, we have to keep set agg functions materialized if there is any virtual slots in the group by list, such as 'GROUPING_ID' in the group by list etc.
now maybe jdbc have problem that there are too many connections and they do not release,
so change the property of datasource: init = 1, min = 1, max = 100, and idle time is 10 minutes.
when there is no snapshot, should no result shows.
1  show backup where SnapshotName="xxx";  
2. show backup where SnapshotName like "%XXX%"
…ake (#18624) (#18661)

Currently, our third party libraries are built by autotools or cmake. Under some scenarios, we may use system-wide headers or libraries to build them which may make the build process fail.

We can configure the search paths explicitly to help autotools and cmake find the right dependencies.
…titioned. (#17932)

Reference to `org.apache.doris.planner.external.HiveSplitter`, the file cache of `HiveMetaStoreCache`
may be created even the table is a non-partitioned table,
so the `RefreshTableStmt` should consider this scene and handle it.
When we use a hive client to submit a `INSERT INTO TBL SELECT * FROM ...` or `INSERT INTO TBL VALUES ...`
sql and the table is non-partitioned table, the hms will generate an insert event. The insert stmt may changed the
hdfs file distribution of this table, but currently we do not handle this, so the file cache of this table may be inaccurate.
…ow table command (#18645)

* [fix](trino catalog) To specify both catalog and database, run the show table command

* fix
…is too large or is not reachable (#18662)

When query tables in information_schema databases, it may timeout due to:

There are external catalog with too many tables.
The external catalog is unreachable
So I add a new FE config infodb_support_ext_catalog.
The default is false, which means that when select from tables in information_schema database,
the result will not contain the information of the table in external catalog.

Describe your changes.
For now, there are 3 packages for the release binaries of Doris: https://doris.apache.org/download
And user may be confused about how to download and deploy these packages.

So I provide a download script for each release, and user can simply download the script and run it, like:

```
> sh download_x64_apache.sh

Begin to download FE from "https://mirrors.tuna.tsinghua.edu.cn/apache/doris/1.2/1.2.3-rc02/apache-doris-fe-1.2.3-bin-x86_64.tar.xz" to "apache-doris-1.2.3-bin/" ...
Total size: 408078012 Bytes
#################################################### 100.0%
Begin to download BE from "https://mirrors.tuna.tsinghua.edu.cn/apache/doris/1.2/1.2.3-rc02/apache-doris-be-1.2.3-bin-x86_64.tar.xz" to "apache-doris-1.2.3-bin/" ...
Total size: 606211324 Bytes
#################################################### 100.0%
Begin to download DEPS from "https://mirrors.tuna.tsinghua.edu.cn/apache/doris/1.2/1.2.3-rc02/apache-doris-dependencies-1.2.3-bin-x86_64.tar.xz" to "apache-doris-1.2.3-bin/" ...
Total size: 253869148 Bytes
#################################################### 100.0%
Begin to assemble the binaries ...
Move java-udf-jar-with-dependencies.jar to be/lib/ ...
Download complete!
You can now deploy Apache Doris from apache-doris-1.2.3-bin/
```

The script will do the rest.

This script will later be published on the Download page of Apache Doris website, so that user can easily get
it and use it.

Currently only for Linux platform. Other platform is untested.
…rmation for query profile action in branch-1.2-lts(#18726)

Add api and docs description
@github-actions github-actions bot added area/load Issues or PRs related to all kinds of load area/nereids area/planner Issues or PRs related to the query planner area/spark-load Issues or PRs related to the spark load area/sql/function Issues or PRs related to the SQL functions area/vectorization kind/docs Categorizes issue or PR as related to documentation. kind/test labels Apr 17, 2023
dingben added 2 commits April 17, 2023 12:51
…rmation for query profile action in branch-1.2-lts(#18726)

Add api and docs description
@bigben0204 bigben0204 closed this Apr 17, 2023
@bigben0204 bigben0204 deleted the branch-1.2-lts-instances-api branch April 17, 2023 04:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/load Issues or PRs related to all kinds of load area/nereids area/planner Issues or PRs related to the query planner area/spark-load Issues or PRs related to the spark load area/sql/function Issues or PRs related to the SQL functions area/vectorization kind/docs Categorizes issue or PR as related to documentation. kind/test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] Add a new rest api to query instance host and ip information for query profile action in branch-1.2-lts