Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide Flink-K8s Runtime Support for StreamX #325

Merged
merged 39 commits into from
Sep 29, 2021
Merged

Provide Flink-K8s Runtime Support for StreamX #325

merged 39 commits into from
Sep 29, 2021

Conversation

wolfboys
Copy link
Member

Signed-off-by: Al-assad yulin.ying@outlook.com, wolfboys benjobs@qq.com

What problem does this PR solve?

Issue Number: 325

Problem Summary: Provide Flink-K8s Runtime Support for StreamX

What is changed and how it works?

What's Changed:

  • Support for submit/cancel Flink-SQL-Job on Flink-K8s-native session/application mode.
  • Support for tracking Flink job status and metrics information from Flink-K8s cluster.
  • Support for StreamX instance to manage both Flink-Yarn or Flink-K8s runtime Cluster, and to use either independently.

Al-assad and others added 30 commits July 2, 2021 19:41
…k8s-flink

# Conflicts:
#	streamx-plugin/streamx-flink-submit/src/main/scala/com/streamxhub/streamx/flink/submit/SubmitRequest.scala
# Conflicts:
#	streamx-plugin/streamx-flink-submit/src/main/scala/com/streamxhub/streamx/flink/submit/trait/YarnSubmitTrait.scala
* modify configuration constants of workspace(#251)

* typo(#251)

* add isAnyBank method(#251)

* add unified fs operator defined(#251)

* register FsOperator to SpringBoot Bean(#251)

* remove unnecessary import(#251)

* extend the signature of method upload, copy, copyDir(#251)

* Separate workspace storage type into configuration(#251)

* Separate workspace storage type into configuration(#251)

* add fileMd5 method(#251)

* replace the code reference of HdfsUtils to FsOperator(#251)

* change the bean injection behavior of FsOperator(#251)

* change the config key of streamx.workspace(#251)

* fix stack overflow bug

* LfsOperator.upload support dir source

* Update ConfigConst.scala

* Update HdfsOperator.scala

* Update LfsOperator.scala

* Update UnifiledFsOperator.scala

* Update Utils.scala

Co-authored-by: benjobs <benjobs@qq.com>
Workspace storage compatibility for HDFS and LFS (#253)
* modify configuration constants of workspace(#251)

* typo(#251)

* add isAnyBank method(#251)

* add unified fs operator defined(#251)

* register FsOperator to SpringBoot Bean(#251)

* remove unnecessary import(#251)

* extend the signature of method upload, copy, copyDir(#251)

* Separate workspace storage type into configuration(#251)

* Separate workspace storage type into configuration(#251)

* add fileMd5 method(#251)

* replace the code reference of HdfsUtils to FsOperator(#251)

* change the bean injection behavior of FsOperator(#251)

* change the config key of streamx.workspace(#251)

* fix stack overflow bug

* LfsOperator.upload support dir source

* Update ConfigConst.scala

* Update HdfsOperator.scala

* Update LfsOperator.scala

* Update UnifiledFsOperator.scala

* Update Utils.scala

* compatible with flink k8s submit

* compatible with flink k8s submit

* add unit test

* rename

* add code build module

* rename

* add maven tool

* resolve conflicts

Co-authored-by: benjobs <benjobs@qq.com>
* [bugfix] flinkSql verify bug fixed (#244)

 [bugfix] flinkSql verify bug fixed.

* Update github issue and pr template (#245)

* .github: update issue template

* .github: add chinese issue template

* .github: add default title for issue template

* [bugfix] KafkaSource topic no matching type when parsing (#254)

* "create temporary table..." syntax parse bug fixed (#257)

* [bugfix] flinkSql verify bug fixed.

* [feature]SqlConvertUtils format sql support.

* [245] "create temporary table..." syntax parse bug fixed

* [bugfix] flinkSql verify bug #265 (#268)
* modify configuration constants of workspace(#251)

* typo(#251)

* add isAnyBank method(#251)

* add unified fs operator defined(#251)

* register FsOperator to SpringBoot Bean(#251)

* remove unnecessary import(#251)

* extend the signature of method upload, copy, copyDir(#251)

* Separate workspace storage type into configuration(#251)

* Separate workspace storage type into configuration(#251)

* add fileMd5 method(#251)

* replace the code reference of HdfsUtils to FsOperator(#251)

* change the bean injection behavior of FsOperator(#251)

* change the config key of streamx.workspace(#251)

* fix stack overflow bug

* LfsOperator.upload support dir source

* Update ConfigConst.scala

* Update HdfsOperator.scala

* Update LfsOperator.scala

* Update UnifiledFsOperator.scala

* Update Utils.scala

* compatible with flink k8s submit

* compatible with flink k8s submit

* add unit test

* rename

* add code build module

* rename

* add maven tool

* resolve conflicts

* import scalatest

* typo exception message

* add unit test for MavenArtifact

* bug fix

* add unit test for MavenTool

* resolve conflict

* Update MavenArtifact.scala

You can use scala's match to make the code more gracefully

* Update MavenTool.scala

Optimize the code to make the code more concise and gracefully

Co-authored-by: benjobs <benjobs@qq.com>
* add .gitignore

* build fat-jar when submit flink k8s-session job
* add .gitignore

* build fat-jar when submit flink k8s-session job

* remove gitkeep

* config const for k8s and docker environment

* import docker-client dependencies

* add docker operator tool

* add tryWithResource support

* add default image namespace const

* add default image namespace const

* fix bug

* add default image namespace const
* add .gitignore

* build fat-jar when submit flink k8s-session job

* remove gitkeep

* config const for k8s and docker environment

* import docker-client dependencies

* add docker operator tool

* add tryWithResource support

* add default image namespace const

* add default image namespace const

* fix bug

* add default image namespace const

* change spec out path and add it to .gitignore

* add unit test

* remove unnecessary try-with-resource

* add unit test for DockerTools and bug fix

* Update DockerTool.scala

Co-authored-by: benjobs <benjobs@qq.com>
* add .gitignore

* build fat-jar when submit flink k8s-session job

* remove gitkeep

* config const for k8s and docker environment

* import docker-client dependencies

* add docker operator tool

* add tryWithResource support

* add default image namespace const

* add default image namespace const

* fix bug

* add default image namespace const

* change spec out path and add it to .gitignore

* add unit test

* remove unnecessary try-with-resource

* add unit test for DockerTools and bug fix

* Update DockerTool.scala

* update

* build flink image before submit flink on k8s-native application mode

Co-authored-by: benjobs <benjobs@qq.com>
* [optimize] streamx-codebuild rename to streamx-packer #249

* [Feature] storage auto adaptation #259
* [optimize] streamx-packer rename to streamx-flink-packer #249

* [Feature] streamx-console/streamx-console-webapp provide additional GUI adaptations for flink k8s mode #259
…UI adaptations for flink k8s mode (#286)

* [Feature] streamx-console/streamx-console-webapp provide additional GUI adaptations for flink k8s mode

* [Feature] streamx-console/streamx-console-webapp provide additional GUI adaptations for flink k8s mode

* [Feature] streamx-console/streamx-console-webapp provide additional GUI adaptations for flink k8s mode
* Using cache to speed up build fatjar
* update unit test

* fix bug: streamx.console.workspace setting missing

* force create local workspace directory during initialization

* remove refilling setting process

* fix bug: k8s clusterId missing

* fix bug: k8s application name exists
* fix bug: streamx.console.workspace setting missing

* remove refilling setting process

* init streamx-flink-kubernetes module

* init streamx-flink-kubernetes module

* temporary removal of fat-jar build cache

* add necessary enum

* coverter between FlinkAppState and k8s.enums.FlinkJobState

* add tracking info cache

* update tracking cache

* add tracking monitor

* replace cache with cachePool

* update FlinkTRKMonitor

* add CachePool

* add k8s connection checking method

* update k8s events cache value model

* add comment

* update model

* add isInTracking method

* add Kubernetes Event Watcher

* add enhanced try-with-resource method

* update jobStatus cache value

* remove tracking of k8s service events

* add new FlinkClusterClient method

* flink watcher trait

* update flink watcher trait

* update flink watcher trait

* update to be safe for thread

* add watcher config object

* add flink job status watcher

* update flink watcher trait

* update flink watcher trait

* update flink watcher trait

* update flink watcher

* add method comments

* update conf class name

* add @nonnull sign for external api

* replace safeSet method

* add httpclient dependencies

* update FlinkJobStatusWatcher

* update FlinkJobStatusWatcher

* update FlinkJobStatusWatcher

* add collectDistinctTrkIds method

* support implicit call of json marshal/unmarshal

* move conf class to new package

* flink metrics bean

* update flink job status watcher

* support flink metrics tracking

* update copyright

* refactor tracking watcher conf

* support closeable

* integrate watcher control to monitor

Co-authored-by: benjobs <benjobs@qq.com>
* metrics & status tracking support for flink-k8s-mode

* [WIP]Multi version flick support

* [WIP]Multi version flick support

* [WIP]Multi version flick support

* [WIP]Multi version flick support

* [WIP]Multi version flick support

* [WIP]Multi version flick support

* [WIP]Multi version flick support

* Multi version flick support
wolfboys and others added 5 commits September 18, 2021 11:53
* StreamX documentation!

* streamx documentation

* streamx documentation

* streamx documentation
* Update V1_1_2__upgrade_db.sql

脚本错误,该bug导致数据库字段错误,程序启动异常。

* Update V1_1_2__upgrade_db.sql

Co-authored-by: benjobs <benjobs@qq.com>
* fix bug: streamx.console.workspace setting missing

* remove refilling setting process

* init streamx-flink-kubernetes module

* init streamx-flink-kubernetes module

* temporary removal of fat-jar build cache

* add necessary enum

* coverter between FlinkAppState and k8s.enums.FlinkJobState

* add tracking info cache

* update tracking cache

* add tracking monitor

* replace cache with cachePool

* update FlinkTRKMonitor

* add CachePool

* add k8s connection checking method

* update k8s events cache value model

* add comment

* update model

* add isInTracking method

* add Kubernetes Event Watcher

* add enhanced try-with-resource method

* update jobStatus cache value

* remove tracking of k8s service events

* add new FlinkClusterClient method

* flink watcher trait

* update flink watcher trait

* update flink watcher trait

* update to be safe for thread

* add watcher config object

* add flink job status watcher

* update flink watcher trait

* update flink watcher trait

* update flink watcher trait

* update flink watcher

* add method comments

* update conf class name

* add @nonnull sign for external api

* replace safeSet method

* add httpclient dependencies

* update FlinkJobStatusWatcher

* update FlinkJobStatusWatcher

* update FlinkJobStatusWatcher

* add collectDistinctTrkIds method

* support implicit call of json marshal/unmarshal

* move conf class to new package

* flink metrics bean

* update flink job status watcher

* support flink metrics tracking

* update copyright

* refactor tracking watcher conf

* support closeable

* integrate watcher control to monitor

* resolve code conflicts

* resolve code conflicts

* resolve code conflicts

* import scalatest dependency

* move model

* default namespace value

* replace with copy

* update method decalaration

* add exception throw declaration

* limit flink-client timeout

* add test for single tracking task

* fix bug

* add default conf for debug

* fix type error bug

* support get all trackingIds from cache

* update single trk test

* add trkCache watching method for debug

* add FlinkTrkMonitor test

* move TrkConf

* add k8s event watcher

* add guava dependency

* support for flink job status change event publish/subscribe

* test for eventbus

* delete unnecessary comments

* update tryWithResourceException

* method for determining the existence of flink job on remote cluster

* fix bug: retrieve flink-cluster-client error

* ChangeEventBus support sync/async post

* check the legitimacy of the TrkId before the monitor trk method is called

* keep the idempotent of start method

* add the necessary comments

* update lazy start aop

* add necessary comments

* test lazy start behavior of FlinkTrkMonitor

* add necessary comments

* add TrkMonitor Factory

* rename default conf method

* rename FlinkTrkMonitor to K8sFlinkTrkMonitor

* remove unnecessary comments

* add covert method

* update ExecutionMode

* update Application

* untracking flink job on k8s mode

* Integrated K8sFlinkTrkMonitor

* refactor event

* support for K8sFlinkTrkMonitor event post and build-in event listener

* let postEvent method trigger TrkMonitor lazy start behavior

* support more flink metrics tracking

* rename K8S_DEPLOYING state

* catch all flink-k8s-native event

* speed up watcher tracking frequency

* add totalJob complete method

* checkIsInRemoteCluster supplemental unsafe trkId check

* untracking k8s flink job

* update wrapper

* differentiate flink LOST state into SILENT and LOST

* add flink SILENT state

* remove delay start trigger of method getClusterMetrics

* fix flywaydb error

* format k8sNamespace field of Application

* kubernetes mode support

* adaptation request

* update behavior of recovery data

* fix stupid bug

* add FlinkTrkMonitor debug helper

* fix bug: k8sNamespace storage missing

* fix bug: k8s info missing

* update jobName when clusterId changed

* fix bug: flink job cancel fail

* add log for key steps of flink submit

* resolve the conflicts of guava

* update

* resolve conflicts of guave

* refactor streamx-packer module

* fix bug: tag format error during push image

* fix bug: flink.pipeline.jars missing when submit job on application mode

* add DOCKER_IMAGE_NAMESPACE config

* add DOCKER_IMAGE_NAMESPACE config

* support check kubernetes deployment resource

* add new flink state

* optimization flink k8s job state inference algorithm

* add new flink job state

* optimize silent state inference algorithm

* update

* update

* update timeout logger

* update status persistence

* enhance JobStatusCV

* optimize JobStatusCV persistence

* support record flink cluster metrics for each flink job

* support record flink cluster metrics for each flink job

* add new cache debugging method

* set application name to flink job

* enhance flink-k8s session job status catching processing

* abandon the maintenance of separate flink cluster metrics aggregation cache

* fixbug: missing application metrics in view list

* further inference of TERMINAL/POS_TERMINAL status

* cache flink rest api on kubernetes cluster

* error single task timeout limitation

* fixbug: missing upload jar parameter

* support third party dependency for flink-k8s job

* integrate flink-k8s support to streamx-console-server

* explicitly throwing exceptions

* fixbug: resolved maven dependencies are incomplete

* support custom pod-template for flink-k8s mode

* temporarily block flame graph for flink-k8s job

* don't request the yarn interface when for the flink-k8s job

* temporarily remove support for remapping flink-k8s job

* remove some support of K8sTrkMonitor interface

* add retracking interface support

* remove retracking interface support

* custom savepoint support for flink-k8s mode

* [fix bug] OutOffBound Exception while activeProfiles is empty

* Support for customizing flink-k8s configuration from SpringBoot application configuration.

* temporarily block checkpoint failover setting for flink-k8s mode.

* fix wrong flink state inference

* support for email alert on flink-k8s mode

* temporarily block flink-k8s support in custom-code mode

* resolve merge conflicts

* typo

* cancel shade shims

* update default shims to flink-1.13

* fixbug: flinkUserJar path error

* update dockerfile template

* import flink-shims dependency explicitly for package programming on flink session mode

* add flink CLASSLOADER_RESOLVE_ORDER configuration

* rename sub building workspace to be more friendly for debugging

* rename sub workspace path, and add support for log4j configuration

* fix bug: flink job state should be shown in initialing when the k8s pod is in initialization phase for a long time

* support custom flink rest service exposed type on flink-native-k8s mode

* modify the execution mode text

* remove multiple space

* add log for start building docker image

* invalidate rest url cache when untracking flink job

* support get flink remote rest url from k8s

* support jump to flink web ui

* remove direct import from flink lib

* update

* replace with scala implicit converter

* resolve conflict

Co-authored-by: benjobs <benjobs@qq.com>
Fix the UI misalignment problem of Add.vue page
Copy link
Member

@Al-assad Al-assad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我们需要要将 pom.xml 中的 streamx 版本更新到 1.2.0

Copy link
Member

@Al-assad Al-assad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main code is fine, but we need to update all the streamx-components in pom.xml to version 1.2.0 : )

wolfboys and others added 3 commits September 28, 2021 18:38
* move document.

* character logo change.
Update streamx version to 1.2.0
Copy link
Member

@Al-assad Al-assad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wolfboys
Copy link
Member Author

LGTM

@wolfboys wolfboys merged commit 489c6df into main Sep 29, 2021
wolfboys added a commit that referenced this pull request Dec 10, 2021
* [WIP] compatible with both flink yarn/k8s modes task submission/termination (#261)

* [feature] flink k8s native mode support

* [feature] flink k8s native mode support

* [issue#220] refactoring SubmitRequest, SubmitResponse to adapt k8s submit operations

* [issue#220] refactoring SubmitRequest, SubmitResponse to adapt k8s submit operations

* [issue#220] New dto object for flink stop action parameter transfer process

* [issue#220] refactor: move the parameters of the flink stop method to a dedicated dto object

* modify configuration constants of workspace(#251)

* typo(#251)

* add isAnyBank method(#251)

* add unified fs operator defined(#251)

* register FsOperator to SpringBoot Bean(#251)

* remove unnecessary import(#251)

* extend the signature of method upload, copy, copyDir(#251)

* Separate workspace storage type into configuration(#251)

* Separate workspace storage type into configuration(#251)

* add fileMd5 method(#251)

* replace the code reference of HdfsUtils to FsOperator(#251)

* change the bean injection behavior of FsOperator(#251)

* change the config key of streamx.workspace(#251)

* fix stack overflow bug

* LfsOperator.upload support dir source

* Update ConfigConst.scala

* Update HdfsOperator.scala

* Update LfsOperator.scala

* Update UnifiledFsOperator.scala

* Update Utils.scala

* compatible with flink k8s submit

* compatible with flink k8s submit

Co-authored-by: benjobs <benjobs@qq.com>

* Revert "[WIP] compatible with both flink yarn/k8s modes task submission/termination (#261)" (#262)

This reverts commit 7ae4d15.

* Provide Flink-K8s Runtime Support for StreamX (#325)

[feature] Support for submit/cancel Flink-SQL-Job on Flink-K8s-native session/application mode.
[feature] Support for tracking Flink job status and metrics information from Flink-K8s cluster.
[feature] Support for StreamX instance to manage both Flink-Yarn or Flink-K8s runtime Cluster, and to use either

* [Enhancement #480] add code style framework

* [Enhancement #480] add code style framework

* [Enhancement #480] add code style framework

* [Enhancement #480] add code style framework

Co-authored-by: al-assad <yulin.ying@outlook.com>
Co-authored-by: benjobs <benjobs@qq.com>
wolfboys added a commit that referenced this pull request Dec 11, 2021
* [WIP] compatible with both flink yarn/k8s modes task submission/termination (#261)

* [feature] flink k8s native mode support

* [feature] flink k8s native mode support

* [issue#220] refactoring SubmitRequest, SubmitResponse to adapt k8s submit operations

* [issue#220] refactoring SubmitRequest, SubmitResponse to adapt k8s submit operations

* [issue#220] New dto object for flink stop action parameter transfer process

* [issue#220] refactor: move the parameters of the flink stop method to a dedicated dto object

* modify configuration constants of workspace(#251)

* typo(#251)

* add isAnyBank method(#251)

* add unified fs operator defined(#251)

* register FsOperator to SpringBoot Bean(#251)

* remove unnecessary import(#251)

* extend the signature of method upload, copy, copyDir(#251)

* Separate workspace storage type into configuration(#251)

* Separate workspace storage type into configuration(#251)

* add fileMd5 method(#251)

* replace the code reference of HdfsUtils to FsOperator(#251)

* change the bean injection behavior of FsOperator(#251)

* change the config key of streamx.workspace(#251)

* fix stack overflow bug

* LfsOperator.upload support dir source

* Update ConfigConst.scala

* Update HdfsOperator.scala

* Update LfsOperator.scala

* Update UnifiledFsOperator.scala

* Update Utils.scala

* compatible with flink k8s submit

* compatible with flink k8s submit

Co-authored-by: benjobs <benjobs@qq.com>

* Revert "[WIP] compatible with both flink yarn/k8s modes task submission/termination (#261)" (#262)

This reverts commit 7ae4d15.

* Provide Flink-K8s Runtime Support for StreamX (#325)

[feature] Support for submit/cancel Flink-SQL-Job on Flink-K8s-native session/application mode.
[feature] Support for tracking Flink job status and metrics information from Flink-K8s cluster.
[feature] Support for StreamX instance to manage both Flink-Yarn or Flink-K8s runtime Cluster, and to use either

* [Enhancement #480] add code style framework

* [Enhancement #480] add code style framework

* [Enhancement #480] add code style framework

* [Enhancement #480] add code style framework

* [Enhancement #512] add header check style

* reset

Co-authored-by: al-assad <yulin.ying@outlook.com>
Co-authored-by: benjobs <benjobs@qq.com>
@tisonkun tisonkun deleted the feature-k8s branch September 30, 2022 08:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants