-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Discuss] Add Server for seatunnel #1947
Comments
This is a very good proposal. I think web server is a very important feature of SeaTunnel. |
Thank you for your comment. |
@dijiekstra 您说的功能什么时候发布呢? |
It is still in the design stage and such a large feature needs to be discussed and approved by the community before it can be developed |
@CalvinKirs @ruanwenjun @gaojun2048 |
Hi, every contributors or others: |
I'm not sure about we need |
It depends on the community, if SeaTunnel devotes to a platform, this could be good. |
|
Or Scheduler push result to us. |
Great. This proposal will be a long building. I suggest to take the integration into consideration, such as user integration with LDAP, auth integration with Ranger, etc. |
Integration with Ranger is a good idea. |
Due to no one has questioned it, I will start development next week. I will update the progress in this issule regularly |
good !! I'm in |
Done. |
We are looking for people who are willing to work together for this feature. Are you interested in participating? |
Hi guys , My name is Monica and I am a PM.I designed some parts of function below. Look forward #2099 |
ONLINE API VIEW |
@dijiekstra I wanna join, how do I start?? |
@dijiekstra I wanna join, how do I start?? |
I will handle the work of the front-end part accordingly, please refer to the changes in the front-end part #2076. |
Is ST not considering the function of theme switching? |
|
The basic script management is already, I'll focus on the development of integration with Scheduler |
@dijiekstra I wanna join, how do I start?? |
What tasks are currently unclaimed? |
目前在用的时候有问题呀,No matched script save dir [/dj],这个不知道怎么处理 |
Code of Conduct
Search before asking
Describe the proposal
Background
Suppose I am now a Seatunnel user and I want to import database or business's logs into the OLAP engine. I can only submit tasks from the command line, and the task stop & maintain depends on Spark/Flink. This created a huge amount of extra work for us
Back to the seatunnel developer's perspective
As a platform, a service, does not provide a visual control platform, only provides the command line interaction, that is unreasonable。
What does the control-platform need to do?
The most important thing is to manage the configuration information of the data integration task.
Users can easily complete task configuration on the WebUI, such as input and output data sources、 field information、 partition information、 filtering conditions、 abnormal data processing、 scheduling time、 concurrency control、 traffic control、 and incremental or full data integration configuration.
In short, it is to enable users to express their business demands through sample configuration information.
In addition to better development, what remains is to make operations easier:
Provide task execution log for users to query task execution;
Provides management of data sources and permissions, which is common in multi-user and multi-tenant scenarios
Provides system load monitoring and task execution alarms
Of course, these capabilities can ideally be integrated with other types of operations on the same platform (because other operations also have similar requirements), so there are higher requirements for the design of control-platform: To be able to reuse the existing capabilities of the scheduling system or maintenance center, if there is no corresponding service, then you should also have a built-in capability to support such things.
But there are some things we can't do depending on other apps, or the ROI is too low.
Therefore, in order for users to better use Seatunnel, a control-platform is essential for us.
Target
In a word: provide convenient task development and operation and maintenance, can easily complete end-to-end data integration.
Functional Target
Maintenance availability
Expansibility
Architectural Design
#1968
Detail Design
#1969
Subsequent planning
This design and development is v1.0, more people can join us to implement more functions
Integration with DolphinScheduler, for example, as scheduler-engine-DS
Web page development, using open source front-end scaffolding to quickly complete the development
背景
假设我现在是一个seatunnel的用户,我现在想要将数据库或者业务日志导入到OLAP引擎中。我现在只能通过命令行的方式进行任务提交,并且任务的停止&运维需要依赖于spark/flink;这给我们带来了巨大的额外工作量
回到seatunnel的开发者角度上来看
作为一个平台,一个服务,不提供可视化的管控平台,只提供命令行交互方式,那就是耍流氓。
管控平台需要做什么?
最主要的是管理数据集成任务的配置信息。
让用户通过WebUI能够轻松的完成任务的配置信息:比如输入&输出数据源、宇段信息、分区信息、过滤条件、异常数据处理、调度时间、并发度控制、流量控制、增量或全量配置等等
总之,就是尽量让用户能够通过配置信息来表达自己的业务诉求。
除了更好的开发,剩下的就是让运维变得更简单:
当然,这些能力理想状态还是能够与其他类型的作业整合到同一个平台上去(因为其余的作业也有相似的需求),所以这里对管控平台的设计就有更高的要求:能够复用调度系统或者运维中心已有的能力,如果没有对应的服务,那么自己也应该有一套内置的能力来支撑这样的事情。
但是有一些事情,依赖于别的应用是做不了,或者说ROI太低
SE
,比如自动新增字段、将删减字段以空数据插入等,还是可行的后置处理
的能力,那只需要seatunnel将数据写到一张临时表的分区中,然后再通过hive/spark/flink的批处理,即可完成对应的操作所以,综上所述,为了用户更好的去使用seatunnel,一个管控平台对于我们来说是必不可少的。
目标
一句话概括:提供便捷的任务开发与运维,能够轻松的完成端到端的数据集成。
功能目标
可运维性
可拓展性
概要设计
#1968
详细设计
#1969
后续规划
本次设计与开发算是v1.0版本,后续需要更多的人加入进来实现更多的功能
Task list
// i will fill task list soon
Are you willing to submit PR?
The text was updated successfully, but these errors were encountered: