SenseML is a Spark Machine Learning Framework based on Scala and Spark. It provides feature engineering functions for auto feature generation, feature extension, hyper parameter tuning etc. It both supports high-level APIs for full-stack and low-level APIs for customization, which greatly ease the development of Machine Learning on big data.
SenseML是一个基于scala、spark的机器学习框架,可以通过简单配置实现自动特征计算、特征组合扩展、特征选择等特征工程组件,以及超参数的调优等,既支持高级API的全栈式调用,也可以调用各组件API进行定制化开发,以简化大规模机器学习任务的开发。目前SenseML主要有Feature模块,支持特征工程相关功能。
- Time dimension features
- Groupby aggregation features, sum,avg,min,max,count
- TimeSeries features
- Pivot features
- [TODO] Feature quantile
- [TODO] Feature scale
- [TODO] Feature combination
- [TODO] Feature selection
Yintao Xue (ruogu.org)
Licensed to SenseML(http://senseml.org) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. SenseML licenses this file to you under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Apache and the Apache feather logo are trademarks of The Apache Software Foundation. Apache Lucene, Apache Spark and their respective logos are trademarks of the Apache Software Foundation. Please see the Apache Trademark Policy for more information.