Skip to content

Apache Spark

spark logo

Apache Spark is an open source distributed general-purpose cluster-computing framework. It provides an interface for programming entire clusters with implicit data parallelism and fault tolerance.

Here are 5,205 public repositories matching this topic...

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

  • Updated Oct 1, 2020
  • Python
joshk0 commented Nov 4, 2020

Is your feature request related to a problem? Please describe.

It is idiomatic for JWTs to be accepted using a header format of Authorization: Bearer <JWT> (see introduction.) In general, in history, the RFCs surrounding the authorization header have taken care to specify the mode of Authorization as the first part of the header value (e.g. Basic, Di


flink learning blog. 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例,还有 Flink 落地应用的大型项目案例(PVUV、日志存储、百亿数据实时去重、监控告警)分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》

  • Updated Nov 11, 2020
  • Java

macOS development environment setup: Easy-to-understand instructions with automated setup scripts for developer tools like Vim, Sublime Text, Bash, iTerm, Python data analysis, Spark, Hadoop MapReduce, AWS, Heroku, JavaScript web development, Android development, common data stores, and dev-based OS X defaults.

  • Updated Jun 20, 2020
  • Python

Open Source Fast Scalable Machine Learning Platform For Smarter Applications: Deep Learning, Gradient Boosting & XGBoost, Random Forest, Generalized Linear Modeling (Logistic Regression, Elastic Net), K-Means, PCA, Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

  • Updated Nov 23, 2020
  • Jupyter Notebook
whuawell commented Jun 19, 2019

Used Spark version
Used Spark Job Server version
(Released version, git branch or docker image version)
Deployed mode
(client/cluster on Spark Standalone/YARN/Mesos/EMR or default)
client spark standalone
Actual (wrong) behavior
curl -d "input.string = a b c a b see hello world ssdsds " 'localhost:8090/jobs?appName=test&classPath=spark.jobserver.WordCo

brunocous commented Sep 2, 2020

I have a simple regression task (using a LightGBMRegressor) where I want to penalize negative predictions more than positive ones. Is there a way to achieve this with the default regression LightGBM objectives (see If not, is it somehow possible to define (many example for default LightGBM model) and pass a custom regression objective?

Created by Matei Zaharia

Released May 26, 2014


Related Topics

hadoop scala
You can’t perform that action at this time.