Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to latest Spark master #1

Merged
merged 1,085 commits into from
Feb 5, 2015
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
1085 commits
Select commit Hold shift + click to select a range
fd3a8a1
[SPARK-733] Add documentation on use of accumulators in lazy transfor…
Jan 16, 2015
ee1c1f3
[SPARK-4937][SQL] Adding optimization to simplify the And, Or condit…
scwf Jan 16, 2015
61b427d
[SPARK-5193][SQL] Remove Spark SQL Java-specific API.
rxin Jan 17, 2015
f3bfc76
[SQL][minor] Improved Row documentation.
rxin Jan 17, 2015
c1f3c27
[SPARK-4937][SQL] Comment for the newly optimization rules in `Boolea…
scwf Jan 17, 2015
6999910
[SPARK-5096] Use sbt tasks instead of vals to get hadoop version
marmbrus Jan 18, 2015
e7884bc
[SQL][Minor] Added comments and examples to explain BooleanSimplifica…
rxin Jan 18, 2015
e12b5b6
MAINTENANCE: Automated closing of pull requests.
pwendell Jan 18, 2015
ad16da1
[HOTFIX]: Minor clean up regarding skipped artifacts in build files.
pwendell Jan 18, 2015
1727e08
[SPARK-5279][SQL] Use java.math.BigDecimal as the exposed Decimal type.
rxin Jan 18, 2015
1a200a3
[SQL][Minor] Update sql doc according to data type APIs changes
scwf Jan 18, 2015
1955645
[SQL][minor] Put DataTypes.java in java dir.
rxin Jan 19, 2015
7dbf1fd
[SQL] fix typo in class description
Jan 19, 2015
851b6a9
SPARK-5217 Spark UI should report pending stages during job execution…
ScrapCodes Jan 19, 2015
3453d57
[SPARK-3288] All fields in TaskMetrics should be private and use gett…
Jan 19, 2015
4a4f9cc
[SPARK-5088] Use spark-class for running executors directly
jongyoul Jan 19, 2015
1ac1c1d
MAINTENANCE: Automated closing of pull requests.
pwendell Jan 19, 2015
4432568
[SPARK-5282][mllib]: RowMatrix easily gets int overflow in the memory…
hhbyyh Jan 19, 2015
cd5da42
[SPARK-5284][SQL] Insert into Hive throws NPE when a inner complex ty…
yhuai Jan 19, 2015
2604bc3
[SPARK-5286][SQL] Fail to drop an invalid table when using the data s…
yhuai Jan 19, 2015
74de94e
[SPARK-4504][Examples] fix run-example failure if multiple assembly j…
gvramana Jan 19, 2015
e69fb8c
[SPARK-5214][Core] Add EventLoop and change DAGScheduler to an EventLoop
zsxwing Jan 20, 2015
306ff18
SPARK-5270 [CORE] Provide isEmpty() function in RDD API
srowen Jan 20, 2015
debc031
[SQL][minor] Add a log4j file for catalyst test.
rxin Jan 20, 2015
4afad9c
[SPARK-4803] [streaming] Remove duplicate RegisterReceiver message
ilayaperumalg Jan 20, 2015
9d9294a
[SPARK-5333][Mesos] MesosTaskLaunchData occurs BufferUnderflowException
jongyoul Jan 20, 2015
8140802
[SQL][Minor] Refactors deeply nested FP style code in BooleanSimplifi…
liancheng Jan 20, 2015
c93a57f
SPARK-4660: Use correct class loader in JavaSerializer (copy of PR #3…
jacek-lewandowski Jan 20, 2015
769aced
[SPARK-5329][WebUI] UIWorkloadGenerator should stop SparkContext.
sarutak Jan 20, 2015
23e2554
SPARK-5019 [MLlib] - GaussianMixtureModel exposes instances of Multiv…
tgaloppo Jan 20, 2015
bc20a52
[SPARK-5287][SQL] Add defaultSizeOf to every data type.
yhuai Jan 20, 2015
d181c2a
[SPARK-5323][SQL] Remove Row's Seq inheritance.
rxin Jan 20, 2015
2f82c84
[SPARK-5186] [MLLIB] Vector.equals and Vector.hashCode are very inef…
hhbyyh Jan 20, 2015
9a151ce
[SPARK-5294][WebUI] Hide tables in AllStagePages for "Active Stages, …
sarutak Jan 21, 2015
bad6c57
[SPARK-5275] [Streaming] include python source code
Jan 21, 2015
ec5b0f2
[HOTFIX] Update pom.xml to pull MapR's Hadoop version 2.4.1.
rkannan82 Jan 21, 2015
424d8c6
[SPARK-5297][Streaming] Fix Java file stream type erasure problem
jerryshao Jan 21, 2015
8c06a5f
[SPARK-5336][YARN]spark.executor.cores must not be less than spark.ta…
WangTaoTheTonic Jan 21, 2015
2eeada3
SPARK-1714. Take advantage of AMRMClient APIs to simplify logic in Ya…
sryza Jan 21, 2015
aa1e22b
[MLlib] [SPARK-5301] Missing conversions and operations on IndexedRow…
Jan 21, 2015
7450a99
[SPARK-4749] [mllib]: Allow initializing KMeans clusters using a seed
str-janus Jan 21, 2015
3ee3ab5
[SPARK-5064][GraphX] Add numEdges upperbound validation for R-MAT gra…
Jan 21, 2015
812d367
[SPARK-5244] [SQL] add coalesce() in sql parser
adrian-wang Jan 21, 2015
8361078
[SPARK-5009] [SQL] Long keyword support in SQL Parsers
chenghao-intel Jan 21, 2015
b328ac6
Revert "[SPARK-5244] [SQL] add coalesce() in sql parser"
JoshRosen Jan 21, 2015
ba19689
[SQL] [Minor] Remove deprecated parquet tests
liancheng Jan 21, 2015
3be2a88
[SPARK-4984][CORE][WEBUI] Adding a pop-up containing the full job des…
scwf Jan 21, 2015
9bad062
[SPARK-5355] make SparkConf thread-safe
Jan 22, 2015
27bccc5
[SPARK-5202] [SQL] Add hql variable substitution support
chenghao-intel Jan 22, 2015
ca7910d
[SPARK-3424][MLLIB] cache point distances during k-means|| init
mengxr Jan 22, 2015
fcb3e18
[SPARK-5317]Set BoostingStrategy.defaultParams With Enumeration Algo.…
Peishen-Jia Jan 22, 2015
3027f06
[SPARK-5147][Streaming] Delete the received data WAL log periodically
tdas Jan 22, 2015
246111d
[SPARK-5365][MLlib] Refactor KMeans to reduce redundant data
viirya Jan 22, 2015
820ce03
SPARK-5370. [YARN] Remove some unnecessary synchronization in YarnAll…
sryza Jan 22, 2015
3c3fa63
[SPARK-5233][Streaming] Fix error replaying of WAL introduced bug
jerryshao Jan 23, 2015
e0f7fb7
[SPARK-5315][Streaming] Fix reduceByWindow Java API not work bug
jerryshao Jan 23, 2015
ea74365
[SPARK-3541][MLLIB] New ALS implementation with improved storage
mengxr Jan 23, 2015
cef1f09
[SPARK-5063] More helpful error messages for several invalid operations
JoshRosen Jan 24, 2015
e224dbb
[SPARK-5351][GraphX] Do not use Partitioner.defaultPartitioner as a p…
maropu Jan 24, 2015
09e09c5
[SPARK-5058] Part 2. Typos and broken URL
jongyoul Jan 24, 2015
0d1e67e
[SPARK-5214][Test] Add a test to demonstrate EventLoop can be stopped…
zsxwing Jan 24, 2015
d22ca1e
Closes #4157
rxin Jan 25, 2015
412a58e
Add comment about defaultMinPartitions
idanz Jan 25, 2015
2d9887b
[SPARK-5401] set executor ID before creating MetricsSystem
ryan-williams Jan 25, 2015
aea2548
[SPARK-5402] log executor ID at executor-construction time
ryan-williams Jan 25, 2015
c586b45
SPARK-3852 [DOCS] Document spark.driver.extra* configs
srowen Jan 25, 2015
383425a
SPARK-3782 [CORE] Direct use of log4j in AkkaUtils interferes with ce…
srowen Jan 25, 2015
1c30afd
SPARK-5382: Use SPARK_CONF_DIR in spark-class if it is defined
jacek-lewandowski Jan 25, 2015
9f64357
SPARK-4506 [DOCS] Addendum: Update more docs to reflect that standalo…
srowen Jan 25, 2015
8f5c827
[SPARK-5344][WebUI] HistoryServer cannot recognize that inprogress fi…
sarutak Jan 25, 2015
fc2168f
[SPARK-5326] Show fetch wait time as optional metric in the UI
kayousterhout Jan 26, 2015
0528b85
SPARK-4430 [STREAMING] [TEST] Apache RAT Checks fail spuriously on te…
srowen Jan 26, 2015
8df9435
[SPARK-5268] don't stop CoarseGrainedExecutorBackend for irrelevant D…
CodingCat Jan 26, 2015
8125168
[SPARK-5384][mllib] Vectors.sqdist returns inconsistent results for s…
hhbyyh Jan 26, 2015
1420931
[SPARK-5355] use j.u.c.ConcurrentHashMap instead of TrieMap
Jan 26, 2015
c094c73
[SPARK-5339][BUILD] build/mvn doesn't work because of invalid URL for…
sarutak Jan 26, 2015
54e7b45
SPARK-4147 [CORE] Reduce log4j dependency
srowen Jan 26, 2015
b38034e
Fix command spaces issue in make-distribution.sh
dyross Jan 26, 2015
0497ea5
SPARK-960 [CORE] [TEST] JobCancellationSuite "two jobs sharing the sa…
srowen Jan 26, 2015
661e0fc
[SPARK-5052] Add common/base classes to fix guava methods signatures.
elmer-garduno Jan 27, 2015
f2ba5c6
[SPARK-5119] java.lang.ArrayIndexOutOfBoundsException on trying to tr…
Lewuathe Jan 27, 2015
d6894b1
[SPARK-3726] [MLlib] Allow sampling_rate not equal to 1.0 in RandomFo…
MechCoder Jan 27, 2015
7b0ed79
[SPARK-5419][Mllib] Fix the logic in Vectors.sqdist
viirya Jan 27, 2015
9142674
[SPARK-5321] Support for transposing local matrices
brkyvz Jan 27, 2015
ff356e2
SPARK-5308 [BUILD] MD5 / SHA1 hash format doesn't match standard Mave…
srowen Jan 27, 2015
fdaad4e
[MLlib] fix python example of ALS in guide
Jan 27, 2015
b1b35ca
SPARK-5199. FS read metrics should support CombineFileSplits and trac…
sryza Jan 27, 2015
119f45d
[SPARK-5097][SQL] DataFrame
rxin Jan 28, 2015
d743732
[SPARK-5097][SQL] Test cases for DataFrame expressions.
rxin Jan 28, 2015
37a5e27
[SPARK-4809] Rework Guava library shading.
Jan 28, 2015
661d3f9
[SPARK-5415] bump sbt to version to 0.13.7
ryan-williams Jan 28, 2015
622ff09
MAINTENANCE: Automated closing of pull requests.
pwendell Jan 28, 2015
eeb53bf
[SPARK-3974][MLlib] Distributed Block Matrix Abstractions
brkyvz Jan 28, 2015
0b35fcd
[SPARK-5291][CORE] Add timestamp and reason why an executor is remove…
sarutak Jan 28, 2015
453d799
[SPARK-5361]Multiple Java RDD <-> Python RDD conversions not working …
Jan 28, 2015
c8e934e
[SPARK-5447][SQL] Replaced reference to SchemaRDD with DataFrame.
rxin Jan 28, 2015
406f6d3
SPARK-5458. Refer to aggregateByKey instead of combineByKey in docs
sryza Jan 28, 2015
e902dc4
[SPARK-5188][BUILD] make-distribution.sh should support curl, not onl…
sarutak Jan 28, 2015
9b18009
SPARK-1934 [CORE] "this" reference escape to "selectorThread" during …
srowen Jan 28, 2015
456c11f
[SPARK-5440][pyspark] Add toLocalIterator to pyspark rdd
mnazbro Jan 28, 2015
81f8f34
[SPARK-4955]With executor dynamic scaling enabled,executor shoude be …
lianhuiwang Jan 28, 2015
84b6ecd
[SPARK-5437] Fix DriverSuite and SparkSubmitSuite timeout issues
Jan 28, 2015
d44ee43
[SPARK-5434] [EC2] Preserve spaces in EC2 path
nchammas Jan 28, 2015
a731314
[SPARK-5417] Remove redundant executor-id set() call
ryan-williams Jan 28, 2015
3bead67
[SPARK-4387][PySpark] Refactoring python profiling code to make it ex…
Jan 28, 2015
e023112
[SPARK-5441][pyspark] Make SerDeUtil PairRDD to Python conversions mo…
mnazbro Jan 28, 2015
e80dc1c
[SPARK-4586][MLLIB] Python API for ML pipeline and parameters
mengxr Jan 29, 2015
4ee79c7
[SPARK-5430] move treeReduce and treeAggregate from mllib to core
mengxr Jan 29, 2015
5b9760d
[SPARK-5445][SQL] Made DataFrame dsl usable in Java
rxin Jan 29, 2015
a63be1a
[SPARK-3977] Conversion methods for BlockMatrix to other Distributed …
brkyvz Jan 29, 2015
5ad78f6
[SQL] Various DataFrame DSL update.
rxin Jan 29, 2015
a3dc618
[SPARK-5477] refactor stat.py
mengxr Jan 29, 2015
f9e5694
[SPARK-5466] Add explicit guava dependencies where needed.
Jan 29, 2015
7156322
[SPARK-5445][SQL] Consolidate Java and Scala DSL static methods.
rxin Jan 29, 2015
bce0ba1
[SPARK-5429][SQL] Use javaXML plan serialization for Hive golden answ…
viirya Jan 29, 2015
940f375
[SPARK-5309][SQL] Add support for dictionaries in PrimitiveConverter …
MickDavies Jan 29, 2015
de221ea
[SPARK-4786][SQL]: Parquet filter pushdown for castable types
Jan 29, 2015
fbaf9e0
[SPARK-5367][SQL] Support star expression in udf
scwf Jan 29, 2015
c1b3eeb
[SPARK-5373][SQL] Literal in agg grouping expressions leads to incorr…
scwf Jan 29, 2015
c00d517
[SPARK-4296][SQL] Trims aliases when resolving and checking aggregate…
yhuai Jan 29, 2015
0bb15f2
[SPARK-5464] Fix help() for Python DataFrame instances
JoshRosen Jan 30, 2015
f240fe3
[WIP] [SPARK-3996]: Shade Jetty in Spark deliverables
pwendell Jan 30, 2015
5338772
remove 'return'
Jan 30, 2015
d2071e8
Revert "[WIP] [SPARK-3996]: Shade Jetty in Spark deliverables"
pwendell Jan 30, 2015
ce9c43b
[SQL] DataFrame API improvements
rxin Jan 30, 2015
5c746ee
[SPARK-5395] [PySpark] fix python process leak while coalesce()
Jan 30, 2015
22271f9
[SPARK-5462] [SQL] Use analyzed query plan in DataFrame.apply()
JoshRosen Jan 30, 2015
80def9d
[SQL] Support df("*") to select all columns in a data frame.
rxin Jan 30, 2015
dd4d84c
[SPARK-5322] Added transpose functionality to BlockMatrix
brkyvz Jan 30, 2015
bc1fc9b
[SPARK-5094][MLlib] Add Python API for Gradient Boosted Trees
Jan 30, 2015
6f21dce
[SPARK-5457][SQL] Add missing DSL for ApproxCountDistinct.
ueshin Jan 30, 2015
254eaa4
SPARK-5393. Flood of util.RackResolver log messages after SPARK-1714
sryza Jan 30, 2015
54d9575
[MLLIB] SPARK-4846: throw a RuntimeException and give users hints to …
jinntrance Jan 30, 2015
0a95085
[SPARK-5496][MLLIB] Allow both classification and Classification in A…
mengxr Jan 30, 2015
6ee8338
[SPARK-5486] Added validate method to BlockMatrix
brkyvz Jan 30, 2015
f377431
[SPARK-4259][MLlib]: Add Power Iteration Clustering Algorithm with Ga…
sboeschhuawei Jan 30, 2015
9869773
SPARK-5400 [MLlib] Changed name of GaussianMixtureEM to GaussianMixture
tgaloppo Jan 30, 2015
e643de4
[SPARK-5504] [sql] convertToCatalyst should support nested arrays
jkbradley Jan 30, 2015
740a568
[SPARK-5307] SerializationDebugger
rxin Jan 31, 2015
f54c9f6
[SQL] remove redundant field "childOutput" from execution.Aggregate, …
Jan 31, 2015
6364083
[SPARK-5307] Add a config option for SerializationDebugger.
rxin Jan 31, 2015
34250a6
[MLLIB][SPARK-3278] Monotone (Isotonic) regression using parallel poo…
zapletal-martin Jan 31, 2015
ef8974b
[SPARK-3975] Added support for BlockMatrix addition and multiplication
brkyvz Jan 31, 2015
c84d5a1
SPARK-3359 [CORE] [DOCS] `sbt/sbt unidoc` doesn't work with Java 8
srowen Jan 31, 2015
80bd715
[SPARK-5422] Add support for sending Graphite metrics via UDP
ryan-williams Feb 1, 2015
bdb0680
[SPARK-5207] [MLLIB] StandardScalerModel mean and variance re-use
ogeagla Feb 1, 2015
4a17122
[SPARK-5424][MLLIB] make the new ALS impl take generic ID types
mengxr Feb 1, 2015
883bc88
[SPARK-4859][Core][Streaming] Refactor LiveListenerBus and StreamingL…
zsxwing Feb 2, 2015
ef89b82
[Minor][SQL] Little refactor DataFrame related codes
viirya Feb 2, 2015
c80194b
[SPARK-5155] Build fails with spark-ganglia-lgpl profile
sarutak Feb 2, 2015
1ca0a10
[SPARK-5176] The thrift server does not support cluster mode
tpanningnextcen Feb 2, 2015
7712ed5
[SPARK-1825] Make Windows Spark client work fine with Linux YARN cluster
tsudukim Feb 2, 2015
1b56f1d
[SPARK-5196][SQL] Support `comment` in Create Table Field DDL
OopsOutOfMemory Feb 2, 2015
8cf4a1f
[SPARK-5262] [SPARK-5244] [SQL] add coalesce in SQLParser and widen t…
adrian-wang Feb 2, 2015
ec10032
[SPARK-5465] [SQL] Fixes filter push-down for Parquet data source
liancheng Feb 2, 2015
d85cd4e
[Spark-5406][MLlib] LocalLAPACK mode in RowMatrix.computeSVD should h…
hhbyyh Feb 2, 2015
859f724
[SPARK-4001][MLlib] adding parallel FP-Growth algorithm for frequent …
jackylk Feb 2, 2015
a15f6e3
[SPARK-3996]: Shade Jetty in Spark deliverables
pwendell Feb 2, 2015
9f0a6e1
[SPARK-5353] Log failures in REPL class loading
gzm0 Feb 2, 2015
63dfe21
[SPARK-5478][UI][Minor] Add missing right parentheses
jerryshao Feb 2, 2015
6f34131
SPARK-5492. Thread statistics can break with older Hadoop versions
sryza Feb 2, 2015
c081b21
[MLLIB] SPARK-5491 (ex SPARK-1473): Chi-square feature selection
avulanov Feb 2, 2015
b2047b5
SPARK-4585. Spark dynamic executor allocation should use minExecutors…
sryza Feb 2, 2015
f5e6375
[SPARK-5173]support python application running on yarn cluster mode
lianhuiwang Feb 2, 2015
3f941b6
[Docs] Fix Building Spark link text
nchammas Feb 2, 2015
62a93a1
[SPARK-5530] Add executor container to executorIdToContainer
XuTingjun Feb 2, 2015
683e938
[SPARK-5212][SQL] Add support of schema-less, custom field delimiter …
viirya Feb 2, 2015
e908322
[SPARK-4631][streaming][FIX] Wait for a receiver to start before publ…
dragos Feb 2, 2015
2321dd1
[HOTFIX] Add jetty references to build for YARN module.
pwendell Feb 2, 2015
52f5754
Make sure only owner can read / write to directories created for the …
Jan 21, 2015
bff65b5
Disabling Utils.chmod700 for Windows
MartinWeindel Feb 2, 2015
5a55261
SPARK-5425: Use synchronised methods in system properties to create S…
jacek-lewandowski Feb 2, 2015
842d000
[SPARK-5461] [graphx] Add isCheckpointed, getCheckpointedFiles method…
jkbradley Feb 2, 2015
8309349
SPARK-5500. Document that feeding hadoopFile into a shuffle operation…
sryza Feb 2, 2015
1646f89
[SPARK-4508] [SQL] build native date type to conform behavior to Hive
adrian-wang Feb 2, 2015
46d50f1
[SPARK-5513][MLLIB] Add nonnegative option to ml's ALS
mengxr Feb 2, 2015
b1aa8fe
[SPARK-2309][MLlib] Multinomial Logistic Regression
Feb 2, 2015
dca6faa
[SPARK-5195][sql]Update HiveMetastoreCatalog.scala(override the Metas…
seayi Feb 3, 2015
8aa3cff
[SPARK-5514] DataFrame.collect should call executeCollect
rxin Feb 3, 2015
f133dec
[SPARK-5534] [graphx] Graph getStorageLevel fix
jkbradley Feb 3, 2015
ef65cf0
[SPARK-5540] hide ALS.solveLeastSquares
mengxr Feb 3, 2015
cfea300
Spark 3883: SSL support for HttpServer and Akka
jacek-lewandowski Feb 3, 2015
eccb9fb
Revert "[SPARK-4508] [SQL] build native date type to conform behavior…
pwendell Feb 3, 2015
554403f
[SQL] Improve DataFrame API error reporting
rxin Feb 3, 2015
0561c45
[SPARK-5154] [PySpark] [Streaming] Kafka streaming support in Python
Feb 3, 2015
1bcd465
[SPARK-5512][Mllib] Run the PIC algorithm with initial vector suggect…
viirya Feb 3, 2015
8f471a6
[SPARK-5472][SQL] A JDBC data source for Spark SQL.
tmyklebu Feb 3, 2015
cb39f12
[SPARK-5543][WebUI] Remove unused import JsonUtil from from JsonProtocol
nemccarthy Feb 3, 2015
0ef38f5
SPARK-5542: Decouple publishing, packaging, and tagging in release sc…
pwendell Feb 3, 2015
7930d2b
SPARK-3996: Add jetty servlet and continuations.
pwendell Feb 3, 2015
60f67e7
[Doc] Minor: Fixes several formatting issues
liancheng Feb 3, 2015
c306555
[SPARK-5219][Core] Add locks to avoid scheduling race conditions
zsxwing Feb 3, 2015
eb0da6c
[SPARK-4979][MLLIB] Streaming logisitic regression
freeman-lab Feb 3, 2015
c31c36c
[SPARK-3778] newAPIHadoopRDD doesn't properly pass credentials for se…
tgravescs Feb 3, 2015
50a1a87
[SPARK-5012][MLLib][PySpark]Python API for Gaussian Mixture Model
FlytxtRnD Feb 3, 2015
13531dd
[SPARK-5501][SPARK-5420][SQL] Write support for the data source API
yhuai Feb 3, 2015
b8ebebe
[SPARK-5414] Add SparkFirehoseListener class for consuming all SparkL…
JoshRosen Feb 3, 2015
0cc7b88
[SPARK-5536] replace old ALS implementation by the new one
mengxr Feb 3, 2015
980764f
[SPARK-1405] [mllib] Latent Dirichlet Allocation (LDA) using EM
jkbradley Feb 3, 2015
659329f
[minor] update streaming linear algorithms
mengxr Feb 3, 2015
37df330
[SQL][DataFrame] Remove DataFrameApi, ExpressionApi, and GroupedDataF…
rxin Feb 3, 2015
523a935
[SPARK-5551][SQL] Create type alias for SchemaRDD for source backward…
rxin Feb 3, 2015
bebf4c4
[SPARK-5549] Define TaskContext interface in Scala.
rxin Feb 3, 2015
f7948f3
Minor: Fix TaskContext deprecated annotations.
rxin Feb 3, 2015
4204a12
[SQL] DataFrame API update
rxin Feb 3, 2015
0c20ce6
[SPARK-4987] [SQL] parquet timestamp type support
adrian-wang Feb 3, 2015
ca7a6cd
[SPARK-5550] [SQL] Support the case insensitive for UDF
chenghao-intel Feb 3, 2015
5adbb39
[SPARK-5383][SQL] Support alias for udtfs
scwf Feb 3, 2015
db821ed
[SPARK-4508] [SQL] build native date type to conform behavior to Hive
adrian-wang Feb 3, 2015
681f9df
[SPARK-5153][Streaming][Test] Increased timeout to deal with flaky Ka…
tdas Feb 3, 2015
1e8b539
[STREAMING] SPARK-4986 Wait for receivers to deregister and receiver …
Feb 3, 2015
068c0e2
[SPARK-5554] [SQL] [PySpark] add more tests for DataFrame Python API
Feb 4, 2015
e380d2d
[SPARK-5520][MLlib] Make FP-Growth implementation take generic item t…
jackylk Feb 4, 2015
1077f2e
[SPARK-5578][SQL][DataFrame] Provide a convenient way for Scala users…
rxin Feb 4, 2015
d37978d
[SPARK-4795][Core] Redesign the "primitive type => Writable" implicit…
zsxwing Feb 4, 2015
eb15631
[FIX][MLLIB] fix seed handling in Python GMM
mengxr Feb 4, 2015
40c4cb2
[SPARK-5579][SQL][DataFrame] Support for project/filter using SQL exp…
rxin Feb 4, 2015
242b4f0
[SPARK-4969][STREAMING][PYTHON] Add binaryRecords to streaming
freeman-lab Feb 4, 2015
83de71c
[SPARK-4939] revive offers periodically in LocalBackend
Feb 4, 2015
6aed719
[SPARK-5341] Use maven coordinates as dependencies in spark-shell and…
brkyvz Feb 4, 2015
4cf4cba
[SPARK-5379][Streaming] Add awaitTerminationOrTimeout
zsxwing Feb 4, 2015
a74cbbf
[Minor] Fix incorrect warning log
viirya Feb 4, 2015
5aa0f21
[SPARK-5574] use given name prefix in dir
squito Feb 4, 2015
38a416f
[SPARK-5585] Flaky test in MLlib python
Feb 4, 2015
ac0b2b7
[SPARK-5588] [SQL] support select/filter by SQL expression
Feb 4, 2015
b0c0021
[SPARK-4964] [Streaming] Exactly-once semantics for Kafka
koeninger Feb 4, 2015
f0500f9
[SPARK-4707][STREAMING] Reliable Kafka Receiver can lose data if the …
harishreedharan Feb 4, 2015
0a89b15
[SPARK-4939] move to next locality when no pending tasks
Feb 4, 2015
a9f0db1
[SPARK-5591][SQL] Fix NoSuchObjectException for CTAS
scwf Feb 4, 2015
b90dd39
[SPARK-5587][SQL] Support change database owner
scwf Feb 4, 2015
424cb69
[SPARK-5426][SQL] Add SparkSQL Java API helper methods.
kul Feb 4, 2015
417d111
[SPARK-5367][SQL] Support star expression in udfs
scwf Feb 4, 2015
b73d5ff
[SQL][Hiveconsole] Bring hive console code up to date and update READ…
OopsOutOfMemory Feb 4, 2015
0d81645
[SQL] Correct the default size of TimestampType and expose NumericType
yhuai Feb 4, 2015
548c9c2
[SQL] Use HiveContext's sessionState in HiveMetastoreCatalog.hiveDefa…
yhuai Feb 4, 2015
e0490e2
[SPARK-5118][SQL] Fix: create table test stored as parquet as select ..
guowei2 Feb 4, 2015
dc101b0
[SPARK-5577] Python udf for DataFrame
Feb 4, 2015
9a7ce70
[SPARK-5411] Allow SparkListeners to be specified in SparkConf and lo…
JoshRosen Feb 5, 2015
1fbd124
[SPARK-5605][SQL][DF] Allow using String to specify colum name in DSL…
rxin Feb 5, 2015
dba98bf
[SPARK-4520] [SQL] This pr fixes the ArrayIndexOutOfBoundsException a…
Feb 5, 2015
6b4c7f0
[SQL][DataFrame] Minor cleanup.
rxin Feb 5, 2015
206f9bc
[SPARK-5538][SQL] Fix flaky CachedTableSuite
rxin Feb 5, 2015
84acd08
[SPARK-5602][SQL] Better support for creating DataFrame from local da…
rxin Feb 5, 2015
c23ac03
SPARK-5607: Update to Kryo 2.24.0 to avoid including objenesis 1.2.
pwendell Feb 5, 2015
975bcef
[SPARK-5596] [mllib] ML model import/export for GLMs, NaiveBayes
jkbradley Feb 5, 2015
db34690
[SPARK-5599] Check MLlib public APIs for 1.3
mengxr Feb 5, 2015
9d3a75e
[SPARK-5606][SQL] Support plus sign in HiveContext
watermen Feb 5, 2015
7d789e1
[SPARK-5612][SQL] Move DataFrame implicit functions into SQLContext.i…
rxin Feb 5, 2015
c3ba4d4
[MLlib] Minor: UDF style update.
rxin Feb 5, 2015
6580929
[HOTFIX] MLlib build break.
rxin Feb 5, 2015
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
The diff you're trying to view is too large. We only load the first 3000 changed files.
2 changes: 2 additions & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
*.bat text eol=crlf
*.cmd text eol=crlf
13 changes: 10 additions & 3 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,18 +5,22 @@
*.ipr
*.iml
*.iws
*.pyc
.idea/
.idea_modules/
sbt/*.jar
build/*.jar
.settings
.cache
cache
.generated-mima*
/build/
work/
out/
.DS_Store
third_party/libmesos.so
third_party/libmesos.dylib
build/apache-maven*
build/zinc*
build/scala*
conf/java-opts
conf/*.sh
conf/*.cmd
Expand Down Expand Up @@ -49,9 +53,12 @@ dependency-reduced-pom.xml
checkpoint
derby.log
dist/
spark-*-bin.tar.gz
dev/create-release/*txt
dev/create-release/*final
spark-*-bin-*.tgz
unit-tests.log
/lib/
ec2/lib/
rat-results.txt
scalastyle.txt
scalastyle-output.xml
Expand Down
4 changes: 4 additions & 0 deletions .rat-excludes
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
target
.gitignore
.gitattributes
.project
.classpath
.mima-excludes
Expand Down Expand Up @@ -43,11 +44,13 @@ SparkImports.scala
SparkJLineCompletion.scala
SparkJLineReader.scala
SparkMemberHandlers.scala
SparkReplReporter.scala
sbt
sbt-launch-lib.bash
plugins.sbt
work
.*\.q
.*\.qv
golden
test.out/*
.*iml
Expand All @@ -61,3 +64,4 @@ dist/*
logs
.*scalastyle-output.xml
.*dependency-reduced-pom.xml
known_translations
36 changes: 22 additions & 14 deletions LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -646,7 +646,8 @@ THE SOFTWARE.

========================================================================
For Scala Interpreter classes (all .scala files in repl/src/main/scala
except for Main.Scala, SparkHelper.scala and ExecutorClassLoader.scala):
except for Main.Scala, SparkHelper.scala and ExecutorClassLoader.scala),
and for SerializableMapWrapper in JavaUtils.scala:
========================================================================

Copyright (c) 2002-2013 EPFL
Expand Down Expand Up @@ -712,18 +713,6 @@ THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

========================================================================
For colt:
========================================================================

Copyright (c) 1999 CERN - European Organization for Nuclear Research.
Permission to use, copy, modify, distribute and sell this software and its documentation for any purpose is hereby granted without fee, provided that the above copyright notice appear in all copies and that both that copyright notice and this permission notice appear in supporting documentation. CERN makes no representations about the suitability of this software for any purpose. It is provided "as is" without expressed or implied warranty.

Packages hep.aida.*

Written by Pavel Binko, Dino Ferrero Merlino, Wolfgang Hoschek, Tony Johnson, Andreas Pfeiffer, and others. Check the FreeHEP home page for more info. Permission to use and/or redistribute this work is granted under the terms of the LGPL License, with the exception that any usage related to military applications is expressly forbidden. The software and documentation made available under the terms of this license are provided with no warranty.


========================================================================
For SnapTree:
========================================================================
Expand Down Expand Up @@ -766,7 +755,7 @@ SUCH DAMAGE.


========================================================================
For Timsort (core/src/main/java/org/apache/spark/util/collection/Sorter.java):
For Timsort (core/src/main/java/org/apache/spark/util/collection/TimSort.java):
========================================================================
Copyright (C) 2008 The Android Open Source Project

Expand All @@ -783,6 +772,25 @@ See the License for the specific language governing permissions and
limitations under the License.


========================================================================
For LimitedInputStream
(network/common/src/main/java/org/apache/spark/network/util/LimitedInputStream.java):
========================================================================
Copyright (C) 2007 The Guava Authors

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.


========================================================================
BSD-style licenses
========================================================================
Expand Down
7 changes: 4 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,8 @@ and Spark Streaming for stream processing.
## Online Documentation

You can find the latest Spark documentation, including a programming
guide, on the [project web page](http://spark.apache.org/documentation.html).
guide, on the [project web page](http://spark.apache.org/documentation.html)
and [project wiki](https://cwiki.apache.org/confluence/display/SPARK).
This README file only contains basic setup instructions.

## Building Spark
Expand All @@ -25,7 +26,7 @@ To build Spark and its example programs, run:

(You do not need to do this if you downloaded a pre-built package.)
More detailed documentation is available from the project site, at
["Building Spark with Maven"](http://spark.apache.org/docs/latest/building-with-maven.html).
["Building Spark"](http://spark.apache.org/docs/latest/building-spark.html).

## Interactive Scala Shell

Expand Down Expand Up @@ -84,7 +85,7 @@ storage systems. Because the protocols have changed in different versions of
Hadoop, you must build Spark against the same version that your cluster runs.

Please refer to the build documentation at
["Specifying the Hadoop Version"](http://spark.apache.org/docs/latest/building-spark.html#specifying-the-hadoop-version)
["Specifying the Hadoop Version"](http://spark.apache.org/docs/latest/building-with-maven.html#specifying-the-hadoop-version)
for detailed guidance on building for a particular distribution of Hadoop, including
building for particular Hive and Hive Thriftserver distributions. See also
["Third Party Hadoop Distributions"](http://spark.apache.org/docs/latest/hadoop-third-party-distributions.html)
Expand Down
65 changes: 30 additions & 35 deletions assembly/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent</artifactId>
<version>1.2.0-SNAPSHOT</version>
<version>1.3.0-SNAPSHOT</version>
<relativePath>../pom.xml</relativePath>
</parent>

Expand All @@ -43,12 +43,6 @@
</properties>

<dependencies>
<!-- Promote Guava to compile scope in this module so it's included while shading. -->
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_${scala.binary.version}</artifactId>
Expand All @@ -66,22 +60,22 @@
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-repl_${scala.binary.version}</artifactId>
<artifactId>spark-streaming_${scala.binary.version}</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_${scala.binary.version}</artifactId>
<artifactId>spark-graphx_${scala.binary.version}</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-graphx_${scala.binary.version}</artifactId>
<artifactId>spark-sql_${scala.binary.version}</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_${scala.binary.version}</artifactId>
<artifactId>spark-repl_${scala.binary.version}</artifactId>
<version>${project.version}</version>
</dependency>
</dependencies>
Expand Down Expand Up @@ -133,20 +127,6 @@
<goal>shade</goal>
</goals>
<configuration>
<relocations>
<relocation>
<pattern>com.google</pattern>
<shadedPattern>org.spark-project.guava</shadedPattern>
<includes>
<include>com.google.common.**</include>
</includes>
<excludes>
<exclude>com/google/common/base/Absent*</exclude>
<exclude>com/google/common/base/Optional*</exclude>
<exclude>com/google/common/base/Present*</exclude>
</excludes>
</relocation>
</relocations>
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer" />
<transformer implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
Expand All @@ -169,16 +149,6 @@
</build>

<profiles>
<profile>
<id>yarn-alpha</id>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-yarn-alpha_${scala.binary.version}</artifactId>
<version>${project.version}</version>
</dependency>
</dependencies>
</profile>
<profile>
<id>yarn</id>
<dependencies>
Expand All @@ -197,6 +167,11 @@
<artifactId>spark-hive_${scala.binary.version}</artifactId>
<version>${project.version}</version>
</dependency>
</dependencies>
</profile>
<profile>
<id>hive-thriftserver</id>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive-thriftserver_${scala.binary.version}</artifactId>
Expand Down Expand Up @@ -359,5 +334,25 @@
</dependency>
</dependencies>
</profile>

<!-- Profiles that disable inclusion of certain dependencies. -->
<profile>
<id>hadoop-provided</id>
<properties>
<hadoop.deps.scope>provided</hadoop.deps.scope>
</properties>
</profile>
<profile>
<id>hive-provided</id>
<properties>
<hive.deps.scope>provided</hive.deps.scope>
</properties>
</profile>
<profile>
<id>parquet-provided</id>
<properties>
<parquet.deps.scope>provided</parquet.deps.scope>
</properties>
</profile>
</profiles>
</project>
17 changes: 1 addition & 16 deletions bagel/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent</artifactId>
<version>1.2.0-SNAPSHOT</version>
<version>1.3.0-SNAPSHOT</version>
<relativePath>../pom.xml</relativePath>
</parent>

Expand All @@ -40,15 +40,6 @@
<artifactId>spark-core_${scala.binary.version}</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-server</artifactId>
</dependency>
<dependency>
<groupId>org.scalatest</groupId>
<artifactId>scalatest_${scala.binary.version}</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.scalacheck</groupId>
<artifactId>scalacheck_${scala.binary.version}</artifactId>
Expand All @@ -58,11 +49,5 @@
<build>
<outputDirectory>target/scala-${scala.binary.version}/classes</outputDirectory>
<testOutputDirectory>target/scala-${scala.binary.version}/test-classes</testOutputDirectory>
<plugins>
<plugin>
<groupId>org.scalatest</groupId>
<artifactId>scalatest-maven-plugin</artifactId>
</plugin>
</plugins>
</build>
</project>
4 changes: 2 additions & 2 deletions bagel/src/test/resources/log4j.properties
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,10 @@
# limitations under the License.
#

# Set everything to be logged to the file bagel/target/unit-tests.log
# Set everything to be logged to the file target/unit-tests.log
log4j.rootCategory=INFO, file
log4j.appender.file=org.apache.log4j.FileAppender
log4j.appender.file.append=false
log4j.appender.file.append=true
log4j.appender.file.file=target/unit-tests.log
log4j.appender.file.layout=org.apache.log4j.PatternLayout
log4j.appender.file.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss.SSS} %t %p %c{1}: %m%n
Expand Down
21 changes: 21 additions & 0 deletions bin/beeline.cmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
@echo off

rem
rem Licensed to the Apache Software Foundation (ASF) under one or more
rem contributor license agreements. See the NOTICE file distributed with
rem this work for additional information regarding copyright ownership.
rem The ASF licenses this file to You under the Apache License, Version 2.0
rem (the "License"); you may not use this file except in compliance with
rem the License. You may obtain a copy of the License at
rem
rem http://www.apache.org/licenses/LICENSE-2.0
rem
rem Unless required by applicable law or agreed to in writing, software
rem distributed under the License is distributed on an "AS IS" BASIS,
rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
rem See the License for the specific language governing permissions and
rem limitations under the License.
rem

set SPARK_HOME=%~dp0..
cmd /V /E /C %SPARK_HOME%\bin\spark-class.cmd org.apache.hive.beeline.BeeLine %*
Loading