Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update #5

Closed
wants to merge 399 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
399 commits
Select commit Hold shift + click to select a range
d7e80c2
[SPARK-2790] [PySpark] fix zip with serializers which have different …
davies Aug 19, 2014
825d4fe
[SPARK-3136][MLLIB] Create Java-friendly methods in RandomRDDs
mengxr Aug 19, 2014
8b9dc99
[SPARK-2468] Netty based block server / client module
rxin Aug 20, 2014
1870dba
[MLLIB] minor update to word2vec
mengxr Aug 20, 2014
c7252b0
[SPARK-3112][MLLIB] Add documentation and example for StreamingLR
freeman-lab Aug 20, 2014
0e3ab94
[SQL] add note of use synchronizedMap in SQLConf
scwf Aug 20, 2014
068b6fe
[SPARK-3130][MLLIB] detect negative values in naive Bayes
mengxr Aug 20, 2014
fce5c0f
[HOTFIX][Streaming][MLlib] use temp folder for checkpoint
mengxr Aug 20, 2014
8adfbc2
[SPARK-3119] Re-implementation of TorrentBroadcast.
rxin Aug 20, 2014
0a984aa
[SPARK-3142][MLLIB] output shuffle data directly in Word2Vec
mengxr Aug 20, 2014
ebcb94f
[SPARK-2974] [SPARK-2975] Fix two bugs related to spark.local.dirs
JoshRosen Aug 20, 2014
8a74e4b
[DOCS] Fixed wrong links
giwa Aug 20, 2014
0a7ef63
[SPARK-3141] [PySpark] fix sortByKey() with take()
davies Aug 20, 2014
8c5a222
[SPARK-3054][STREAMING] Add unit tests for Spark Sink.
harishreedharan Aug 20, 2014
f2f26c2
SPARK-3092 [SQL]: Always include the thriftserver when -Phive is enab…
pwendell Aug 20, 2014
ceb1983
BUILD: Bump Hadoop versions in the release build.
pwendell Aug 20, 2014
cf46e72
[SPARK-3126][SPARK-3127][SQL] Fixed HiveThriftServer2Suite
liancheng Aug 20, 2014
0ea46ac
[SPARK-3062] [SPARK-2970] [SQL] spark-sql script ends with IOExceptio…
sarutak Aug 20, 2014
c1ba4cd
[SPARK-3149] Connection establishment information is not enough.
sarutak Aug 20, 2014
b3ec51b
[SPARK-2849] Handle driver configs separately in client mode
andrewor14 Aug 20, 2014
fb60bec
[SPARK-2298] Encode stage attempt in SparkListener & UI.
rxin Aug 20, 2014
a2e658d
[SPARK-2967][SQL] Fix sort based shuffle for spark sql.
marmbrus Aug 20, 2014
a1e8b1b
SPARK_LOGFILE and SPARK_ROOT_LOGGER no longer need in spark-daemon.sh
scwf Aug 20, 2014
d9e9414
[SPARK-2846][SQL] Add configureInputJobPropertiesForStorageHandler to…
alexoss68 Aug 20, 2014
c9f7439
[SPARK-2848] Shade Guava in uber-jars.
Aug 20, 2014
ba3c730
[SPARK-3140] Clarify confusing PySpark exception message
andrewor14 Aug 21, 2014
e157187
[SPARK-3143][MLLIB] add tf-idf user guide
mengxr Aug 21, 2014
e0f9462
[SPARK-2843][MLLIB] add a section about regularization parameter in ALS
mengxr Aug 21, 2014
050f8d0
[SPARK-2840] [mllib] DecisionTree doc update (Java, Python examples)
jkbradley Aug 21, 2014
220c2d7
[SPARK-2742][yarn] delete useless variables
XuTingjun Aug 22, 2014
a5219db
Link to Contributing to Spark wiki page on README.md.
rxin Aug 23, 2014
3004074
[SPARK-3169] Removed dependency on spark streaming test from spark fl…
tdas Aug 23, 2014
323cd92
[SPARK-2963] REGRESSION - The description about how to build for usin…
sarutak Aug 23, 2014
f3d65cd
[SPARK-3068]remove MaxPermSize option for jvm 1.8
adrian-wang Aug 23, 2014
76bb044
[Minor] fix typo
viirya Aug 23, 2014
2fb1c72
[SQL] Make functionRegistry in HiveContext transient.
yhuai Aug 23, 2014
7e191fe
[SPARK-2554][SQL] CountDistinct partial aggregation and object alloca…
marmbrus Aug 23, 2014
3519b5e
[SPARK-2967][SQL] Follow-up: Also copy hash expressions in sort base…
marmbrus Aug 23, 2014
db436e3
[SPARK-2871] [PySpark] add `key` argument for max(), min() and top(n)
davies Aug 24, 2014
8df4dad
[SPARK-2871] [PySpark] add approx API for RDD
davies Aug 24, 2014
8861cdf
Clean unused code in SortShuffleWriter
colorant Aug 24, 2014
ded6796
[SPARK-3192] Some scripts have 2 space indentation but other scripts …
sarutak Aug 24, 2014
572952a
[SPARK-2841][MLlib] Documentation for feature transformations
Aug 25, 2014
b1b2030
[MLlib][SPARK-2997] Update SVD documentation to reflect roughly square
rezazadeh Aug 25, 2014
fb0db77
[SPARK-2871] [PySpark] add zipWithIndex() and zipWithUniqueId()
davies Aug 25, 2014
220f413
[SPARK-2495][MLLIB] make KMeans constructor public
mengxr Aug 25, 2014
cd30db5
SPARK-2798 [BUILD] Correct several small errors in Flume module pom.x…
srowen Aug 25, 2014
cc40a70
SPARK-3180 - Better control of security groups
Aug 25, 2014
fd8ace2
[FIX] fix error message in sendMessageReliably
mengxr Aug 25, 2014
805fec8
Fixed a typo in docs/running-on-mesos.md
liancheng Aug 25, 2014
d299e2b
[SPARK-3204][SQL] MaxOf would be foldable if both left and right are …
ueshin Aug 25, 2014
cae9414
[SPARK-2929][SQL] Refactored Thrift server and CLI suites
liancheng Aug 25, 2014
156eb39
[SPARK-3058] [SQL] Support EXTENDED for EXPLAIN
chenghao-intel Aug 26, 2014
507a1b5
[SQL] logWarning should be logInfo in getResultSetSchema
scwf Aug 26, 2014
4243bb6
[SPARK-3011][SQL] _temporary directory should be filtered out by sqlC…
josephsu Aug 26, 2014
9f04db1
SPARK-2481: The environment variables SPARK_HISTORY_OPTS is covered i…
witgo Aug 26, 2014
62f5009
[SPARK-2976] Replace tabs with spaces
sarutak Aug 26, 2014
52fbdc2
[Spark-3222] [SQL] Cross join support in HiveQL
adrian-wang Aug 26, 2014
b21ae5b
[SPARK-2886] Use more specific actor system name than "spark"
andrewor14 Aug 26, 2014
8856c3d
[SPARK-3131][SQL] Allow user to set parquet compression codec for wri…
chutium Aug 26, 2014
3cedc4f
[SPARK-2871] [PySpark] add histgram() API
davies Aug 26, 2014
98c2bb0
[SPARK-2969][SQL] Make ScalaReflection be able to handle ArrayType.co…
ueshin Aug 26, 2014
6b5584e
[SPARK-3063][SQL] ExistingRdd should convert Map to catalyst Map.
ueshin Aug 26, 2014
adbd5c1
[SPARK-3226][MLLIB] doc update for native libraries
mengxr Aug 26, 2014
1208f72
[SPARK-2839][MLlib] Stats Toolkit documentation updated
brkyvz Aug 26, 2014
c4787a3
[SPARK-3194][SQL] Add AttributeSet to fix bugs with invalid compariso…
marmbrus Aug 26, 2014
f1e71d4
[SPARK-3073] [PySpark] use external sort in sortBy() and sortByKey()
davies Aug 26, 2014
2ffd329
[SPARK-3225]Typo in script
WangTaoTheTonic Aug 27, 2014
faeb9c0
[SPARK-2964] [SQL] Remove duplicated code from spark-sql and start-th…
liancheng Aug 27, 2014
73b3089
[Docs] Run tests like in contributing guide
nchammas Aug 27, 2014
727cb25
[SPARK-3036][SPARK-3037][SQL] Add MapType/ArrayType containing null v…
ueshin Aug 27, 2014
be043e3
[SPARK-3240] Adding known issue for MESOS-1688
MartinWeindel Aug 27, 2014
d834547
Fix unclosed HTML tag in Yarn docs.
JoshRosen Aug 27, 2014
ee91eb8
Manually close some old pull requests
mateiz Aug 27, 2014
e70aff6
Manually close old pull requests
mateiz Aug 27, 2014
bf71905
[SPARK-3224] FetchFailed reduce stages should only show up once in fa…
rxin Aug 27, 2014
7557c4c
[SPARK-3167] Handle special driver configs in Windows
andrewor14 Aug 27, 2014
9d65f27
HOTFIX: Minor typo in conf template
pwendell Aug 27, 2014
3e2864e
[SPARK-3139] Made ContextCleaner to not block on shuffles
tdas Aug 27, 2014
e1139dd
[SPARK-3237][SQL] Fix parquet filters with UDFs
marmbrus Aug 27, 2014
43dfc84
[SPARK-2830][MLLIB] doc update for 1.1
mengxr Aug 27, 2014
171a41c
[SPARK-3227] [mllib] Added migration guide for v1.0 to v1.1
jkbradley Aug 27, 2014
6f671d0
[SPARK-3154][STREAMING] Make FlumePollingInputDStream shutdown cleaner.
harishreedharan Aug 27, 2014
b92d823
[SPARK-2933] [yarn] Refactor and cleanup Yarn AM code.
Aug 27, 2014
d8298c4
[SPARK-3170][CORE][BUG]:RDD info loss in "StorageTab" and "ExecutorTab"
uncleGen Aug 27, 2014
5ac4093
SPARK-3259 - User data should be given to the master
Aug 27, 2014
3b5eb70
[SPARK-3118][SQL]add "SHOW TBLPROPERTIES tblname;" and "SHOW COLUMNS …
wangxiaojing Aug 27, 2014
4238c17
[SPARK-3197] [SQL] Reduce the Expression tree object creations for ag…
chenghao-intel Aug 27, 2014
191d7cf
[SPARK-3256] Added support for :cp <jar> that was broken in Scala 2.1…
Aug 27, 2014
48f4278
[SPARK-3138][SQL] sqlContext.parquetFile should be able to take a sin…
chutium Aug 27, 2014
4fa2fda
[SPARK-2871] [PySpark] add RDD.lookup(key)
davies Aug 27, 2014
7faf755
Spark-3213 Fixes issue with spark-ec2 not detecting slaves created wi…
vidaha Aug 27, 2014
63a053a
[SPARK-3243] Don't use stale spark-driver.* system properties
andrewor14 Aug 27, 2014
28d41d6
[SPARK-3252][SQL] Add missing condition for test
viirya Aug 27, 2014
cc275f4
[SQL] [SPARK-3236] Reading Parquet tables from Metastore mangles loca…
aarondav Aug 27, 2014
6525350
[SPARK-3065][SQL] Add locale setting to fix results do not match for …
luogankun Aug 27, 2014
7d2a7a9
[SPARK-3235][SQL] Ensure in-memory tables don't always broadcast.
marmbrus Aug 27, 2014
8712653
HOTFIX: Don't build with YARN support for Mapr3
pwendell Aug 27, 2014
64d8ecb
Add line continuation for script to work w/ py2.7.5
mattf Aug 27, 2014
b86277c
[SPARK-3271] delete unused methods in Utils
scwf Aug 28, 2014
f38fab9
SPARK-3265 Allow using custom ipython executable with pyspark
robbles Aug 28, 2014
dafe343
[HOTFIX] Wait for EOF only for the PySpark shell
andrewor14 Aug 28, 2014
024178c
[HOTFIX][SQL] Remove cleaning of UDFs
marmbrus Aug 28, 2014
68f75dc
[SQL] Fixed 2 comment typos in SQLConf
liancheng Aug 28, 2014
76e3ba4
[SPARK-3230][SQL] Fix udfs that return structs
marmbrus Aug 28, 2014
70d8146
[SPARK-3150] Fix NullPointerException in in Spark recovery: Add initi…
tanyatik Aug 28, 2014
6d392b3
[SPARK-2608][Core] Fixed command line option passing issue over Mesos…
liancheng Aug 27, 2014
41dc598
[SPARK-3264] Allow users to set executor Spark home in Mesos
andrewor14 Aug 28, 2014
be53c54
[SPARK-3281] Remove Netty specific code in BlockManager / shuffle
rxin Aug 28, 2014
3901245
[SPARK-3285] [examples] Using values.sum is easier to understand than…
watermen Aug 28, 2014
96df929
[SPARK-3190] Avoid overflow in VertexRDD.count()
ankurdave Aug 28, 2014
92af231
SPARK-3082. yarn.Client.logClusterResourceDetails throws NPE if reque…
sryza Aug 28, 2014
a46b8f2
[SPARK-3277] Fix external spilling with LZ4 assertion error
andrewor14 Aug 29, 2014
3c517a8
[Spark QA] Link to console output on test time out
nchammas Aug 29, 2014
665e71d
[SPARK-1912] Lazily initialize buffers for local shuffle blocks.
rxin Aug 29, 2014
27df6ce
[SPARK-3279] Remove useless field variable in ApplicationMaster
sarutak Aug 29, 2014
e248328
[SPARK-3307] [PySpark] Fix doc string of SparkContext.broadcast()
davies Aug 29, 2014
53aa831
[Docs] SQL doc formatting and typo fixes
nchammas Aug 29, 2014
2f1519d
SPARK-2813: [SQL] Implement SQRT() directly in Spark SQL
willb Aug 29, 2014
287c0ac
[SPARK-3234][Build] Fixed environment variables that rely on deprecat…
liancheng Aug 29, 2014
dc4d577
[SPARK-3198] [SQL] Remove the TreeNode.id
chenghao-intel Aug 29, 2014
b1eccfc
[SQL] Turns on in-memory columnar compression in HiveCompatibilitySuite
liancheng Aug 29, 2014
d94a44d
[SPARK-3269][SQL] Decreases initial buffer size for row set to preven…
liancheng Aug 29, 2014
634d04b
[SPARK-3291][SQL]TestcaseName in createQueryTest should not contain ":"
Aug 29, 2014
98ddbe6
[SPARK-3173][SQL] Timestamp support in the parser
byF Aug 29, 2014
1390176
[SPARK-3296][mllib] spark-example should be run-example in head notat…
scwf Aug 30, 2014
32b18dd
[SPARK-3320][SQL] Made batched in-memory column buffer building work …
liancheng Aug 30, 2014
a004a8d
BUILD: Adding back CDH4 as per user requests
pwendell Aug 30, 2014
7e662af
[SPARK-3305] Remove unused import from UI classes.
sarutak Aug 30, 2014
acea928
[SPARK-2288] Hide ShuffleBlockManager behind ShuffleManager
colorant Aug 30, 2014
d90434c
Manually close old pull requests
rxin Aug 30, 2014
b6cf134
[SPARK-2889] Create Hadoop config objects consistently.
Aug 30, 2014
ba78383
SPARK-3318: Documentation update in addFile on how to use SparkFiles.get
holdenk Aug 30, 2014
9b8c228
MAINTENANCE: Automated closing of pull requests.
pwendell Aug 31, 2014
c567a68
[Spark QA] only check code files for new classes
nchammas Aug 31, 2014
725715c
[SPARK-3010] fix redundant conditional
scwf Aug 31, 2014
1f98add
MAINTENANCE: Automated closing of pull requests.
pwendell Sep 2, 2014
db16067
[SPARK-3135] Avoid extra mem copy in TorrentBroadcast via ByteArrayCh…
rxin Sep 2, 2014
44d3a6a
[SPARK-3342] Add SSDs to block device mapping
darabos Sep 2, 2014
fbf2678
SPARK-2636: Expose job ID in JobWaiter API
Sep 2, 2014
0f16b23
[MLlib] Squash bug in IndexedRowMatrix
rezazadeh Sep 2, 2014
32ec0a8
SPARK-3331 [BUILD] PEP8 tests fail because they check unzipped py4j code
srowen Sep 2, 2014
378b231
[SPARK-3061] Fix Maven build under Windows
JoshRosen Sep 2, 2014
8f1f9aa
[SPARK-1919] Fix Windows spark-shell --jars
andrewor14 Sep 2, 2014
066f31a
[SPARK-3347] [yarn] Fix yarn-alpha compilation.
Sep 2, 2014
81b9d5b
SPARK-3052. Misleading and spurious FileSystem closed errors whenever…
sryza Sep 2, 2014
e2c901b
[SPARK-2871] [PySpark] add countApproxDistinct() API
davies Sep 2, 2014
644e315
SPARK-3328 fixed make-distribution script --with-tachyon option.
prudhvi953 Sep 3, 2014
7c92b49
[SPARK-1986][GraphX]move lib.Analytics to org.apache.spark.examples
larryxiao Sep 3, 2014
7c9bbf1
[SPARK-3123][GraphX]: override the "setName" function to set EdgeRDD'…
uncleGen Sep 3, 2014
aa7de12
[SPARK-2981][GraphX] EdgePartition1D Int overflow
larryxiao Sep 3, 2014
e9bb12b
[SPARK-1981][Streaming][Hotfix] Fixed docs related to kinesis
tdas Sep 3, 2014
9b225ac
[SPARK-2823][GraphX]fix GraphX EdgeRDD zipPartitions
luluorta Sep 3, 2014
0cd91f6
[SPARK-3341][SQL] The dataType of Sqrt expression should be DoubleType.
ueshin Sep 3, 2014
19d3e1e
[SQL] Renamed ColumnStat to ColumnMetrics to avoid confusion between …
liancheng Sep 3, 2014
24ab384
[SPARK-3300][SQL] No need to call clear() and shorten build()
viirya Sep 3, 2014
c64cc43
SPARK-3358: [EC2] Switch back to HVM instances for m3.X.
pwendell Sep 3, 2014
6a72a36
[SPARK-3187] [yarn] Cleanup allocator code.
Sep 3, 2014
6481d27
[SPARK-3309] [PySpark] Put all public API in __all__
davies Sep 3, 2014
e5d3768
[SPARK-3263][GraphX] Fix changes made to GraphGenerator.logNormalGrap…
rnowling Sep 3, 2014
ccc69e2
[SPARK-2845] Add timestamps to block manager events.
Sep 3, 2014
f2b5b61
[SPARK-3388] Expose aplication ID in ApplicationStart event, use it i…
Sep 3, 2014
2784822
[Minor] Fix outdated Spark version
andrewor14 Sep 3, 2014
996b743
[SPARK-3345] Do correct parameters for ShuffleFileGroup
viirya Sep 4, 2014
a522407
[SPARK-2419][Streaming][Docs] Updates to the streaming programming guide
tdas Sep 4, 2014
e08ea73
[SPARK-3303][core] fix SparkContextSchedulerCreationSuite test error
scwf Sep 4, 2014
4bba10c
[SPARK-3233] Executor never stop its SparnEnv, BlockManager, Connecti…
sarutak Sep 4, 2014
f48420f
[SPARK-2973][SQL] Lightweight SQL commands without distributed jobs w…
liancheng Sep 4, 2014
248067a
[SPARK-2961][SQL] Use statistics to prune batches within cached parti…
liancheng Sep 4, 2014
c5cbc49
[SPARK-3335] [SQL] [PySpark] support broadcast in Python UDF
davies Sep 4, 2014
7c6e71f
[SPARK-2435] Add shutdown hook to pyspark
mattf Sep 4, 2014
1bed0a3
[SPARK-3372] [MLlib] MLlib doesn't pass maven build / checkstyle due …
sarutak Sep 4, 2014
00362da
[HOTFIX] [SPARK-3400] Revert 9b225ac "fix GraphX EdgeRDD zipPartitions"
ankurdave Sep 4, 2014
9058619
[Minor]Remove extra semicolon in FlumeStreamSuite.scala
witgo Sep 4, 2014
4feb46c
[SPARK-3401][PySpark] Wrong usage of tee command in python/run-tests
sarutak Sep 4, 2014
dc1ba9e
[SPARK-3378] [DOCS] Replace the word "SparkSQL" with right word "Spar…
sarutak Sep 4, 2014
0fdf2f5
Manually close old PR
mateiz Sep 5, 2014
90b17a7
Manually close old PR
mateiz Sep 5, 2014
3eb6ef3
[SPARK-3310][SQL] Directly use currentTable without unnecessary impli…
viirya Sep 5, 2014
ee575f1
[SPARK-2219][SQL] Added support for the "add jar" command
liancheng Sep 5, 2014
1904bac
[SPARK-3392] [SQL] Show value spark.sql.shuffle.partitions for mapred…
chenghao-intel Sep 5, 2014
1725a1a
[SPARK-3391][EC2] Support attaching up to 8 EBS volumes.
rxin Sep 5, 2014
6a37ed8
[Docs] fix minor MLlib case typo
nchammas Sep 5, 2014
51b53a7
[SPARK-3260] yarn - pass acls along with executor launch
tgravescs Sep 5, 2014
62c5576
[SPARK-3375] spark on yarn container allocation issues
tgravescs Sep 5, 2014
7ff8c45
[SPARK-3399][PySpark] Test for PySpark should ignore HADOOP_CONF_DIR …
sarutak Sep 5, 2014
ba5bcad
SPARK-3211 .take() is OOM-prone with empty partitions
ash211 Sep 6, 2014
19f61c1
[Build] suppress curl/wget progress bars
nchammas Sep 6, 2014
9422c4e
[SPARK-3361] Expand PEP 8 checks to include EC2 script and Python exa…
nchammas Sep 6, 2014
1b9001f
[SPARK-3409][SQL] Avoid pulling in Exchange operator itself in Exchan…
rxin Sep 6, 2014
0c681dd
[EC2] don't duplicate default values
nchammas Sep 6, 2014
baff7e9
[SPARK-2419][Streaming][Docs] More updates to the streaming programmi…
tdas Sep 6, 2014
da35330
Spark-3406 add a default storage level to python RDD persist API
holdenk Sep 6, 2014
607ae39
[SPARK-3397] Bump pom.xml version number of master branch to 1.2.0-SN…
witgo Sep 6, 2014
21a1e1b
[SPARK-3273][SPARK-3301]We should read the version information from t…
witgo Sep 6, 2014
110fb8b
[SPARK-2334] fix AttributeError when call PipelineRDD.id()
davies Sep 6, 2014
3fb57a0
[SPARK-3353] parent stage should have lower stage id.
rxin Sep 7, 2014
6754570
[SPARK-3394] [SQL] Fix crash in TakeOrdered when limit is 0
Sep 8, 2014
39db1bf
[SQL] Update SQL Programming Guide
marmbrus Sep 8, 2014
e261403
[SPARK-3408] Fixed Limit operator so it works with sort-based shuffle.
rxin Sep 8, 2014
ecfa76c
[SPARK-3415] [PySpark] removes SerializingAdapter code
Sep 8, 2014
9d69a78
Fixed typos in make-distribution.sh
liancheng Sep 8, 2014
4ba2673
[HOTFIX] Fix broken Mima tests on the master branch
JoshRosen Sep 8, 2014
f25bbbd
[SPARK-3280] Made sort-based shuffle the default implementation
rxin Sep 8, 2014
eddfedd
[SPARK-938][doc] Add OpenStack Swift support
rxin Sep 8, 2014
0d1cc4a
[HOTFIX] A left over version change. It should make mima happy.
ScrapCodes Sep 8, 2014
711356b
[SPARK-3086] [SPARK-3043] [SPARK-3156] [mllib] DecisionTree aggregat…
jkbradley Sep 8, 2014
e16a8e7
SPARK-3337 Paranoid quoting in shell to allow install dirs with space…
ScrapCodes Sep 8, 2014
16a73c2
SPARK-2978. Transformation with MR shuffle semantics
sryza Sep 8, 2014
386bc24
Provide a default PYSPARK_PYTHON for python/run_tests
mattf Sep 8, 2014
26bc765
[SQL] Minor edits to sql programming guide.
hcook Sep 8, 2014
939a322
[SPARK-3417] Use new-style classes in PySpark
mrocklin Sep 8, 2014
08ce188
[SPARK-3019] Pluggable block transfer interface (BlockTransferService)
rxin Sep 8, 2014
7db5339
[SPARK-3349][SQL] Output partitioning of limit should not be inherite…
Sep 8, 2014
50a4fa7
[SPARK-3443][MLLIB] update default values of tree:
mengxr Sep 9, 2014
ca0348e
SPARK-3423: [SQL] Implement BETWEEN for SQLParser
willb Sep 9, 2014
dc1dbf2
[SPARK-3414][SQL] Stores analyzed logical plan when registering a tem…
liancheng Sep 9, 2014
2b7ab81
[SPARK-3329][SQL] Don't depend on Hive SET pair ordering in tests.
willb Sep 9, 2014
092e2f1
SPARK-2425 Don't kill a still-running Application because of some mis…
markhamstra Sep 9, 2014
ce5cb32
[Build] Removed -Phive-thriftserver since this profile has been removed
liancheng Sep 9, 2014
c419e4f
[Docs] actorStream storageLevel default is MEMORY_AND_DISK_SER_2
melrief Sep 9, 2014
1e03cf7
[SPARK-3455] [SQL] **HOT FIX** Fix the unit test failure
chenghao-intel Sep 9, 2014
88547a0
SPARK-3422. JavaAPISuite.getHadoopInputSplits isn't used anywhere.
sryza Sep 9, 2014
f0f1ba0
SPARK-3404 [BUILD] SparkSubmitSuite fails with "spark-submit exits wi…
srowen Sep 9, 2014
2686233
[SPARK-3193]output errer info when Process exit code is not zero in t…
scwf Sep 9, 2014
02b5ac7
Minor - Fix trivial compilation warnings.
ScrapCodes Sep 9, 2014
07ee4a2
[SPARK-3176] Implement 'ABS and 'LAST' for sql
Sep 9, 2014
c110614
[SPARK-3448][SQL] Check for null in SpecificMutableRow.update
liancheng Sep 10, 2014
25b5b86
[SPARK-3458] enable python "with" statements for SparkContext
mattf Sep 10, 2014
b734ed0
[SPARK-3395] [SQL] DSL sometimes incorrectly reuses attribute ids, br…
Sep 10, 2014
6f7a768
[SPARK-3286] - Cannot view ApplicationMaster UI when Yarn’s url schem…
Sep 10, 2014
a028330
[SPARK-3362][SQL] Fix resolution for casewhen with nulls.
adrian-wang Sep 10, 2014
f0c87dc
[SPARK-3363][SQL] Type Coercion should promote null to all other types.
adrian-wang Sep 10, 2014
26503fd
[HOTFIX] Fix scala style issue introduced by #2276.
JoshRosen Sep 10, 2014
1f4a648
SPARK-1713. Use a thread pool for launching executors.
sryza Sep 10, 2014
e4f4886
[SPARK-2096][SQL] Correctly parse dot notations
cloud-fan Sep 10, 2014
558962a
[SPARK-3411] Improve load-balancing of concurrently-submitted drivers…
WangTaoTheTonic Sep 10, 2014
79cdb9b
[SPARK-2207][SPARK-3272][MLLib]Add minimum information gain and minim…
Sep 10, 2014
84e2c8b
[SQL] Add test case with workaround for reading partitioned Avro files
marmbrus Sep 11, 2014
f92cde2
[SPARK-3447][SQL] Remove explicit conversion with JListWrapper to avo…
marmbrus Sep 11, 2014
c27718f
[SPARK-2781][SQL] Check resolution of LogicalPlans in Analyzer.
staple Sep 11, 2014
ed1980f
[SPARK-2140] Updating heap memory calculation for YARN stable and alpha.
Sep 11, 2014
1ef656e
[SPARK-3047] [PySpark] add an option to use str in textFileRDD
davies Sep 11, 2014
ca83f1e
[SPARK-2917] [SQL] Avoid table creation in logical plan analyzing for…
chenghao-intel Sep 11, 2014
4bc9e04
[SPARK-3390][SQL] sqlContext.jsonRDD fails on a complex structure of …
yhuai Sep 11, 2014
6324eb7
[Spark-3490] Disable SparkUI for tests
andrewor14 Sep 12, 2014
ce59725
[SPARK-3429] Don't include the empty string "" as a defaultAclUser
ash211 Sep 12, 2014
f858f46
SPARK-3462 push down filters and projections into Unions
Sep 12, 2014
33c7a73
SPARK-2482: Resolve sbt warnings during build
witgo Sep 12, 2014
42904b8
[SPARK-3465] fix task metrics aggregation in local mode
davies Sep 12, 2014
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
2 changes: 2 additions & 0 deletions .rat-excludes
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,13 @@ log4j-defaults.properties
bootstrap-tooltip.js
jquery-1.11.1.min.js
sorttable.js
.*avsc
.*txt
.*json
.*data
.*log
cloudpickle.py
heapq3.py
join.py
SparkExprTyper.scala
SparkILoop.scala
Expand Down
283 changes: 283 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -338,6 +338,289 @@ THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

========================================================================
For heapq (pyspark/heapq3.py):
========================================================================

# A. HISTORY OF THE SOFTWARE
# ==========================
#
# Python was created in the early 1990s by Guido van Rossum at Stichting
# Mathematisch Centrum (CWI, see http://www.cwi.nl) in the Netherlands
# as a successor of a language called ABC. Guido remains Python's
# principal author, although it includes many contributions from others.
#
# In 1995, Guido continued his work on Python at the Corporation for
# National Research Initiatives (CNRI, see http://www.cnri.reston.va.us)
# in Reston, Virginia where he released several versions of the
# software.
#
# In May 2000, Guido and the Python core development team moved to
# BeOpen.com to form the BeOpen PythonLabs team. In October of the same
# year, the PythonLabs team moved to Digital Creations (now Zope
# Corporation, see http://www.zope.com). In 2001, the Python Software
# Foundation (PSF, see http://www.python.org/psf/) was formed, a
# non-profit organization created specifically to own Python-related
# Intellectual Property. Zope Corporation is a sponsoring member of
# the PSF.
#
# All Python releases are Open Source (see http://www.opensource.org for
# the Open Source Definition). Historically, most, but not all, Python
# releases have also been GPL-compatible; the table below summarizes
# the various releases.
#
# Release Derived Year Owner GPL-
# from compatible? (1)
#
# 0.9.0 thru 1.2 1991-1995 CWI yes
# 1.3 thru 1.5.2 1.2 1995-1999 CNRI yes
# 1.6 1.5.2 2000 CNRI no
# 2.0 1.6 2000 BeOpen.com no
# 1.6.1 1.6 2001 CNRI yes (2)
# 2.1 2.0+1.6.1 2001 PSF no
# 2.0.1 2.0+1.6.1 2001 PSF yes
# 2.1.1 2.1+2.0.1 2001 PSF yes
# 2.2 2.1.1 2001 PSF yes
# 2.1.2 2.1.1 2002 PSF yes
# 2.1.3 2.1.2 2002 PSF yes
# 2.2.1 2.2 2002 PSF yes
# 2.2.2 2.2.1 2002 PSF yes
# 2.2.3 2.2.2 2003 PSF yes
# 2.3 2.2.2 2002-2003 PSF yes
# 2.3.1 2.3 2002-2003 PSF yes
# 2.3.2 2.3.1 2002-2003 PSF yes
# 2.3.3 2.3.2 2002-2003 PSF yes
# 2.3.4 2.3.3 2004 PSF yes
# 2.3.5 2.3.4 2005 PSF yes
# 2.4 2.3 2004 PSF yes
# 2.4.1 2.4 2005 PSF yes
# 2.4.2 2.4.1 2005 PSF yes
# 2.4.3 2.4.2 2006 PSF yes
# 2.4.4 2.4.3 2006 PSF yes
# 2.5 2.4 2006 PSF yes
# 2.5.1 2.5 2007 PSF yes
# 2.5.2 2.5.1 2008 PSF yes
# 2.5.3 2.5.2 2008 PSF yes
# 2.6 2.5 2008 PSF yes
# 2.6.1 2.6 2008 PSF yes
# 2.6.2 2.6.1 2009 PSF yes
# 2.6.3 2.6.2 2009 PSF yes
# 2.6.4 2.6.3 2009 PSF yes
# 2.6.5 2.6.4 2010 PSF yes
# 2.7 2.6 2010 PSF yes
#
# Footnotes:
#
# (1) GPL-compatible doesn't mean that we're distributing Python under
# the GPL. All Python licenses, unlike the GPL, let you distribute
# a modified version without making your changes open source. The
# GPL-compatible licenses make it possible to combine Python with
# other software that is released under the GPL; the others don't.
#
# (2) According to Richard Stallman, 1.6.1 is not GPL-compatible,
# because its license has a choice of law clause. According to
# CNRI, however, Stallman's lawyer has told CNRI's lawyer that 1.6.1
# is "not incompatible" with the GPL.
#
# Thanks to the many outside volunteers who have worked under Guido's
# direction to make these releases possible.
#
#
# B. TERMS AND CONDITIONS FOR ACCESSING OR OTHERWISE USING PYTHON
# ===============================================================
#
# PYTHON SOFTWARE FOUNDATION LICENSE VERSION 2
# --------------------------------------------
#
# 1. This LICENSE AGREEMENT is between the Python Software Foundation
# ("PSF"), and the Individual or Organization ("Licensee") accessing and
# otherwise using this software ("Python") in source or binary form and
# its associated documentation.
#
# 2. Subject to the terms and conditions of this License Agreement, PSF hereby
# grants Licensee a nonexclusive, royalty-free, world-wide license to reproduce,
# analyze, test, perform and/or display publicly, prepare derivative works,
# distribute, and otherwise use Python alone or in any derivative version,
# provided, however, that PSF's License Agreement and PSF's notice of copyright,
# i.e., "Copyright (c) 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010,
# 2011, 2012, 2013 Python Software Foundation; All Rights Reserved" are retained
# in Python alone or in any derivative version prepared by Licensee.
#
# 3. In the event Licensee prepares a derivative work that is based on
# or incorporates Python or any part thereof, and wants to make
# the derivative work available to others as provided herein, then
# Licensee hereby agrees to include in any such work a brief summary of
# the changes made to Python.
#
# 4. PSF is making Python available to Licensee on an "AS IS"
# basis. PSF MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR
# IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, PSF MAKES NO AND
# DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS
# FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON WILL NOT
# INFRINGE ANY THIRD PARTY RIGHTS.
#
# 5. PSF SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON
# FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS
# A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON,
# OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF.
#
# 6. This License Agreement will automatically terminate upon a material
# breach of its terms and conditions.
#
# 7. Nothing in this License Agreement shall be deemed to create any
# relationship of agency, partnership, or joint venture between PSF and
# Licensee. This License Agreement does not grant permission to use PSF
# trademarks or trade name in a trademark sense to endorse or promote
# products or services of Licensee, or any third party.
#
# 8. By copying, installing or otherwise using Python, Licensee
# agrees to be bound by the terms and conditions of this License
# Agreement.
#
#
# BEOPEN.COM LICENSE AGREEMENT FOR PYTHON 2.0
# -------------------------------------------
#
# BEOPEN PYTHON OPEN SOURCE LICENSE AGREEMENT VERSION 1
#
# 1. This LICENSE AGREEMENT is between BeOpen.com ("BeOpen"), having an
# office at 160 Saratoga Avenue, Santa Clara, CA 95051, and the
# Individual or Organization ("Licensee") accessing and otherwise using
# this software in source or binary form and its associated
# documentation ("the Software").
#
# 2. Subject to the terms and conditions of this BeOpen Python License
# Agreement, BeOpen hereby grants Licensee a non-exclusive,
# royalty-free, world-wide license to reproduce, analyze, test, perform
# and/or display publicly, prepare derivative works, distribute, and
# otherwise use the Software alone or in any derivative version,
# provided, however, that the BeOpen Python License is retained in the
# Software, alone or in any derivative version prepared by Licensee.
#
# 3. BeOpen is making the Software available to Licensee on an "AS IS"
# basis. BEOPEN MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR
# IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, BEOPEN MAKES NO AND
# DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS
# FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF THE SOFTWARE WILL NOT
# INFRINGE ANY THIRD PARTY RIGHTS.
#
# 4. BEOPEN SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF THE
# SOFTWARE FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS
# AS A RESULT OF USING, MODIFYING OR DISTRIBUTING THE SOFTWARE, OR ANY
# DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF.
#
# 5. This License Agreement will automatically terminate upon a material
# breach of its terms and conditions.
#
# 6. This License Agreement shall be governed by and interpreted in all
# respects by the law of the State of California, excluding conflict of
# law provisions. Nothing in this License Agreement shall be deemed to
# create any relationship of agency, partnership, or joint venture
# between BeOpen and Licensee. This License Agreement does not grant
# permission to use BeOpen trademarks or trade names in a trademark
# sense to endorse or promote products or services of Licensee, or any
# third party. As an exception, the "BeOpen Python" logos available at
# http://www.pythonlabs.com/logos.html may be used according to the
# permissions granted on that web page.
#
# 7. By copying, installing or otherwise using the software, Licensee
# agrees to be bound by the terms and conditions of this License
# Agreement.
#
#
# CNRI LICENSE AGREEMENT FOR PYTHON 1.6.1
# ---------------------------------------
#
# 1. This LICENSE AGREEMENT is between the Corporation for National
# Research Initiatives, having an office at 1895 Preston White Drive,
# Reston, VA 20191 ("CNRI"), and the Individual or Organization
# ("Licensee") accessing and otherwise using Python 1.6.1 software in
# source or binary form and its associated documentation.
#
# 2. Subject to the terms and conditions of this License Agreement, CNRI
# hereby grants Licensee a nonexclusive, royalty-free, world-wide
# license to reproduce, analyze, test, perform and/or display publicly,
# prepare derivative works, distribute, and otherwise use Python 1.6.1
# alone or in any derivative version, provided, however, that CNRI's
# License Agreement and CNRI's notice of copyright, i.e., "Copyright (c)
# 1995-2001 Corporation for National Research Initiatives; All Rights
# Reserved" are retained in Python 1.6.1 alone or in any derivative
# version prepared by Licensee. Alternately, in lieu of CNRI's License
# Agreement, Licensee may substitute the following text (omitting the
# quotes): "Python 1.6.1 is made available subject to the terms and
# conditions in CNRI's License Agreement. This Agreement together with
# Python 1.6.1 may be located on the Internet using the following
# unique, persistent identifier (known as a handle): 1895.22/1013. This
# Agreement may also be obtained from a proxy server on the Internet
# using the following URL: http://hdl.handle.net/1895.22/1013".
#
# 3. In the event Licensee prepares a derivative work that is based on
# or incorporates Python 1.6.1 or any part thereof, and wants to make
# the derivative work available to others as provided herein, then
# Licensee hereby agrees to include in any such work a brief summary of
# the changes made to Python 1.6.1.
#
# 4. CNRI is making Python 1.6.1 available to Licensee on an "AS IS"
# basis. CNRI MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR
# IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, CNRI MAKES NO AND
# DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS
# FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON 1.6.1 WILL NOT
# INFRINGE ANY THIRD PARTY RIGHTS.
#
# 5. CNRI SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON
# 1.6.1 FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS
# A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON 1.6.1,
# OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF.
#
# 6. This License Agreement will automatically terminate upon a material
# breach of its terms and conditions.
#
# 7. This License Agreement shall be governed by the federal
# intellectual property law of the United States, including without
# limitation the federal copyright law, and, to the extent such
# U.S. federal law does not apply, by the law of the Commonwealth of
# Virginia, excluding Virginia's conflict of law provisions.
# Notwithstanding the foregoing, with regard to derivative works based
# on Python 1.6.1 that incorporate non-separable material that was
# previously distributed under the GNU General Public License (GPL), the
# law of the Commonwealth of Virginia shall govern this License
# Agreement only as to issues arising under or with respect to
# Paragraphs 4, 5, and 7 of this License Agreement. Nothing in this
# License Agreement shall be deemed to create any relationship of
# agency, partnership, or joint venture between CNRI and Licensee. This
# License Agreement does not grant permission to use CNRI trademarks or
# trade name in a trademark sense to endorse or promote products or
# services of Licensee, or any third party.
#
# 8. By clicking on the "ACCEPT" button where indicated, or by copying,
# installing or otherwise using Python 1.6.1, Licensee agrees to be
# bound by the terms and conditions of this License Agreement.
#
# ACCEPT
#
#
# CWI LICENSE AGREEMENT FOR PYTHON 0.9.0 THROUGH 1.2
# --------------------------------------------------
#
# Copyright (c) 1991 - 1995, Stichting Mathematisch Centrum Amsterdam,
# The Netherlands. All rights reserved.
#
# Permission to use, copy, modify, and distribute this software and its
# documentation for any purpose and without fee is hereby granted,
# provided that the above copyright notice appear in all copies and that
# both that copyright notice and this permission notice appear in
# supporting documentation, and that the name of Stichting Mathematisch
# Centrum or CWI not be used in advertising or publicity pertaining to
# distribution of the software without specific, written prior
# permission.
#
# STICHTING MATHEMATISCH CENTRUM DISCLAIMS ALL WARRANTIES WITH REGARD TO
# THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND
# FITNESS, IN NO EVENT SHALL STICHTING MATHEMATISCH CENTRUM BE LIABLE
# FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
# WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
# ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT
# OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.

========================================================================
For sorttable (core/src/main/resources/org/apache/spark/ui/static/sorttable.js):
Expand Down
16 changes: 13 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@ Spark is a fast and general cluster computing system for Big Data. It provides
high-level APIs in Scala, Java, and Python, and an optimized engine that
supports general computation graphs for data analysis. It also supports a
rich set of higher-level tools including Spark SQL for SQL and structured
data processing, MLLib for machine learning, GraphX for graph processing,
and Spark Streaming.
data processing, MLlib for machine learning, GraphX for graph processing,
and Spark Streaming for stream processing.

<http://spark.apache.org/>

Expand Down Expand Up @@ -69,7 +69,7 @@ Many of the example programs print usage help if no params are given.
Testing first requires [building Spark](#building-spark). Once Spark is built, tests
can be run using:

./sbt/sbt test
./dev/run-tests

## A Note About Hadoop Versions

Expand Down Expand Up @@ -115,6 +115,14 @@ If your project is built with Maven, add this to your POM file's `<dependencies>
</dependency>


## A Note About Thrift JDBC server and CLI for Spark SQL

Spark SQL supports Thrift JDBC server and CLI.
See sql-programming-guide.md for more information about using the JDBC server and CLI.
You can use those features by setting `-Phive` when building Spark as follows.

$ sbt/sbt -Phive assembly

## Configuration

Please refer to the [Configuration guide](http://spark.apache.org/docs/latest/configuration.html)
Expand All @@ -131,3 +139,5 @@ submitting any copyrighted material via pull request, email, or other means
you agree to license the material under the project's open source license and
warrant that you have the legal authority to do so.

Please see [Contributing to Spark wiki page](https://cwiki.apache.org/SPARK/Contributing+to+Spark)
for more information.
25 changes: 19 additions & 6 deletions assembly/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent</artifactId>
<version>1.1.0-SNAPSHOT</version>
<version>1.2.0-SNAPSHOT</version>
<relativePath>../pom.xml</relativePath>
</parent>

Expand All @@ -43,6 +43,12 @@
</properties>

<dependencies>
<!-- Promote Guava to compile scope in this module so it's included while shading. -->
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_${scala.binary.version}</artifactId>
Expand Down Expand Up @@ -113,6 +119,18 @@
<goal>shade</goal>
</goals>
<configuration>
<relocations>
<relocation>
<pattern>com.google</pattern>
<shadedPattern>org.spark-project.guava</shadedPattern>
<includes>
<include>com.google.common.**</include>
</includes>
<excludes>
<exclude>com.google.common.base.Optional**</exclude>
</excludes>
</relocation>
</relocations>
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer" />
<transformer implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
Expand Down Expand Up @@ -163,11 +181,6 @@
<artifactId>spark-hive_${scala.binary.version}</artifactId>
<version>${project.version}</version>
</dependency>
</dependencies>
</profile>
<profile>
<id>hive-thriftserver</id>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive-thriftserver_${scala.binary.version}</artifactId>
Expand Down
Loading