Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
654 commits
Select commit Hold shift + click to select a range
f14f6d6
[SPARK-38357][SQL][TESTS] Add test coverage for file source with OR(d…
huaxingao Mar 2, 2022
3ab18cc
[SPARK-38383][K8S] Support `APP_ID` and `EXECUTOR_ID` placeholder in …
dongjoon-hyun Mar 2, 2022
42db298
Revert "[SPARK-37090][BUILD] Upgrade `libthrift` to 0.16.0 to avoid s…
dongjoon-hyun Mar 2, 2022
b141c15
[SPARK-38342][CORE] Clean up deprecated api usage of Ivy
LuciferYang Mar 2, 2022
f960328
[SPARK-38389][SQL] Add the `DATEDIFF()` and `DATE_DIFF()` aliases for…
MaxGekk Mar 2, 2022
4d4c044
[SPARK-38392][K8S][TESTS] Add `spark-` prefix to namespaces and `-dri…
martin-g Mar 2, 2022
ad5427e
[SPARK-36553][ML] KMeans avoid compute auxiliary statistics for large K
zhengruifeng Mar 2, 2022
829d7fb
[MINOR][SQL][DOCS] Add more examples to sql-ref-syntax-ddl-create-tab…
wangyum Mar 2, 2022
226bdec
[SPARK-38269][CORE][SQL][SS][ML][MLLIB][MESOS][YARN][K8S][EXAMPLES] C…
LuciferYang Mar 2, 2022
23db9b4
[SPARK-38191][CORE][FOLLOWUP] The staging directory of write job only…
weixiuli Mar 3, 2022
86e0903
[SPARK-38398][K8S][TESTS] Add `priorityClassName` integration test case
dongjoon-hyun Mar 3, 2022
dfff8d8
[SPARK-38353][PYTHON] Instrument __enter__ and __exit__ magic methods…
heyihong Mar 3, 2022
b71d6d0
[SPARK-38378][SQL] Refactoring of the ANTLR grammar definition into s…
zhenlineo Mar 3, 2022
b81d90b
[SPARK-38312][CORE] Use error class in GraphiteSink
bozhang2820 Mar 3, 2022
34618a7
[SPARK-38351][TESTS] Don't use deprecate symbol API in test classes
martin-g Mar 3, 2022
5039c0f
[SPARK-38345][SQL] Introduce SQL function ARRAY_SIZE
xinrong-meng Mar 4, 2022
83d8000
[SPARK-38196][SQL] Refactor framework so as JDBC dialect could compil…
beliefer Mar 4, 2022
ae9b804
[SPARK-38417][CORE] Remove `Experimental` from `RDD.cleanShuffleDepen…
dongjoon-hyun Mar 5, 2022
980d88d
[SPARK-38418][PYSPARK] Add PySpark `cleanShuffleDependencies` develop…
dongjoon-hyun Mar 5, 2022
727f044
[SPARK-38189][K8S][DOC] Add `Priority scheduling` doc for Spark on K8S
Yikun Mar 5, 2022
97716f7
[SPARK-38393][SQL] Clean up deprecated usage of `GenSeq/GenMap`
LuciferYang Mar 5, 2022
18219d4
[SPARK-37400][SPARK-37426][PYTHON][MLLIB] Inline type hints for pyspa…
zero323 Mar 6, 2022
69bc9d1
[SPARK-38239][PYTHON][MLLIB] Fix pyspark.mllib.LogisticRegressionMode…
zero323 Mar 6, 2022
135841f
[SPARK-38411][CORE] Use `UTF-8` when `doMergeApplicationListingIntern…
pan3793 Mar 6, 2022
b651617
[SPARK-38416][PYTHON][TESTS] Change day to month
bjornjorgensen Mar 7, 2022
3175d83
[SPARK-38394][BUILD] Upgrade `scala-maven-plugin` to 4.4.0 for Hadoop…
steveloughran Mar 7, 2022
b99f58a
[SPARK-38267][CORE][SQL][SS] Replace pattern matches on boolean expre…
LuciferYang Mar 7, 2022
d83ab94
[SPARK-38419][BUILD] Replace tabs that exist in the script with spaces
Mar 7, 2022
fc6b5e5
[SPARK-38188][K8S][TESTS][FOLLOWUP] Cleanup resources in `afterEach`
Yikun Mar 7, 2022
3bbc43d
[SPARK-38430][K8S][DOCS] Add `SBT` commands to K8s IT README
williamhyun Mar 7, 2022
f36d1bf
[SPARK-38423][K8S] Reuse driver pod's `priorityClassName` for `PodGroup`
Yikun Mar 7, 2022
4883a80
[SPARK-38382][DOC] Fix incorrect version infomation of migration guid…
AngersZhuuuu Mar 7, 2022
e21cb62
[SPARK-38335][SQL] Implement parser support for DEFAULT column values
dtenedor Mar 7, 2022
c1e5e8a
[SPARK-38407][SQL] ANSI Cast: loosen the limitation of casting non-nu…
gengliangwang Mar 7, 2022
1b31b7c
[SPARK-38434][SQL] Correct semantic of CheckAnalysis.getDataTypesAreC…
ivoson Mar 7, 2022
ed3a61d
[SPARK-38394][BUILD][FOLLOWUP] Update comments about `scala-maven-plu…
steveloughran Mar 7, 2022
60d3de1
[SPARK-38104][SQL] Migrate parsing errors of window into the new erro…
yutoacts Mar 7, 2022
ddc1803
[SPARK-38414][CORE][DSTREAM][EXAMPLES][ML][MLLIB][SQL] Remove redunda…
LuciferYang Mar 7, 2022
6c486d2
[SPARK-38436][PYTHON][TESTS] Fix `test_ceil` to test `ceil`
bjornjorgensen Mar 7, 2022
71991f7
[SPARK-38285][SQL] Avoid generator pruning for invalid extractor
viirya Mar 7, 2022
a13b478
[SPARK-38183][PYTHON][FOLLOWUP] Check the ANSI conf properly when cre…
itholic Mar 8, 2022
14cda58
[SPARK-38385][SQL] Improve error messages of 'mismatched input' cases…
anchovYu Mar 8, 2022
e80d979
[SPARK-37895][SQL] Filter push down column with quoted columns
planga82 Mar 8, 2022
e5ba617
[SPARK-38361][SQL] Add factory method `getConnection` into `JDBCDialect`
beliefer Mar 8, 2022
4df8512
[SPARK-37283][SQL][FOLLOWUP] Avoid trying to store a table which cont…
sarutak Mar 8, 2022
9e1d00c
[SPARK-38406][SQL] Improve perfermance of ShufflePartitionsUtil creat…
ulysses-you Mar 8, 2022
cd32c22
[SPARK-38240][SQL][FOLLOW-UP] Make RuntimeReplaceableAggregate as an …
HyukjinKwon Mar 8, 2022
9854456
[SPARK-35956][K8S][FOLLOWP] Fix typos in config names
dongjoon-hyun Mar 8, 2022
13021ed
[SPARK-38442][SQL] Fix ConstantFoldingSuite/ColumnExpressionSuite/Dat…
gengliangwang Mar 8, 2022
8a0b101
[SPARK-38112][SQL] Use error classes in the execution errors of date/…
ivoson Mar 8, 2022
8b08f19
[SPARK-37753][SQL] Fine tune logic to demote Broadcast hash join in D…
ekoifman Mar 8, 2022
b5589a9
[SPARK-38423][K8S][FOLLOWUP] PodGroup spec should not be null
dongjoon-hyun Mar 8, 2022
0ad7677
[SPARK-38309][CORE] Fix SHS `shuffleTotalReads` and `shuffleTotalBloc…
robreeves Mar 8, 2022
8fabd5e
[SPARK-38428][SHUFFLE] Check the FetchShuffleBlocks message only once…
weixiuli Mar 8, 2022
049d6d1
[SPARK-38443][SS][DOC] Document config STREAMING_SESSION_WINDOW_MERGE…
viirya Mar 9, 2022
59ce0a7
[SPARK-37865][SQL] Fix union deduplication correctness bug
karenfeng Mar 9, 2022
43c7824
[SPARK-38412][SS] Fix the swapped sequence of from and to in StateSch…
HeartSaVioR Mar 9, 2022
f2058eb
[SPARK-38450][SQL] Fix HiveQuerySuite//PushFoldableIntoBranchesSuite/…
gengliangwang Mar 9, 2022
35c0e5c
[MINOR][PYTHON] Fix `MultilayerPerceptronClassifierTest.test_raw_and_…
harupy Mar 9, 2022
4da04fc
[SPARK-37600][BUILD] Upgrade to Hadoop 3.3.2
sunchao Mar 9, 2022
b8c03ee
[SPARK-38455][SPARK-38187][K8S] Support driver/executor `PodGroup` te…
dongjoon-hyun Mar 9, 2022
587ec34
[SPARK-38449][SQL] Avoid call createTable when ignoreIfExists=true an…
AngersZhuuuu Mar 9, 2022
66ff4b6
[SPARK-38452][K8S][TESTS] Support pyDockerfile and rDockerfile in SBT…
Yikun Mar 9, 2022
52e7602
[SPARK-38458][SQL] Fix always false condition in `LogDivertAppender#i…
LuciferYang Mar 9, 2022
bd6a3b4
[SPARK-38437][SQL] Lenient serialization of datetime from datasource
MaxGekk Mar 9, 2022
62e4c29
[SPARK-37421][PYTHON] Inline type hints for python/pyspark/mllib/eval…
dchvn Mar 9, 2022
93a25a4
[SPARK-37947][SQL] Extract generator from GeneratorOuter expression c…
bersprockets Mar 9, 2022
1584366
[SPARK-38354][SQL] Add hash probes metric for shuffled hash join
c21 Mar 9, 2022
effef84
[SPARK-36681][CORE][TEST] Enable SnappyCodec test in FileSuite
viirya Mar 9, 2022
97df016
[SPARK-38480][K8S] Remove `spark.kubernetes.job.queue` in favor of `s…
dongjoon-hyun Mar 9, 2022
01014aa
[SPARK-38486][K8S][TESTS] Upgrade the minimum Minikube version to 1.18.0
dongjoon-hyun Mar 10, 2022
0f4c26a
[SPARK-38387][PYTHON] Support `na_action` and Series input correspond…
xinrong-meng Mar 10, 2022
bd08e79
[SPARK-38355][PYTHON][TESTS] Use `mkstemp` instead of `mktemp`
bjornjorgensen Mar 10, 2022
ecabfb1
[SPARK-38187][K8S][TESTS] Add K8S IT for `volcano` minResources cpu/m…
Yikun Mar 10, 2022
82b6194
[SPARK-38385][SQL] Improve error messages of empty statement and <EOF…
anchovYu Mar 10, 2022
f286416
[SPARK-38379][K8S] Fix Kubernetes Client mode when mounting persisten…
tgravescs Mar 10, 2022
ec544ad
[SPARK-38148][SQL] Do not add dynamic partition pruning if there exis…
ulysses-you Mar 10, 2022
e5a86a3
[SPARK-38453][K8S][DOCS] Add `volcano` section to K8s IT `README.md`
Yikun Mar 10, 2022
c483e29
[SPARK-38487][PYTHON][DOC] Fix docstrings of nlargest/nsmallest of Da…
xinrong-meng Mar 10, 2022
3ab2455
[SPARK-38499][BUILD] Upgrade Jackson to 2.13.2
dongjoon-hyun Mar 10, 2022
bcf7849
[SPARK-38489][SQL] Aggregate.groupOnly support foldable expressions
wangyum Mar 10, 2022
538c81b
[SPARK-38481][SQL] Substitute Java overflow exception from `TIMESTAMP…
MaxGekk Mar 10, 2022
5cbd9b4
[SPARK-38500][INFRA] Add ASF License header to all Service Provider c…
yaooqinn Mar 10, 2022
216b972
[SPARK-38360][SQL][SS][PYTHON] Introduce a `exists` function for `Tre…
LuciferYang Mar 10, 2022
0a4a12d
[SPARK-38490][SQL][INFRA] Add Github action test job for ANSI SQL mode
gengliangwang Mar 10, 2022
a26c01d
[SPARK-38451][R][TESTS] Fix `make_date` test case to pass with ANSI mode
HyukjinKwon Mar 10, 2022
024d03e
[SPARK-38501][SQL] Fix thriftserver test failures under ANSI mode
gengliangwang Mar 10, 2022
f852100
[SPARK-38513][K8S] Move custom scheduler-specific configs to under `s…
dongjoon-hyun Mar 10, 2022
2239e9d
[MINOR][DOCS] Fix minor typos at nulls_option in Window Functions
bfallik Mar 11, 2022
54abb85
[SPARK-38517][INFRA] Fix PySpark documentation generation (missing ip…
HyukjinKwon Mar 11, 2022
aec70e8
[SPARK-38511][K8S] Remove `priorityClassName` propagation in favor of…
dongjoon-hyun Mar 11, 2022
2e3ac4f
[SPARK-38509][SQL] Unregister the `TIMESTAMPADD/DIFF` functions and r…
MaxGekk Mar 11, 2022
34e3029
[SPARK-38107][SQL] Use error classes in the compilation errors of pyt…
itholic Mar 11, 2022
36023c2
[SPARK-38491][PYTHON] Support `ignore_index` of `Series.sort_values`
xinrong-meng Mar 11, 2022
b1d8f35
[SPARK-38518][PYTHON] Implement `skipna` of `Series.all/Index.all` to…
xinrong-meng Mar 11, 2022
fd5896b
[SPARK-38527][K8S][DOCS] Set the minimum Volcano version
dongjoon-hyun Mar 11, 2022
60334d7
[SPARK-38516][BUILD] Add log4j-core and log4j-api to classpath if act…
wangyum Mar 12, 2022
c91c2e9
[SPARK-38526][SQL] Fix misleading function alias name for RuntimeRepl…
cloud-fan Mar 12, 2022
a511ca1
[SPARK-38534][SQL][TESTS] Disable `to_timestamp('366', 'DD')` test case
dongjoon-hyun Mar 12, 2022
c032928
[SPARK-37430][PYTHON][MLLIB] Inline hints for pyspark.mllib.linalg.di…
hi-zir Mar 12, 2022
6becf4e
[SPARK-38538][K8S][TESTS] Fix driver environment verification in Basi…
dongjoon-hyun Mar 13, 2022
96e5446
[SPARK-36058][K8S][TESTS][FOLLOWUP] Fix error message to include exce…
dongjoon-hyun Mar 13, 2022
6b64e5d
[SPARK-38320][SS] Fix flatMapGroupsWithState timeout in batch with da…
alex-balikov Mar 13, 2022
786a70e
[SPARK-38537][K8S] Unify `Statefulset*` to `StatefulSet*`
dongjoon-hyun Mar 13, 2022
9bede26
[MINOR][K8S][TESTS] Remove `verifyPriority` from `VolcanoFeatureStepS…
williamhyun Mar 13, 2022
0840b23
[SPARK-38540][BUILD] Upgrade `compress-lzf` from 1.0.3 to 1.1
LuciferYang Mar 13, 2022
83673c8
[SPARK-38528][SQL] Eagerly iterate over aggregate sequence when build…
bersprockets Mar 14, 2022
715a06c
[SPARK-38532][SS][TESTS] Add test case for invalid gapDuration of ses…
nyingping Mar 14, 2022
5699095
[SPARK-38519][SQL] AQE throw exception should respect SparkFatalExcep…
ulysses-you Mar 14, 2022
efe4330
[SPARK-38410][SQL] Support specify initial partition number for rebal…
ulysses-you Mar 14, 2022
9596942
[SPARK-38523][SQL] Fix referring to the corrupt record column from CSV
MaxGekk Mar 14, 2022
35536a1
[SPARK-38103][SQL] Migrate parsing errors of transform into the new e…
Mar 14, 2022
8e44791
[SPARK-38504][SQL] Cannot read TimestampNTZ as TimestampLTZ
beliefer Mar 14, 2022
2844a18
[SPARK-38360][SQL][AVRO][SS][FOLLOWUP] Replace `TreeNode.collectFirst…
LuciferYang Mar 14, 2022
130bcce
[SPARK-38415][SQL] Update the histogram_numeric (x, y) result type to…
dtenedor Mar 14, 2022
a342214
[SPARK-38535][SQL] Add the `datetimeUnit` enum and use it in `TIMESTA…
MaxGekk Mar 14, 2022
5bb001b
[SPARK-36967][FOLLOWUP][CORE] Report accurate shuffle block size if i…
wankunde Mar 14, 2022
0005b41
[SPARK-38400][PYTHON] Enable Series.rename to change index labels
xinrong-meng Mar 14, 2022
c16a66a
[SPARK-36194][SQL] Add a logical plan visitor to propagate the distin…
wangyum Mar 14, 2022
f6c4634
[SPARK-37491][PYTHON] Fix Series.asof for unsorted values
pralabhkumar Mar 14, 2022
a30575e
[SPARK-38544][BUILD] Upgrade log4j2 to 2.17.2
LuciferYang Mar 14, 2022
1d4e917
[SPARK-38521][SQL] Change `partitionOverwriteMode` from string to var…
jackylee-ch Mar 15, 2022
8b5ec77
[SPARK-38549][SS] Add `numRowsDroppedByWatermark` to `SessionWindowSt…
viirya Mar 15, 2022
f17f078
[SPARK-38513][K8S][FOLLWUP] Cleanup executor-podgroup-template.yml
Yikun Mar 15, 2022
58c21e5
[SPARK-38527][K8S][DOCS][FOLLOWUP] Use v1.5.0 tag instead of release-1.5
dongjoon-hyun Mar 15, 2022
2a63fea
Revert "[SPARK-38544][BUILD] Upgrade log4j2 to 2.17.2"
wangyum Mar 15, 2022
c00942d
[SPARK-38524][SPARK-38553][K8S] Bump `Volcano` to v1.5.1 and fix Volc…
Yikun Mar 15, 2022
21db916
[SPARK-38484][PYTHON] Move usage logging instrumentation util functio…
heyihong Mar 15, 2022
4e31000
[SPARK-38204][SS] Use StatefulOpClusteredDistribution for stateful op…
HeartSaVioR Mar 15, 2022
f84018a
[SPARK-38424][PYTHON] Warn unused casts and ignores
zero323 Mar 16, 2022
1acadf3
[SPARK-38558][SQL] Remove unnecessary casts between IntegerType and I…
cashmand Mar 16, 2022
8476c8b
[SPARK-38542][SQL] UnsafeHashedRelation should serialize numKeys out
mcdull-zhang Mar 16, 2022
8193b40
[SPARK-38563][PYTHON] Upgrade to Py4J 0.10.9.4
HyukjinKwon Mar 16, 2022
1b41416
[SPARK-38106][SQL] Use error classes in the parsing errors of functions
ivoson Mar 16, 2022
71e2110
[SPARK-38194][YARN][MESOS][K8S] Make memory overhead factor configurable
Kimahriman Mar 16, 2022
4ff40c1
[SPARK-38561][K8S][DOCS] Add doc for `Customized Kubernetes Schedulers`
Yikun Mar 16, 2022
5967f29
[SPARK-38545][BUILD] Upgarde scala-maven-plugin from 4.4.0 to 4.5.6
LuciferYang Mar 16, 2022
6d3e8eb
[SPARK-38555][NETWORK][SHUFFLE] Avoid contention and get or create cl…
weixiuli Mar 16, 2022
b16a9e9
[SPARK-38572][BUILD] Setting version to 3.4.0-SNAPSHOT
MaxGekk Mar 17, 2022
b348acd
[SPARK-38441][PYTHON] Support string and bool `regex` in `Series.repl…
xinrong-meng Mar 17, 2022
7d1ff01
[SPARK-38556][PYTHON] Disable Pandas usage logging for method calls i…
heyihong Mar 17, 2022
78ed4cc
[SPARK-38575][INFRA] Duduplicate branch specification in GitHub Actio…
HyukjinKwon Mar 17, 2022
7630787
[SPARK-38575][INFRA][FOLLOW-UP] Fix ** to '**' in ansi_sql_mode_test.yml
HyukjinKwon Mar 17, 2022
f0b836b
[SPARK-38560][SQL] If `Sum`, `Count`, `Any` accompany with distinct, …
beliefer Mar 17, 2022
3afc4fb
[SPARK-37995][SQL] PlanAdaptiveDynamicPruningFilters should use prepa…
ulysses-you Mar 17, 2022
46ccc22
[MINOR][INFRA] Add ANTLR generated files to .gitignore
Mar 17, 2022
5c4930a
[SPARK-38586][INFRA] Trigger notifying workflow in branch-3.3 and oth…
HyukjinKwon Mar 17, 2022
968bb34
[SPARK-38575][INFRA][FOLLOW-UP] Use GITHUB_REF to get the current branch
HyukjinKwon Mar 17, 2022
cd86df8
[SPARK-38575][INFRA][FOLLOW-UP] Pin the branch to `master` for forked…
HyukjinKwon Mar 17, 2022
2d1d18a
[SPARK-37425][PYTHON] Inline type hints for python/pyspark/mllib/reco…
dchvn Mar 17, 2022
54fdb88
Revert "[SPARK-38563][PYTHON] Upgrade to Py4J 0.10.9.4"
dongjoon-hyun Mar 17, 2022
f36a5fb
[SPARK-38194] Followup: Fix k8s memory overhead passing to executor pods
Mar 17, 2022
97335ea
[SPARK-38563][PYTHON] Upgrade to Py4J 0.10.9.5
HyukjinKwon Mar 18, 2022
681dfee
[SPARK-38583][SQL] Restore the behavior of `to_timestamp` that allows…
HyukjinKwon Mar 18, 2022
a9ad119
[SPARK-38593][SS] Carry over the metric of the number of dropped late…
HeartSaVioR Mar 18, 2022
53eaaf8
[SPARK-38600][SQL] Include `unit` into the sql string of `TIMESTAMPAD…
MaxGekk Mar 18, 2022
b0f21e1
[SPARK-38568][BUILD] Upgrade ZSTD-JNI to 1.5.2-2
wangyum Mar 19, 2022
56086cb
[SPARK-38541][BUILD] Upgrade Netty to 4.1.75
LuciferYang Mar 19, 2022
4661455
[SPARK-38544][BUILD] Upgrade log4j2 to 2.17.2
jackylee-ch Mar 19, 2022
91614ff
[SPARK-38510][SQL] Retry ClassSymbol.selfType to work around cyclic r…
shardulm94 Mar 19, 2022
dcc66e4
Revert "[SPARK-38556][PYTHON] Disable Pandas usage logging for method…
HyukjinKwon Mar 21, 2022
cae51ea
[SPARK-38607][INFRA] Test result report for ANSI mode
HyukjinKwon Mar 21, 2022
c34fee4
[SPARK-38548][SQL] New SQL function: try_sum
gengliangwang Mar 21, 2022
a627dac
[SPARK-38609][PYTHON] Add `PYSPARK_PANDAS_USAGE_LOGGER` environment v…
HyukjinKwon Mar 21, 2022
d3af3e5
[SPARK-30220] Enable using Exists/In subqueries outside of the Filter…
tanelk Mar 21, 2022
acb50d9
[SPARK-38612][PYTHON] Fix Inline type hint for duplicated.keep
Yikun Mar 21, 2022
f8fd023
[SPARK-34805][SQL] Propagate metadata from nested columns in Alias
kevinwallimann Mar 21, 2022
a876f00
[SPARK-38606][DOC] Update document to make a good guide of multiple v…
TonyDoen Mar 21, 2022
692e4b0
[SPARK-38604][SQL] Keep ceil and floor with only a single argument th…
revans2 Mar 22, 2022
2ca5d18
[SPARK-38488][INFRA] Upgrade ffi to 1.15.5 with --enable-libffi-alloc…
Yikun Mar 22, 2022
ee5121a
[SPARK-38574][DOCS] Enrich the documentation of option avroSchema
tianhanhu Mar 22, 2022
fc5e922
[SPARK-38564][SS] Support collecting metrics from streaming sinks
jerrypeng Mar 22, 2022
53df456
[SPARK-38432][SQL] Refactor framework so as JDBC dialect could compil…
beliefer Mar 22, 2022
99992a4
[SPARK-38579][SQL][WEBUI] Requesting Restful API can cause NullPointe…
yimin-yang Mar 22, 2022
27455ae
[SPARK-38456][SQL] Improve error messages of no viable alternative, e…
anchovYu Mar 22, 2022
7373cd2
[SPARK-38619][TESTS] Clean up Junit api usage in scalatest
LuciferYang Mar 22, 2022
768ab55
[SPARK-38194][FOLLOWUP] Update executor config description for memory…
Kimahriman Mar 22, 2022
c309cd1
[SPARK-38522][SS] Enrich the method contract of iterator in StateStor…
HeartSaVioR Mar 23, 2022
a89d289
[SPARK-38626][SQL] Make condition in DeleteFromTable plan required
aokolnychyi Mar 23, 2022
4e60638
[SPARK-38432][SQL][FOLLOWUP] Supplement test case for overflow and ad…
beliefer Mar 23, 2022
1f4e4c8
[SPARK-32268][SQL] Row-level Runtime Filtering
somani Mar 23, 2022
f73d528
[MINOR] Add @since for DSv2 API
pan3793 Mar 23, 2022
12be81a
[SPARK-38622][BUILD] Upgrade jersey to 2.35
LuciferYang Mar 23, 2022
43487cb
[SPARK-38630][K8S] K8s app name label should start and end with alpha…
dongjoon-hyun Mar 23, 2022
643e8a9
[SPARK-38564][SS][TESTS] Wait all events to arrive in ReportSinkMetri…
HyukjinKwon Mar 23, 2022
ac9ae98
[SPARK-38629][SQL][DOCS] Two links beneath Spark SQL Guide/Data Sourc…
Mar 23, 2022
f327dad
[SPARK-38533][SQL] DS V2 aggregate push-down supports project with alias
beliefer Mar 23, 2022
4817b01
[SPARK-38624][SQL] Reduce UnsafeProjection.create call times when Per…
lw33 Mar 23, 2022
a3776e0
[SPARK-38587][SQL] Validating new location for rename command should …
yaooqinn Mar 23, 2022
861e8b4
[SPARK-38628][SQL] Complete the copy method in subclasses of Internal…
ueshin Mar 23, 2022
4fe55c5
[SPARK-37483][SQL][FOLLOWUP] Rename `pushedTopN` to `PushedTopN` and …
beliefer Mar 23, 2022
c5ebdc6
[SPARK-18621][PYTHON] Make sql type reprs eval-able
crflynn Mar 23, 2022
71f1083
[SPARK-38635][YARN] Remove duplicate log
wangshengjie123 Mar 23, 2022
2eae3db
[SPARK-38613][CORE] Change the exception type thrown by `PushBlockStr…
LuciferYang Mar 23, 2022
39fc7ee
[SPARK-38611][TESTS] Replace `intercept` with `assertThrows` in `Cata…
LuciferYang Mar 23, 2022
7165123
[SPARK-32268][TESTS][FOLLOWUP] Fix `BloomFilterAggregateQuerySuite` f…
LuciferYang Mar 23, 2022
6743aaa
[SPARK-38625][SQL] DataSource V2: Add APIs for group-based row-level …
aokolnychyi Mar 24, 2022
057c051
[SPARK-38631][CORE] Uses Java-based implementation for un-tarring at …
HyukjinKwon Mar 24, 2022
650c774
[SPARK-38585][SQL] Simplify the code of `TreeNode.clone()`
LuciferYang Mar 24, 2022
3858bf0
[SPARK-38063][SQL] Support split_part Function
amaliujia Mar 24, 2022
b902936
[SPARK-38588][ML] Validate input dataset of ml.classification
zhengruifeng Mar 24, 2022
8eb8a42
[SPARK-37568][SQL] Support 2-arguments by the convert_timezone() func…
MaxGekk Mar 24, 2022
e410d98
[SPARK-37463][SQL] Read/Write Timestamp ntz from/to Orc uses int64
beliefer Mar 24, 2022
de960a5
[SPARK-38641][BUILD] Get rid of invalid configuration elements in mvn…
morvenhuang Mar 24, 2022
4c51851
[SPARK-38570][SQL] Incorrect DynamicPartitionPruning caused by Literal
mcdull-zhang Mar 25, 2022
18ff157
[SPARK-38646][PYTHON] Pull a trait out for Python functions
zhenlineo Mar 25, 2022
6d3149a
[SPARK-38643][ML] Validate input dataset of ml.regression
zhengruifeng Mar 25, 2022
53908be
[SPARK-38644][SQL] DS V2 topN push-down supports project with alias
beliefer Mar 25, 2022
b112528
[SPARK-38569][BUILD] Rename `external` top level dir to `connector`
alkis Mar 25, 2022
8ef0159
[SPARK-38654][SQL][PYTHON] Show default index type in SQL plans for p…
HyukjinKwon Mar 25, 2022
9a7596e
[SPARK-37618][CORE] Remove shuffle blocks using the shuffle service f…
Kimahriman Mar 25, 2022
8262a7b
[SPARK-38219][SQL] Support ANSI aggregation function `percentile_cont…
beliefer Mar 25, 2022
4e95738
[SPARK-38336][SQL] Support DEFAULT column values in CREATE/REPLACE TA…
dtenedor Mar 26, 2022
0a4de08
[SPARK-37512][PYTHON][FOLLOWUP] Add test_timedelta_ops to modules
Yikun Mar 27, 2022
eb30a27
[SPARK-38308][SQL] Eagerly iterate over sequence of window expression…
bersprockets Mar 27, 2022
c952b83
[SPARK-38665][BUILD] Upgrade jackson due to CVE-2020-36518
pan3793 Mar 27, 2022
ecfe049
[SPARK-38616][SQL] Keep track of SQL query text in Catalyst TreeNode
gengliangwang Mar 28, 2022
a8629a1
[SPARK-38391][SQL] Datasource v2 supports partial topN push-down
beliefer Mar 28, 2022
c0cb5bc
[SPARK-38623][SQL] Add more comments and tests for HashShuffleSpec
cloud-fan Mar 28, 2022
3ffe4ef
[SPARK-38655][SQL] `OffsetWindowFunctionFrameBase` cannot find the of…
beliefer Mar 28, 2022
6d32d2e
[SPARK-38671][INFRA] Publish snapshot from branch-3.3
wangyum Mar 28, 2022
6560825
[SPARK-38432][SQL][FOLLOWUP] Add test case for push down filter with …
beliefer Mar 28, 2022
9987d17
[SPARK-38257][BUILD] Upgrade `rockdbjni` to 7.0.3
LuciferYang Mar 28, 2022
34a39a2
[SPARK-38678][TESTS] Enable RocksDB tests on Apple Silicon on MacOS
dongjoon-hyun Mar 28, 2022
0562cac
[SPARK-38673][TESTS] Replace Java assert with Junit API in Java UTs
LuciferYang Mar 28, 2022
84bc452
[SPARK-37853][CORE][SQL][FOLLOWUP] Clean up log4j2 deprecation api usage
LuciferYang Mar 28, 2022
2d90659
[SPARK-38680][INFRA] Set upperbound for pandas-stubs in CI
HyukjinKwon Mar 29, 2022
264dbd7
[MINOR][PYTHON] Fix `MultilayerPerceptronClassifierTest.test_raw_and_…
harupy Mar 29, 2022
94abcd7
[SPARK-38656][UI][PYTHON] Show options for Pandas API on Spark in UI
HyukjinKwon Mar 29, 2022
cd222db
[SPARK-38633][SQL] Support push down Cast to JDBC data source V2
beliefer Mar 29, 2022
3e12ec9
[SPARK-38657][UI][SQL] Rename 'SQL' to 'SQL / DataFrame' in SQL UI page
HyukjinKwon Mar 29, 2022
b01d81e
[SPARK-37641][SQL] Support ANSI Aggregate Function: regr_r2
beliefer Mar 29, 2022
ca7200b
[SPARK-38633][SQL][FOLLOWUP] JDBCSQLBuilder should build cast to type…
beliefer Mar 29, 2022
8fab597
[SPARK-38670][SS] Add offset commit time to streaming query listener
jerrypeng Mar 29, 2022
c0c52dd
[SPARK-32268][SQL][FOLLOWUP] Add RewritePredicateSubquery below the I…
ulysses-you Mar 29, 2022
bc22161
[SPARK-38674][SQL] Avoid deduplication if the keys of the HashedRelat…
wangyum Mar 29, 2022
80deb24
[SPARK-38562][K8S][DOCS] Add doc for `Volcano` scheduler
Yikun Mar 29, 2022
42a9114
[SPARK-37982][SQL] Replace the exception by IllegalStateException in …
leesf Mar 29, 2022
a445536
[SPARK-38349][SS] No need to filter events when sessionwindow gapDura…
nyingping Mar 30, 2022
9f6aad4
[SPARK-38676][SQL] Provide SQL query context in runtime error message…
gengliangwang Mar 30, 2022
6b29b28
[SPARK-38696][BUILD] Add `commons-collections` back for hadoop-3 profile
dongjoon-hyun Mar 30, 2022
cab8aa1
[SPARK-38652][K8S] `uploadFileUri` should preserve file scheme
dongjoon-hyun Mar 30, 2022
ef8fb9b
[SPARK-38694][TESTS] Simplify Java UT code with Junit `assertThrows` Api
LuciferYang Mar 30, 2022
26e93f9
[SPARK-38705][SQL] Use function identifier in create and drop functio…
allisonwang-db Mar 31, 2022
60d0921
[SPARK-38706][CORE] Use URI in `FallbackStorage.copy`
williamhyun Mar 31, 2022
e96883d
[SPARK-38698][SQL] Provide query context in runtime error of Divide/D…
gengliangwang Mar 31, 2022
d678ed4
[SPARK-38650][SQL] Better ParseException message for char types witho…
anchovYu Mar 31, 2022
a364cc0
[SPARK-38336][SQL] Support INSERT INTO commands into tables with DEFA…
dtenedor Mar 31, 2022
f79a948
Remove literals from grouping expressions when using the DataFrame API
tanelk Aug 12, 2021
74cb9e9
Merge branch 'SPARK-36496_remove_grouping_literals' of github.com:tan…
tanelk Mar 31, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
118 changes: 81 additions & 37 deletions .github/workflows/build_and_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,8 @@ on:
push:
branches:
- '**'
- '!branch-*.*'
schedule:
# Note that the scheduled jobs are only for master branch.
# master, Hadoop 2
- cron: '0 1 * * *'
# master
Expand All @@ -37,6 +37,12 @@ on:
- cron: '0 13 * * *'
# Java 17
- cron: '0 16 * * *'
workflow_call:
inputs:
ansi_enabled:
required: false
type: boolean
default: false

jobs:
configure-jobs:
Expand Down Expand Up @@ -90,21 +96,55 @@ jobs:
echo '::set-output name=hadoop::hadoop3'
else
echo '::set-output name=java::8'
echo '::set-output name=branch::master' # Default branch to run on. CHANGE here when a branch is cut out.
echo '::set-output name=branch::master' # NOTE: UPDATE THIS WHEN CUTTING BRANCH
echo '::set-output name=type::regular'
echo '::set-output name=envs::{}'
echo '::set-output name=envs::{"SPARK_ANSI_SQL_MODE": "${{ inputs.ansi_enabled }}"}'
echo '::set-output name=hadoop::hadoop3'
fi

precondition:
name: Check changes
runs-on: ubuntu-20.04
needs: configure-jobs
env:
GITHUB_PREV_SHA: ${{ github.event.before }}
outputs:
required: ${{ steps.set-outputs.outputs.required }}
steps:
- name: Checkout Spark repository
uses: actions/checkout@v2
with:
fetch-depth: 0
repository: apache/spark
ref: ${{ needs.configure-jobs.outputs.branch }}
- name: Sync the current branch with the latest in Apache Spark
if: github.repository != 'apache/spark'
run: |
echo "APACHE_SPARK_REF=$(git rev-parse HEAD)" >> $GITHUB_ENV
git fetch https://github.com/$GITHUB_REPOSITORY.git ${GITHUB_REF#refs/heads/}
git -c user.name='Apache Spark Test Account' -c user.email='sparktestacc@gmail.com' merge --no-commit --progress --squash FETCH_HEAD
git -c user.name='Apache Spark Test Account' -c user.email='sparktestacc@gmail.com' commit -m "Merged commit"
- name: Check all modules
id: set-outputs
run: |
build=`./dev/is-changed.py -m avro,build,catalyst,core,docker-integration-tests,examples,graphx,hadoop-cloud,hive,hive-thriftserver,kubernetes,kvstore,launcher,mesos,mllib,mllib-local,network-common,network-shuffle,pyspark-core,pyspark-ml,pyspark-mllib,pyspark-pandas,pyspark-pandas-slow,pyspark-resource,pyspark-sql,pyspark-streaming,repl,sketch,spark-ganglia-lgpl,sparkr,sql,sql-kafka-0-10,streaming,streaming-kafka-0-10,streaming-kinesis-asl,tags,unsafe,yarn`
pyspark=`./dev/is-changed.py -m avro,build,catalyst,core,graphx,hive,kvstore,launcher,mllib,mllib-local,network-common,network-shuffle,pyspark-core,pyspark-ml,pyspark-mllib,pyspark-pandas,pyspark-pandas-slow,pyspark-resource,pyspark-sql,pyspark-streaming,repl,sketch,sql,tags,unsafe`
sparkr=`./dev/is-changed.py -m avro,build,catalyst,core,hive,kvstore,launcher,mllib,mllib-local,network-common,network-shuffle,repl,sketch,sparkr,sql,tags,unsafe`
tpcds=`./dev/is-changed.py -m build,catalyst,core,hive,kvstore,launcher,network-common,network-shuffle,repl,sketch,sql,tags,unsafe`
docker=`./dev/is-changed.py -m build,catalyst,core,docker-integration-tests,hive,kvstore,launcher,network-common,network-shuffle,repl,sketch,sql,tags,unsafe`
echo "{\"build\": \"$build\", \"pyspark\": \"$pyspark\", \"sparkr\": \"$sparkr\", \"tpcds\": \"$tpcds\", \"docker\": \"$docker\"}" > required.json
cat required.json
echo "::set-output name=required::$(cat required.json)"

# Build: build Spark and run the tests for specified modules.
build:
name: "Build modules (${{ format('{0}, {1} job', needs.configure-jobs.outputs.branch, needs.configure-jobs.outputs.type) }}): ${{ matrix.modules }} ${{ matrix.comment }} (JDK ${{ matrix.java }}, ${{ matrix.hadoop }}, ${{ matrix.hive }})"
needs: configure-jobs
needs: [configure-jobs, precondition]
# Run scheduled jobs for Apache Spark only
# Run regular jobs for commit in both Apache Spark and forked repository
if: >-
(github.repository == 'apache/spark' && needs.configure-jobs.outputs.type == 'scheduled')
|| needs.configure-jobs.outputs.type == 'regular'
|| (needs.configure-jobs.outputs.type == 'regular' && fromJson(needs.precondition.outputs.required).build == 'true')
# Ubuntu 20.04 is the latest LTS. The next LTS is 22.04.
runs-on: ubuntu-20.04
strategy:
Expand Down Expand Up @@ -219,7 +259,7 @@ jobs:
- name: Install Python packages (Python 3.8)
if: (contains(matrix.modules, 'sql') && !contains(matrix.modules, 'sql-'))
run: |
python3.8 -m pip install 'numpy>=1.20.0' 'pyarrow<5.0.0' pandas scipy xmlrunner
python3.8 -m pip install 'numpy>=1.20.0' pyarrow pandas scipy xmlrunner
python3.8 -m pip list
# Run the tests.
- name: Run tests
Expand All @@ -243,18 +283,18 @@ jobs:
path: "**/target/unit-tests.log"

pyspark:
needs: configure-jobs
needs: [configure-jobs, precondition]
# Run PySpark coverage scheduled jobs for Apache Spark only
# Run scheduled jobs with JDK 17 in Apache Spark
# Run regular jobs for commit in both Apache Spark and forked repository
if: >-
(github.repository == 'apache/spark' && needs.configure-jobs.outputs.type == 'pyspark-coverage-scheduled')
|| (github.repository == 'apache/spark' && needs.configure-jobs.outputs.type == 'scheduled' && needs.configure-jobs.outputs.java == '17')
|| needs.configure-jobs.outputs.type == 'regular'
|| (needs.configure-jobs.outputs.type == 'regular' && fromJson(needs.precondition.outputs.required).pyspark == 'true')
name: "Build modules (${{ format('{0}, {1} job', needs.configure-jobs.outputs.branch, needs.configure-jobs.outputs.type) }}): ${{ matrix.modules }}"
runs-on: ubuntu-20.04
container:
image: dongjoon/apache-spark-github-action-image:20211228
image: dongjoon/apache-spark-github-action-image:20220207
strategy:
fail-fast: false
matrix:
Expand All @@ -278,14 +318,15 @@ jobs:
SKIP_UNIDOC: true
SKIP_MIMA: true
METASPACE_SIZE: 1g
SPARK_ANSI_SQL_MODE: ${{ inputs.ansi_enabled }}
steps:
- name: Checkout Spark repository
uses: actions/checkout@v2
# In order to fetch changed files
with:
fetch-depth: 0
repository: apache/spark
ref: master
ref: ${{ needs.configure-jobs.outputs.branch }}
- name: Sync the current branch with the latest in Apache Spark
if: github.repository != 'apache/spark'
run: |
Expand Down Expand Up @@ -351,28 +392,29 @@ jobs:
path: "**/target/unit-tests.log"

sparkr:
needs: configure-jobs
needs: [configure-jobs, precondition]
if: >-
needs.configure-jobs.outputs.type == 'regular'
(needs.configure-jobs.outputs.type == 'regular' && fromJson(needs.precondition.outputs.required).sparkr == 'true')
|| (github.repository == 'apache/spark' && needs.configure-jobs.outputs.type == 'scheduled' && needs.configure-jobs.outputs.java == '17')
name: "Build modules: sparkr"
runs-on: ubuntu-20.04
container:
image: dongjoon/apache-spark-github-action-image:20211228
image: dongjoon/apache-spark-github-action-image:20220207
env:
HADOOP_PROFILE: ${{ needs.configure-jobs.outputs.hadoop }}
HIVE_PROFILE: hive2.3
GITHUB_PREV_SHA: ${{ github.event.before }}
SPARK_LOCAL_IP: localhost
SKIP_MIMA: true
SPARK_ANSI_SQL_MODE: ${{ inputs.ansi_enabled }}
steps:
- name: Checkout Spark repository
uses: actions/checkout@v2
# In order to fetch changed files
with:
fetch-depth: 0
repository: apache/spark
ref: master
ref: ${{ needs.configure-jobs.outputs.branch }}
- name: Sync the current branch with the latest in Apache Spark
if: github.repository != 'apache/spark'
run: |
Expand Down Expand Up @@ -429,14 +471,14 @@ jobs:
PYSPARK_DRIVER_PYTHON: python3.9
PYSPARK_PYTHON: python3.9
container:
image: dongjoon/apache-spark-github-action-image:20211228
image: dongjoon/apache-spark-github-action-image:20220207
steps:
- name: Checkout Spark repository
uses: actions/checkout@v2
with:
fetch-depth: 0
repository: apache/spark
ref: master
ref: ${{ needs.configure-jobs.outputs.branch }}
- name: Sync the current branch with the latest in Apache Spark
if: github.repository != 'apache/spark'
run: |
Expand Down Expand Up @@ -475,10 +517,8 @@ jobs:
# See also https://github.com/sphinx-doc/sphinx/issues/7551.
# Jinja2 3.0.0+ causes error when building with Sphinx.
# See also https://issues.apache.org/jira/browse/SPARK-35375.
python3.9 -m pip install 'flake8==3.9.0' pydata_sphinx_theme 'mypy==0.920' numpydoc 'jinja2<3.0.0' 'black==21.12b0'
python3.9 -m pip install pandas-stubs
# TODO Update to PyPI
python3.9 -m pip install git+https://github.com/typeddjango/pytest-mypy-plugins.git@b0020061f48e85743ee3335bd62a3a608d17c6bd
python3.9 -m pip install 'flake8==3.9.0' pydata_sphinx_theme 'mypy==0.920' 'pytest-mypy-plugins==1.9.3' numpydoc 'jinja2<3.0.0' 'black==21.12b0'
python3.9 -m pip install 'pandas-stubs==1.2.0.53'
- name: Install R linter dependencies and SparkR
run: |
apt-get install -y libcurl4-openssl-dev libgit2-dev libssl-dev libxml2-dev
Expand All @@ -498,11 +538,14 @@ jobs:
# See also https://github.com/sphinx-doc/sphinx/issues/7551.
# Jinja2 3.0.0+ causes error when building with Sphinx.
# See also https://issues.apache.org/jira/browse/SPARK-35375.
python3.9 -m pip install 'sphinx<3.1.0' mkdocs pydata_sphinx_theme ipython nbsphinx numpydoc 'jinja2<3.0.0'
python3.9 -m pip install sphinx_plotly_directive 'numpy>=1.20.0' 'pyarrow<5.0.0' pandas 'plotly>=4.8'
# Pin the MarkupSafe to 2.0.1 to resolve the CI error.
# See also https://issues.apache.org/jira/browse/SPARK-38279.
python3.9 -m pip install 'sphinx<3.1.0' mkdocs pydata_sphinx_theme ipython nbsphinx numpydoc 'jinja2<3.0.0' 'markupsafe==2.0.1'
python3.9 -m pip install ipython_genutils # See SPARK-38517
python3.9 -m pip install sphinx_plotly_directive 'numpy>=1.20.0' pyarrow pandas 'plotly>=4.8'
apt-get update -y
apt-get install -y ruby ruby-dev
Rscript -e "install.packages(c('devtools', 'testthat', 'knitr', 'rmarkdown', 'roxygen2'), repos='https://cloud.r-project.org/')"
Rscript -e "install.packages(c('devtools', 'testthat', 'knitr', 'rmarkdown', 'markdown', 'e1071', 'roxygen2'), repos='https://cloud.r-project.org/')"
Rscript -e "devtools::install_version('pkgdown', version='2.0.1', repos='https://cloud.r-project.org')"
Rscript -e "devtools::install_version('preferably', version='0.4', repos='https://cloud.r-project.org')"
gem install bundler
Expand Down Expand Up @@ -532,8 +575,8 @@ jobs:
bundle exec jekyll build

java-11-17:
needs: configure-jobs
if: needs.configure-jobs.outputs.type == 'regular'
needs: [configure-jobs, precondition]
if: needs.configure-jobs.outputs.type == 'regular' && fromJson(needs.precondition.outputs.required).build == 'true'
name: Java ${{ matrix.java }} build with Maven
strategy:
fail-fast: false
Expand All @@ -548,7 +591,7 @@ jobs:
with:
fetch-depth: 0
repository: apache/spark
ref: master
ref: ${{ needs.configure-jobs.outputs.branch }}
- name: Sync the current branch with the latest in Apache Spark
if: github.repository != 'apache/spark'
run: |
Expand Down Expand Up @@ -583,12 +626,12 @@ jobs:
export MAVEN_CLI_OPTS="--no-transfer-progress"
export JAVA_VERSION=${{ matrix.java }}
# It uses Maven's 'install' intentionally, see https://github.com/apache/spark/pull/26414.
./build/mvn $MAVEN_CLI_OPTS -DskipTests -Pyarn -Pmesos -Pkubernetes -Phive -Phive-thriftserver -Phadoop-cloud -Djava.version=${JAVA_VERSION/-ea} install
./build/mvn $MAVEN_CLI_OPTS -DskipTests -Pyarn -Pmesos -Pkubernetes -Pvolcano -Phive -Phive-thriftserver -Phadoop-cloud -Djava.version=${JAVA_VERSION/-ea} install
rm -rf ~/.m2/repository/org/apache/spark

scala-213:
needs: configure-jobs
if: needs.configure-jobs.outputs.type == 'regular'
needs: [configure-jobs, precondition]
if: needs.configure-jobs.outputs.type == 'regular' && fromJson(needs.precondition.outputs.required).build == 'true'
name: Scala 2.13 build with SBT
runs-on: ubuntu-20.04
steps:
Expand All @@ -597,7 +640,7 @@ jobs:
with:
fetch-depth: 0
repository: apache/spark
ref: master
ref: ${{ needs.configure-jobs.outputs.branch }}
- name: Sync the current branch with the latest in Apache Spark
if: github.repository != 'apache/spark'
run: |
Expand Down Expand Up @@ -629,22 +672,23 @@ jobs:
- name: Build with SBT
run: |
./dev/change-scala-version.sh 2.13
./build/sbt -Pyarn -Pmesos -Pkubernetes -Phive -Phive-thriftserver -Phadoop-cloud -Pkinesis-asl -Pdocker-integration-tests -Pkubernetes-integration-tests -Pspark-ganglia-lgpl -Pscala-2.13 compile test:compile
./build/sbt -Pyarn -Pmesos -Pkubernetes -Pvolcano -Phive -Phive-thriftserver -Phadoop-cloud -Pkinesis-asl -Pdocker-integration-tests -Pkubernetes-integration-tests -Pspark-ganglia-lgpl -Pscala-2.13 compile test:compile

tpcds-1g:
needs: configure-jobs
if: needs.configure-jobs.outputs.type == 'regular'
needs: [configure-jobs, precondition]
if: needs.configure-jobs.outputs.type == 'regular' && fromJson(needs.precondition.outputs.required).tpcds == 'true'
name: Run TPC-DS queries with SF=1
runs-on: ubuntu-20.04
env:
SPARK_LOCAL_IP: localhost
SPARK_ANSI_SQL_MODE: ${{ inputs.ansi_enabled }}
steps:
- name: Checkout Spark repository
uses: actions/checkout@v2
with:
fetch-depth: 0
repository: apache/spark
ref: master
ref: ${{ needs.configure-jobs.outputs.branch }}
- name: Sync the current branch with the latest in Apache Spark
if: github.repository != 'apache/spark'
run: |
Expand Down Expand Up @@ -726,8 +770,8 @@ jobs:
path: "**/target/unit-tests.log"

docker-integration-tests:
needs: configure-jobs
if: needs.configure-jobs.outputs.type == 'regular'
needs: [configure-jobs, precondition]
if: needs.configure-jobs.outputs.type == 'regular' && fromJson(needs.precondition.outputs.required).docker == 'true'
name: Run Docker integration tests
runs-on: ubuntu-20.04
env:
Expand All @@ -743,7 +787,7 @@ jobs:
with:
fetch-depth: 0
repository: apache/spark
ref: master
ref: ${{ needs.configure-jobs.outputs.branch }}
- name: Sync the current branch with the latest in Apache Spark
if: github.repository != 'apache/spark'
run: |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,26 +15,20 @@
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#

from typing import Dict, List

from pyspark.sql.types import Row, StructType
name: "Build and test (ANSI)"

from numpy import ndarray
on:
push:
branches:
- '**'

class _ImageSchema:
def __init__(self) -> None: ...
@property
def imageSchema(self) -> StructType: ...
@property
def ocvTypes(self) -> Dict[str, int]: ...
@property
def columnSchema(self) -> StructType: ...
@property
def imageFields(self) -> List[str]: ...
@property
def undefinedImageType(self) -> str: ...
def toNDArray(self, image: Row) -> ndarray: ...
def toImage(self, array: ndarray, origin: str = ...) -> Row: ...
jobs:
call-build-and-test:
name: Call main build
uses: ./.github/workflows/build_and_test.yml
if: github.repository == 'apache/spark'
with:
ansi_enabled: true

ImageSchema: _ImageSchema
Loading