Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-33813][SQL][3.0] Fix the issue that JDBC source can't treat MS SQL Server's spatial types #31289

Closed
wants to merge 1,280 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
1280 commits
Select commit Hold shift + click to select a range
4656ee5
[SPARK-31511][FOLLOW-UP][TEST][SQL] Make BytesToBytesMap iterators th…
cxzl25 Sep 8, 2020
9b39e4b
[SPARK-32753][SQL][3.0] Only copy tags to node with no tags
manuzhang Sep 8, 2020
8c0b9cb
[SPARK-32815][ML][3.0] Fix LibSVM data source loading error on file p…
MaxGekk Sep 8, 2020
3f20f14
[SPARK-32638][SQL][3.0] Corrects references when adding aliases in Wi…
cloud-fan Sep 8, 2020
e86d90b
[SPARK-32824][CORE] Improve the error message when the user forgets t…
tgravescs Sep 9, 2020
86b9dd9
[SPARK-32823][WEB UI] Fix the master ui resources reporting
tgravescs Sep 9, 2020
4c0f9d8
[SPARK-32813][SQL] Get default config of ParquetSource vectorized rea…
viirya Sep 9, 2020
837843b
[SPARK-32810][SQL][TESTS][FOLLOWUP][3.0] Check path globbing in JSON/…
MaxGekk Sep 9, 2020
e632e7c
[SPARK-32794][SS] Fixed rare corner case error in micro-batch engine …
tdas Sep 9, 2020
5a81f60
[SPARK-32836][SS][TESTS] Fix DataStreamReaderWriterSuite to check wri…
dongjoon-hyun Sep 10, 2020
44acb5a
[SPARK-32832][SS] Use CaseInsensitiveMap for DataStreamReader/Writer …
dongjoon-hyun Sep 10, 2020
5708045
[SPARK-32819][SQL][3.0] ignoreNullability parameter should be effecti…
viirya Sep 10, 2020
4fdd818
[SPARK-32840][SQL][3.0] Invalid interval value can happen to be just …
yaooqinn Sep 11, 2020
cf14897
[SPARK-32677][SQL][DOCS][MINOR] Improve code comment in CreateFunctio…
cloud-fan Sep 11, 2020
ec45d10
[SPARK-32845][SS][TESTS] Add sinkParameter to check sink options robu…
dongjoon-hyun Sep 11, 2020
2e04689
[SPARK-32779][SQL][FOLLOW-UP] Delete Unused code
sandeep-katta Sep 12, 2020
d4d2f5c
[SPARK-32865][DOC] python section in quickstart page doesn't display …
bowenli86 Sep 13, 2020
828603d
[SPARK-32876][SQL] Change default fallback versions to 3.0.1 and 2.4.…
HyukjinKwon Sep 14, 2020
990d49a
[SPARK-32872][CORE] Prevent BytesToBytesMap at MAX_CAPACITY from exce…
ankurdave Sep 14, 2020
fe6ff15
[SPARK-32715][CORE] Fix memory leak when failed to store pieces of br…
LantaoJin Sep 15, 2020
cb6a0d0
[SPARK-32688][SQL][TEST] Add special values to LiteralGenerator for f…
tanelk Sep 16, 2020
75a225e
[SPARK-32888][DOCS] Add user document about header flag and RDD as pa…
viirya Sep 16, 2020
aa9563e
[SPARK-32897][PYTHON] Don't show a deprecation warning at SparkSessio…
HyukjinKwon Sep 16, 2020
2e94d9a
[SPARK-32900][CORE] Allow UnsafeExternalSorter to spill when there ar…
tomvanbussel Sep 17, 2020
b3b6f38
[SPARK-32887][DOC] Correct the typo for SHOW TABLE
Udbhav30 Sep 17, 2020
17a5195
[SPARK-32738][CORE][3.0] Should reduce the number of active threads i…
wzhfy Sep 17, 2020
ecc2f5d
[SPARK-32635][SQL] Fix foldable propagation
peter-toth Sep 17, 2020
5581a92
[SPARK-32908][SQL] Fix target error calculation in `percentile_approx()`
MaxGekk Sep 18, 2020
2d55de5
[SPARK-32906][SQL] Struct field names should not change after normali…
maropu Sep 18, 2020
ffcd757
[SPARK-32905][CORE][YARN] ApplicationMaster fails to receive UpdateDe…
yaooqinn Sep 18, 2020
20cd7bb
[SPARK-32930][CORE] Replace deprecated isFile/isDirectory methods
williamhyun Sep 18, 2020
7746c20
[SPARK-32635][SQL][FOLLOW-UP] Add a new test case in catalyst module
peter-toth Sep 18, 2020
03fb144
[SPARK-32898][CORE] Fix wrong executorRunTime when task killed before…
Ngone51 Sep 18, 2020
0a4b668
[SPARK-32886][WEBUI] fix 'undefined' link in event timeline view
zhli1142015 Sep 21, 2020
b27bbbb
[SPARK-32718][SQL][3.0] Remove unnecessary keywords for interval units
cloud-fan Sep 21, 2020
8a481d8
[SPARK-32659][SQL][FOLLOWUP][3.0] Broadcast Array instead of Set in I…
cloud-fan Sep 22, 2020
58124bd
[MINOR][SQL][3.0] Improve examples for `percentile_approx()`
MaxGekk Sep 23, 2020
542dc97
[SPARK-32306][SQL][DOCS][3.0] Clarify the result of `percentile_appro…
MaxGekk Sep 23, 2020
21b6b69
[SPARK-32977][SQL][DOCS] Fix JavaDoc on Default Save Mode
RussellSpitzer Sep 24, 2020
4b84e57
[SPARK-32877][SQL][TEST] Add test for Hive UDF complex decimal type
ulysses-you Sep 25, 2020
4425c3a
[SPARK-32999][SQL] Use Utils.getSimpleName to avoid hitting Malformed…
rednaxelafx Sep 26, 2020
424f16e
[SPARK-33015][SQL] Compute the current date only once
MaxGekk Sep 29, 2020
118de10
[MINOR][DOCS] Document when `current_date` and `current_timestamp` ar…
MaxGekk Sep 29, 2020
97d8634
[SPARK-33021][PYTHON][TESTS] Move functions related test cases into t…
HyukjinKwon Sep 29, 2020
2160dc5
[SPARK-33015][SQL][FOLLOWUP][3.0] Use millisToDays() in the ComputeCu…
MaxGekk Sep 29, 2020
d3cc564
[SPARK-32901][CORE] Do not allocate memory while spilling UnsafeExter…
tomvanbussel Sep 29, 2020
39bfae2
[MINOR][DOCS] Fixing log message for better clarity
akshatb1 Sep 29, 2020
ae8b35a
[SPARK-33018][SQL] Fix estimate statistics issue if child has 0 bytes
wangyum Sep 29, 2020
f3b80f8
[SPARK-33019][CORE] Use spark.hadoop.mapreduce.fileoutputcommitter.al…
dongjoon-hyun Sep 29, 2020
db6ba04
[SPARK-31753][SQL][DOCS][FOLLOW-UP] Add missing keywords in the SQL docs
GuoPhilipse Sep 30, 2020
bc29602
[SQL][DOC][MINOR] Corrects input table names in the examples of CREAT…
iRakson Oct 1, 2020
41e1919
[SPARK-32996][WEB-UI][3.0] Handle empty ExecutorMetrics in ExecutorMe…
shrutig Oct 2, 2020
31684d6
[SPARK-33051][INFRA][R] Uses setup-r to install R in GitHub Actions b…
HyukjinKwon Oct 2, 2020
c9b6271
[SPARK-33043][ML] Handle spark.driver.maxResultSize=0 in RowMatrix he…
srowen Oct 3, 2020
75003fc
[SPARK-33065][TESTS] Expand the stack size of a thread in a test in L…
sarutak Oct 4, 2020
46a62ca
[SPARK-33069][INFRA] Skip test result report if no JUnit XML files ar…
HyukjinKwon Oct 6, 2020
4f71231
[SPARK-33073][PYTHON] Improve error handling on Pandas to Arrow conve…
BryanCutler Oct 6, 2020
d51b8d6
[SPARK-27428][CORE][TEST] Increase receive buffer size used in Statsd…
mundaym Oct 6, 2020
2076abc
Revert "[SPARK-33073][PYTHON] Improve error handling on Pandas to Arr…
HyukjinKwon Oct 7, 2020
23207fc
[SPARK-33035][SQL][3.0] Updates the obsoleted entries of attribute ma…
maropu Oct 7, 2020
7981f67
[SPARK-33073][PYTHON][3.0] Improve error handling on Pandas to Arrow …
BryanCutler Oct 7, 2020
45475af
[SPARK-32067][K8S] Use unique ConfigMap name for executor pod template
stijndehaes Oct 7, 2020
a7e4318
[SPARK-33089][SQL] make avro format propagate Hadoop config from DS o…
yuningzh-db Oct 8, 2020
782ab8e
[SPARK-33091][SQL] Avoid using map instead of foreach to avoid potent…
HyukjinKwon Oct 8, 2020
c1b660e
[SPARK-33096][K8S] Use LinkedHashMap instead of Map for newlyCreatedE…
dongjoon-hyun Oct 8, 2020
dcffa56
[SPARK-33101][ML][3.0] Make LibSVM format propagate Hadoop config fro…
MaxGekk Oct 9, 2020
9892b3e
[SPARK-33094][SQL][3.0] Make ORC format propagate Hadoop config from …
MaxGekk Oct 9, 2020
0601fc7
[SPARK-33118][SQL] CREATE TEMPORARY TABLE fails with location
pablolanga-stratio Oct 12, 2020
9430ae6
[SPARK-33115][BUILD][DOCS] Fix javadoc errors in `kvstore` and `unsaf…
gemelen Oct 13, 2020
205b65e
[SPARK-33134][SQL][3.0] Return partial results only for root JSON obj…
MaxGekk Oct 14, 2020
2ebea13
[SPARK-33136][SQL] Fix mistakenly swapped parameter in V2WriteCommand…
HeartSaVioR Oct 14, 2020
d9669bd
[SPARK-33146][CORE] Check for non-fatal errors when loading new appli…
Oct 15, 2020
0b7b811
[SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in HiveExternalCatalogVers…
dongjoon-hyun Oct 15, 2020
e40c147
Revert "[SPARK-33146][CORE] Check for non-fatal errors when loading n…
dongjoon-hyun Oct 15, 2020
d0f1120
[SPARK-33163][SQL][TESTS] Check the metadata key 'org.apache.spark.le…
MaxGekk Oct 16, 2020
160f458
[SPARK-33165][SQL][TEST] Remove dependencies(scalatest,scalactic) fro…
maropu Oct 16, 2020
37d6b3c
[SPARK-32761][SQL][3.0] Allow aggregating multiple foldable distinct …
linhongliu-db Oct 16, 2020
698ac6a
[SPARK-33165][SQL][TESTS][FOLLOW-UP] Use scala.Predef.assert instead
HyukjinKwon Oct 16, 2020
b66bd79
[SPARK-33171][INFRA] Mark ParquetV*FilterSuite/ParquetV*SchemaPruning…
dongjoon-hyun Oct 16, 2020
1bec8a3
[SPARK-32436][CORE] Initialize numNonEmptyBlocks in HighlyCompressedM…
dongjoon-hyun Jul 25, 2020
fab10f0
[SPARK-33131][SQL][3.0] Fix grouping sets with having clause can not …
ulysses-you Oct 17, 2020
56a60ca
[SPARK-33170][SQL] Add SQL config to control fast-fail behavior in Fi…
viirya Oct 18, 2020
7e65b12
[MINOR][DOCS][EXAMPLE] Fix the Python manual_load_options_csv example
kjmrknsn Oct 18, 2020
05fbbb1
[SPARK-33176][K8S] Use 11-jre-slim as default in K8s Dockerfile
dongjoon-hyun Oct 18, 2020
0bff1f6
[SPARK-33123][INFRA] Ignore GitHub only changes in Amplab Jenkins build
williamhyun Oct 19, 2020
15ed312
[SPARK-32557][CORE] Logging and swallowing the exception per entry in…
yanxiaole Aug 9, 2020
02f80cf
Revert "Revert "[SPARK-33146][CORE] Check for non-fatal errors when l…
HeartSaVioR Oct 15, 2020
b1d5a08
Revert "[SPARK-33069][INFRA] Skip test result report if no JUnit XML …
HyukjinKwon Oct 19, 2020
c3af7c6
[SPARK-33181][SQL][DOCS] Document Load Table Directly from File in SQ…
liaoaoyuan97 Oct 20, 2020
3b5b533
[SPARK-33190][INFRA][TESTS] Set upper bound of PyArrow version in Git…
HyukjinKwon Oct 20, 2020
4373c71
[MINOR][DOCS] Fix the description about to_avro and from_avro functions
kjmrknsn Oct 20, 2020
5e33155
[SPARK-33189][PYTHON][TESTS] Add env var to tests for legacy nested t…
BryanCutler Oct 21, 2020
a36b3c4
[SPARK-32785][SQL][DOCS][FOLLOWUP][3.0] Update migration guide for in…
yaooqinn Oct 21, 2020
e31fe6c
[SPARK-33189][FOLLOWUP][3.0] Fix syntax error in python/run-tests.py
dongjoon-hyun Oct 22, 2020
933dc6c
[SPARK-32247][INFRA] Install and test scipy with PyPy in GitHub Actions
HyukjinKwon Oct 15, 2020
f7c7f4f
[SPARK-30821][K8S] Handle executor failure with multiple containers
huskysun Oct 24, 2020
80716d1
[SPARK-33228][SQL] Don't uncache data when replacing a view having th…
maropu Oct 25, 2020
590ccb3
[SPARK-33197][SQL] Make changes to spark.sql.analyzer.maxIterations t…
yuningzh-db Oct 26, 2020
22392be
[SPARK-33230][SQL] Hadoop committers to get unique job ID in "spark.s…
steveloughran Oct 26, 2020
c95d925
[SPARK-33260][SQL] Fix incorrect results from SortExec when sortOrder…
ankurdave Oct 27, 2020
e37859a
[SPARK-33246][SQL][DOCS] Correct documentation for null semantics of …
Oct 27, 2020
737a850
[SPARK-32090][SQL] Improve UserDefinedType.equal() to make it be symm…
Ngone51 Jun 29, 2020
ba2a113
[SPARK-33264][SQL][DOCS] Add a dedicated page for SQL-on-file in SQL …
maropu Oct 28, 2020
f6c72e6
[SPARK-33208][SQL] Update the document of SparkSession#sql
waitinfuture Oct 28, 2020
3ce335d
[SPARK-33267][SQL] Fix NPE issue on 'In' filter when one of values co…
HeartSaVioR Oct 28, 2020
f5dc06e
[SPARK-32119][CORE][3.0] ExecutorPlugin doesn't work with Standalone …
sarutak Oct 28, 2020
f03bca8
[SQL][MINOR] Update from_unixtime doc
Obbay2 Oct 29, 2020
563a678
[SPARK-33292][SQL] Make Literal ArrayBasedMapData string representati…
dongjoon-hyun Oct 30, 2020
8f57603
[SPARK-33268][SQL][PYTHON][3.0] Fix bugs for casting data from/to Pyt…
maropu Oct 30, 2020
83f259f
[SPARK-33183][SQL][3.0] Fix Optimizer rule EliminateSorts and add a p…
allisonwang-db Oct 30, 2020
fc10531
[SPARK-33290][SQL] REFRESH TABLE should invalidate cache even though …
sunchao Oct 31, 2020
49e9575
[SPARK-33306][SQL] Timezone is needed when cast date to string
WangGuangxin Oct 31, 2020
92ba08d
[SPARK-33277][PYSPARK][SQL][3.0] Use ContextAwareIterator to stop con…
ueshin Nov 2, 2020
131179a
[SPARK-33313][TESTS][R][3.0][2.4] Add testthat 3.x support
HyukjinKwon Nov 2, 2020
71ef48e
[SPARK-33156][INFRA][3.0] Upgrade GithubAction image from 18.04 to 20.04
dongjoon-hyun Nov 3, 2020
d99ff20
[SPARK-24266][K8S][3.0] Restart the watcher when we receive a version…
stijndehaes Nov 3, 2020
55105a0
[SPARK-33284][WEB-UI] In the Storage UI page, clicking any field to s…
echohlne Nov 3, 2020
5dd36f3
[SPARK-33333][BUILD][3.0] Upgrade Jetty to 9.4.28.v20200408
dongjoon-hyun Nov 4, 2020
e7a6211
[SPARK-33338][SQL] GROUP BY using literal map should not fail
dongjoon-hyun Nov 4, 2020
b43572e
[SPARK-33162][INFRA][3.0] Use pre-built image at GitHub Action PySpar…
dongjoon-hyun Nov 5, 2020
14eb8b1
[SPARK-33239][INFRA][3.0] Use pre-built image at GitHub Action SparkR…
dongjoon-hyun Nov 5, 2020
74d8eac
Revert "[SPARK-33277][PYSPARK][SQL][3.0] Use ContextAwareIterator to …
HyukjinKwon Nov 5, 2020
c43231c
[MINOR][SS][DOCS] Update join type in stream static joins code examples
sarveshdave1 Nov 5, 2020
6da60bf
[SPARK-33362][SQL] skipSchemaResolution should still require query to…
cloud-fan Nov 5, 2020
3223e3e
[SPARK-32860][DOCS][SQL] Updating documentation about map support in …
Nov 8, 2020
808dd8f
[SPARK-33371][PYTHON][3.0] Update setup.py and tests for Python 3.9
HyukjinKwon Nov 9, 2020
c157fa3
[SPARK-33372][SQL] Fix InSet bucket pruning
wangyum Nov 9, 2020
a418495
[SPARK-33397][YARN][DOC] Fix generating md to html for available-patt…
yaooqinn Nov 10, 2020
1aa8f4f
[SPARK-33405][BUILD][3.0] Upgrade commons-compress to 1.20
dongjoon-hyun Nov 10, 2020
b905d65
[SPARK-33391][SQL] element_at with CreateArray not respect one based …
leanken-zz Nov 10, 2020
4a1c143
[SPARK-33339][PYTHON] Pyspark application will hang due to non Except…
Nov 10, 2020
577dbb9
[SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TP…
maropu Nov 11, 2020
1e2984b
[SPARK-33412][SQL][3.0] OverwriteByExpression should resolve its dele…
cloud-fan Nov 11, 2020
3edec10
[SPARK-33402][CORE] Jobs launched in same second have duplicate MapRe…
steveloughran Nov 11, 2020
00be83a
[SPARK-33404][SQL][3.0] Fix incorrect results in `date_trunc` expression
utkarsh39 Nov 11, 2020
2eadedc
[SPARK-33408][K8S][R][3.0] Use R 3.6.3 in K8s R image
dongjoon-hyun Nov 12, 2020
5ee76e6
[MINOR][DOC] spark.executor.memoryOverhead is not cluster-mode only
yaooqinn Nov 12, 2020
e684720
[SPARK-33435][SQL][3.0] DSv2: REFRESH TABLE should invalidate caches …
sunchao Nov 13, 2020
921daa8
[SPARK-33439][INFRA] Use SERIAL_SBT_TESTS=1 for SQL modules
dongjoon-hyun Nov 13, 2020
45bdb58
[SPARK-33358][SQL] Return code when command process failed
artiship Nov 16, 2020
265363d
[SPARK-33451][DOCS] Change to 'spark.sql.adaptive.skewJoin.skewedPart…
Southwest16 Nov 16, 2020
26c0404
[MINOR][GRAPHX][3.0] Correct typos in the sub-modules: graphx, extern…
jsoref Nov 17, 2020
c301d9c
[SPARK-33464][INFRA][3.0] Add/remove (un)necessary cache and restruct…
HyukjinKwon Nov 19, 2020
1101938
[SPARK-27421][SQL] Fix filter for int column and value class java.lan…
wangyum Nov 19, 2020
6b7172b
[SPARK-33483][INFRA][TESTS][3.0] Fix rat exclusion patterns and add a…
dongjoon-hyun Nov 19, 2020
d7c2dae
[SPARK-33422][DOC] Fix the correct display of left menu item
liucht-inspur Nov 20, 2020
1e525c1
[SPARK-33472][SQL][3.0] Adjust RemoveRedundantSorts rule order
allisonwang-db Nov 20, 2020
b70584f
[MINOR][INFRA] Suppress warning in check-license
williamhyun Nov 23, 2020
200417e
[SPARK-33524][SQL][TESTS] Change `InMemoryTable` not to use Tuple.has…
dongjoon-hyun Nov 24, 2020
efae8b6
[SPARK-33535][INFRA][TESTS] Export LANG to en_US.UTF-8 in run-tests-j…
LuciferYang Nov 24, 2020
8eedc41
[SPARK-33565][PYTHON][BUILD][3.0] Remove py38 spark3
shaneknapp Nov 25, 2020
7503c4a
[SPARK-33565][INFRA][FOLLOW-UP][3.0] Keep the test coverage with Pyth…
HyukjinKwon Nov 26, 2020
f67f80b
[SPARK-33585][SQL][DOCS] Fix the comment for `SQLContext.tables()` an…
MaxGekk Nov 29, 2020
f6638cf
[SPARK-33579][UI] Fix executor blank page behind proxy
Nov 30, 2020
03291c8
[SPARK-33588][SQL][3.0] Respect the `spark.sql.caseSensitive` config …
MaxGekk Nov 30, 2020
242581f
[SPARK-33440][CORE] Use current timestamp with warning log in HadoopF…
HeartSaVioR Nov 30, 2020
6abfeb6
[SPARK-33611][UI] Avoid encoding twice on the query parameter of rewr…
gengliangwang Dec 1, 2020
e59179b
[SPARK-33504][CORE] The application log in the Spark history server c…
echohlne Dec 2, 2020
3fb9f6f
Revert "[SPARK-33504][CORE] The application log in the Spark history …
tgravescs Dec 2, 2020
6f4587a
[SPARK-33631][DOCS][TEST] Clean up spark.core.connection.ack.wait.tim…
LuciferYang Dec 2, 2020
13ca88c
[SPARK-33636][PYTHON][ML][3.0] Add labelsArray to PySpark StringIndexer
viirya Dec 3, 2020
c4318a1
[SPARK-33629][PYTHON] Make spark.buffer.size configuration visible on…
gaborgsomogyi Dec 3, 2020
6121c8f
[SPARK-33660][DOCS][SS] Fix Kafka Headers Documentation
Gschiavon Dec 4, 2020
8743571
[SPARK-33571][SQL][DOCS][3.0] Add a ref to INT96 config from the doc …
MaxGekk Dec 4, 2020
66b1bdb
[MINOR] Fix string interpolation in CommandUtils.scala and KafkaDataC…
imback82 Dec 6, 2020
a11a07a
[SPARK-33667][SQL][3.0] Respect the `spark.sql.caseSensitive` config …
MaxGekk Dec 6, 2020
8029d66
[SPARK-33675][INFRA][3.0] Add GitHub Action job to publish snapshot
dongjoon-hyun Dec 7, 2020
313a460
[SPARK-33681][K8S][TESTS][3.0] Increase K8s IT timeout to 3 minutes
dongjoon-hyun Dec 7, 2020
1bb37dc
[SPARK-33675][INFRA][3.0][FOLLOWUP] Set GIT_REF to branch-3.0
dongjoon-hyun Dec 7, 2020
8acbe5b
[SPARK-33592][ML][PYTHON][3.0] Backport Fix: Pyspark ML Validator par…
WeichenXu123 Dec 7, 2020
9555658
[SPARK-33670][SQL][3.0] Verify the partition provider is Hive in v1 S…
MaxGekk Dec 7, 2020
46a0ec5
[SPARK-32680][SQL][3.0] Don't Preprocess V2 CTAS with Unresolved Query
linhongliu-db Dec 8, 2020
ea7c2a1
[SPARK-33677][SQL] Skip LikeSimplification rule if pattern contains a…
luluorta Dec 8, 2020
eae6a3e
[SPARK-32110][SQL] normalize special floating numbers in HyperLogLog++
cloud-fan Dec 8, 2020
a4c5e54
[SPARK-33727][K8S] Fall back from gnupg.net to openpgp.org
holdenk Dec 10, 2020
2921c4e
[SPARK-33725][BUILD][3.0] Upgrade snappy-java to 1.1.8.2
viirya Dec 10, 2020
83af036
[SPARK-33732][K8S][TESTS][3.0] Kubernetes integration tests doesn't w…
sarutak Dec 10, 2020
728bdb7
[SPARK-33749][BUILD][PYTHON] Exclude target directory in pycodestyle …
HyukjinKwon Dec 11, 2020
9439e11
[SPARK-33740][SQL][3.0] hadoop configs in hive-site.xml can overrides…
yaooqinn Dec 11, 2020
2534165
[SPARK-33757][INFRA][R] Fix the R dependencies build error on GitHub …
sarutak Dec 11, 2020
fe38821
[SPARK-33742][SQL][3.0] Throw PartitionsAlreadyExistException from Hi…
MaxGekk Dec 11, 2020
14e77ab
[MINOR][UI] Correct JobPage's skipped/pending tableHeaderId
linzebing Dec 13, 2020
7cd1aab
[SPARK-33757][INFRA][R][FOLLOWUP] Provide more simple solution
sarutak Dec 14, 2020
d652b47
[SPARK-33770][SQL][TESTS][3.1][3.0] Fix the `ALTER TABLE .. DROP PART…
MaxGekk Dec 14, 2020
f2c8079
[SPARK-33786][SQL][3.0] The storage level for a cache should be respe…
imback82 Dec 16, 2020
a77b70d
[SPARK-33788][SQL][3.1][3.0][2.4] Throw NoSuchPartitionsException fro…
MaxGekk Dec 16, 2020
da272f7
[SPARK-33793][TESTS][3.0] Introduce withExecutor to ensure proper cle…
sander-goos Dec 16, 2020
b86ea0f
[SPARK-33733][SQL][3.0] PullOutNondeterministic should check and coll…
ulysses-you Dec 17, 2020
cd683b3
[SPARK-33819][CORE][3.0] SingleFileEventLogFileReader/RollingEventLog…
dongjoon-hyun Dec 17, 2020
99eb027
[SPARK-33774][UI][CORE] Back to Master" returns 500 error in Standalo…
Ngone51 Dec 17, 2020
3ef6827
[SPARK-33822][SQL] Use the `CastSupport.cast` method in HashJoin
maropu Dec 18, 2020
5ce0b9f
Revert "[SPARK-33822][SQL] Use the `CastSupport.cast` method in HashJ…
dongjoon-hyun Dec 18, 2020
1615b0e
[SPARK-33822][SQL][3.0] Use the `CastSupport.cast` method in HashJoin
maropu Dec 18, 2020
faf4a0e
[SPARK-33831][UI] Update to jetty 9.4.34
srowen Dec 18, 2020
f67c3c2
[SPARK-33593][SQL][3.0] Vector reader got incorrect data with binary …
AngersZhuuuu Dec 18, 2020
7881622
[SPARK-33841][CORE][3.0] Fix issue with jobs disappearing intermitten…
vhlinskyi Dec 18, 2020
faf8dd5
[SPARK-33756][SQL] Make BytesToBytesMap's MapIterator idempotent
advancedxy Dec 20, 2020
78dbb4a
[SPARK-33853][SQL] EXPLAIN CODEGEN and BenchmarkQueryTest don't show …
sarutak Dec 21, 2020
c9fe712
[SPARK-33869][PYTHON][SQL][TESTS] Have a separate metastore directory…
HyukjinKwon Dec 21, 2020
0820beb
[SPARK-28863][SQL][FOLLOWUP][3.0] Make sure optimized plan will not b…
cloud-fan Dec 22, 2020
7af54fd
[SPARK-33860][SQL] Make CatalystTypeConverters.convertToCatalyst matc…
ulysses-you Dec 22, 2020
73f5626
[BUILD][MINOR] Do not publish snapshots from forks
EnricoMi Dec 22, 2020
4299a48
Revert "[SPARK-33860][SQL] Make CatalystTypeConverters.convertToCatal…
HyukjinKwon Dec 23, 2020
8c4e166
[SPARK-33891][DOCS][CORE] Update dynamic allocation related documents
dongjoon-hyun Dec 23, 2020
83adba7
[SPARK-33277][PYSPARK][SQL] Use ContextAwareIterator to stop consumin…
ueshin Dec 23, 2020
1445129
[SPARK-33900][WEBUI] Show shuffle read size / records correctly when …
cxzl25 Dec 24, 2020
65dd1d0
[SPARK-33911][SQL][DOCS][3.0] Update the SQL migration guide about ch…
MaxGekk Dec 27, 2020
91a2260
[MINOR][SS] Call fetchEarliestOffsets when it is necessary
viirya Dec 30, 2020
b156c1f
[SPARK-33942][DOCS] Remove `hiveClientCalls.count` in `CodeGenerator`…
Dec 31, 2020
39867a8
[SPARK-33931][INFRA][3.0] Recover GitHub Action `build_and_test` job
dongjoon-hyun Jan 1, 2021
dda431a
[SPARK-33963][SQL] Canonicalize `HiveTableRelation` w/o table stats
MaxGekk Jan 3, 2021
9f1bf4e
[SPARK-33398] Fix loading tree models prior to Spark 3.0
zhengruifeng Jan 3, 2021
e882c90
[SPARK-33950][SQL][3.1][3.0] Refresh cache in v1 `ALTER TABLE .. DROP…
MaxGekk Jan 4, 2021
36e845b
[SPARK-34000][CORE] Fix stageAttemptToNumSpeculativeTasks java.util.N…
LantaoJin Jan 5, 2021
7a2f4da
[SPARK-34010][SQL][DODCS] Use python3 instead of python in SQL docume…
HyukjinKwon Jan 5, 2021
1179b8b
[SPARK-33935][SQL][3.0] Fix CBO cost function
Jan 5, 2021
9ba6db9
[SPARK-33844][SQL][3.0] InsertIntoHiveDir command should check col na…
AngersZhuuuu Jan 5, 2021
98cb0cd
[SPARK-33635][SS] Adjust the order of check in KafkaTokenUtil.needTok…
HeartSaVioR Jan 6, 2021
403bca4
[SPARK-33029][CORE][WEBUI][3.0] Fix the UI executor page incorrectly …
Jan 6, 2021
aaa3dcc
[SPARK-34012][SQL][3.0] Keep behavior consistent when conf `spark.sql…
AngersZhuuuu Jan 6, 2021
c9c3d6f
[SPARK-34011][SQL][3.1][3.0] Refresh cache in `ALTER TABLE .. RENAME …
MaxGekk Jan 6, 2021
e7d5344
[SPARK-33100][SQL][3.0] Ignore a semicolon inside a bracketed comment…
turboFei Jan 8, 2021
471a089
[SPARK-34055][SQL][3.0] Refresh cache in `ALTER TABLE .. ADD PARTITION`
MaxGekk Jan 11, 2021
16cab5c
[SPARK-33591][SQL][3.0] Recognize `null` in partition spec values
MaxGekk Jan 11, 2021
4cbc177
[MINOR][3.1][3.0] Improve flaky NaiveBayes test
WeichenXu123 Jan 11, 2021
ecfa015
[SPARK-34060][SQL][3.0] Fix Hive table caching while updating stats b…
MaxGekk Jan 11, 2021
27c03b6
[SPARK-34059][SQL][CORE][3.0] Use for/foreach rather than map to make…
HyukjinKwon Jan 12, 2021
a30d20f
[SPARK-31952][SQL][3.0] Fix incorrect memory spill metric when doing …
Ngone51 Jan 12, 2021
7cfc45b
[SPARK-32691][3.0] Bump commons-crypto to v1.1.0
huangtianhua Jan 12, 2021
0c4fdea
[SPARK-34084][SQL][3.0] Fix auto updating of table stats in `ALTER TA…
MaxGekk Jan 13, 2021
dbc18d6
[SPARK-34103][INFRA] Fix MiMaExcludes by moving SPARK-23429 from 2.4 …
dongjoon-hyun Jan 14, 2021
fcd10a6
[SPARK-33557][CORE][MESOS][3.0] Ensure the relationship between STORA…
LuciferYang Jan 14, 2021
dc1816d
[SPARK-34118][CORE][SQL][3.0] Replaces filter and check for emptiness…
LuciferYang Jan 15, 2021
d81f482
[SPARK-33790][CORE][3.0] Reduce the rpc call of getFileStatus in Sing…
cxzl25 Jan 15, 2021
70fa108
[SPARK-32598][SCHEDULER] Fix missing driver logs under UI App-Executo…
KevinSmile Jan 15, 2021
f7591e5
[SPARK-33711][K8S][3.0] Avoid race condition between POD lifecycle ma…
attilapiros Jan 15, 2021
1ab0f02
[SPARK-34060][SQL][FOLLOWUP] Preserve serializability of canonicalize…
MaxGekk Jan 16, 2021
403a4ac
[MINOR][DOCS] Update Parquet website link
williamhyun Jan 16, 2021
d8ce224
[MINOR][DOCS] Fix typos in sql-ref-datatypes.md
kariya-mitsuru Jan 18, 2021
70c0bc9
[SPARK-33819][CORE][FOLLOWUP][3.0] Restore the constructor of SingleF…
dongjoon-hyun Jan 18, 2021
f705b65
[SPARK-34027][SQL][3.0] Refresh cache in `ALTER TABLE .. RECOVER PART…
MaxGekk Jan 19, 2021
67b9f6c
[SPARK-34153][SQL][3.1][3.0] Remove unused `getRawTable()` from `Hive…
MaxGekk Jan 19, 2021
5a93bcb
[MINOR][ML] Increase the timeout for StreamingLinearRegressionSuite t…
liangz1 Jan 20, 2021
b5b1da9
[SPARK-34115][CORE] Check SPARK_TESTING as lazy val to avoid slowdown
nob13 Jan 20, 2021
89443ab
[SPARK-34178][SQL] Copy tags for the new node created by MultiInstanc…
Ngone51 Jan 20, 2021
4690063
[MINOR][TESTS] Increase tolerance to 0.2 for NaiveBayesSuite
Loquats Jan 21, 2021
4e80f8c
Revert "[SPARK-34178][SQL] Copy tags for the new node created by Mult…
HyukjinKwon Jan 21, 2021
785998b
[SPARK-34181][DOC] Update Prerequisites for build doc of ruby 3.0 issue
AngersZhuuuu Jan 21, 2021
c59a423
[SPARK-33813][SQL] Fix the issue that JDBC source can't treat MS SQL …
sarutak Jan 22, 2021
c36fe77
Modify the URL to comply with the version of mssql-jdbc driver.
sarutak Jan 22, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
372 changes: 372 additions & 0 deletions .github/workflows/build_and_test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,372 @@
name: Build and test

on:
push:
branches:
- branch-3.0
pull_request:
branches:
- branch-3.0

jobs:
# Build: build Spark and run the tests for specified modules.
build:
name: "Build modules: ${{ matrix.modules }} ${{ matrix.comment }} (JDK ${{ matrix.java }}, ${{ matrix.hadoop }}, ${{ matrix.hive }})"
# Ubuntu 20.04 is the latest LTS. The next LTS is 22.04.
runs-on: ubuntu-20.04
strategy:
fail-fast: false
matrix:
java:
- 8
hadoop:
- hadoop2.7
hive:
- hive2.3
# TODO(SPARK-32246): We don't test 'streaming-kinesis-asl' for now.
# Kinesis tests depends on external Amazon kinesis service.
# Note that the modules below are from sparktestsupport/modules.py.
modules:
- >-
core, unsafe, kvstore, avro,
network-common, network-shuffle, repl, launcher,
examples, sketch, graphx
- >-
catalyst, hive-thriftserver
- >-
streaming, sql-kafka-0-10, streaming-kafka-0-10,
mllib-local, mllib,
yarn, mesos, kubernetes, hadoop-cloud, spark-ganglia-lgpl
# Here, we split Hive and SQL tests into some of slow ones and the rest of them.
included-tags: [""]
# Some tests are disabled in GitHun Actions. Ideally, we should remove this tag
# and run all tests.
excluded-tags: ["org.apache.spark.tags.GitHubActionsUnstableTest"]
comment: [""]
include:
# Hive tests
- modules: hive
java: 8
hadoop: hadoop2.7
hive: hive2.3
included-tags: org.apache.spark.tags.SlowHiveTest
comment: "- slow tests"
- modules: hive
java: 8
hadoop: hadoop2.7
hive: hive2.3
excluded-tags: org.apache.spark.tags.SlowHiveTest,org.apache.spark.tags.GitHubActionsUnstableTest
comment: "- other tests"
# SQL tests
- modules: sql
java: 8
hadoop: hadoop2.7
hive: hive2.3
included-tags: org.apache.spark.tags.ExtendedSQLTest
comment: "- slow tests"
- modules: sql
java: 8
hadoop: hadoop2.7
hive: hive2.3
excluded-tags: org.apache.spark.tags.ExtendedSQLTest,org.apache.spark.tags.GitHubActionsUnstableTest
comment: "- other tests"
env:
MODULES_TO_TEST: ${{ matrix.modules }}
EXCLUDED_TAGS: ${{ matrix.excluded-tags }}
INCLUDED_TAGS: ${{ matrix.included-tags }}
HADOOP_PROFILE: ${{ matrix.hadoop }}
HIVE_PROFILE: ${{ matrix.hive }}
# GitHub Actions' default miniconda to use in pip packaging test.
CONDA_PREFIX: /usr/share/miniconda
GITHUB_PREV_SHA: ${{ github.event.before }}
steps:
- name: Checkout Spark repository
uses: actions/checkout@v2
# In order to fetch changed files
with:
fetch-depth: 0
# Cache local repositories. Note that GitHub Actions cache has a 2G limit.
- name: Cache Scala, SBT, Maven and Zinc
uses: actions/cache@v2
with:
path: |
build/apache-maven-*
build/zinc-*
build/scala-*
build/*.jar
~/.sbt
key: build-${{ hashFiles('**/pom.xml', 'project/build.properties', 'build/mvn', 'build/sbt', 'build/sbt-launch-lib.bash', 'build/spark-build-info') }}
restore-keys: |
build-
- name: Cache Ivy local repository
uses: actions/cache@v2
with:
path: ~/.ivy2/cache
key: ${{ matrix.java }}-${{ matrix.hadoop }}-ivy-${{ hashFiles('**/pom.xml', '**/plugins.sbt') }}
restore-keys: |
${{ matrix.java }}-${{ matrix.hadoop }}-ivy-
- name: Install Java ${{ matrix.java }}
uses: actions/setup-java@v1
with:
java-version: ${{ matrix.java }}
- name: Install Python 3.8
uses: actions/setup-python@v2
# We should install one Python that is higher then 3+ for SQL and Yarn because:
# - SQL component also has Python related tests, for example, IntegratedUDFTestUtils.
# - Yarn has a Python specific test too, for example, YarnClusterSuite.
if: contains(matrix.modules, 'yarn') || (contains(matrix.modules, 'sql') && !contains(matrix.modules, 'sql-'))
with:
python-version: 3.8
architecture: x64
- name: Install Python packages (Python 3.8)
if: (contains(matrix.modules, 'sql') && !contains(matrix.modules, 'sql-'))
run: |
python3.8 -m pip install numpy 'pyarrow<3.0.0' pandas scipy xmlrunner
python3.8 -m pip list
# Run the tests.
- name: Run tests
run: |
# Hive and SQL tests become flaky when running in parallel as it's too intensive.
if [[ "$MODULES_TO_TEST" == "hive" ]] || [[ "$MODULES_TO_TEST" == "sql" ]]; then export SERIAL_SBT_TESTS=1; fi
./dev/run-tests --parallelism 2 --modules "$MODULES_TO_TEST" --included-tags "$INCLUDED_TAGS" --excluded-tags "$EXCLUDED_TAGS"
- name: Upload test results to report
if: always()
uses: actions/upload-artifact@v2
with:
name: test-results-${{ matrix.modules }}-${{ matrix.comment }}-${{ matrix.java }}-${{ matrix.hadoop }}-${{ matrix.hive }}
path: "**/target/test-reports/*.xml"
- name: Upload unit tests log files
if: failure()
uses: actions/upload-artifact@v2
with:
name: unit-tests-log-${{ matrix.modules }}-${{ matrix.comment }}-${{ matrix.java }}-${{ matrix.hadoop }}-${{ matrix.hive }}
path: "**/target/unit-tests.log"

pyspark:
name: "Build modules: ${{ matrix.modules }}"
runs-on: ubuntu-20.04
container:
image: dongjoon/apache-spark-github-action-image:20201025
strategy:
fail-fast: false
matrix:
modules:
- >-
pyspark-sql, pyspark-mllib
- >-
pyspark-core, pyspark-streaming, pyspark-ml
env:
MODULES_TO_TEST: ${{ matrix.modules }}
HADOOP_PROFILE: hadoop2.7
HIVE_PROFILE: hive2.3
# GitHub Actions' default miniconda to use in pip packaging test.
CONDA_PREFIX: /usr/share/miniconda
GITHUB_PREV_SHA: ${{ github.event.before }}
steps:
- name: Checkout Spark repository
uses: actions/checkout@v2
# In order to fetch changed files
with:
fetch-depth: 0
# Cache local repositories. Note that GitHub Actions cache has a 2G limit.
- name: Cache Scala, SBT, Maven and Zinc
uses: actions/cache@v2
with:
path: |
build/apache-maven-*
build/zinc-*
build/scala-*
build/*.jar
~/.sbt
key: build-${{ hashFiles('**/pom.xml', 'project/build.properties', 'build/mvn', 'build/sbt', 'build/sbt-launch-lib.bash', 'build/spark-build-info') }}
restore-keys: |
build-
- name: Cache Ivy local repository
uses: actions/cache@v2
with:
path: ~/.ivy2/cache
key: pyspark-ivy-${{ hashFiles('**/pom.xml', '**/plugins.sbt') }}
restore-keys: |
pyspark-ivy-
- name: Install Python 2.7
uses: actions/setup-python@v2
with:
python-version: 2.7
architecture: x64
- name: Install Python packages (Python 2.7 )
run: |
python2.7 -m pip install numpy 'pyarrow<3.0.0' pandas scipy xmlrunner
python2.7 -m pip list
# Run the tests.
- name: Run tests
run: |
./dev/run-tests --parallelism 2 --modules "$MODULES_TO_TEST"
- name: Upload test results to report
if: always()
uses: actions/upload-artifact@v2
with:
name: test-results-${{ matrix.modules }}--8-hadoop2.7-hive2.3
path: "**/target/test-reports/*.xml"
- name: Upload unit tests log files
if: failure()
uses: actions/upload-artifact@v2
with:
name: unit-tests-log-${{ matrix.modules }}--8-hadoop2.7-hive2.3
path: "**/target/unit-tests.log"

sparkr:
name: "Build modules: sparkr"
runs-on: ubuntu-20.04
container:
image: dongjoon/apache-spark-github-action-image:20201025
env:
HADOOP_PROFILE: hadoop2.7
HIVE_PROFILE: hive2.3
GITHUB_PREV_SHA: ${{ github.event.before }}
steps:
- name: Checkout Spark repository
uses: actions/checkout@v2
# In order to fetch changed files
with:
fetch-depth: 0
# Cache local repositories. Note that GitHub Actions cache has a 2G limit.
- name: Cache Scala, SBT, Maven and Zinc
uses: actions/cache@v2
with:
path: |
build/apache-maven-*
build/zinc-*
build/scala-*
build/*.jar
~/.sbt
key: build-${{ hashFiles('**/pom.xml', 'project/build.properties', 'build/mvn', 'build/sbt', 'build/sbt-launch-lib.bash', 'build/spark-build-info') }}
restore-keys: |
build-
- name: Cache Ivy local repository
uses: actions/cache@v2
with:
path: ~/.ivy2/cache
key: sparkr-ivy-${{ hashFiles('**/pom.xml', '**/plugins.sbt') }}
restore-keys: |
sparkr-ivy-
- name: Run tests
run: |
# The followings are also used by `r-lib/actions/setup-r` to avoid
# R issues at docker environment
export TZ=UTC
export _R_CHECK_SYSTEM_CLOCK_=FALSE
./dev/run-tests --parallelism 2 --modules sparkr
- name: Upload test results to report
if: always()
uses: actions/upload-artifact@v2
with:
name: test-results-sparkr--8-hadoop2.7-hive2.3
path: "**/target/test-reports/*.xml"
- name: Upload unit tests log files
if: failure()
uses: actions/upload-artifact@v2
with:
name: unit-tests-log-sparkr--8-hadoop2.7-hive2.3
path: "**/target/unit-tests.log"

# Static analysis, and documentation build
lint:
name: Linters, licenses, dependencies and documentation generation
runs-on: ubuntu-20.04
container:
image: dongjoon/apache-spark-github-action-image:20201025
steps:
- name: Checkout Spark repository
uses: actions/checkout@v2
# Cache local repositories. Note that GitHub Actions cache has a 2G limit.
- name: Cache Scala, SBT, Maven and Zinc
uses: actions/cache@v2
with:
path: |
build/apache-maven-*
build/zinc-*
build/scala-*
build/*.jar
~/.sbt
key: build-${{ hashFiles('**/pom.xml', 'project/build.properties', 'build/mvn', 'build/sbt', 'build/sbt-launch-lib.bash', 'build/spark-build-info') }}
restore-keys: |
build-
- name: Cache Ivy local repository
uses: actions/cache@v2
with:
path: ~/.ivy2/cache
key: docs-ivy-${{ hashFiles('**/pom.xml', '**/plugins.sbt') }}
restore-keys: |
docs-ivy-
- name: Cache Maven local repository
uses: actions/cache@v2
with:
path: ~/.m2/repository
key: docs-maven-${{ hashFiles('**/pom.xml') }}
restore-keys: |
docs-maven-
- name: Install Python 3.6
uses: actions/setup-python@v2
with:
python-version: 3.6
architecture: x64
- name: Install Python linter dependencies
run: |
python3.6 -m pip install install flake8 sphinx numpy
- name: Install R linter dependencies and SparkR
run: |
apt-get install -y libcurl4-openssl-dev libgit2-dev libssl-dev libxml2-dev
Rscript -e "install.packages(c('devtools'), repos='https://cloud.r-project.org/')"
Rscript -e "devtools::install_github('jimhester/lintr@v2.0.0')"
./R/install-dev.sh
- name: Install dependencies for documentation generation
run: |
apt-get install -y libcurl4-openssl-dev pandoc
python3.6 -m pip install sphinx mkdocs numpy
apt-get update -y
apt-get install -y ruby ruby-dev
gem install jekyll jekyll-redirect-from rouge
Rscript -e "install.packages(c('devtools', 'testthat', 'knitr', 'rmarkdown', 'roxygen2'), repos='https://cloud.r-project.org/')"
- name: Scala linter
run: ./dev/lint-scala
- name: Java linter
run: ./dev/lint-java
- name: Python linter
run: ./dev/lint-python
- name: R linter
run: ./dev/lint-r
- name: License test
run: ./dev/check-license
- name: Dependencies test
run: ./dev/test-dependencies.sh
- name: Run documentation build
run: |
cd docs
export LC_ALL=C.UTF-8
export LANG=C.UTF-8
jekyll build

java-11:
name: Java 11 build with Maven
runs-on: ubuntu-20.04
steps:
- name: Checkout Spark repository
uses: actions/checkout@v2
- name: Cache Maven local repository
uses: actions/cache@v2
with:
path: ~/.m2/repository
key: java11-maven-${{ hashFiles('**/pom.xml') }}
restore-keys: |
java11-maven-
- name: Install Java 11
uses: actions/setup-java@v1
with:
java-version: 11
- name: Build with Maven
run: |
export MAVEN_OPTS="-Xmx2g -XX:ReservedCodeCacheSize=1g -Dorg.slf4j.simpleLogger.defaultLogLevel=WARN"
export MAVEN_CLI_OPTS="--no-transfer-progress"
# It uses Maven's 'install' intentionally, see https://github.com/apache/spark/pull/26414.
./build/mvn $MAVEN_CLI_OPTS -DskipTests -Pyarn -Pmesos -Pkubernetes -Phive -Phive-thriftserver -Phadoop-cloud -Djava.version=11 install
rm -rf ~/.m2/repository/org/apache/spark
Loading