Skip to content

Conversation

@zabetak
Copy link
Member

@zabetak zabetak commented Sep 17, 2025

Why are the changes needed?

The check-spelling action reports many false positives that require additional work and extra commits to address. Addressing the errors requires additional work from contributors and extra resources from CI since all tests are triggered again on new commits. Occasionally it also detects valid typos but at this stage the negatives outweigh the positives.

See:
https://lists.apache.org/thread/bb2ncb5ytk50j73fcwzw0wbdsblkw9x3 https://lists.apache.org/thread/n7k808bkxtclww95fhgkznfsol32f0mn

Does this PR introduce any user-facing change?

No

How was this patch tested?

Verify that action stops running after merging this pull request.

The check-spelling action reports many false positives that require additional work and extra commits to address. Addressing the errors requires additional work from contributors and extra resources from CI since all tests are triggered again on new commits. Occasionally it also detects valid typos but at this stage the negatives outweigh the positives.

See:
https://lists.apache.org/thread/bb2ncb5ytk50j73fcwzw0wbdsblkw9x3
https://lists.apache.org/thread/n7k808bkxtclww95fhgkznfsol32f0mn
@zabetak
Copy link
Member Author

zabetak commented Sep 17, 2025

It seems that spell-checking will continue to run everywhere till we merge this to master.

@github-actions
Copy link

@check-spelling-bot Report

🔴 Please review

See the files view or the action log for details.

Unrecognized words (12244)

truncated please see the log or artifact if available

Some files were automatically ignored

These sample patterns would exclude them:

^\Qdata/files/2000_cols_data.csv\E$
^\Qdata/files/5col_data.txt\E$
^\Qdata/files/ProxyAuth.res\E$
^\Qdata/files/alltypesorc\E$
^\Qdata/files/alltypesorc3xcols\E$
^\Qdata/files/alltypesorc_voriginal\E$
^\Qdata/files/alltypesorcold\E$
^\Qdata/files/avro_charvarchar.txt\E$
^\Qdata/files/bucket_pruning/l3_monthly_dw_dimplan/000056_0\E$
^\Qdata/files/compressed_4line_file2.csv.bz2\E$
^\Qdata/files/control_characters.txt\E$
^\Qdata/files/csv/00000.csv\E$
^\Qdata/files/data_with_escape.txt\E$
^\Qdata/files/dataconnector_derbydb/db_for_connectortest.db/seg0/c1a1.dat\E$
^\Qdata/files/dataconnector_derbydb/db_for_connectortest.db/seg0/c20.dat\E$
^\Qdata/files/dataconnector_derbydb/db_for_connectortest.db/seg0/c241.dat\E$
^\Qdata/files/dataconnector_derbydb/db_for_connectortest.db/seg0/c31.dat\E$
^\Qdata/files/dataconnector_derbydb/db_for_connectortest.db/seg0/c3a1.dat\E$
^\Qdata/files/dataconnector_derbydb/db_for_connectortest.db/seg0/c3b1.dat\E$
^\Qdata/files/dataconnector_derbydb/db_for_connectortest.db/seg0/c41.dat\E$
^\Qdata/files/dataconnector_derbydb/db_for_connectortest.db/seg0/c51.dat\E$
^\Qdata/files/dataconnector_derbydb/db_for_connectortest.db/seg0/c60.dat\E$
^\Qdata/files/dataconnector_derbydb/db_for_connectortest.db/seg0/c71.dat\E$
^\Qdata/files/dataconnector_derbydb/db_for_connectortest.db/seg0/c81.dat\E$
^\Qdata/files/dataconnector_derbydb/db_for_connectortest.db/seg0/c90.dat\E$
^\Qdata/files/dataconnector_derbydb/db_for_connectortest.db/seg0/ca1.dat\E$
^\Qdata/files/dataconnector_derbydb/db_for_connectortest.db/seg0/cc0.dat\E$
^\Qdata/files/dataconnector_derbydb/db_for_connectortest.db/seg0/cd1.dat\E$
^\Qdata/files/dataconnector_derbydb/db_for_connectortest.db/seg0/ce1.dat\E$
^\Qdata/files/emptyhead_4line_file1.csv.bz2\E$
^\Qdata/files/encoding-utf8.txt\E$
^\Qdata/files/exported_table/data/data\E$
^\Qdata/files/fullouter_long_big_1c.txt\E$
^\Qdata/files/fullouter_multikey_big_1b.txt\E$
^\Qdata/files/fullouter_multikey_small_1b.txt\E$
^\Qdata/files/fullouter_string_big_1a.txt\E$
^\Qdata/files/fullouter_string_big_1a_nonull.txt\E$
^\Qdata/files/fullouter_string_small_1a.txt\E$
^\Qdata/files/fullouter_string_small_1a_nonull.txt\E$
^\Qdata/files/hive22670.parquet\E$
^\Qdata/files/jsonserde.txt\E$
^\Qdata/files/keystore.jks\E$
^\Qdata/files/keystore_exampledotcom.jks\E$
^\Qdata/files/lt100.txt.deflate\E$
^\Qdata/files/orc_split_elim.orc\E$
^\Qdata/files/orc_test_ppd\E$
^\Qdata/files/over10k.gz\E$
^\Qdata/files/parquet_columnar.txt\E$
^\Qdata/files/parquet_types.txt\E$
^\Qdata/files/part.orc\E$
^\Qdata/files/primitive_type_arrays.txt\E$
^\Qdata/files/repl_dump/hive/test_hcube_2/tbl/data/delta_0000001_0000001_0000/bucket_00000\E$
^\Qdata/files/repl_dump/hive/test_hcube_2/tbl1/data/delta_0000001_0000001_0000/bucket_00000\E$
^\Qdata/files/repl_dump/hive/test_hcube_2/tbl1/data/delta_0000003_0000003_0000/bucket_00000\E$
^\Qdata/files/repl_dump/hive/test_hcube_2/tbl2/data/delta_0000001_0000001_0000/bucket_00000\E$
^\Qdata/files/repl_dump/hive/test_hcube_2/tbl3/data/delta_0000001_0000001_0000/bucket_00000\E$
^\Qdata/files/repl_dump/hive/test_hcube_2/tbl4/data/delta_0000001_0000001_0000/bucket_00000\E$
^\Qdata/files/repl_dump/hive/test_hcube_2/tbl5/data/delta_0000001_0000001_0000/bucket_00000\E$
^\Qdata/files/repl_dump/hive/test_hcube_2/tbl6/fld1=1/data/delta_0000001_0000001_0000/bucket_00000\E$
^\Qdata/files/repl_dump/hive/test_hcube_2/tbl6/fld1=2/data/delta_0000001_0000001_0000/bucket_00000\E$
^\Qdata/files/small_csv.csv\E$
^\Qdata/files/sortdp/000000_0\E$
^\Qdata/files/t4_multi_delimit.csv\E$
^\Qdata/files/teradata_binary_file/teradata_binary_table.deflate\E$
^\Qdata/files/test.csv.gz\E$
^\Qdata/files/test_hcube_2.db/tbl/delta_0000001_0000001_0000/bucket_00000\E$
^\Qdata/files/test_hcube_2.db/tbl1/delta_0000001_0000001_0000/bucket_00000\E$
^\Qdata/files/test_hcube_2.db/tbl1/delta_0000003_0000003_0000/bucket_00000\E$
^\Qdata/files/test_hcube_2.db/tbl2/delta_0000001_0000001_0000/bucket_00000\E$
^\Qdata/files/test_hcube_2.db/tbl3/delta_0000001_0000001_0000/bucket_00000\E$
^\Qdata/files/test_hcube_2.db/tbl4/delta_0000001_0000001_0000/bucket_00000\E$
^\Qdata/files/test_hcube_2.db/tbl5/delta_0000001_0000001_0000/bucket_00000\E$
^\Qdata/files/test_hcube_2.db/tbl6/delta_0000001_0000001_0000/bucket_00000\E$
^\Qdata/files/tjoin3.txt\E$
^\Qdata/files/tjoin4.txt\E$
^\Qdata/files/tpch/sf0_001/customer.tbl.bz2\E$
^\Qdata/files/tpch/sf0_001/lineitem.tbl.bz2\E$
^\Qdata/files/tpch/sf0_001/nation.tbl.bz2\E$
^\Qdata/files/tpch/sf0_001/orders.tbl.bz2\E$
^\Qdata/files/tpch/sf0_001/part.tbl.bz2\E$
^\Qdata/files/tpch/sf0_001/partsupp.tbl.bz2\E$
^\Qdata/files/tpch/sf0_001/region.tbl.bz2\E$
^\Qdata/files/tpch/sf0_001/supplier.tbl.bz2\E$
^\Qdata/files/tpch/tiny/lineitem.tbl.bz2\E$
^\Qdata/files/tpch/tiny/part.tbl.bz2\E$
^\Qdata/files/truststore.jks\E$
^\Qdata/files/vector_groupingsets_switchmode.csv\E$
^\Qdata/files/web_sales.parquet\E$
^\Qdata/files/windowing_distinct.txt\E$
^\Qdata/scripts/q_test_cleanup.sql\E$
^\Qerrata.txt\E$
^\Qhplsql/src/test/queries/local/to_timestamp.sql\E$
^\Qitests/hive-minikdc/src/test/resources/auth.jwt/jwt-authorized-key.json\E$
^\Qitests/hive-minikdc/src/test/resources/auth.jwt/jwt-unauthorized-key.json\E$
^\Qitests/hive-minikdc/src/test/resources/auth.jwt/jwt-verification-jwks.json\E$
^\Qitests/hive-unit/src/test/resources/auth.jwt/jwt-authorized-key.json\E$
^\Qitests/hive-unit/src/test/resources/auth.jwt/jwt-unauthorized-key.json\E$
^\Qitests/hive-unit/src/test/resources/auth.jwt/jwt-verification-jwks.json\E$
^\Qitests/hive-unit/src/test/resources/simple-saml-idp-metadata-template.xml\E$
^\Qitests/test-jdbc/src/test/resources/custom_hosts_file\E$
^\Qllap-server/src/main/resources/hive-webapps/llap/fonts/glyphicons-halflings-regular.eot\E$
^\Qllap-server/src/main/resources/hive-webapps/llap/fonts/glyphicons-halflings-regular.ttf\E$
^\Qllap-server/src/main/resources/hive-webapps/llap/fonts/glyphicons-halflings-regular.woff\E$
^\Qllap-server/src/main/resources/hive-webapps/llap/images/hive_logo.jpeg\E$
^\Qql/src/test/queries/clientnegative/invalid_create_tbl2.q\E$
^\Qql/src/test/queries/clientpositive/confirm_initial_tbl_stats.q\E$
^\Qql/src/test/queries/clientpositive/dfscmd.q\E$
^\Qql/src/test/queries/clientpositive/udf_deserialize.q\E$
^\Qql/src/test/resources/bucket_00952_0\E$
^\Qql/src/test/resources/hsmm/hsmm_cfg_01.yaml\E$
^\Qql/src/test/results/clientpositive/llap/compute_bit_vector.q.out\E$
^\Qql/src/test/results/clientpositive/llap/ctas_direct.q.out\E$
^\Qql/src/test/results/clientpositive/llap/ctas_direct_with_specified_locations.q.out\E$
^\Qql/src/test/results/clientpositive/llap/ctas_direct_with_suffixed_locations.q.out\E$
^\Qql/src/test/results/clientpositive/llap/cte_1.q.out\E$
^\Qql/src/test/results/clientpositive/llap/llap_0.q.out\E$
^\Qql/src/test/results/clientpositive/llap/manyViewJoin.q.out\E$
^\Qql/src/test/results/clientpositive/llap/orc_llap_nonvector.q.out\E$
^\Qql/src/test/results/clientpositive/llap/parquet_vectorization_0.q.out\E$
^\Qql/src/test/results/clientpositive/llap/parquet_vectorization_10.q.out\E$
^\Qql/src/test/results/clientpositive/llap/parquet_vectorization_12.q.out\E$
^\Qql/src/test/results/clientpositive/llap/parquet_vectorization_16.q.out\E$
^\Qql/src/test/results/clientpositive/llap/parquet_vectorization_6.q.out\E$
^\Qql/src/test/results/clientpositive/llap/parquet_vectorization_9.q.out\E$
^\Qql/src/test/results/clientpositive/llap/partition_explain_ddl.q.out\E$
^\Qql/src/test/results/clientpositive/llap/vector_leftsemi_mapjoin.q.out\E$
^\Qql/src/test/results/clientpositive/llap/vectorization_0.q.out\E$
^\Qql/src/test/results/clientpositive/llap/vectorization_10.q.out\E$
^\Qql/src/test/results/clientpositive/llap/vectorization_12.q.out\E$
^\Qql/src/test/results/clientpositive/llap/vectorization_16.q.out\E$
^\Qql/src/test/results/clientpositive/llap/vectorization_6.q.out\E$
^\Qql/src/test/results/clientpositive/llap/vectorization_9.q.out\E$
^\Qql/src/test/results/clientpositive/tez/update_orig_table.q.out\E$
^\Qserde/src/test/resources/json/single_pixel.json\E$
^\Qservice/src/resources/hive-webapps/ha-healthcheck/WEB-INF/web.xml\E$
^\Qservice/src/resources/hive-webapps/static/favicon.ico\E$
^\Qservice/src/resources/hive-webapps/static/fonts/glyphicons-halflings-regular.eot\E$
^\Qservice/src/resources/hive-webapps/static/fonts/glyphicons-halflings-regular.ttf\E$
^\Qservice/src/resources/hive-webapps/static/fonts/glyphicons-halflings-regular.woff\E$
^\Qservice/src/resources/hive-webapps/static/hive_logo.jpeg\E$
^\Qstandalone-metastore/metastore-common/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp\E$
^\Qstandalone-metastore/metastore-common/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h\E$
^\Qstandalone-metastore/metastore-common/src/gen/thrift/gen-cpp/hive_metastore_types.cpp\E$
^\Qstandalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java\E$
^\Qstandalone-metastore/metastore-common/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py\E$
^\Qstandalone-metastore/metastore-common/src/gen/thrift/gen-py/hive_metastore/ttypes.py\E$
^\Qstandalone-metastore/metastore-rest-catalog/src/test/resources/auth/jwt/jwt-authorized-key.json\E$
^\Qstandalone-metastore/metastore-rest-catalog/src/test/resources/auth/jwt/jwt-unauthorized-key.json\E$
^\Qstandalone-metastore/metastore-rest-catalog/src/test/resources/auth/jwt/jwt-verification-jwks.json\E$
^\Qstandalone-metastore/metastore-server/src/test/resources/auth/jwt/jwt-authorized-key.json\E$
^\Qstandalone-metastore/metastore-server/src/test/resources/auth/jwt/jwt-unauthorized-key.json\E$
^\Qstandalone-metastore/metastore-server/src/test/resources/auth/jwt/jwt-verification-jwks.json\E$
^\Qstandalone-metastore/metastore-server/src/test/resources/hive_keystore.p12\E$
^\Qstandalone-metastore/metastore-server/src/test/resources/hive_truststore.p12\E$

You should consider excluding directory paths (e.g. (?:^|/)vendor/), filenames (e.g. (?:^|/)yarn\.lock$), or file extensions (e.g. \.gz$)

You should consider adding them to:

.github/actions/spelling/excludes.txt

File matching is via Perl regular expressions.

To check these files, more of their words need to be in the dictionary than not. You can use patterns.txt to exclude portions, add items to the dictionary (e.g. by adding them to allow.txt), or fix typos.

Warnings (2)

See Warning descriptions for more information.

Warning Count
large-file 12
noisy-file 142
Script unavailable

truncated please see the log or artifact if available

If the flagged items do not appear to be text

If items relate to a ...

  • well-formed pattern.

    If you can write a pattern that would match it,
    try adding it to the patterns.txt file.

    Patterns are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your lines.

    Note that patterns can't match multiline strings.

  • binary file.

    Please add a file path to the excludes.txt file matching the containing file.

    File paths are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your files.

    ^ refers to the file's path from the root of the repository, so ^README\.md$ would exclude README.md (on whichever branch you're using).

@sonarqubecloud
Copy link

Copy link
Member

@ayushtkn ayushtkn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Go for it. I never liked this bot....

Copy link
Contributor

@InvisibleProgrammer InvisibleProgrammer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. As a last attempt it says spell-checking failed :D

@zabetak zabetak merged commit f374f39 into apache:master Sep 18, 2025
5 of 6 checks passed
@zabetak zabetak deleted the HIVE-29207 branch September 18, 2025 07:15
@zabetak
Copy link
Member Author

zabetak commented Sep 18, 2025

Thanks for the reviews @ayushtkn and @InvisibleProgrammer !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants