Skip to content

Conversation

@allaboutevemirolive
Copy link
Contributor

@allaboutevemirolive allaboutevemirolive commented Aug 26, 2023

Motivation

Previous code was refactored to remove unnecessary memory allocation and to include comments for clarity.

Some of the current code's optimizations include:

  • Use of chained operations and avoidance of unnecessary string allocations for case transformation.
  • Use eq_ignore_ascii_case() instead of to_lowercase() to avoid memory allocations and performance overhead.
  • Combines checks within a single iteration using all() and logical conditions, reducing the overall number of operations.
  • Avoids repeated allocations of ascii_confusables by using const.
  • const values are evaluated and resolved at compile time, which means they don't result in runtime allocations.
  • Separates different checks into distinct parts, improving readability and maintainability.

Testing

This improvement was tested with command ./x test --stage 2 tests/ui on Debian.

@rustbot
Copy link
Collaborator

rustbot commented Aug 26, 2023

r? @WaffleLapkin

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Aug 26, 2023
@rust-log-analyzer
Copy link
Collaborator

The job mingw-check-tidy failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
Prepare all required actions
Getting action download info
Download action repository 'actions/checkout@v3' (SHA:f43a0e5ff2bd294095638e18286ca9a3d1956744)
Download action repository 'actions/upload-artifact@v3' (SHA:0b7f8abb1508181956e8e162db84b466c27e18ce)
Complete job name: PR - mingw-check-tidy
git config --global core.autocrlf false
shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
---
GITHUB_ACTION=__run_7
GITHUB_ACTIONS=true
GITHUB_ACTION_REF=
GITHUB_ACTION_REPOSITORY=
GITHUB_ACTOR=allaboutevemirolive
GITHUB_API_URL=https://api.github.com
GITHUB_BASE_REF=master
GITHUB_ENV=/home/runner/work/_temp/_runner_file_commands/set_env_fed5aac9-aee4-431f-aa13-7caca75b250d
GITHUB_EVENT_NAME=pull_request
---
GITHUB_SERVER_URL=https://github.com
GITHUB_SHA=c248e02774a5e7615e0984063a7b6e08d9e2dc49
GITHUB_STATE=/home/runner/work/_temp/_runner_file_commands/save_state_fed5aac9-aee4-431f-aa13-7caca75b250d
GITHUB_STEP_SUMMARY=/home/runner/work/_temp/_runner_file_commands/step_summary_fed5aac9-aee4-431f-aa13-7caca75b250d
GITHUB_TRIGGERING_ACTOR=allaboutevemirolive
GITHUB_WORKFLOW_REF=rust-lang/rust/.github/workflows/ci.yml@refs/pull/115258/merge
GITHUB_WORKFLOW_SHA=c248e02774a5e7615e0984063a7b6e08d9e2dc49
GITHUB_WORKSPACE=/home/runner/work/rust/rust
GOROOT_1_18_X64=/opt/hostedtoolcache/go/1.18.10/x64
---
Removing intermediate container 3dfdb145cb15
 ---> 8930cc88ceee
Step 6/10 : COPY host-x86_64/mingw-check/reuse-requirements.txt /tmp/
 ---> 8b3e487210e0
Step 7/10 : RUN pip3 install --no-deps --no-cache-dir --require-hashes -r /tmp/reuse-requirements.txt     && pip3 install virtualenv
Collecting binaryornot==0.4.4
  Downloading binaryornot-0.4.4-py2.py3-none-any.whl (9.0 kB)
Collecting boolean-py==4.0
  Downloading boolean.py-4.0-py3-none-any.whl (25 kB)
---
Building wheels for collected packages: reuse
  Building wheel for reuse (pyproject.toml): started
  Building wheel for reuse (pyproject.toml): finished with status 'done'
  Created wheel for reuse: filename=reuse-1.1.0-cp310-cp310-manylinux_2_35_x86_64.whl size=180117 sha256=2196c9034bf565528bbb1ee6dad4f753eb813f58822363e6b768f09c73e4d4ff
  Stored in directory: /tmp/pip-ephem-wheel-cache-potf5fo3/wheels/c2/3c/b9/1120c2ab4bd82694f7e6f0537dc5b9a085c13e2c69a8d0c76d
Installing collected packages: boolean-py, binaryornot, setuptools, reuse, python-debian, markupsafe, license-expression, jinja2, chardet
  Attempting uninstall: setuptools
    Found existing installation: setuptools 59.6.0
    Not uninstalling setuptools at /usr/lib/python3/dist-packages, outside environment /usr
    Not uninstalling setuptools at /usr/lib/python3/dist-packages, outside environment /usr
    Can't uninstall 'setuptools'. No files were found to uninstall.
Successfully installed binaryornot-0.4.4 boolean-py-4.0 chardet-5.1.0 jinja2-3.1.2 license-expression-30.0.0 markupsafe-2.1.1 python-debian-0.1.49 reuse-1.1.0 setuptools-66.0.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Collecting virtualenv
  Downloading virtualenv-20.24.3-py3-none-any.whl (3.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.0/3.0 MB 20.2 MB/s eta 0:00:00
Collecting platformdirs<4,>=3.9.1
  Downloading platformdirs-3.10.0-py3-none-any.whl (17 kB)
Collecting filelock<4,>=3.12.2
  Downloading filelock-3.12.2-py3-none-any.whl (10 kB)
Collecting distlib<1,>=0.3.7
  Downloading distlib-0.3.7-py2.py3-none-any.whl (468 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 468.9/468.9 KB 38.3 MB/s eta 0:00:00
Installing collected packages: distlib, platformdirs, filelock, virtualenv
Successfully installed distlib-0.3.7 filelock-3.12.2 platformdirs-3.10.0 virtualenv-20.24.3
Removing intermediate container a8c8109cb9b8
 ---> bd809966ea0d
Step 8/10 : COPY host-x86_64/mingw-check/validate-toolstate.sh /scripts/
 ---> 6ef48ca3a404
 ---> 6ef48ca3a404
Step 9/10 : COPY host-x86_64/mingw-check/validate-error-codes.sh /scripts/
 ---> b7757e143e8a
Step 10/10 : ENV SCRIPT TIDY_PRINT_DIFF=1 python2.7 ../x.py test            --stage 0 src/tools/tidy tidyselftest --extra-checks=py:lint
Removing intermediate container d94c0ba16219
 ---> 3393e1ab0d06
Successfully built 3393e1ab0d06
Successfully tagged rust-ci:latest
Successfully tagged rust-ci:latest
##[endgroup]
Built container sha256:3393e1ab0d061ff50edb499ccadd0c1e36122070daac9dfe333469ac36a88a81
Uploading finished image to https://ci-caches.rust-lang.org/docker/f76604104d6a6f4aa068dca8679411c51198f6c5562033174b89cf3ceb2f9ecc175231f1257b18e00fa57bb3eed5f53dbcc121e55de0452dda3db18894e27ea0

<botocore.awsrequest.AWSRequest object at 0x7f95df0fa450>
gzip: stdout: Broken pipe
xargs: docker: terminated by signal 13
[CI_JOB_NAME=mingw-check-tidy]
[CI_JOB_NAME=mingw-check-tidy]
---
DirectMap4k:      276416 kB
DirectMap2M:     6014976 kB
DirectMap1G:    12582912 kB
##[endgroup]
Executing TIDY_PRINT_DIFF=1 python2.7 ../x.py test            --stage 0 src/tools/tidy tidyselftest --extra-checks=py:lint
+ TIDY_PRINT_DIFF=1 python2.7 ../x.py test --stage 0 src/tools/tidy tidyselftest --extra-checks=py:lint
    Finished dev [unoptimized] target(s) in 0.03s
##[endgroup]
downloading https://ci-artifacts.rust-lang.org/rustc-builds-alt/22d41ae90facbffdef9115809e8b6c1f71ebbf7c/rust-dev-nightly-x86_64-unknown-linux-gnu.tar.xz
extracting /checkout/obj/build/cache/llvm-22d41ae90facbffdef9115809e8b6c1f71ebbf7c-true/rust-dev-nightly-x86_64-unknown-linux-gnu.tar.xz to /checkout/obj/build/x86_64-unknown-linux-gnu/ci-llvm
---
   Compiling tidy v0.1.0 (/checkout/src/tools/tidy)
    Finished release [optimized] target(s) in 25.45s
##[endgroup]
fmt check
##[error]Diff in /checkout/compiler/rustc_errors/src/emitter.rs at line 2709:
 }
 
-/// Determines if the original code and the suggested code have sufficient visual similarity 
-/// Determines if the original code and the suggested code have sufficient visual similarity 
-/// to include extra textual information or descriptions to highlight the similarities or 
-/// differences between them. 
-/// 
+/// Determines if the original code and the suggested code have sufficient visual similarity
+/// to include extra textual information or descriptions to highlight the similarities or
+/// differences between them.
 /// # Returns
 ///
 ///
-/// Returns `true` if the original and suggested code are visually similar enough to warrant 
+/// Returns `true` if the original and suggested code are visually similar enough to warrant
 /// extra wording, otherwise `false`.
 pub fn is_case_difference(sm: &SourceMap, suggested: &str, sp: Span) -> bool {
     // FIXME: this should probably be extended to also account for `FO0` → `FOO` and unicode.
##[error]Diff in /checkout/compiler/rustc_errors/src/emitter.rs at line 2722:
     // Retrieve code
     let found = match sm.span_to_snippet(sp) {
-        Ok(snippet) => {
-            snippet
-        }
+        Ok(snippet) => snippet,
         Err(e) => {
             warn!(error = ?e, "Invalid span {:?}", sp);
             return false;
##[error]Diff in /checkout/compiler/rustc_errors/src/emitter.rs at line 2731:
 
 
     // ASCII confusable characters when capitalization.
-    const ASCII_CONFUSABLES: [char; 12] = ['c', 'f', 'i', 'k', 'o', 's', 'u', 'v', 'w', 'x', 'y', 'z'];
+    const ASCII_CONFUSABLES: [char; 12] =
+        ['c', 'f', 'i', 'k', 'o', 's', 'u', 'v', 'w', 'x', 'y', 'z'];
     // # CHECK 1
     // # CHECK 1
     // 'zip' function is used to iterate over two sequences in parallel
Running `"/checkout/obj/build/x86_64-unknown-linux-gnu/rustfmt/bin/rustfmt" "--config-path" "/checkout" "--edition" "2021" "--unstable-features" "--skip-children" "--check" "/checkout/compiler/rustc_incremental/src/persist/mod.rs" "/checkout/compiler/rustc_errors/src/tests.rs" "/checkout/compiler/rustc_errors/src/emitter.rs" "/checkout/compiler/rustc_incremental/src/persist/fs.rs" "/checkout/compiler/rustc_errors/src/translation.rs" "/checkout/compiler/rustc_incremental/src/persist/work_product.rs" "/checkout/compiler/rustc_errors/src/snippet.rs" "/checkout/compiler/rustc_errors/src/annotate_snippet_emitter_writer.rs"` failed.
If you're running `tidy`, try again with `--bless`. Or, if you just want to format code, run `./x.py fmt` instead.
  local time: Sat Aug 26 21:28:31 UTC 2023
  network time: Sat, 26 Aug 2023 21:28:31 GMT
##[error]Process completed with exit code 1.
Post job cleanup.

Copy link
Member

@WaffleLapkin WaffleLapkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this function needs optimization, but your code is generally nicer and more documented here, so let's go with that.

I've left a few nitpicks, don't mind them much, it's always easy to find things to nitpick about on small PRs like there 😅

Thank you for the pull request!

pub fn is_case_difference(sm: &SourceMap, suggested: &str, sp: Span) -> bool {
// FIXME: this should probably be extended to also account for `FO0` → `FOO` and unicode.
// Retrieve code
let found = match sm.span_to_snippet(sp) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you turn this into a let-else?

&& found != suggested

// ASCII confusable characters when capitalization.
const ASCII_CONFUSABLES: [char; 12] = ['c', 'f', 'i', 'k', 'o', 's', 'u', 'v', 'w', 'x', 'y', 'z'];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a const does not make a difference, actually. Consts are copied on access, so this is actually worse, if anything (as we now potentially spill two copies on the attach, instead of one).

Not that it actually matters, given than this is the error path, but still. You should probably change the type to &[char; 12] if you want to make it const (do that only the reference is copied, not the array).

// ASCII confusable characters when capitalization.
const ASCII_CONFUSABLES: [char; 12] = ['c', 'f', 'i', 'k', 'o', 's', 'u', 'v', 'w', 'x', 'y', 'z'];

// # CHECK 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, do you think those titles are useful? To me they seem somewhat pointless, they don't really add information or make it easier to scan the code, I think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the original code, the comment sounds like // All the chars that differ in capitalization are confusable (above):

Should I make it more descriptive?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean, I was only talking about the # CHECK titles, the comments themself look fine.

Comment on lines +2737 to +2739
// 'zip' function is used to iterate over two sequences in parallel
found.chars()
.zip(suggested.chars())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need to explain zip, it's a well known function :)

Additionally I'd suggest using iter::zip, in my opinion it's nicer, as it highlights the symmetry.

Suggested change
// 'zip' function is used to iterate over two sequences in parallel
found.chars()
.zip(suggested.chars())
iter::zip(found.chars(), suggested.chars())

Copy link
Contributor Author

@allaboutevemirolive allaboutevemirolive Aug 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the remainder :)

Actually, I'm still new to Rust, and there's a lot I'm still catching on to.

About the iter::zip, I completely agree with you. I'll modify it later.

// - Check character `f` is present in the `ASCII_CONFUSABLES` array, OR
// - Check character `s` is present in the `ASCII_CONFUSABLES` array
//
// This line equivalent to the: .filter(|(f, s)| f != s).all(|(f, s)| (ascii_confusables.contains(&f) || ascii_confusables.contains(&s)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is redundant, the filter representation is more confusing if anything

@WaffleLapkin
Copy link
Member

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 26, 2023
@allaboutevemirolive
Copy link
Contributor Author

@rustbot review

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Aug 28, 2023
@workingjubilee
Copy link
Member

@allaboutevemirolive You have not materially responded to the reviewer's comments. Please do not return this to the reviewer's queue without making some meaningful change or at least requesting distinctive feedback.

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 1, 2023
@allaboutevemirolive allaboutevemirolive closed this by deleting the head repository Sep 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants