Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#239][#249] Further filepath refactor #263

Merged
merged 4 commits into from
Jan 27, 2023

Conversation

aeqz
Copy link
Contributor

@aeqz aeqz commented Jan 9, 2023

Description

Problem: after refactoring the FilePath usages in the codebase to have a canonical representation of them, we noticed that further improvements could be applied, such as clarifying whether the path is system dependent and avoiding absolute file system paths.

Solution: we now use POSIX relative paths during the analysis, and system dependant ones for reading file contents in the scan phase.

Related issue(s)

Fixes #239
Fixes #249
Relates #219

✅ Checklist for your Pull Request

Related changes

  • Tests

    • If I added new functionality, I added tests covering it.
    • If I fixed a bug, I added a regression test to prevent the bug from
      silently reappearing again.
  • Documentation

    • I checked whether I should update the docs and did so if necessary:
  • Public contracts

    • Any modifications of public contracts comply with the Evolution
      of Public Contracts
      policy.
    • I added an entry to the changelog if my changes are visible to the users
      and
    • provided a migration guide for breaking changes if possible

Stylistic guide

@Martoon-00
Copy link
Member

Martoon-00 commented Jan 9, 2023

1 Warning
⚠️ Please, only use letters, digits and dashes in the branch name.
Found: ["#"]

Generated by 🚫 Danger

src/Xrefcheck/Core.hs Outdated Show resolved Hide resolved
@aeqz
Copy link
Contributor Author

aeqz commented Jan 9, 2023

How should I properly write the PR and commit issue id referring to two different issues?

I have tackled both of them at the same time because, when trying to refine when POSIX links are used and when platform dependent ones, I realised that we just need relative POSIX links during the verification phase (either relative to the repository root or to other repository file) and system dependent ones during the scan phase, which do not require further processing.

@Martoon-00
Copy link
Member

I have tackled both of them at the same time because

Yeah, sounds fair.

Usually we write e.g. [#123][#456] Text in PR title or commit subject if it concerns multiple tickets.

Copy link
Member

@Martoon-00 Martoon-00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed superficially for now, and quite like how it looks like.

And otherall to me this looks simpler and more robust now.

@YuriRomanowski I would also appreciate your review, if you find time; meanwhile I'll try to spend more time on your PRs.

src/Xrefcheck/Core.hs Outdated Show resolved Hide resolved
src/Xrefcheck/Core.hs Show resolved Hide resolved
src/Xrefcheck/Core.hs Outdated Show resolved Hide resolved
src/Xrefcheck/System.hs Show resolved Hide resolved
src/Xrefcheck/Core.hs Outdated Show resolved Hide resolved
tests/Test/Xrefcheck/IgnoreAnnotationsSpec.hs Show resolved Hide resolved
@aeqz aeqz changed the title [#239 #249] Further filepath refactor [#239][#249] Further filepath refactor Jan 13, 2023
@aeqz aeqz force-pushed the aeqz/#239-#249-further-filepath-refactor branch from 57942fd to dde6f9d Compare January 13, 2023 09:55
src/Xrefcheck/Core.hs Outdated Show resolved Hide resolved
src/Xrefcheck/Core.hs Outdated Show resolved Hide resolved

-- | Join two 'CanonicalRelPosixLink's.
(</>) :: RelPosixLink -> RelPosixLink -> RelPosixLink
UnsafeRelPosixLink a </> UnsafeRelPosixLink b =
Copy link
Contributor Author

@aeqz aeqz Jan 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems quite arbitrary but, as now RelPosixLinks correspond to the original ones, with no normalisation applied, the output was displaying paths like "././file.md" (and this would be the same even by using </> from System.FilePath, I just changed it to avoid String <-> Text conversions):

ghci> import System.FilePath
ghci> "." </> "./file.md"
"././file.md"

src/Xrefcheck/Core.hs Outdated Show resolved Hide resolved
src/Xrefcheck/Core.hs Outdated Show resolved Hide resolved
@aeqz
Copy link
Contributor Author

aeqz commented Jan 16, 2023

Okay, I guess it is now better than before. Although if I got from the beginning that everything is consistent in this regard, I would probably abort my suggestion to change the existing code.

I will undo that change. As you say, it was already consistent.

Now I wonder if counting "other" references as local progress could be better initialised by using lenses:

vrLocal = initProgress $
  (length $ references ^.. folded . to rInfo . _RIFile) +
  (length $ references ^.. folded . to rInfo . _RIExternal . _ELOther)

This is not valid, but something similar would be cool instead:

vrLocal = initProgress $ length $ 
  references ^.. folded . to rInfo . (_RIFile <|> _RIExternal . _ELOther)

@aeqz aeqz requested a review from Martoon-00 January 16, 2023 19:53
@aeqz aeqz force-pushed the aeqz/#239-#249-further-filepath-refactor branch from 96befba to 0a72726 Compare January 18, 2023 12:36
@Martoon-00
Copy link
Member

I will undo that change. As you say, it was already consistent.

Thanks 🙏

This is not valid, but something similar would be cool instead:

Oh, I cannot miss this call for perfection!

AFAIR internally Traversals are about lists, not Maybes, so you should try <> instead of <|>.

Plus _RIFile and _RIExternal . _ELOther may return different types, so to lead everything to one type I would use united lens.

This works for me:

> let vals = [ Left "a", Right (Left 0), Right (Right 1) ] :: [Either String (Either Word Int)]
> vals ^.. L.folded . (L._Left . L.united <> L._Right . L._Right . L.united)
[(),()]

@aeqz
Copy link
Contributor Author

aeqz commented Jan 20, 2023

AFAIR internally Traversals are about lists, not Maybes, so you should try <> instead of <|>

Nice! I was also trying to unify types, but to Either instead of (), and also thinking that this sum type getters are "partial", so that's why my intuition was going more like "try with this getter that goes to Maybe (Left a) or, if it fails, this other one that goes to Maybe (Right b). But, as you pointed out, ^.. is about lists, not maybes.

Your proposal is simpler and worked perfectly 👌

@aeqz aeqz force-pushed the aeqz/#239-#249-further-filepath-refactor branch from e879454 to a0cd95b Compare January 20, 2023 13:23
@aeqz aeqz requested a review from Martoon-00 January 20, 2023 13:27
Copy link
Member

@Martoon-00 Martoon-00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more comment

src/Xrefcheck/Verify.hs Outdated Show resolved Hide resolved
@aeqz aeqz requested a review from Martoon-00 January 24, 2023 10:30
@aeqz aeqz force-pushed the aeqz/#239-#249-further-filepath-refactor branch from 327f2bb to 446b52b Compare January 24, 2023 18:37
@aeqz aeqz requested a review from Martoon-00 January 24, 2023 19:17
src/Xrefcheck/Scan.hs Outdated Show resolved Hide resolved
src/Xrefcheck/Verify.hs Outdated Show resolved Hide resolved
src/Xrefcheck/System.hs Show resolved Hide resolved
src/Xrefcheck/Verify.hs Outdated Show resolved Hide resolved
@aeqz aeqz force-pushed the aeqz/#239-#249-further-filepath-refactor branch 2 times, most recently from aa1818e to 8013bf7 Compare January 25, 2023 11:36
src/Xrefcheck/Scan.hs Outdated Show resolved Hide resolved
@aeqz
Copy link
Contributor Author

aeqz commented Jan 25, 2023

Mmm, this seems like a justified case to start using OS-dependent bats tests 🤔 Like, set up some env variable in CI config, and rely on it in the test?

I would be worried about some points anyways:

  • We should set the corresponding OS env variable for running bats tests in our device.
  • Perhaps some Windows user or developer could have some trouble when cloning the repository if it contains such test files. I currently don't have a Windows device for trying it.

src/Xrefcheck/Verify.hs Outdated Show resolved Hide resolved
src/Xrefcheck/Verify.hs Outdated Show resolved Hide resolved
src/Xrefcheck/Verify.hs Outdated Show resolved Hide resolved
@aeqz aeqz force-pushed the aeqz/#239-#249-further-filepath-refactor branch from a8428f8 to 4f96c10 Compare January 25, 2023 12:26
@aeqz aeqz requested a review from Martoon-00 January 25, 2023 13:11
src/Xrefcheck/Verify.hs Outdated Show resolved Hide resolved
@Martoon-00
Copy link
Member

Mmm, this seems like a justified case to start using OS-dependent bats tests thinking Like, set up some env variable in CI config, and rely on it in the test?

I would be worried about some points anyways:

* We should set the corresponding OS env variable for running bats tests in our device.

* Perhaps some Windows user or developer could have some trouble when cloning the repository if it contains such test files. I currently don't have a Windows device for trying it.

That's true, I forgot that tests are run not only by CIs.

Hmm, if I remember correctly, we already had tests that imply different output under Linux and Windows.

That test is written as assert_diff ... || assert_diff ... which is a bit sad, but if in our case at Windows we expect quite custom "File does not exist", I think that should work. What would you say?

@aeqz
Copy link
Contributor Author

aeqz commented Jan 27, 2023

That test is written as assert_diff ... || assert_diff ... which is a bit sad, but if in our case at Windows we expect quite custom "File does not exist", I think that should work. What would you say?

I tried to add my example and the Windows CI failed: the repo cannot be cloned if it contains the a\a.md file. Then I tried to create this file on the fly during the test and it cannot be created on Windows, so I made the test to pass if this file cannot be created.

@aeqz aeqz requested a review from Martoon-00 January 27, 2023 10:32
@Martoon-00
Copy link
Member

The last request: do we have tests on paths case-sensitivity? That aBc.md file would not be considered as referred by abc.md link?

I think that would be aa finishing touch for this refactoring.

@aeqz
Copy link
Contributor Author

aeqz commented Jan 27, 2023

The last request: do we have tests on paths case-sensitivity?

We have tests for anchor case-sensitivity, but not for path case-sensitivity. I will add them 👍

Copy link
Member

@Martoon-00 Martoon-00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thank you for this work!

Please squash the fixup commits and let's merge.

Problem: After refactoring the FilePath usages in the codebase to have a
canonical representation of them, we noticed that further improvements
could be applied, such as clarifying whether the path is system
dependent and avoiding absolute file system paths.

Solution: We now use POSIX relative paths during the analysis, and
system dependent ones for reading file contents in the scan phase.
Problem: After changing the progress bar interface to require progress
unit witnesses, some functions related to an anonymous task timestamp
also started to require a progress unit witness, which complicates its
usage unnecessarily.

Solution: Do not require a progress unit witness for getTaskTimestamp.
Problem: When a file name contains unusual characters, its output line
in git ls-files is quoted and Xrefcheck parses the file name wrong.

Solution: Use the git ls-files -z option to output lines verbatim with
null character line terminators. The file not found message has also
been improved for the case in which its link contains a backslash.
Problem: We currenlty have golden tests for anchor case-sensitivity, but
not for path case-sensitivity.

Solution: A golden test for path case-sensitivity has been added and the
previous one has been renamed.
@aeqz aeqz force-pushed the aeqz/#239-#249-further-filepath-refactor branch from c4a4827 to 236a006 Compare January 27, 2023 18:54
@aeqz aeqz merged commit 25ec1a2 into master Jan 27, 2023
@aeqz aeqz deleted the aeqz/#239-#249-further-filepath-refactor branch January 27, 2023 19:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants