Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Various changes for tools/check-typo #1287

Merged
merged 18 commits into from Oct 25, 2017

Conversation

@dra27
Copy link
Contributor

commented Aug 12, 2017

This is a prerequisite for #1148 moving towards having the entire repository pass the tools/check-typo script.

Review points definitely required:

  • @damiendoligez - I've altered tools/check-typo to permit UTF-8 characters in copyright and authors lines in the headers (allows several existing headers to retain accented characters and also means no author in future should feel compelled to use incorrect/alternate spellings of names just to satisfy a script)
  • I've used a URL shortener to reduce some line-lengths in the documentation, which may be contentious for several reasons (@gasche: thoughts?)
  • The two changes in Makefile are worthwhile in their own right, but consequently should be reviewed as code, not reformatting
  • I'm reasonably confident that CCOUTPUT in otherlibs/systhreads/Makefile is a merge artefact from @shindere's marvellous merge, but I didn't triple-check it

There are various people who should briefly sign off on (hopefully) trivial alterations:

  • @stedolan - tools/check-symbol-names was lacking a copyright and I assume that you're happy to be credited with your script 馃槈
  • @edwintorok - tools/lintapidiff.ml has a mis-formatted header, but I believe you are the both author and copyright holder!
  • @nojb - tools/make_opcodes.mll has a wrong licence, either because it was miscopied or possibly because the patch originally predated the change of OCaml's licence. Please confirm that switching the file to declare LGPL 2.1 as for the rest of the tree is OK.
@nojb

This comment has been minimized.

Copy link
Contributor

commented Aug 13, 2017

@dra27 yes, it was miscopied, change is OK.

@gasche

This comment has been minimized.

Copy link
Member

commented Aug 13, 2017

Indeed, I don't like the URL shortener idea -- thanks for asking. We win very little (the only strong positive I see is to respect the 80 columns limit), and we start depending on an extra external service that (1) introduces a new way our links may become dead (although it's reasonable to hope that even if Google killed the service (likely), it would preserve existing URLs) and (2) leaks information from people reading documentation to a third-party for no good reason鹿.

One thing that would be useful, I think, is to ask archive.org for a version of these pages, which has the side-effect of getting it to record them. (This is not failsafe, if the site later uses a robots.txt to disable archiving they will still be unavailable.)

鹿: for the record, although I like privacy and believe that google is probably evil or may be in the future, I don't personally mind that much the use of a Google-supported (or someone-else-supported) url shortener, but I know that some people do mind; in particular the German-speaking school of free software and privacy activism is very sensitive about privacy issues, even little things, and I think it's nice to avoid moves that could turn them off OCaml contributions.

@@ -4,7 +4,7 @@ open Bigarray

let pp_sep ppf () = Format.fprintf ppf ";@ "
let print_array pp ppf a =
Format.fprintf ppf "@[<hov>鉄%a鉄@]"
Format.fprintf ppf "@[<hov>\xe2\x9f\xa6%a\xe2\x9f\xa7@]"

This comment has been minimized.

Copy link
@Octachron

Octachron Aug 13, 2017

Contributor

May I suggest to replace the format string with a simple "[|%a|]"? (This is the failing path of the test, so there would be no need to update the reference file).

This comment has been minimized.

Copy link
@dra27

dra27 Aug 14, 2017

Author Contributor

Sure - I just (pedantically) mapped it to the characters you'd used!

This comment has been minimized.

Copy link
@gasche

gasche Aug 14, 2017

Member

(I also find [|%a|] more readable.)

Makefile Outdated
$(call SUBST,CCOMPTYPE) \
$(call SUBST,CC_PROFILE) \
$(call SUBST,EXT_ASM) \
$(call SUBST,EXT_DLL) \
-e 's|%%EXT_EXE%%|$(EXE)|' \

This comment has been minimized.

Copy link
@stedolan

stedolan Aug 14, 2017

Contributor

Any reason why EXT_EXE doesn't get the same treatment?

This comment has been minimized.

Copy link
@dra27

dra27 Aug 14, 2017

Author Contributor

It's the only one where the substitution variable name doesn't match the Makefile variable name... I guess utils/config.mlp could be changed...

@stedolan

This comment has been minimized.

Copy link
Contributor

commented Aug 14, 2017

license header on tools/check-symbol-names looks fine, thanks!

@shindere

This comment has been minimized.

Copy link
Contributor

commented Aug 16, 2017

@dra27

This comment has been minimized.

Copy link
Contributor Author

commented Aug 16, 2017

@shindere - for this pull request, it means that Makefile now passes tools/check-typo as the WITH_SPACETIME_CALL_COUNTS options was tricky to split.

Semantically, it does also decrease the chance of generating invalid strings because it means that \ is always correctly escaped for each variable. Previously this was done manually for FLEXLINK_FLAGS, MKDLL, MKEXE and MKMAINDLL only but in theory one could have paths in any of the compiler flags variables too.

@shindere

This comment has been minimized.

Copy link
Contributor

commented Aug 16, 2017

@edwintorok
Copy link
Contributor

left a comment

Changes to lintapidiff look fine

@dra27

This comment has been minimized.

Copy link
Contributor Author

commented Aug 16, 2017

@gasche - thanks! I found a way to alter the ASCIIdoc output to reduce the line length with the URL. There's no equivalent in Markdown, and it seems a shame to enable long-line just for that one line. I've pushed a littlebig further hack to check-typo which permits the ignoring of a long URL(ish) at the end of a line.

@gasche

gasche approved these changes Aug 21, 2017

Copy link
Member

left a comment

Approved with minor comments.

Int32.(logor (shift_left (of_int (Bytes.get s (j+3) |> Char.code)) 24)
(logor (shift_left (of_int (Bytes.get s (j+2) |> Char.code)) 16)
(logor (shift_left (of_int (Bytes.get s (j+1) |> Char.code)) 8)
(of_int (Bytes.get s j |> Char.code)))))

This comment has been minimized.

Copy link
@gasche

gasche Aug 21, 2017

Member

Is there a way to format this code to make it readable?

let byte n = Bytes.get s (j+n) |> Char.code |> Int32.of_int in
let open Int32 in
byte 0
|> logor (shift_left (byte 1) 8)
|> logor (shift_left (byte 2) 16)
|> logor (shift_left (byte 3) 24)

This comment has been minimized.

Copy link
@dra27

dra27 Oct 12, 2017

Author Contributor

Readable test cases?! Whatever next!

Sorry, I'd missed this one previously, but I've changed that - far better to improve the readability of the code, than just reduce the line lengths below the maximum.

@@ -3,7 +3,7 @@ will need a good understanding of the OCaml type system and type
inference. Here is a reading list to ease your discovery of the
typechecker:

http://caml.inria.fr/pub/docs/u3-ocaml/index.html[Using, Understanding, and Unraveling the OCaml Language by Didier R茅my] ::
http://caml.inria.fr/pub/docs/u3-ocaml/index.html[Using, Understanding, and Unraveling the OCaml Language by Didier R&eacute;my] ::

This comment has been minimized.

Copy link
@gasche

gasche Aug 21, 2017

Member

Why is this necessary, is this file not valid UTF8?

This comment has been minimized.

Copy link
@damiendoligez

damiendoligez Aug 21, 2017

Member

Funny fact: I've known Didier for more than 25 years, and I learned two weeks ago that his name officially doesn't have an accent...

@damiendoligez

This comment has been minimized.

Copy link
Member

commented Aug 21, 2017

I've altered tools/check-typo to permit UTF-8 characters in copyright and authors lines in the headers (allows several existing headers to retain accented characters and also means no author in future should feel compelled to use incorrect/alternate spellings of names just to satisfy a script)

This is slightly dangerous: I've once seen a file with UTF-8 in the comments and Latin-1 in the code. Emacs got confused by this and introduced mojibake in the code (which made it buggy).

But if we make sure there's no non-ascii character anywhere else than in headers, I guess it's OK.

@damiendoligez damiendoligez added this to the 4.06.0 milestone Sep 25, 2017

@dra27 dra27 force-pushed the dra27:check-typo-changes branch from 6871c4c to 316319d Oct 11, 2017

@dra27

This comment has been minimized.

Copy link
Contributor Author

commented Oct 11, 2017

@damiendoligez - I've pushed an extra commit (316319d) which adds a non-ascii-utf8 warning which activates only if the non-ascii attribute has been added for the file and a both a non-ASCII character is found and a UTF-8 sequence is skipped in the headers. I eventually managed to persuade my editor to concoct a test file for it 馃檪

@dra27

This comment has been minimized.

Copy link
Contributor Author

commented Oct 11, 2017

This is now rebased. I have hopefully applied enough care in the root Makefile (the change for updating utils/config.ml) to reflect recent additions, but this must therefore be allowed to run through CI.

All previous review points had been dealt with so as long as Damien is happy with the new UTF-8/non-ASCII check, I think this it then good to go, and I'll get #1288 updated. (I'm liking having tools/check-typo always check my commits, but I'm liking the fact it reveals other people's violations less!!)

@dra27 dra27 force-pushed the dra27:check-typo-changes branch from 316319d to 6d40526 Oct 11, 2017

@@ -297,63 +297,71 @@ ifeq "$(FLEXDLL_SUBMODULE_PRESENT)" ""
else
BOOT_FLEXLINK_CMD = FLEXLINK_CMD="../boot/ocamlrun ../flexdll/flexlink.exe"
CAMLOPT := OCAML_FLEXLINK="boot/ocamlrun flexdll/flexlink.exe" $(CAMLOPT)
FLEXDLL_DIR=$(if $(wildcard flexdll/flexdll_*.$(O)),"+flexdll")
FLEXDLL_DIR=$(if $(wildcard flexdll/flexdll_*.$(O)),+flexdll)

This comment has been minimized.

Copy link
@damiendoligez

damiendoligez Oct 11, 2017

Member

Travis complains about this change. The value of FLEXDLL_DIR must be a string in OCaml syntax, so the quotes are needed.

This comment has been minimized.

Copy link
@dra27

dra27 Oct 11, 2017

Author Contributor

It's not quite this - it's a mistake in one of the macros further down. I've just pushed a fix. The issue is that FLEXDLL_DIR is non-empty then it must be a complete OCaml string (with the double quotes) but if it's empty then it should be empty - as the result goes into a string list into sys/config.mlp. The new generic SUBST macro escapes double-quotes which was what was causing the problem for this specific variable.

@shindere

This comment has been minimized.

Copy link
Contributor

commented Oct 11, 2017

@shindere

This comment has been minimized.

Copy link
Contributor

commented Oct 11, 2017

@dra27 dra27 force-pushed the dra27:check-typo-changes branch from 6d40526 to b914315 Oct 11, 2017

@dra27

This comment has been minimized.

Copy link
Contributor Author

commented Oct 11, 2017

@shindere: yes, I will run this through precheck (once it also passes Travis - careless of me not to have checked Linux in the first place).

@shindere

This comment has been minimized.

Copy link
Contributor

commented Oct 11, 2017

@dra27 dra27 force-pushed the dra27:check-typo-changes branch 2 times, most recently from 9b54cd4 to 6f16a74 Oct 11, 2017

@dra27

This comment has been minimized.

Copy link
Contributor Author

commented Oct 11, 2017

@shindere:

"Permit lines between 80-132 in .travis-ci.sh": why is that needed? Why
isn't it possible to have this file respect the 80 characters line
limit?

The particular line was the API query because the URL is very long. Specifically for the CI, it seemed easier just to allow this one to have a few overly long lines because it should change very infrequently. I can revisit this if you prefer.

"Tweak .gitattributes for fakeclock.c in testsuite": same question
here, why this?

I thought we decided not to have copyright headers in test programs, but
perhaps I am mistaken here?

We had, but fakeclock.c is a massive piece of unscrupulous C on which I wish to be identified (both for the "oh my God, who did this" reason and also for copyright!).

"Permit UTF-8 characters in copyright and authors": are we really sure
we want to go in this direction?

It's easy for me to say no one should have accents in their names! That said, I think we should permit it, yes, even though the support here is not perfect (part of the "TODO Ensure the file is valid UTF-8" part would also include "and ensure that the user has not used any combining characters", i.e. that it's normalised, but I think that then strays into over-engineering... and I'm certainly not planning on implementing Unicode normalisation in awk!).

"Ignore very long URLs in tools/check-typo": typo in the commit message:
This is only to allow a allow a very long URL

Fixed 馃槉

Am I correct that you will get rid of the URL shortener, finally?

I should have re-worded this when I did the fixup (I have now) - I've used as many markdown/ASCIIdoc tricks as possible to shorten those lines.

@shindere

This comment has been minimized.

Copy link
Contributor

commented Oct 11, 2017

@dra27 dra27 force-pushed the dra27:check-typo-changes branch from fd4015e to 1104c58 Oct 20, 2017

@dra27

This comment has been minimized.

Copy link
Contributor Author

commented Oct 20, 2017

Done - the change in Makefile was to rename WITH_SPACETIME_CALL_COUNTS to ENABLE_CALL_COUNTS from 8f6c6ac

@mshinwell

This comment has been minimized.

Copy link
Contributor

commented Oct 20, 2017

@dra27 Please write a Changes entry and I think you can merge this; @damiendoligez approved yesterday.

@shindere

This comment has been minimized.

Copy link
Contributor

commented Oct 20, 2017

@mshinwell

This comment has been minimized.

Copy link
Contributor

commented Oct 25, 2017

@dra27 Is there a reason this hasn't been merged to 4.06? You just need to write a Changes entry, or so I thought.

dra27 added some commits Aug 12, 2017

Tweak .gitattributes for fakeclock.c in testsuite
Large piece of code which does have a header.
Permit UTF-8 characters in copyright and authors
Allows names to include accented characters, but only in a comment.
Correct licence for tools/make_opcodes.mll
Incorrectly copied from an old QPL header.
Reformat MD5 test
Reduce line lengths and tidy-up the code while at it.
Tidy-up generation of utils/config.ml
Simultaneously deals with an overly-long line and also ensure that
backslashes would be escaped no matter which variables they happened to
appear in.
Ignore very long URLs in tools/check-typo
This is only to allow for a few specific instances, so for the time
being implemented for long-long and not very-long-line.
Warn if a file contains UTF-8 and Latin-1
Add a new warning non-ascii-utf8 displayed only if the non-ascii
attribute is specified and UTF-8 characters were ignored in the
copyright or authors lines in the header.

@dra27 dra27 force-pushed the dra27:check-typo-changes branch from 1104c58 to f282223 Oct 25, 2017

@dra27 dra27 merged commit fd7df86 into ocaml:trunk Oct 25, 2017

1 of 2 checks passed

continuous-integration/travis-ci/pr The Travis CI build failed
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details

dra27 added a commit that referenced this pull request Oct 25, 2017

@dra27

This comment has been minimized.

Copy link
Contributor Author

commented Oct 25, 2017

Travis Changes test appears to be overloaded. Merged and cherry-picked to 4.06 as 4e03f87

@dra27 dra27 referenced this pull request Apr 18, 2019

Merged

Fix stdlib build #8626

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can鈥檛 perform that action at this time.