Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add missing @since annotations for 4.00 .. 4.05 #916

Merged
merged 2 commits into from Feb 21, 2017
Merged

Add missing @since annotations for 4.00 .. 4.05 #916

merged 2 commits into from Feb 21, 2017

Conversation

edwintorok
Copy link
Contributor

Fixes https://caml.inria.fr/mantis/view.php?id=7412

After finding a few missing @SInCE annotations in the Unix module, I wrote the script below to find more (it is not guaranteed that it finds all missing annotations, but it found quite a few).

URL_PREFIX="https://github.com/ocaml/ocaml/commit"
check () {
    git diff --name-only $1..$2 -- "$3/*.mli" |\
    grep -Ev internal\|obj\|threadUnix\|raw | while read f; do
        echo "$f"
        rm -f commit_ids lines
        # find newly introduces functions/modules/exceptions/types
        git diff $1..$2 -- "$f" |\
            grep -Eo '^[+-] *(val|external|module|exception|type) [^:=*;]+'|\
            colrm 1 1|sed -re 's/[.]? *?$//'|sort|uniq -c|grep '^ *1'|\
            cut -f2 -d'1'|while read line; do
            echo " $line" >>lines
            # find the commit that introduced it
            # COMMIT=$(git log --pretty=format:%H -S "$line" -- "$f"|head -1)
            # echo "$COMMIT" >>commit_ids
        done
        if [ -f lines ]; then
            echo "$f"
            cat lines | while read line; do
                echo " $line"
                # find existing @since annotation if any
                grep "^ *$line" -A10 -- "$f" | tail -n +2| tr '\n' ' '|grep -Eo '^ *\(\*\*[^*]+@since[^*]+'|cut -f2 -d@
            done
            test -f commit_ids && sort -u commit_ids | while read id; do
                # unique commit ids for verification
                echo " $URL_PREFIX/$id";
                git tag --contains "$id" | head -1|while read tag; do
                    # @since tag based on commit (could be higher if cherry-picked and edited)
                    echo " @since $tag"
                done
            done
        fi
    done
}

VPREV=3.12.1
for V in 4.00.0 4.01.0 4.02.0 4.03.0 4.04.0 origin/trunk; do
    echo "------------------"
    echo "@since $V"
    for dir in otherlibs/* stdlib; do
        check "$VPREV" "$V" "$dir"
    done
    VPREV="$V"
done

@alainfrisch
Copy link
Contributor

Thanks, this is very useful. I'm tempted to merge without checking each annotation one by one if you are reasonably confident in the methodology (and unless someone else objects to it, of course).

@dra27
Copy link
Contributor

dra27 commented Nov 14, 2016

There's a little inconsistency of, e.g., 4.03 vs 4.03.0. Is it a good idea to put 4.05 before that's confirmed?

@@ -40,7 +40,8 @@ val self : unit -> t
val id : t -> int
(** Return the identifier of the given thread. A thread identifier
is an integer that identifies uniquely the thread.
It can be used to build data structures indexed by threads. *)
It can be used to build data structures indexed by threads.
@since 4.03 *)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This annotation is not quite right, Thread.id was added by ee63e8d back in 1996.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed it used to be external id before, but I don't think that distinction matters for user code.

@gasche
Copy link
Member

gasche commented Nov 14, 2016

This scripts looks phenomenally useful. Which raises two questions:

  • Could you include it in tools/? We have other related scripts for quality analysis, eg. tools/check-typo
  • Is it possible to make it generic enough to work for user packages? The compiler distribution is far from the only library that would benefit from this. Maybe we could think of reimplementing it as an OCaml script to have a more convenient user-interface? (Not using bash is in no way mandatory, but in my personal experience as soon as I want to make something reusable I quickly need more robust abstraction facilities.)

@alainfrisch : You will call me a perfectionist, but I think that at least having a look at all functions to see if something looks odd would be warranted. I only looked at a few and found a couple errors, it would be interesting to iterate to fix the script.

@gasche
Copy link
Member

gasche commented Nov 14, 2016

There's a little inconsistency of, e.g., 4.03 vs 4.03.0.

Indeed it would be better to use correct minor/bugfix versions when possible -- although if it's too painful, I think it's fine to merge without.

Is it a good idea to put 4.05 before that's confirmed?

In Batteries I use @since NEXT_RELEASE and grep and replace at release time. That said, while we don't know yet whether we will have a minor/bugfix release, I don't think that it should have any new standard library functions in it, so 4.05.0 is probably fine.

@dbuenzli
Copy link
Contributor

Is it possible to make it generic enough to work for user packages?

Except maybe for larger projects I think that for user packages you rather want to simply use the documentation that corresponds to the smallest version of the package you code against. odigging your installed packages gives you just that.

@edwintorok
Copy link
Contributor Author

edwintorok commented Nov 14, 2016

@gasche I don't think there is any need to rush in merging this PR, I can iterate/improve the script if you want this in tools/, and go through the patch once more by hand. Regex based match won't be perfect, (but I don't want to turn this into an overkil by looking at ocamlobjinfo output or something).
@dra27 which form is preferred for @SInCE? I've seen both used in existing documentation, I tend to prefer the full version number form.

@gasche
Copy link
Member

gasche commented Nov 14, 2016

I definitely think that it would be good to have the script in tools/, because it would make it easier to distribute its usage (and the responsibility to use it) among contributors in the future. It's of course fine if the output is imperfect -- and for future runs the output should be much shorter, so having a human review it would not be an issue. (You could even have the script write a small message pointing out that the output is known to be imperfect, if you would feel guilty about it otherwise.) The goal of the current review/iteration should not be to have a "perfect output", but to find among the inevitable glitches all the one that you can relatively easily solve programmatically.

I don't think there are strong conventions on whether you should use @since X.Y.Z or just @since X.Y, but I think that the former should be preferred if it does not make the script noticeably more complex. In particular, I'm sure some older releases added functions in X.Y.Z that were not in available X.Y.0.

@edwintorok
Copy link
Contributor Author

Sorry for the delay. I have updated my PR, changes since last time:

  • rebased on latest trunk
  • added tools/lintapidiff.{ml,sh}
    • works on a Parsetree (Env.fold* would be nice, but it is harder to extract the module annotations). might've exagerated a bit with this, maybe this is not what you meant by 'reimplement in OCaml'
    • assumptions
      • you are running the tool on a development branch that is newer than any other tagged release (otherwise it complains that you deleted a bunch of values without marking them as deprecated first)
      • you have OCaml and findlib already installed (it is not integrated into the compiler's build system)
      • you are running the tool in a git checkout
      • version numbers are of form major.minor or major.minor.patch
      • if a value is marked as deprecated no further error is reported
      • @since/@deprecated annotations are considered to be "inherited" from the parent module (although when reporting you get one report for each value)
      • @SInCE annotations can be extract from ocaml.doc attributes using a regex
      • missing annotations on undocumented symbols of undocumented modules are not reported
      • it analyzes all tagged releases since 3.0: takes ~11s in total, YMMV (needed to correctly report @SInCE mismatches for now)
      • its scope is limited to that of this PR (analyzes only values, types and exceptions; reports only on missing @SInCE for >= 4.00.0). Tried to look at type constructors too, but there are too many false positives from Bigarray's GADTs (which I assume shouldn't be all annotated)
    • only looks at newly added, deleted symbols and mismatched @SInCE. Detecting other ways of breaking an API is out of scope.
    • running on trunk shows these problems
  • fixed the problems reported by the above tool
    • incorrect @SInCE in my previous PR on Hashtbl.hash_param, Format.get_formatter_output_functions, Thread.id
    • add annotations for Bigarray.Array1.slice, Bigarray.genarray_of_array0, Format.formatter_out_functions, Hashtbl.*.{find_opt, filter_map_inplace,reset}, Hashtbl.{stats,statistics}, StringLabels.init, Sys.backend_type
  • use major.minor.patch format for newly added @SInCE annotations consistently

If you prefer I can put the tool into a separate PR, it can probably still be improved.

Copy link
Member

@gasche gasche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a review to make sure that I would be comfortable adding the code to the compiler distribution. It is nitpicky, mostly as a way to force myself to read the code. I agree that the design is reasonably solid and I like this implementation approach.

(One disadvantage over the more greppy previous approach is that it won't work very well for code that is not immediately valid OCaml code, eg. if it uses a cpp-like preprocessor, but I think people can adjust their build to run the tool after preprocessing in this case.)

open Location
open Parsetree

module IdMap = Depend.StringMap
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any particular reason to use Depend.StringMap instead of Map.Make(String) here? I would find the latter more clear as to what the interface is, and I don't immediately see why you chose to reference Depend.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was looking at the code of depend.ml to learn how to use the parsetree, I'll remove it.


let find_attr lst attrs =
try Some (List.find (fun (loc, _) -> List.mem loc.txt lst) attrs)
with Not_found -> None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpicking, but match List.find ... with exception Not_found -> None | attr -> Some attr is a better style nowadays. Don't change it here if this one is more natural to you -- also having code that works on older OCaml version might make sense for this program.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I used this once below when I needed tail recursion, but otherwise I'm still used to the old way.

with Not_found -> false

let get parent_info loc attrs =
let doc = get_doc ["ocaml.doc"; "ocaml.text"] attrs in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder, shouldn't we also support unqualified doc and text? Are such explicitly-user-written attributes never considered by ocamldoc?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ocamldoc does not use documentation attributes currently: their generation by the lexer is even disabled by ocamldoc since #348. (I also think it might be a good idea to support unqualified doc and text.)

match get_doc ["ocaml.text"] attrs with (* for toplevel module annotation *)
| None -> false
| Some text ->
try Misc.search_substring "@deprecated" text 0 >= 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpicking: the semicolon here is mildly confusing (usually people use it to indicate that a value is of type unit, not that it is ignored) and would break with -strict-sequence.

deprecated = parent_info.deprecated || is_deprecated attrs;
loc;
has_doc_parent = parent_info.has_doc;
has_doc = doc <> None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is has_doc_parent used for? One thing that is a bit strange in this code is that the other attributes will behave nicely if there are several nested parents, but these two attributes will only refer to the immediate parent and not the aggregated result on all parents. In particular, if I understand correctly, a undocumented value in an undocumented module in a documented module will be consider undocumented.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making has_doc_parent fully propagated (i.e. parent_info.has_doc_parent || parent_info.has_doc) produces a lot of noise for MoreLabels, although I could easily just exclude morelabels from checking, and enable full propagation.

#!/bin/sh
set -e
TAGS=$(git tag|grep '^[0-9]*.[0-9]*.[0-9]*$'|grep -v '^[12].')
ocamlfind ocamlopt -package compiler-libs.common,str tools/lintapidiff.ml \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that you documented the assumption that this tool relies on an existing installation with findlib, but is there any particular reason for this, or just that you found it easier to iterate on it as a separate tool? I think the reasoning can go both ways: if it's in tools it would be consistent to link with the compiler distribution directly, but having it external like this makes it easier for other people to just steal the scripts and use them for their projects. I like that second advantage and I suspect it could also benefit some other tools, so I would leave things the way you created them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially this had #use and #require (easier to edit/run from within Emacs). I could move this to the Makefile (and not built it by default, just have a target you can run that builds and runs the linter).


module Diff = struct
type seen_info = {
not_seen: version option;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is not_seen set? I understand from the report that it identifies a version in which the item is missing (I would thus find option list more natural), but it seems to be None everywhere in the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In parse_file_at_rev merge: 20e6a66#diff-8de361ea648ec4912974450f0d705a9fR193. I think last_not_seen, would be a better name.

| None, None -> assert false
| _, Some {has_doc_parent=false;has_doc=false;_} -> None (* undocumented *)
| Some {deprecated=true;first_seen;_}, None -> None (* deleted deprecated *)
| Some _, None ->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case above {deprecated=true;_} should suffice. In the present case I think {deprecated=false;_} would be clearer -- it's best to avoid relying on case ordering when it is easy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first_seen in {deprecated=true;first_seen;_} was required for disambiguation, although I can see how thats a bit ugly, I'll add a type annotation instead.

| Some {deprecated=true;first_seen;_}, None -> None (* deleted deprecated *)
| Some _, None ->
Some (default, "deleted non-deprecated", seen, latest)
| _, Some {deprecated=true;since=None;_} -> None (* marked as deprecated *)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it also a problem if deprecated items don't have a @since marker?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that if something is marked as deprecated users would be better of avoiding to use it, than figuring out the required compiler version constraints where they can use it. e.g. Array.make_float was rather short-lived: introduced in 4.02.0, then marked as deprecated, removing the since annotation.

| Some {first_seen;_}, Some {since=None;_} when is_old first_seen ->
None (* OK, very old *)
| Some {first_seen;_}, Some {loc; since=None;_} ->
Some (loc, "missing @since", seen, latest)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the when + case ordering between two clauses is less readable than having a single clause with a conditional test.

@edwintorok
Copy link
Contributor Author

edwintorok commented Dec 10, 2016

Thanks a lot for the review. I appreciate nitpicky comments, and pushed some commits to address most of them (I haven't replied to each comment individually, but see the commit message in the fixup commits).

@shindere
Copy link
Contributor

shindere commented Dec 12, 2016 via email

@gasche
Copy link
Member

gasche commented Dec 12, 2016

@damiendoligez may know about the policy. @damiendoligez, I would also be interested in your general opinion on the patch. I personally think that it is good and that we should consider merging it.

@edwintorok thanks for the fixups. I agree with your justifications in the no-change cases.

@damiendoligez
Copy link
Member

I like the idea of this patch (both the tool and the fixes) although I don't have time to review (and I trust @gasche for that).

The copyright header is mandatory on all source files. @edwintorok have you signed the CLA already for some other contribution?

@edwintorok
Copy link
Contributor Author

@damiendoligez I haven't signed a CLA yet. Can I contact you via e-mail to discuss the details of signing that please?

@gasche
Copy link
Member

gasche commented Feb 14, 2017

@edwintorok it is of course fine to contact Damien if you have further questions, but there is an explanation on the CLA in the CONTRIBUTING.md file, both of what it is (and why) and what is the process to complete it (essentially the process is to print a document, sign it, scan it back, and send it to Xavier Leroy by email).

@edwintorok
Copy link
Contributor Author

@gasche I have rebased the branch, fixed some new warnings from lintapidiff, and added the copyright header on lintapidiff.ml. Please let me know if these changes are ok, I'll then squash the commits marked with squash!

@xavierleroy
Copy link
Contributor

CLA received in good order. I don't know what the next step is, but if anyone does, please proceed.

@gasche
Copy link
Member

gasche commented Feb 21, 2017

I think this should be merged in trunk and 4.05, and I can take care of it. @edwintorok, could you squash and also add a Changes entry (this was a sizeable amount of work so it deserves one)?

@edwintorok
Copy link
Contributor Author

edwintorok commented Feb 21, 2017

Thanks, squashed it down to 2 commits and added the Changes entry.

@gasche
Copy link
Member

gasche commented Feb 21, 2017

(Hm, I was only thinking of squashing the squash! commit and you squashed a bit more. That's fine.)

The commit message in 640bc8b and the comment in lintapidiff.ml still mention a lintapidiff.sh script, but as far as I can tell this script does not exist in the current version of the patch.

Add missing @SInCE annotations for OCaml versions 4.00.0 - 4.05.0,
and fix existing annotations as needed:

Format.ikprintf: clarify ambiguity on @SInCE 4.0 annotation
See b815196

Hashtbl.is_randomized and ListLabels.sort_uniq should be @SInCE 4.03
List.sort_uniq is 4.02 but ListLabels.sort_uniq is 4.03
See:
512d128
189d29b
Run 'make lintapidiff' in the root of a git checkout to get a list of
potentially missing or wrong @SInCE annotations.

The tool is not built by default, you have to first run 'make
world.opt', and then run 'make lintapidiff'.

lintapidiff doesn't support stop comments: add explicit list of changes to ignore.

see copyright header for license.
@edwintorok
Copy link
Contributor Author

Thanks for spotting that, pushed updated branch.

@gasche gasche merged commit 4f46291 into ocaml:trunk Feb 21, 2017
@gasche
Copy link
Member

gasche commented Feb 21, 2017

Merged, and pushed to 4.05 as well. Thanks!

@gasche
Copy link
Member

gasche commented Oct 2, 2017

Note: we just used lintapidiff in preparation for the 4.06, and it was useful (we caught a missing @since in stdlib/uchar.mli). Thanks again!

@edwintorok edwintorok deleted the add-missing-since-annots branch October 2, 2017 17:22
sadiqj pushed a commit to sadiqj/ocaml that referenced this pull request Feb 21, 2023
stedolan added a commit to stedolan/ocaml that referenced this pull request Mar 21, 2023
stedolan pushed a commit to stedolan/ocaml that referenced this pull request Mar 21, 2023
b11eea1 flambda-backend: Introduce Import_info (ocaml#1036)
bc5b135 flambda-backend: Fix `ocamlobjinfo` on flambda2 .cmx files (ocaml#1029)
c8babbd flambda-backend: Compilation_unit optimisations (ocaml#1035)
e8d3e22 flambda-backend: Use 4.14.0 opam switch for building (includes upgrading ocamlformat to 0.24.1) (ocaml#1030)
eb14a86 flambda-backend: Port PR81 from ocaml-jst (ocaml#1024)
131bc12 flambda-backend: Merge ocaml-jst 2022-12-13 (ocaml#1022)
06c189a flambda-backend: Make stack allocation the default (ocaml#1013)
98debd5 flambda-backend: Initial support for value slots not of value kind (ocaml#946)
deb1714 flambda-backend: Add is_last flag to closinfo words (ocaml#938)
d07fce1 flambda-backend: Disable poll insertion in Configure (ocaml#967)
0f1ce0e flambda-backend: Regenerate ocaml/configure autoconf 2.69 (instead of 2.71) (ocaml#1012)
27132d8 flambda-backend: Fix for spurious typing error related to expanding through functor arguments (ocaml#997)
724fb68 flambda-backend: Use `Compilation_unit.t` instead of `Ident.t` for globals (ocaml#871)
396d5b8 flambda-backend: Add a test for frametable setup in natdynlinked libraries (ocaml#983)
b73ab12 flambda-backend: Fix invocation of `caml_shared_startup` in native dynlink (ocaml#980)
7c7d75a flambda-backend: Fix split_default_wrapper which did not trigger anymore with flambda2 (ocaml#970)
8fb75bd flambda-backend: Port ocaml#11727 and ocaml#11732 (ocaml#965)
fdb7987 flambda-backend: Fix include functor issue after 4.14 merge. (ocaml#948)
9745cdb flambda-backend: Print -dprofile/-dtimings output to stdout like 4.12 (ocaml#943)
5f51f21 flambda-backend: Merge pull request ocaml#932 from mshinwell/4.14-upgrade
841687d flambda-backend: Run make alldepend in ocaml/ (ocaml#936)
72a7658 flambda-backend: Remove reformatting changes only in dynlink/dune (preserving PR889 and adjusting to minimise diff)
6d758cd flambda-backend: Revert whitespace changes in dune files, to match upstream
c86bf6e flambda-backend: Remove duplicate tests for polling
971dbeb flambda-backend: Testsuite fixes
32f8356 flambda-backend: Topeval fix for symbols patch
befea01 flambda-backend: Compilation fixes / rectify merge faults
a84543f flambda-backend: Merge ocaml-jst
8e65056 flambda-backend: Merge ocaml-jst
4d70045 flambda-backend: Remove filename from system frametable (amd64) (ocaml#920)
5e57b7d flambda-backend: Bugfix for runtime frame_descr logic for C frames (ocaml#918)
6423d5e flambda-backend: Merge pull request ocaml#914 from mshinwell/merge-ocaml-jst-2022-10-24
ead605c flambda-backend: Add a missing Extract_exception (ocaml#916)
c8f1481 flambda-backend: Resolve conflicts and add specialise/specialised attributes to Builtin_attributes
cf4d0d3 flambda-backend: Merge fixes (ocaml#21)
c2f742f flambda-backend: Re-enable some tests for Flambda2 (ocaml#881)
3d38d13 flambda-backend: Long frames in frametable (ocaml#797)
85aec7b flambda-backend: Add loop attribute to Builtin_attributes
c0f16e3 flambda-backend: Compilation fixes
90dea23 flambda-backend: Merge flambda-backend/main
5acc6ea flambda-backend: Fixes after merge
e501946 flambda-backend: Merge ocaml-jst
115083b flambda-backend: Merge ocaml-jst
9943b2e flambda-backend: Revert "Revert "Transform tail-recursive functions into recursive continuations (ocaml#893)"" (ocaml#909)
ce339f1 flambda-backend: Fix alloc modes and call kinds for overapplications (ocaml#902)
e6a317c flambda-backend: Revert "Transform tail-recursive functions into recursive continuations (ocaml#893)"
853c488 flambda-backend: Transform tail-recursive functions into recursive continuations (ocaml#893)
5a977e4 flambda-backend: Fix missing End_region primitives on switch arms (ocaml#898)
7fa7f9d flambda-backend: Add missing dependencies to Dune files (ocaml#889)
3cd36f0 flambda-backend: Have Lambda `Pgetglobal` and `Psetglobal` take `Compilation_unit.t` (ocaml#896)
7565915 flambda-backend: [@poll error] attribute (ocaml#745)
9eb9448 flambda-backend: Backport the main safepoints PRs (ocaml#740)
689bdda flambda-backend: Add strict mode for ocamldep (ocaml#892)

git-subtree-dir: ocaml
git-subtree-split: b11eea1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants