Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restoring test on ident validity while browsing directory structure. #1054

Merged
merged 2 commits into from Oct 11, 2017

Conversation

herbelin
Copy link
Member

The test was abandoned at the time of merging subdirectory browsing between coqdep and coqtop, and to limit at the same time the dependency of coqdep in files such as unicode.cmo.

But checking ident validity speeds up browsing in arbitrary non-Coq-specific directory structures and we restore it for this reason.

(One could also consider that browsing arbitrary directory structures is not intended, but in practice this may happen, as e.g. reported in BZ#5734.)

@letouzey, I know that you care about minimal dependencies for coqdep.

Is it ok for you that we make coqdep aware of the exact (UTF-8) syntax of module names? Otherwise we can also consider that users are not supposed to do Add LoadPath on non-Coq related directory structure and that the regression in efficiency of Add LoadPath between 8.5 and 8.6 is not a problem. Or, we can have different ocaml code to locate modules in coqdep and coqtop (but with the risk of inconsistencies as we had).

Note that the question is also about whether we want coqdep to be aware of the exact (UTF-8) syntax of library names (see commit e88dfed in PR#1041). (Though here, I could not understand if there were a global agreement on supporting arbitrary letters in library names, as we do for modules in general, rather than just ascii letters.)

@Zimmi48 Zimmi48 added kind: fix This fixes a bug or incorrect documentation. needs: fixing The proposed code change is broken. needs: discussion Further discussion is needed. kind: performance Improvements to performance and efficiency. and removed kind: fix This fixes a bug or incorrect documentation. labels Sep 18, 2017
@Zimmi48
Copy link
Member

Zimmi48 commented Sep 18, 2017

This commit apparently breaks the compilation of the standard library.

Now, for the design question itself: I'm unsure that what is fixed here is the real problem that ought to be fixed (not saying though that I'm opposing to this PR which may be good in any case). Maybe we can consider indeed that users should only use Add LoadPath on Coq-related directory structures.

To me, the real issue of BZ#5734 is that CoqIDE calls Add LoadPath without the user either wanting it, needing it, nor knowing about it. And it isn't even consistent since this Add LoadPath command is not always sent depending on where the file is located related to where CoqIDE was opened. Remark how absurd this is: if I'm a console user, I will browse to a directory then call coqide test.v (no Add LoadPath). If I'm a desktop user, I will open CoqIDE from a desktop shortcut then use the menu to open a file and browse to the location of test.v (there Add LoadPath is sent!).

What would it cost to simply stop sending Add LoadPath when opening a file? (Actual question, not rhetorical: I'm not well aware of the consequences.)

@ejgallego
Copy link
Member

I am afraid I have not a clear picture of what direction we would like to move here :( I have been thinking about the LoadPath issue for a while [as in my own toplevels I need to make a choice], but I am not sure what we'd like to do.

What do other compilers do? Do they support "recursive" path binding?

@silene
Copy link
Contributor

silene commented Sep 19, 2017

This commit apparently breaks the compilation of the standard library.

Actually, it should break almost everything. Indeed, at the time of 8.5, ok_dirname was only called when recovering the list of subdirectories. But 1be9c4d changed the code so that ok_dirname would also be called when recovering files (despite its name). Thus, this pull request causes coqdep to become completely blind, since no interesting filenames (e.g. .v files) can successfully pass the 8.5 ok_dirname check.

@herbelin
Copy link
Member Author

The commit was multiply broken, and not only for what @silene is reporting. Hopefully ok now.

@Zimmi48 Zimmi48 removed the needs: fixing The proposed code change is broken. label Sep 19, 2017
@silene
Copy link
Contributor

silene commented Sep 19, 2017

Hopefully ok now.

It indeed looks a lot better.

I don't think we are ever interested in iterating over filenames that cannot be mapped to Coq identifiers. So I suggest making f (FileRegular name) conditional on ok_coqname (basename name). It would make all the changes to tools/coqdep_common.ml pointless, I think.

@herbelin
Copy link
Member Author

@silene: it remains the case of OCaml files (option -I) which obey the OCaml syntax (call to process_directory via Coqdep_common.add_caml_dir). The OCaml syntax for file names is strictly more restricted than Coq's one (at the current time) but I was a bit reluctant to mix them. Actually, coqdep currently accepts any possible string in Declare ML Module which would match any file of same base name on disk! Don't know what to think about it.

@silene
Copy link
Contributor

silene commented Sep 20, 2017

The OCaml syntax for file names is strictly more restricted than Coq's one

Then we don't have to worry about it; no need to overengineer it, in my opinion.

Thinking a bit more about the code, we would still be doing too many syscalls in some degenerate cases. So the code could just look as follows; this would fully recover the time-complexity of Coq 8.5.

let apply_subdir f path name =
  let base = basename name in
  if ok_dirname base then
    let path = if path = "." then name else path//name in
    match try (Unix.stat path).Unix.st_kind with Unix.Unix_error _ -> Unix.S_BLK with
    | Unix.S_DIR when name = base -> f (FileDir (path,name))
    | Unix.S_REG -> f (FileRegular name)

@herbelin
Copy link
Member Author

Even if more restricted than the one of Coq for module name, the OCaml syntax is more liberal for file name. I can call a file e.g. #.ml and make a Coq plugin of this name with only the warning 24 as complaint. So, you suggest that we make the warning 24 constraint strict in the context of Coq?

@silene
Copy link
Contributor

silene commented Sep 20, 2017

I can call a file e.g. #.ml and make a Coq plugin of this name with only the warning 24 as complaint.

Is it possible to then load the plugin? (just curious)

So, you suggest that we make the warning 24 constraint strict in the context of Coq?

Yes, at least when it comes to plugins.

@herbelin
Copy link
Member Author

Is it possible to then load the plugin? (just curious)

Yes:

echo "1" > #.ml
ocamlopt -shared #.ml -o #.cmxs
bin/coqtop
Declare ML Module "#".
(* [Loading ML file #.cmxs ... done] *)

So, you suggest that we make the warning 24 constraint strict in the context of Coq?

Yes, at least when it comes to plugins.

Not unreasonable. Let's give us some time to ponder it.

@mattam82
Copy link
Member

I'm fine with a restricted charset for plugin objects.

@maximedenes
Copy link
Member

I don't like that we rely on some file names to be outside of the allowed subset for identifiers in order not to crash. But this patch seems like a good emergency fix until we improve the design. So if that's ok with everyone, I can merge it and leave https://coq.inria.fr/bugs/show_bug.cgi?id=5734 open.

@Zimmi48
Copy link
Member

Zimmi48 commented Sep 26, 2017

I'm not sure I understand what you mean @maximedenes but yes, I think this PR can be merged and the bug report left open. I don't think that this LoadPath - CoqIDE story makes any sense.

@maximedenes
Copy link
Member

I'm not sure I understand what you mean @maximedenes

What I meant is that IIUC, this patch does not really fix the underlying pb (browsing of potentially large subtrees of .), it simply relies on the fact that many folder names are not valid identifiers, and that not browsing such folders is enough to make the browsing not diverge too often in practice.

@silene
Copy link
Contributor

silene commented Sep 26, 2017

this patch does not really fix the underlying pb (browsing of potentially large subtrees of .)

No patch can ever fix that. As long as -R exists (or -Q and From), you will always have to browse some subtree at one time or another.

@maximedenes
Copy link
Member

Sure, but it still does some browsing that was not asked explicitly by the user, doesn't it? Or did I miss that it also removes the Add LoadPath .?

@silene
Copy link
Contributor

silene commented Sep 26, 2017

Sure, but it still does some browsing that was not asked explicitly by the user, doesn't it? Or did I miss that it also removes the Add LoadPath .?

No, it does not remove it, but that is kind of orthogonal. For example, I did not quite follow what the issue with hangul characters was, but I would not be surprised if this commit had prevented it.

@Zimmi48 Zimmi48 removed the needs: discussion Further discussion is needed. label Oct 6, 2017
@Zimmi48 Zimmi48 added this to the 8.7.0 milestone Oct 6, 2017
@Zimmi48 Zimmi48 added this to Request 8.7 inclusion in Coq 8.7 Oct 6, 2017
@maximedenes maximedenes added the needs: rebase Should be rebased on the latest master to solve conflicts or have a newer CI run. label Oct 9, 2017
@maximedenes
Copy link
Member

maximedenes commented Oct 10, 2017

@herbelin can you rebase this one, removing the dependency fix that was already merged? (as part of #1133)

The test was abandoned at the time of merging subdirectory browsing
between coqdep and coqtop, and to limit at the same time the
dependency of coqdep in files such as unicode.cmo.

But checking ident validity speeds up browsing in arbitrary directory
structure and we restore it for this reason.

(One could also say that browsing arbitrary directory structures is
not intended, but in practice this may happen, as e.g. reported in
BZ#5734.)
@herbelin
Copy link
Member Author

I rebased (it now includes @silene's optimization).

Two questions remain for future work:

  • split unicode.ml into a stand-alone part which just recognizes idents (for coqdep) and a part which depends on unicodetable.ml (for coqtop)
  • add OCaml warning 24 on plugin file names (strict check on OCaml module names)

@Zimmi48 Zimmi48 removed the needs: rebase Should be rebased on the latest master to solve conflicts or have a newer CI run. label Oct 10, 2017
@maximedenes
Copy link
Member

The failures look unrelated so merging.

@coqbot coqbot merged commit 7c36b13 into coq:master Oct 11, 2017
coqbot pushed a commit that referenced this pull request Oct 11, 2017
@Zimmi48 Zimmi48 moved this from Request 8.7 inclusion to Waiting to be backported in Coq 8.7 Oct 11, 2017
Zimmi48 added a commit to Zimmi48/coq that referenced this pull request Oct 11, 2017
@Zimmi48 Zimmi48 moved this from Waiting to be backported to Waiting for CI in Coq 8.7 Oct 11, 2017
@Zimmi48 Zimmi48 moved this from Waiting for CI to Shipped in 8.7.0 in Coq 8.7 Oct 11, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind: performance Improvements to performance and efficiency.
Projects
No open projects
Coq 8.7
  
Shipped in 8.7.0
Development

Successfully merging this pull request may close these issues.

None yet

7 participants