Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repeated calls to find_in_path degrades performances #5551

Closed
vicuna opened this issue Mar 20, 2012 · 4 comments

Comments

Projects
None yet
2 participants
@vicuna
Copy link

commented Mar 20, 2012

Original bug ID: 5551
Reporter: @alainfrisch
Assigned to: @alainfrisch
Status: closed (set by @xavierleroy on 2013-08-31T10:48:52Z)
Resolution: fixed
Priority: normal
Severity: minor
Category: ~DO NOT USE (was: OCaml general)
Monitored by: @protz jmeber

Bug description

In Env.find_pers_struct, when the the .cmi file is not found, this outcome is not cached. The same lookup can thus occur over and over again with the same module name if the .cmi file is not found. This can happen in two cases:

  • The module name if "predef" (created internally by the compiler) --> in this case, find_pers_struct should fail immediatly.

  • A real module name, whose .cmi is not in the load path. This is not a fatal error if the module is only used to expand abbreviations (types are considered abstract).

I propose to define Env.persistent_structures as:

let persistent_structures =
(Hashtbl.create 17 : (string, pers_struct option) Hashtbl.t)

and store None when the .cmi file is missing.

We have seen huge speedups by applying this change (under Windows). (E.g. a 6-times speedup for a series of tests calling the toplevel on large files, from 3 minutes to 30 seconds.)

@vicuna

This comment has been minimized.

Copy link
Author

commented Mar 20, 2012

Comment author: @alainfrisch

FWIW, this micro benchmark:

for i = 1 to 500000 do ignore (Sys.file_exists "foobar") done;;

takes about 18s under Windows (Intel Core i5, 2.4 Ghz), and 0.8 under Linux (VIA Nano U2250, 1.6Ghz). (When "foobar" does not exist. Results are quite close when it does.)

As a side note, it seems that "_access" is about twice as fast as "stat" under Windows. Should we use it in sys.c (under Windows)?

@vicuna

This comment has been minimized.

Copy link
Author

commented Mar 21, 2012

Comment author: @alainfrisch

We need to be careful in the toplevel, in interactive mode, because persistent_structures is not cleared between phrases, and one might want to use a .cmi file which became available only during the session.

@vicuna

This comment has been minimized.

Copy link
Author

commented Mar 21, 2012

Comment author: @alainfrisch

Commit 12251 (in trunk, not 4.00).

FWIW, the tests I was mentioning now runs in 8sec on another machine (Windows, with an SSD drive) where it used to take 300sec.

@vicuna

This comment has been minimized.

Copy link
Author

commented Mar 26, 2012

Comment author: @alainfrisch

Pushed to the trunk (r12279).

@vicuna vicuna closed this Aug 31, 2013

@vicuna vicuna added the bug label Mar 20, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.