Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BLocking stubs not releasing the runtime in stdlib and unix #6276

Closed
vicuna opened this Issue Dec 17, 2013 · 12 comments

Comments

Projects
None yet
2 participants
@vicuna
Copy link
Collaborator

commented Dec 17, 2013

Original bug ID: 6276
Reporter: @diml
Assigned to: @diml
Status: closed (set by @xavierleroy on 2015-12-11T18:25:42Z)
Resolution: fixed
Priority: normal
Severity: minor
Target version: 4.01.1+dev
Fixed in version: 4.01.1+dev
Category: runtime system and C interface
Has duplicate: #4995
Related to: #4801
Monitored by: @gasche @hcarty @dbuenzli

Bug description

The following functions might block and do not release the runtime lock:

Sys.chdir
Sys.close
Sys.file_exists
Sys.is_directory
Sys.readdir
Sys.remove
Sys.rename
Unix.readdir
Unix.opendir
Unix.close
Unix.closedir
Unix.rename
Unix.remove

This can be a problem when using NFS for instance (that's how we noticed it).

File attachments

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

commented Dec 17, 2013

Comment author: @diml

I don't think there is any particular reason not to release the runtime in these functions? I'm preparing a patch.

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

commented Dec 17, 2013

Comment author: @diml

Patch attached. If there is no objections I'll commit it.

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

commented Dec 17, 2013

Comment author: @gasche

Have I correctly understood that then change involves doing copy for the string arguments (usually file paths) of those functions, when there often was not before? Would that make a performance difference for, say, recursive traversal of a large filesystem?

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

commented Dec 18, 2013

Comment author: @diml

Have I correctly understood that then change involves doing copy for the string arguments (usually file paths) of those functions, when there often was not before?

Yes

Would that make a performance difference for, say, recursive traversal of a large filesystem?

It is going to be slightly slower, yes. With the attached traverse.ml, on a hierarchy of 1M files, I get a 3% slowdown.

I can understand that all these copy are a bit frustrating, but not releasing the lock means that you can block all threads because one of them is accessing a NFS.

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

commented Dec 18, 2013

Comment author: @gasche

With the attached traverse.ml, on a hierarchy of 1M files, I get a 3% slowdown.

That seems reasonable.

I can understand that all these copy are a bit frustrating,
but not releasing the lock means that you can block all threads
because one of them is accessing a NFS.

I agree copying and releasing is a better compromise.

I'm not familiar with the FFI, so I can't say whether your patch is correct. (There are good explanations of enter_section/leave_section in http://d.hatena.ne.jp/camlspotter/20100309/1268111257 ).

Can you explain a bit when exactly do you need to add CAMLparam/CAMLreturn? My understanding was that they were need when the GC is called (and caml_stat_alloc{_string} does not call the GC). Does releasing the lock require using those? Why does your patch add them to caml_sys_remove, but not caml_sys_rename, for example?

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

commented Dec 18, 2013

Comment author: @diml

Can you explain a bit when exactly do you need to add CAMLparam/CAMLreturn?

Some of the stubs are using the [path] variable for the error message after the call. So we need to ensure it is still valid.

My understanding was that they were need when the GC is called (and caml_stat_alloc{_string} does not call the GC). Does releasing the lock require using those?

Yes, the GC might run in another thread for instance.

Why does your patch add them to caml_sys_remove, but not caml_sys_rename, for example?

caml_sys_remove uses [path] for the error message but caml_sys_rename doesn't. I guess it wouldn't harm to add them everywhere.

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

commented Dec 23, 2013

Comment author: @diml

commited, revision 14384 and 14385.

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

commented Dec 24, 2013

Comment author: @yallop

I think 'File_offset_val' needs to be moved outside the blocking section in 'unix_truncate_64', and the call to 'access' in 'unix_access' needs to use 'p' instead of 'String_val(path)'.

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

commented Dec 24, 2013

Comment author: @diml

Indeed. Fixed in commits 14387 and 14388. Thanks!

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

commented Jan 2, 2014

Comment author: @yallop

Another thought: it's probably very slightly better to have the calls to caml_stat_free inside the blocking sections.

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

commented Jan 29, 2014

Comment author: dsheets

I believe ftruncate, fchown, and fchmod also need to have the lock released.

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

commented Jan 29, 2014

Comment author: @diml

Indeed. Fixed in commits 14425 and 14426. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.