Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include name of binary into default BISECT_FILE name. #96

Closed
wants to merge 1 commit into from

Conversation

lindig
Copy link

@lindig lindig commented May 12, 2016

By including the name of the binary running, logs for multiple binaries
can go to the same directory without having to use the env variable.

Signed-off-by: Christian Lindig christian.lindig@citrix.com

By including the name of the binary running, logs for multiple binaries
can go to the same directory without having to use the env variable.

Signed-off-by: Christian Lindig <christian.lindig@citrix.com>
@aantron
Copy link
Owner

aantron commented May 12, 2016

Looks reasonable. It might break some build processes, but hopefully most projects are using something like bisect*.out or *.out to match the .out files.

This commit breaks the usage test. Can you please amend it by including changes to tests/usage/Makefile, specifically the lines mentioning bisect0001.out?

@aantron
Copy link
Owner

aantron commented May 12, 2016

One thing that concerns me is that BISECT_FILE can be used to change the .out file directory, but someone that wants to do that would lose the benefit of this PR. What do you think about this?

@lindig
Copy link
Author

lindig commented May 12, 2016

If env var BISECT_FILE is set to a path, this is still respected and treated as before. Only in the case when BISECT_FILE is not set, rather than bisect now bisect-progname is used. But I see your point: in that case the user would have to supply a good name. But in my opinion using BISECT_FILE is the use case when a user wants control. A slightly fancier model would be to allow BISECT_FILE=/some/path/to/bisect-%n-%c.out where %n and %c are replaced by dump():

  • %n is replaced by the binary name
  • %c is replaced by the counter

Hence, in that case the user is expected to supply a name that at least includes %c. One could think of more patterns, like %p for pid. The default would then amount to bisect-%n-%c.out.

I don't see the failing test when doing make test or make unit from tests/ although I agree that it should fail. How can I trigger it?

@aantron
Copy link
Owner

aantron commented May 13, 2016

Hmm, to run that test, you have to install your updated Bisect_ppx and then do make -C tests/usage. Not exactly very usable; I will open an issue for improving that.

I think a simpler solution to the environment variable thing is to provide something like BISECT_DIR and BISECT_FILE, where the actual directory will be the Filename.concat of the two (may have to be careful if BISECT_DIR is empty or BISECT_FILE is an absolute path). We can then change the docs to encourage using BISECT_DIR for changing the directory only and BISECT_FILE for changing the filename only. The Filename.concat will be a hidden detail for backwards compatibility. BISECT_DIR should then be emphasized in the docs, since presumably that is what "most" people are using BISECT_FILE for now. Thoughts?

@aantron
Copy link
Owner

aantron commented May 13, 2016

Actually, your proposal may be better. Would you be okay with the default being bisect%c.out, thus maintaining backwards compatibility?

EDIT: That wouldn't be fully backwards-compatible, actually. I think for full backwards-compatibility, you would have to treat BISECT_FILE set but without %c as a prefix, as now.

EDIT 2: And hope nobody's prefix has %c in it... it's amazing how complicated this simple issue is.

@lindig
Copy link
Author

lindig commented May 13, 2016

My use case is: we are instrumenting a bunch of daemons and the less configuration the better. Right now they would log all into the same place, using the same name. Hence, I would prefer a default where I can see which daemon wrote which file. The dump function is clever enough to not write over existing files. For me, the best default would be "/tmp/bisect-%n-%-c.out" or similar. I see your point of maintaining backward compatibility. If the default is to keep "bisectXXXX.out", maybe we could log more information in the head of the *.out file such that by inspecting them, one knows where it is coming from?

@rleonid
Copy link
Collaborator

rleonid commented May 13, 2016

Although, I'll defer to @aantron on this, but I would be very hesitant to accept this patch.

You're trying to push your desired configuration into bisect_ppx when it can be implemented by wrapping the calls to your daemons with the appropriate environment variable. I'm not certain that it is worthwhile for bisect_ppx to maintain this feature.

@lindig
Copy link
Author

lindig commented May 13, 2016

I'm using the env variable right now. However, I believe not to be the only one having the problem that the connection between the files produced and where they come from is weak and a stronger connection is desirable. As I had suggested, it could be even inside the *.out file. Profiling also not always happens in the context where all source files are available and this is why a stronger connection would help.

@aantron
Copy link
Owner

aantron commented May 13, 2016

I agree with @rleonid. It seems potentially worthwhile to have the ability to include the binary name, or PID, or some other identifier(s) in BISECT_FILE, but I would be hesitant to make support for the above use case the default. Can you do something like export BISECT_FILE=/tmp/bisect-%n-%c.out && my_daemon in your configuration(s)? Also, why not export BISECT_FILE=/tmp/bisect-my_daemon-, using just the current implementation?

If you know specific people that are on GitHub and can comment on this issue, we would like to hear their opinions as well :)

@lindig
Copy link
Author

lindig commented May 19, 2016

I fine with giving up my pull request. To continue the discussion, bisect_ppx could also provide an init function such that a program being profiled could programatically direct where log files are written. Today this is possible by setting the env variable from within the program but an official API would be a little cleaner. I'm using the code below; it can stay in programs that don't link against the bisect libs as there is no direct connection.

let init name =
  let (//)    = Filename.concat in
  let tmpdir  = Filename.get_temp_dir_name () in
    try 
      ignore (Sys.getenv "BISECT_FILE") 
    with Not_found ->
      Unix.putenv "BISECT_FILE" (tmpdir // Printf.sprintf "bisect-%s-" name)

@aantron
Copy link
Owner

aantron commented May 19, 2016

it can stay in programs that don't link against the bisect libs as there is no direct connection.

This is cool, and probably a reason not to add a "proper" API. I imagine few projects actually want a hard dependency on Bisect_ppx, or to introduce conditional compilation due to it.

I am closing this for now. I am under the impression that setting BISECT_FILE outside the program at all is what you are trying to avoid. However, I would be reluctant to make Bisect_ppx change defaults specifically to support one project, specifically in its current form, and complicate it for others. At least, it would be nice to get some more opinions before making such a change.

You are welcome to reopen if you have additional thoughts. There are good ideas in the PR – perhaps some variation of it should be added. Also, parts of the issue might be addressed by modifying the documentation. Perhaps the init function could be made into a tip.

@aantron aantron closed this May 19, 2016
@aantron
Copy link
Owner

aantron commented May 19, 2016

To summarize, the proposals were:

  1. Originally: include the binary name into default value of BISECT_FILE.

    This made it difficult to both include the binary name, and use BISECT_FILE to change the directory at the same time.

  2. Make it possible to both include the binary name and change the directory, using two variables BISECT_FILE and BISECT_DIR.

    This seemed complicated and not very general.

  3. Make it possible to include the binary name and keep the ability to change the directory, by using a placeholder such as %n: export BISECT_FILE=my_directory/bisect-%n-%c.out.

    Seem to have settled on this, but not on including %n in the BISECT_FILE default. The considerations were backwards compatibility and complicated filenames for simple (common?) usage.

  4. Log the binary name to the .out file, and perhaps log other values as well. Provide some way to filter .out files during reporting.

    Potentially a very nice solution. @rleonid? PRs, or amendments to this PR, welcome :)

  5. Provide an API for the program/test runner to set the .out file name programmatically.

    It seems better not to do this to avoid hard dependencies on Bisect_ppx, though perhaps some projects are only ever tested while linked with Bisect_ppx.

  6. Improve the documentation on this topic.

    If you are aware of a part of the docs, where you had looked, and where discussing this would have been particularly helpful to save time on this issue, please let me know. A PR is also welcome, as well :)

@rleonid
Copy link
Collaborator

rleonid commented May 19, 2016

Logging the desired output seems safe and easy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants