New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement couch_file:format_status to log filepath #1601

Merged
merged 1 commit into from Sep 12, 2018

Conversation

Projects
None yet
2 participants
@iilyak
Contributor

iilyak commented Sep 12, 2018

Overview

Erlang OTP logs the fact when gen_server behavior crashes. However in case of couch_file the filename is not part of the state. So it is quite hard to figure out what happened given the current log entry format. This PR adds information about filepath to the log entry. Keep in mind though that the filepath value is captured on couch_file:start_link. It is not representing the current file name if file is renamed or moved.

Testing recommendations

  1. Start couchdb dev/run --admin=adm:pass
  2. Start remsh session dev/remsh
  3. Inject failure in one of the couch_file processes (by reading with position greater than the file size)
{_Port, Pid, _Fd, _Name} = hd(couch_debug:opened_files()).
couch_file:pread_binary(Pid, 999999999999999).
  1. Check logs to make sure the following messages are present:
[error] 2018-09-12T20:01:02.957946Z node1@127.0.0.1 <0.390.0> -------- gen_server <0.390.0> terminated with reason: bad return value {read_beyond_eof,"/home/jenkins/dev/lib/node1/data/dbs/nodes.couch"}
  last msg: {pread_iolist,123123123}
     state: [{data,[{"State",{file,{file_descriptor,prim_file,{#Port<0.51359>,29}},true,139454,#Ref<0.0.3.65>,infinity}},{"InitialFilePath","/home/jenkins/dev/lib/node1/data/dbs/nodes.couch"}]}]

Related Issues or Pull Requests

This is a port of PR from apache/couchdb-couch#215 to couchdb repository.

Checklist

  • Code is written and works correctly;
  • Changes are covered by tests;
  • Documentation reflects the changes;
@iilyak

This comment has been minimized.

Contributor

iilyak commented Sep 12, 2018

Reusing +1 from previous review of the same change apache/couchdb-couch#215 (comment)

@iilyak iilyak merged commit b4bfc52 into apache:master Sep 12, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@iilyak iilyak deleted the cloudant:log_file_path_on_crash branch Sep 12, 2018

@eiri

This comment has been minimized.

Member

eiri commented Sep 27, 2018

This is a great fix, but why did you did it in CamelCase? This is so strange to see javaism in erlang's stack trace...

@iilyak iilyak referenced this pull request Sep 27, 2018

Closed

Don't use CamelCase in format_status #1630

1 of 3 tasks complete
@iilyak

This comment has been minimized.

Contributor

iilyak commented Sep 27, 2018

@eiri: Would this work #1630 or you prefer a string "initial_file_path"?

@eiri

This comment has been minimized.

Member

eiri commented Sep 27, 2018

@iilyak Personally I prefer atoms for both initial_file_path and state just so it's consistent with the rest of stack trace.

But to be clear I didn't try to twist your arm here, it's obviously a question of personal preferences, I was just was surprised when I was walking through commits happened while I was away.

@iilyak

This comment has been minimized.

Contributor

iilyak commented Sep 28, 2018

@eiri: Here is an extract from [gen_server documentation] (http://erlang.org/doc/man/gen_server.html#Module:format_status-2)

the recommended form for the Status value is [{data, [{"State", Term}]}], where Term provides relevant details of the gen_server state. Following this recommendation is not required, but it makes the callback module status consistent with the rest of the sys:get_status/1,2 return value.

I remember being inspired by the following example from OTP

format_status(_Opt, [_PDict, #state{os = OS, threshold = Threshold,
				    timeout = Timeout,
				    diskdata = DiskData}]) ->
    [{data, [{"OS", OS},
	     {"Timeout", Timeout},
	     {"Threshold", Threshold},
             {"DiskData", DiskData}]}].

The use of the function in otp is inconsistent:

Specific = [{data, [{"State", State}]}],
[{header, Header},
     {data, [{"Status", SysState},
	     {"Parent", Parent},
	     {"Logged events", Log}]} |
Specfic]. 
[{data, [{"StateData", State}]}]; 
[{data, [{"State", State}]},
 {supervisor, [{"Callback", State#state.module}]}]
[{data, [{"Timeout", Timeout}]},
 {items, {"Memory Usage", [
      {"Allocated", Allocated},
      {"Total", Total}]}},
 {items, {"Worst Memory User", WorstMemFormat}}].
[{data, [{"State", [{foo, 1}, {bar, 2}]}]}].

My guess is it might have to do with the way how observer or SASL uses this data. Since we are not using either we should be able to use any format we want.

@eiri

This comment has been minimized.

Member

eiri commented Sep 28, 2018

Huh. I haven't seen this gen_server doc's snippet, thank you for pointing out.

To be honest I feel very confused on what OTP folks are doing here in source code, mixing strings, atoms, camelcase and capitalized spaced. Dunno, maybe it just me.

Anyway, I take my complaints back, since you are following worded OTP recommendation here let's keep it this way, sorry about whole confusion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment