Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FreeBSD/EventKqueue.{h,cc} Added code to restore events on (thread)fork #11430

Merged
merged 3 commits into from Oct 16, 2016

Conversation

wjwithagen
Copy link
Contributor

According the FreeBSD man page of kqueue(), the kq-descriptors become invalid
upon fork. It looks like the same happens when a kq-handle is created and then
a thread is created.

So we keep a list of assigned events with each kq-handle and when the handle
has beccome invalid, we recreated the kq-handle and the events that go with it.

Signed-off-by: Willem Jan Withagen wjw@digiware.nl

@wjwithagen
Copy link
Contributor Author

@yuyuyu101
I've taken that rather rigorous approach to redo the events on a kqfd after the thread has forked.
But it seems to work, since all tests that are available run to completion. (This is for the first time.)

So would you care to review the code, and comment....
More that happy to fix any suggestions.

@liewegas
Copy link
Member

@yuyuyu101

@wjwithagen
Copy link
Contributor Author

@liewegas
I pinged @yuyuyu101 already to see what he thinks of the code.
Probably is not going to win a first prize in a beauty contest, but it does get the thing done.

Copy link
Member

@yuyuyu101 yuyuyu101 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

basicly looks good. But why we not add handle_fork in EventDriver interface which may more clean?

CephContext *cct;
int size;

private:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: don't need to private here

<< cpp_strerror(errno) << dendl;
return -errno;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can move these checks to a function to make it shared

ldout(cct,10) << __func__ << " We changed thread from " << mythread << " to " << pthread_self() << dendl;
mythread = pthread_self();
kqfd = -1;
} else if ((kqfd != -1) && (test_kqfd() < 0)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if thread id isn't changed, it may destroy kqfd?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuyuyu101
I have seen cases where it triggered an "invalid" kqfd. But perhaps that was during testing and the code still had bugs. Lets keep this code in for now. And in due time if we full understand what is going on, perhaps change it into an 'assert()' if this actually should not happen. But now I'm not 100% certain.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuyuyu101
Or even just return a negative result. The layer above is not equipped to handle kqueue-event faults, and assert in most cases anyways.

struct SaveEvent{
int fd;
int mask;
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can move struct kevent to this structure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuyuyu101
That would actually make it hard to use res_events[] the structure that kevent() can return the triggerd events in. So we'd again need to alloc(nevents-size) an array for that, or do that on the stack. Blowing the stack out of proportion on every call to event_wait()

}
}

ldout(cct,10) << __func__ << " add event kqfd = " << kqfd << " fd = " << fd
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

20 should be better. 10 is too verbose

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuyuyu101
Will fix all verbose levels, since they are too low.

int mask = cur_mask & delmask;
int mask = cur_mask & del_mask;

ldout(cct,10) << __func__ << " delete event kqfd = " << kqfd
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

20

@@ -99,22 +211,66 @@ int KqueueDriver::event_wait(vector<FiredFileEvent> &fired_events, struct timeva
int retval, numevents = 0;
struct timespec timeout;

ldout(cct,10) << __func__ << " kqfd = " << kqfd << dendl;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope we don't print this or improve this to 30

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wjwithagen this line... we don't want to print level 10 log when entering event_wait each time

}

if (retval > 0) {
ldout(cct,10) << __func__ << " kevent retval: "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

20 or 30

int j;

numevents = retval;
fired_events.resize(numevents);
for (j = 0; j < numevents; j++) {
int mask = 0;
struct kevent *e = events + j;
struct kevent *e = res_events + j;
ldout(cct,10) << __func__ << " j: " << j << dendl;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hope we don't need this

@wjwithagen
Copy link
Contributor Author

@yuyuyu101
Glad you think it is not a total mess.... I will work on the changes.

And yes, I would love to do this at a higher level, but on the other hand it would perhaps obfuscate the code at that level. This is obviously a Kqueue issue, and as such should be hidden in the kqueue-code.
Perhaps for me the most important thing: I'm not 100% confident that I understand all details of the thread creation in the stack module.
Perhaps we should later on, when the dust has settled, and we are quite sure that this is a workable solution that has passed QA.

@wjwithagen
Copy link
Contributor Author

@yuyuyu101
If you want to go the handle_fork way, you will have to give me some pointers/example code of what your idea is. And I'll start thinking about it, and we can see if it delivers cleaner code.

@yuyuyu101
Copy link
Member

@wjwithagen hmm, I think we can go through this way at first. Let's see it will work firstly. Then if available, I could just refactor this workable codes and make it in higher level. So let's make it first!

Thanks!!

@wjwithagen
Copy link
Contributor Author

@yuyuyu101
Great, thanx for the positive feedback.

@wjwithagen
Copy link
Contributor Author

@yuyuyu101
I think I addressed most/all of your comments?
If you are oke with this, I'll rebase it again into 2 commits.

int j;

numevents = retval;
fired_events.resize(numevents);
for (j = 0; j < numevents; j++) {
int mask = 0;
struct kevent *e = events + j;
struct kevent *e = res_events + j;
ldout(cct,20) << __func__ << " j: " << j << dendl;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont know why need to print j.... since we already know the number of events returned

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuyuyu101
This is really debug stuff to make sure all events are processed.
So yes it can go.

@wjwithagen
Copy link
Contributor Author

@yuyuyu101
Ping...

@@ -22,34 +22,129 @@
#undef dout_prefix
#define dout_prefix *_dout << "KqueueDriver."

#define KEVENT_NOWAIT 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuyuyu101
Leftover from previous code. Usually I replace hard values with descriptive constants.
I'll see if it still fits the purpose.

ldout(cct,0) << __func__ << " unable to add event: "
<< cpp_strerror(errno) << dendl;
return -errno;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to reset num = 0 here? otherwise, num will continue to grow.....

@@ -99,22 +211,66 @@ int KqueueDriver::event_wait(vector<FiredFileEvent> &fired_events, struct timeva
int retval, numevents = 0;
struct timespec timeout;

ldout(cct,10) << __func__ << " kqfd = " << kqfd << dendl;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping

public:
explicit KqueueDriver(CephContext *c): kqfd(-1), events(NULL), cct(c), size(0) {}
explicit KqueueDriver(CephContext *c): kqfd(-1), res_events(NULL), cct(c), size(0) {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to init sav_events = 0

int mask;
};
struct SaveEvent *sav_events;
int sav_index;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right the idea was to keep save usage here.
Best to name it sav_max and keep the size of sav_events for resizing.

int fd;
int mask;
};
struct SaveEvent *sav_events;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because you save all ongoing kqueue events, I think you need to resize in KqueueDriver::resize_events in case of fd is exceed the init number

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuyuyu101
Right, I was hoping not to run into that :), but it does require action.
Would be a lot when 1 OSD has like 5000 fd's open.

@wjwithagen
Copy link
Contributor Author

@yuyuyu101
Good review, I'm currently also busy with other things, so it might take more than a few hours to rework this.

@wjwithagen
Copy link
Contributor Author

@yuyuyu101
I think I got all the issues you touched and did a bit of cleaning up.
Especially catching the num=0 was a nice one, could be the reason that I sometimes saw restore_event errors, and reset the FD. Probably events were reinserted that were already there.
So can you do another review?

}
memset(&sav_events[size], 0, sizeof(struct SaveEvent)*(newsize-sav_max));
sav_max = newsize;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks format is totally unaligned

@yuyuyu101
Copy link
Member

others look better to me

@wjwithagen
Copy link
Contributor Author

@yuyuyu101
You are right. Looks like vi did not pick up on tabbing.
Fixed that.

Copy link
Member

@yuyuyu101 yuyuyu101 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yuyuyu101
Copy link
Member

Because the qa won't compile out Kqueue driver, I don't think we need qa process. But before merging, I want to make sure @wjwithagen have tested the fresh commits. And got ack from you....

@wjwithagen
Copy link
Contributor Author

@yuyuyu101
Before I left yesterday, i fired up a testrun with this code in my work tree:

99% tests passed, 1 tests failed out of 129
Total Test time (real) = 3531.31 sec
The following tests FAILED:
         12 - run-tox-ceph-disk (Failed)

And the failure on the last on is expected, because I have not done any work on it.
So I think it is good to go.

…ead)fork

According the FreeBSD man page of kqueue(), the kq-descriptors become invalid
upon fork. It looks like the same happens when a kq-handle is created and then
a thread is created.

So we keep a list of assigned events with each kq-handle and when the handle
has beccome invalid, we recreated the kq-handle and the events that go with it.

Signed-off-by: Willem Jan Withagen <wjw@digiware.nl>
Signed-off-by: Willem Jan Withagen <wjw@digiware.nl>
Signed-off-by: Willem Jan Withagen <wjw@digiware.nl>
@wjwithagen
Copy link
Contributor Author

@yuyuyu101
Rebased to compress the commits into 3 pieces.

@yuyuyu101 yuyuyu101 merged commit 21d5f49 into ceph:master Oct 16, 2016
@wjwithagen
Copy link
Contributor Author

@yuyuyu101
thank ....

@wjwithagen wjwithagen deleted the wip-wjw-freebsd-EventKqueue branch January 4, 2017 10:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants