interrupt opcode handling #15

aarzilli · 2013-06-29T06:09:24Z

No description provided.

…and interrupt handling code)

aarzilli · 2013-06-29T17:02:06Z

actually reqPool used to work like a free list. Can you rename so we have reqUsed and reqFree? or reqFree, reqPending ?

reqPool currently hosts both free and used requests, calling it reqUsed or reqPending would be misleading. If you don't like reqPool I could call it reqList.

the sleep looks dodgy. What's is this function supposed to do ? Does libfuse have the sleep too?

doc/kernel.txt in libfuse at line 148 says this

If the filesystem cannot find the original request, it should wait for
some timeout and/or a number of new requests to arrive, after which it
should reply to the INTERRUPT request with an EAGAIN error.

libfuse, unless I'm reading it wrong, seems to put every interrupt request that can't be immediately be satisfied into a list and check every incoming request against the list of interrupted requests.

To do the same we would have to guarantee that an interrupt request is fully processed before starting to process any other request and that all the requests prior to an interrupt request have at least been parsed before we process the interrupt request. It seems a big rework of the way go-fuse does things.

Speaking of races...

are you sure you need this? Whether or not a req is in use is already determined by which list it is on?

actually yes, but not the way I was using it. I need it to know that a request has been parsed.

rather than sending, can you close the channel instead? otherwise clients have to worry about extracting values exactly once from the channel.

After closing the channel, you have to put a new channel into the request

AFAIK there is no way to know that a channel is closed I would have to create a new channel for every request, regardless of whether it gets closed or not.

Beyond specific commands, I would like to see a test.

I'll get on it

aarzilli · 2013-06-29T17:09:19Z

AFAIK there is no way to know that a channel is closed I would have to create a new channel for every request, regardless of whether it gets closed or not.

I take it back, there is a way:

func isclosed(ch chan bool) bool {
    select {
    case _, ok := <- ch:
        if !ok {
            return true
        }
    default:
    }
    return false
}

Do you want to do it this way?

hanwen · 2013-06-30T11:32:36Z

About the sleep: sleeping in concurrent code is wrong almost by definition. How about the following:

on receiving the interrupt, put it on a list of pending interrupted requests if the original is not found.
after parsing the normal request successfully, check if there is a pending interrupt request
- yes: discard original request and return EINTR for it.
no: continue normally.
the pending interrupts list, and its check would have to be protected by reqMu
if a request is returned to the pool (ie. the request was answered successfully), for which we have a pending interrupt, we clear the pending interrupt.

For the management of the channel, how about the following:

Rather than getInFlightMethod() which returns the wanted request, have a interruptRequest() method, taking the unique ID. If it can find the request to interrupt, it does the following:

close channel
set req.wasInterrupted = true

On returning the request to the pool of free requests, replace the channel if req.wasInterrupted is set.

Both actions should happen under protection by reqMu.

I really don't want to have individual values on the interrupt channel; imagine one request that results in two goroutines being fired. You'd want the channel to be used for canceling both goroutines, and you can't do that by sending a single value.

… interruptions

…d when the corrisponding request is processed

aarzilli · 2013-07-11T12:37:34Z

Have you seen my last two commits? Do they address your concerns?

hanwen · 2013-07-11T16:19:49Z

I saw them, but didn't realize they were already ready for consideration.
I'll try to look at it in the following days.

On Thu, Jul 11, 2013 at 2:37 PM, Alessandro Arzilli <
notifications@github.com> wrote:

Have you seen my last two commits? Do they address your concerns?

—
Reply to this email directly or view it on GitHubhttps://github.com//pull/15#issuecomment-20807828
.

Han-Wen Nienhuys - hanwen@xs4all.nl - http://www.xs4all.nl/~hanwen

hanwen · 2013-07-14T17:08:46Z

I posted some comments. Also, I have been doing a lot of cleanup, so you'll
need to rebase this on current head. As a side-effect, we'll probably need
to pass the interrupt channel directly in the raw FS methods, and introduce
a nodefs.Context to hold fuse.Context plus the interrupt channel.

On Thu, Jul 11, 2013 at 6:19 PM, Han-Wen Nienhuys hanwenn@gmail.com wrote:

I saw them, but didn't realize they were already ready for consideration.
I'll try to look at it in the following days.

On Thu, Jul 11, 2013 at 2:37 PM, Alessandro Arzilli <
notifications@github.com> wrote:

Have you seen my last two commits? Do they address your concerns?

—
Reply to this email directly or view it on GitHubhttps://github.com//pull/15#issuecomment-20807828
.

Han-Wen Nienhuys - hanwen@xs4all.nl - http://www.xs4all.nl/~hanwen

Han-Wen Nienhuys - hanwen@xs4all.nl - http://www.xs4all.nl/~hanwen

- removed println - direct test that interrupt actually happened

aarzilli · 2013-07-14T18:15:01Z

Can you parametrize times in this test, relative to some fixed constant?

Done

Also, I think the 500ms and 20s are too long.

Pick what you like

the printlns should not be in the final code.

Done

I still don't get the reason for inUse.

request structs need to be removed from reqFree immediately or multiple requests may use the same struct, however they don't contain valid data until the parsing is done. inUse signals the interrupt handling code that the data in a request is valid.

I'd much rather have a
reqFree []_request // free for reading into
reqInflight []_request // being processed
reqIntr []*request // pending interrupts

managing reqInflight is going to be O(n) on the fast path. BTW this is how the patch started (only with an inflight map that would be at least constant amortized on the fast path)

in particular, if an FS had a highly parallel load at some point previously, a reqPool containing both free and in-use reqs will be large, and you'd be paying unnecessarily for looping through them.

I don't think there is any expectation that an interrupt request will be fast. If this is really a concern a way to fix it would be compacting and shrinking the request pool on the fast path every nth request if the request pool is larger than m elements (but I have no way to tune this heuristic).

Also, I have been doing a lot of cleanup, so you'll need to rebase this on current head. As a side-effect, we'll probably need to pass the interrupt channel directly in the raw FS methods, and introduce a nodefs.Context to hold fuse.Context plus the interrupt channel.

Will do.

hanwen · 2013-07-15T08:24:02Z

On Sun, Jul 14, 2013 at 8:15 PM, Alessandro Arzilli <
notifications@github.com> wrote:

Can you parametrize times in this test, relative to some fixed constant?

Done

Also, I think the 500ms and 20s are too long.

Pick what you like

Can you experiment with this for some values? You can define a TTL, and
have the sleep be 10*TTL. I think you should be able to get TTL to the 10ms
range, but you have to test it

the printlns should not be in the final code.

Done

I still don't get the reason for inUse.

request structs need to be removed from reqFree immediately or multiple
requests may use the same struct, however they don't contain valid data
until the parsing is done. inUse signals the interrupt handling code that
the data in a request is valid.

I'd much rather have a
reqFree []_request // free for reading into
reqInflight []_request // being processed
reqIntr []*request // pending interrupts

managing reqInflight is going to be O(n) on the fast path. BTW this is how
the patch

No. Here is how you do it:

type request struct {
..
index int
}

when adding to the inflight list,

r.index = len(server.inflight)
server.inflight = append(server.inflight, r)

when removing

s.inflight[r.index] = s.inflight[len(s.inflight)-1]
s.inflight = s.inflight[:len(s.inflight)-1]
s.inflight[r.index].index = r.index

started (only with an inflight map that would be at least constant
amortized on the fast path)

in particular, if an FS had a highly parallel load at some point
previously, a reqPool containing both free and in-use reqs will be large,
and you'd be paying unnecessarily for looping through them.

I don't think there is any expectation that an interrupt request will be
fast. If this is really a concern a way to fix it would be compacting and
shrinking the request pool on the fast path every nth request if the
request pool is larger than m elements (but I have no way to tune this
heuristic).

Also, I have been doing a lot of cleanup, so you'll need to rebase this on
current head. As a side-effect, we'll probably need to pass the interrupt
channel directly in the raw FS methods, and introduce a nodefs.Context to
hold fuse.Context plus the interrupt channel.

Will do.

—
Reply to this email directly or view it on GitHubhttps://github.com//pull/15#issuecomment-20940815
.

Han-Wen Nienhuys - hanwen@xs4all.nl - http://www.xs4all.nl/~hanwen

navytux · 2019-03-12T13:10:50Z

Update: commit e028a29e09 added INTERRUPT request handling by propagating it to closing req.cancel channel. However that cancel chan is not yet available in any way to user request handlers.

aarzilli added 5 commits June 29, 2013 07:57

Handling of interrupt opcode

1cdb1d1

Updated lockingFile for new API

2abe159

removed log line

c943bc5

correctly using inUse (avoids race condition between request.parse() …

7915203

…and interrupt handling code)

fixing race between interrupt handling and clear

4fccf54

interrupt test

a0560a9

aarzilli added 2 commits July 5, 2013 06:50

Closing Interrupted channel intead of sending a value to it to signal…

b37873b

… interruptions

interrupt requests that can not be satisfied go in a list, get remove…

47e36fc

…d when the corrisponding request is processed

- parametrized times

7d00531

- removed println - direct test that interrupt actually happened

navytux mentioned this pull request Jan 31, 2019

Tests get stuck. #261

Closed

hanwen closed this Mar 28, 2019

navytux mentioned this pull request Apr 11, 2019

Handling INTERRUPT requests #14

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

interrupt opcode handling #15

interrupt opcode handling #15

aarzilli commented Jun 29, 2013

aarzilli commented Jun 29, 2013

aarzilli commented Jun 29, 2013

hanwen commented Jun 30, 2013

aarzilli commented Jul 11, 2013

hanwen commented Jul 11, 2013

hanwen commented Jul 14, 2013

aarzilli commented Jul 14, 2013

hanwen commented Jul 15, 2013

navytux commented Mar 12, 2019

interrupt opcode handling #15

interrupt opcode handling #15

Conversation

aarzilli commented Jun 29, 2013

aarzilli commented Jun 29, 2013

aarzilli commented Jun 29, 2013

hanwen commented Jun 30, 2013

aarzilli commented Jul 11, 2013

hanwen commented Jul 11, 2013

hanwen commented Jul 14, 2013

aarzilli commented Jul 14, 2013

hanwen commented Jul 15, 2013

navytux commented Mar 12, 2019