SEGV when MULTI_PROCESS enabled #27

MasahikoSawada · 2016-12-16T12:23:10Z

Hi,

I got SEGV error with large input file that includes parse error data when MULTI_PROCES enabled. Input file is 13MB 500000 lines csv file and contains parse error data like "" at 153249 line. And the control file is written like follows,

OUTPUT = wikipedia_title                   # [<schema_name>.]table_name
#INPUT = /path/to/test.csv  # Input data location (absolute path)
TYPE = CSV                            # Input file type
QUOTE = "\""                          # Quoting character
ESCAPE = \                            # Escape character for Quoting
DELIMITER = ","                       # Delimiter
WRITER = BUFFERED
VERBOSE = YES
MULTI_PROCESS = YES

Also error message I got is following.

WARNING:  Parse error Record 1: Input Record 153249: Rejected - column 92. unterminated CSV quoted field
WARNING:  Maximum parse error count exceeded - 1 error(s) found in input file
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: LOG:  server process (PID 88994) was terminated by signal 11: Segmentation fault

I think the main cause of this problem is that AsyncSource could not allocate new memory in appropriate memory context when expanding read buffer. That memory context AsyncSource allocates for the read buffer could be switched by main thread and the current pg_bulkload doesn't take care about it. Because pg_bulkload resets the used memory context when parse error occurred and then clean up the read buffer, pg_bulkload could try to free the memory that is already reseted by MemoryContextReset().

The text was updated successfully, but these errors were encountered:

AsyncSource mechanism which uses a separate thread to read data can sometimes try to manipulate an address that may be invalid because of memory context reset or delete by the main thread. Fix that by implementing a dedicated memory context for AsyncSource, which is only manipulated in the reading thread. Fixes #27.

amitlan · 2016-12-19T10:37:52Z

Thanks for the report!

One solution might be to have a dedicated memory context for AsyncSource, such that, whenever it needs to allocate memory for its buffer, it allocates from the dedicated context and frees the old memory previously allocated in the same context. AsyncSourceClose() should delete the context at the end, or if an error occurs.

Could you review the PR #28?

MasahikoSawada · 2016-12-20T16:10:37Z

Thank you for the reply!

One solution might be to have a dedicated memory context for AsyncSource, such that, whenever it needs to allocate memory for its buffer, it allocates from the dedicated context and frees the old memory previously allocated in the same context. AsyncSourceClose() should delete the context at the end, or if an error occurs.

This is the same as what I imagined. PR #28 looks good but because I'm not familiar with pg_bulkload code I'm not sure whether current approach is safe. Is it possible that the memory context is changed unexpectedly from AsyncSource to another memory context like TupleChecker by main thread after switched to AsyncSource?

amitlan · 2016-12-21T01:41:20Z

To answer your question - you'll see that we switch back to the AsyncSource context every time we need to palloc for AsyncSource's needs. We also pfree the previously allocated memory while we are in the same context. When error occurs, like in case of this issue, we simply delete the context using MemoryContextDelete(), which we know is our own.

Also, as I commented on PR #28, the problem is not caused by interaction of different threads. It's rather the lack of coordination between different modules of pg_bulkload (such as parser, reader, source). Currently, we do have a dedicated context for parallel writer named "ParallelWriter", which I think somebody designed to take care of such hazard. With this PR, we do that for AsyncSource.

I noticed however that, context is a member of base class Writer, instead of ParallelWriter. Maybe, we should do something similar in case of AsyncSource. That is, make the newly introduced context a member of base class Source, instead of AsyncSource. But then again, only AsyncSource seems to need to do palloc, not other sources like FileSource and RemoteSource.

MasahikoSawada · 2016-12-21T03:41:13Z

I've now understood what happened exactly. You're right. Reading source file is done by child thread but extending read buffer is done by main thread. The main thread extends the read buffer after switched memory context to "ParallelWriter", so it's cause of SEGV.

I noticed however that, context is a member of base class Writer, instead of ParallelWriter. Maybe, we should do something similar in case of AsyncSource. That is, make the newly introduced context a member of base class Source, instead of AsyncSource. But then again, only AsyncSource seems to need to do palloc, not other sources like FileSource and RemoteSource.

ISTM that AsyncSource can have context as current your PR does, or can we use repalloc instead?

amitlan · 2016-12-21T05:31:58Z

ISTM that AsyncSource can have context as current your PR does, or can we use repalloc instead?

I assume you mean, repalloc(self->buffer, newsize) within the proposed dedicated context, instead of the following stanza:

        newbuf = palloc0(newsize);
        /* copy it in new buffer from old buffer */
        pthread_mutex_lock(&self->lock);
        if (self->begin > self->end)
        {
            memcpy(newbuf, self->buffer + self->begin,
                   self->size - self->begin);
            memcpy(newbuf + self->size - self->begin, self->buffer, self->end);
            self->end = self->size - self->begin + self->end;
        }
        else
        {
            memcpy(newbuf, self->buffer + self->begin, self->end - self->begin);
            self->end = self->end - self->begin;
        }
        pfree(self->buffer);
        self->buffer = newbuf;
        self->size = newsize;
        self->begin = 0;
        pthread_mutex_unlock(&self->lock);

Due to involvement of the locking between threads here, I'm not immediately sure if that's going to be straightforward. It could be made to work with enough attention though.

MasahikoSawada · 2016-12-22T08:27:38Z

I assume you mean, repalloc(self->buffer, newsize) within the proposed dedicated context, instead of the following stanza:

Yes. I guessed that we can use repalloc(self->buffer, newsize) instead of switching memory context and then executing palloc0, memcpy and pfree. But ISTM that your original PR approach is safer.

AsyncSource mechanism which uses a separate thread to read data can sometimes try to manipulate an address that may be invalid because of memory context reset or delete by the main thread. Fix that by implementing a dedicated memory context for AsyncSource, which is only manipulated in the reading thread. Fixes #27.

amitlan · 2016-12-27T01:27:58Z

OK, merged the PR for now. Thanks for the review!

AsyncSource mechanism which uses a separate thread to read data can sometimes try to manipulate an address that may be invalid because of memory context reset or delete by the main thread. Fix that by implementing a dedicated memory context for AsyncSource, which is only manipulated in the reading thread. Fixes #27.

amitlan mentioned this issue Dec 19, 2016

Fix crash due to improper memory context handling in some cases #28

Merged

amitlan closed this as completed in #28 Dec 27, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SEGV when MULTI_PROCESS enabled #27

SEGV when MULTI_PROCESS enabled #27

MasahikoSawada commented Dec 16, 2016

amitlan commented Dec 19, 2016 •

edited

Loading

MasahikoSawada commented Dec 20, 2016

amitlan commented Dec 21, 2016 •

edited

Loading

MasahikoSawada commented Dec 21, 2016

amitlan commented Dec 21, 2016

MasahikoSawada commented Dec 22, 2016

amitlan commented Dec 27, 2016

SEGV when MULTI_PROCESS enabled #27

SEGV when MULTI_PROCESS enabled #27

Comments

MasahikoSawada commented Dec 16, 2016

amitlan commented Dec 19, 2016 • edited Loading

MasahikoSawada commented Dec 20, 2016

amitlan commented Dec 21, 2016 • edited Loading

MasahikoSawada commented Dec 21, 2016

amitlan commented Dec 21, 2016

MasahikoSawada commented Dec 22, 2016

amitlan commented Dec 27, 2016

amitlan commented Dec 19, 2016 •

edited

Loading

amitlan commented Dec 21, 2016 •

edited

Loading