Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak when selecting a large table #29

Closed
shawjef3 opened this issue Jun 19, 2015 · 27 comments
Closed

Memory leak when selecting a large table #29

shawjef3 opened this issue Jun 19, 2015 · 27 comments

Comments

@shawjef3
Copy link

I have a large table in sql server 2008 r2, 57,731,898 rows by 35 columns. The following query causes my PostgreSQL service to crash because the server runs out of memory. The server is running CentOS 7 x64, PostgreSQL 9.4.4, tds_fdw 1.0.3, FreeTDS 0.95, and 256 GB of RAM.

Orders is a native table with the same columns as the tds_fdw-backed foreign table, foreign_orders.

insert into orders
select *
from foreign_orders

@GeoffMontee
Copy link
Collaborator

Thanks for the report!

Could you please provide the definitions for your foreign server and foreign table?

I think I may know the cause. I started using MemoryContexts in tds_fdw 1.0.3. Now that PostgreSQL is no longer auto-cleaning tds_fdw's memory, I may have to reset the MemoryContext when the query is complete. That reset should probably happen in tdsEndForeignScan().

Unfortunately, I'm not at home right now, which is where my tds_fdw development environment is set up. I probably won't have an opportunity to work on this for a week or so.

@GeoffMontee
Copy link
Collaborator

@shawjef3,

I can't test it in my current environment, but I did just commit code that resets the MemoryContext in tdsEndForeignScan().

Commit: 780060b

If it's important for this to be fixed before next week, could you please see if the code in the current git master still has this issue?

@shawjef3
Copy link
Author

Geoff, thanks for the amazingly quick reply! I'll test it now.

@shawjef3
Copy link
Author

Here is the table definition, though I've changed the column names just in case my employer would prefer that.

CREATE TABLE api.orders
(
a integer NOT NULL,
b character varying(50) NOT NULL,
c timestamp(3) without time zone,
d integer NOT NULL,
e boolean,
f numeric(22,4),
g character varying(5),
h character varying(5),
i integer,
j bigint,
k bigint,
k integer,
l character varying(50),
m character varying(20),
n integer,
o integer,
p timestamp(3) without time zone,
q integer,
r integer,
s bigint,
t bigint,
u timestamp(3) without time zone,
v integer,
w timestamp(3) without time zone,
x integer,
y boolean,
z integer,
aa character varying(255),
ab character varying(255),
ac integer,
ad bigint,
ae bigint,
af character varying(50),
ag character varying(50),
ah bigint,
ai bigint,
aj character varying(255),
CONSTRAINT orders_pkey PRIMARY KEY (orderid)
)
WITH (
OIDS=FALSE
);
ALTER TABLE api.orders
OWNER TO postgres;

@shawjef3
Copy link
Author

The foreign table has the same columns, but it also has

SERVER server
OPTIONS (table 'Marketing.API.Orders');

@shawjef3
Copy link
Author

I don't know if it's relevant, but the PostgreSQL server being inserted into is a master with streaming replication to one slave.

Unfortunately, 780060b doesn't fix the problem. It's not important for this to be fixed over the weekend.

Thanks again, Geoff!

@GeoffMontee
Copy link
Collaborator

Rather than this being a memory leak, I wonder if I'm just using a very inefficient method to store tds_fdw results at the moment.

Right now, tds_fdw uses ExecStoreTuple().

I see that mysql_fdw uses ExecStoreVirtualTuple().

Even the PostgreSQL documentation says:

Fetch one row from the foreign source, returning it in a tuple table slot (the node's ScanTupleSlot should be used for this purpose). Return NULL if no more rows are available. The tuple table slot infrastructure allows either a physical or virtual tuple to be returned; in most cases the latter choice is preferable from a performance standpoint. Note that this is called in a short-lived memory context that will be reset between invocations. Create a memory context in BeginForeignScan if you need longer-lived storage, or use the es_query_cxt of the node's EState.

So I guess I should look into that as a possible fix when I get a chance.

@shawjef3
Copy link
Author

That'd be great. I was honestly surprised when I found out my problem was due to OOM, since the table can in principle fit in memory. On disk it's 18.6 GB.

@shawjef3
Copy link
Author

I will try 1.0.2 on Monday and see if that resolves the memory problem.

@GeoffMontee
Copy link
Collaborator

Thanks!

By the way, when you tried the most recent commit, did you restart the PostgreSQL server after doing make install? If not, PostgreSQL would most likely continue using the old shared library. I'm not sure if there's actually a way to replace an extension's shared library without restarting the server or also dropping all objects that use the extension (such as foreign tables).

@shawjef3
Copy link
Author

Yup, I definitely restarted PostgreSQL.

Sent from my phone

On Jun 19, 2015, at 5:49 PM, Geoff Montee notifications@github.com wrote:

Thanks!

By the way, when you tried the most recent commit, did you restart the PostgreSQL server after doing make install? If not, PostgreSQL would most likely continue using the old shared library. I'm not sure if there's actually a way to replace an extension's shared library without restarting the server or also dropping all objects that use the extension (such as foreign tables).


Reply to this email directly or view it on GitHub.

@shawjef3
Copy link
Author

I tried 1.0.2 this morning. Memory usage went up while the query ran, but much more slowly. It finished successfully.

So if 1.0.2 has a memory leak, it's much better than 1.0.3.

@GeoffMontee
Copy link
Collaborator

Thanks for running the test!

Since 1.0.2 performs better, it probably is related to the MemoryContext somehow. I'll let you know when I might have a fix.

@GeoffMontee
Copy link
Collaborator

I think it would be useful to see how much memory tds_fdw is using at different steps of your query.

The latest commit adds a few new variables:

  • tds_fdw.show_finished_memory_stats - show memory stats after the query is finished, but tds_fdw hasn't cleaned up yet.
  • tds_fdw.show_before_row_memory_stats - show memory stats before each row is fetched.
  • tds_fdw.show_after_row_memory_stats - show memory stats after each row is fetched.

Commit: 03879e3

Would you be willing to create a smaller copy of your table (maybe 100 rows or so?), and then could you use these variables to show how your memory usage grows?

e.g. first set the variables, and execute the query:

postgres=# SET tds_fdw.show_finished_memory_stats=true;
SET
postgres=# SET tds_fdw.show_before_row_memory_stats=true;
SET
postgres=# SET tds_fdw.show_after_row_memory_stats=true;
SET
postgres=# select * from mssql_table;
 id |  data   
----+---------
  1 | horse
  2 | correct
  3 | battery
  4 | staple
  5 | 
  6 | 
(6 rows)

And then your PostgreSQL log should have data that looks like this:

Showing memory statistics before row 1.
ExecutorState: 8192 total in 1 blocks; 2416 free (1 chunks); 5776 used
  printtup: 0 total in 0 blocks; 0 free (0 chunks); 0 used
  tds_fdw data: 8192 total in 1 blocks; 8112 free (0 chunks); 80 used
  ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
Showing memory statistics after row 1.
ExecutorState: 8192 total in 1 blocks; 2416 free (1 chunks); 5776 used
  printtup: 0 total in 0 blocks; 0 free (0 chunks); 0 used
  tds_fdw data: 8192 total in 1 blocks; 7016 free (0 chunks); 1176 used
  ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
Showing memory statistics before row 2.
ExecutorState: 8192 total in 1 blocks; 2272 free (1 chunks); 5920 used
  printtup: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used
  tds_fdw data: 8192 total in 1 blocks; 6664 free (2 chunks); 1528 used
  ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
Showing memory statistics after row 2.
ExecutorState: 8192 total in 1 blocks; 2272 free (1 chunks); 5920 used
  printtup: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used
  tds_fdw data: 8192 total in 1 blocks; 5568 free (0 chunks); 2624 used
  ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
Showing memory statistics before row 3.
ExecutorState: 8192 total in 1 blocks; 2272 free (1 chunks); 5920 used
  printtup: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used
  tds_fdw data: 8192 total in 1 blocks; 5216 free (2 chunks); 2976 used
  ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
Showing memory statistics after row 3.
ExecutorState: 8192 total in 1 blocks; 2272 free (1 chunks); 5920 used
  printtup: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used
  tds_fdw data: 8192 total in 1 blocks; 4120 free (0 chunks); 4072 used
  ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used

The tds_fdw data context does appear to be growing very quickly, even with the very small table that I tested this with. I'll see if I can track down the issue using my data, but it might be helpful to see your statistics as well.

@shulcsm
Copy link

shulcsm commented Jul 15, 2015

Same issue, eats up unreasonable amount of memory compared to 1.0.1

@tskenb
Copy link

tskenb commented Sep 15, 2015

This problem has bitten us, and when we downgrade to 1.0.2 we get bitten by the empty/null string problem that was fixed in the same commit that introduced memory contexts. Is this issue being worked on now? Would the kind of debugging data that you requested back on July 3 still be helpful? I tried playing with the code, but I'm new to memory contexts so progress was slow.

@GeoffMontee
Copy link
Collaborator

@tskenb,

Sorry to hear that this bug is causing you problems. I would like to fix this bug, but I haven't had the time to work on tds_fdw recently. Any information that you could provide would be great.

I'm guessing that moving tds_fdw from ExecStoreTuple to ExecStoreVirtualTuple() would be good for memory usage in the long-run. However, maybe there is an easier way to fix this problem in the short term.

GeoffMontee added a commit that referenced this issue Sep 15, 2015
@GeoffMontee
Copy link
Collaborator

@tskenb,

As a quick and dirty attempt, I just committed some changes that may fix the leak, but it still need to be tested. Can you try out the commit to see how things work for you?

fd5b441

This changes in this commit cause the C-String array used for a row's values to be freed after the row is built with ExecStoreTuple. I'm not really sure if this is allowed by PostgreSQL's API.

@tskenb
Copy link

tskenb commented Sep 16, 2015

This new commit is better, but there is still a leak, to the point where I couldn't transfer a table of 160 million rows, 80 bytes per row (about 13 GB). The columns are fixed length - int, float or datetime.

I created a table of 100 rows and 2 columns (int and float), and logged the stats that you mentioned with both commits fd5b441 and 2b09f99. Here are the logs:

https://gist.github.com/tskenb/223b5bd3d3b2b772c255
https://gist.github.com/tskenb/c82a2aa2e8b97af2fac2

GeoffMontee added a commit that referenced this issue Sep 16, 2015
@GeoffMontee
Copy link
Collaborator

@tskenb,

Thanks for the logs! I can definitely tell that there is a big difference in memory usage before and after the patch. Before the patch, tds_fdw was using 243,408 blocks at query end. After the patch, tds_fdw was using 32,144 blocks.

I just submitted a new commit:

8f15569

Can you please try that commit to see if that's better for you?

By the way, if you are trying to transfer 13 GB, how much total memory does the system have? Also, what is shared_buffers set to?

@tskenb
Copy link

tskenb commented Sep 17, 2015

The machine has 32 GB memory, and shared buffers is 12800MB.

The latest commit is a further improvement, but there is still a leak and I still can't transfer that table.

The new log is here:

https://gist.github.com/tskenb/ca320fb3abdf06602e22

Thanks for your efforts so far. I'll see if I can work on it a bit too.

GeoffMontee added a commit that referenced this issue Sep 18, 2015
…ntext in tdsGetColumnMetadata to prevent too much bloating over the iterations of tdsIterateForeignScan
@GeoffMontee
Copy link
Collaborator

@tskenb,

After reading this, it sounds like memory contexts don't necessarily shrink when data is freed.

I tried something a bit more aggressive. I changed tds_fdw so that the memory context is only used for data that is intended to persist for every row. Currently, that data only includes the column metadata that is fetched right after the query is executed.

Can you please try the latest commit?

5b78ee7

@tskenb
Copy link

tskenb commented Sep 18, 2015

Interesting. The log file is here.

On one hand, I can transfer the whole file and the memory pattern looks more like 1.0.2 - meaning it doesn't seem to grow with number of rows. And the resulting table looks close enough to correct that the diffs are probably due to other issues.

On the other hand, that log file has some strange numbers in it and the solution is awfully hacky. I have been looking at it, but have so far only succeeded in making things worse. In the latest commit, it looks like stuff allocated in tdsIterateForeignScan doesn't use the "tds_fdw data" context at all. What was the original motivation for introducing a new memory context? Does some data need to be preserved across iterations?

@GeoffMontee
Copy link
Collaborator

@tskenb,

On one hand, I can transfer the whole file and the memory pattern looks more like 1.0.2 - meaning it doesn't seem to grow with number of rows.

Awesome! That's great news!

And the resulting table looks close enough to correct that the diffs are probably due to other issues.

If any differences point towards any more issues in tds_fdw, let me know.

If any differences are related to character encodings, then keep in mind that character encodings can be kind of tricky, since you have several different encodings with tds_fdw that all need to match up or be compatible:

  • The encoding of the database/table/column on the remote server.
  • The encoding used in transit over the wire between tds_fdw and the remote SQL Server or Sybase server. (This is always UCS-2 if tds version is set to 7 or higher.)
  • The encoding used by your PostgreSQL client's connection. (e.g. see PostgreSQL's client_encoding variable)
  • The encoding used by the PostgreSQL database that contains the foreign table.

On the other hand, that log file has some strange numbers in it and the solution is awfully hacky. I have been looking at it, but have so far only succeeded in making things worse.

While the solution sounds a bit strange, I'm not so sure that it's hacky. After thinking about it, this might be a better way to use PostgreSQL's API. (more on that next)

In the latest commit, it looks like stuff allocated in tdsIterateForeignScan doesn't use the "tds_fdw data" context at all. What was the original motivation for introducing a new memory context? Does some data need to be preserved across iterations?

The reason for introducing the memory context was that I wanted to preserve column metadata between iterations of tdsIterateForeignScan. Originally, I decided to use the memory context for the whole execution of tdsIterateForeignScan, rather than just in tdsGetColumnMetadata. This meant that every memory block allocated during the iterations would be added to the memory context.

I didn't think that would be a problem, as long as I freed the memory that didn't need to persist at the end of a given iteration. That didn't quite work out as I expected because when the memory was freed, it seems that the memory context doesn't necessarily shrink.

Also, even after freeing all of the memory that I allocated (except the column metadata), memory usage still seemed to grow too much for you. That makes me wonder if using the memory context in that way was unintentionally causing other data to be included in it when I called various PostgreSQL API functions that may make their own palloc calls.

I'm still relatively new to PostgreSQL extension development, but I think this most recent commit is probably using the memory context in a more efficient way. It seems that different memory contexts should be used for persistent data and short-lived data.

@tskenb
Copy link

tskenb commented Sep 18, 2015

Thanks for the explanation. I agree now that the latest commit is good. I just have one tweak: I tried reverting the previous two commits - using false in the ExecStoreTuple call, and deleting the following loop and the pfree(values) line. It didn't change the memory usage, and the log file looks a lot better.

Yeah, I think my diffs are not real diffs, just variances in representation.

Thanks again for your help.

GeoffMontee added a commit that referenced this issue Sep 18, 2015
…hat it pfrees memory when it's cleaning up"

This reverts commit 8f15569.
@GeoffMontee
Copy link
Collaborator

@tskenb,

I just have one tweak: I tried reverting the previous two commits - using false in the ExecStoreTuple call, and deleting the following loop and the pfree(values) line. It didn't change the memory usage, and the log file looks a lot better.

Good catch! I've reverted those two commits.

Thanks again for your help.

No problem! I'm glad you are finding tds_fdw useful. Let me know if you run into any other issues.

It might be time for a new release soon now that this memory issue appears to be under control.

@GeoffMontee
Copy link
Collaborator

Fixed in v1.0.4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants