Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OutOfMemory when copying a large table #5

Open
mishael-w opened this issue Jul 12, 2018 · 7 comments
Open

OutOfMemory when copying a large table #5

mishael-w opened this issue Jul 12, 2018 · 7 comments

Comments

@mishael-w
Copy link

hey,

I'm trying to use your tool for migrating my project from MSSQL to Postgres,
I've been able to migrate all of the tables and the data I had but one, that has about 20 millions records in it.

while running an attempt to migrate that specific table, I get the error-

java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space
at java.util.concurrent.FutureTask.report(Unknown Source)
at java.util.concurrent.FutureTask.get(Unknown Source)
at net.twentyonesolutions.m2pg.PgMigrator.lambda$doDml$1(PgMigrator.java:155)
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(Unknown Source)
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(Unknown Source)
at java.util.stream.AbstractPipeline.copyInto(Unknown Source)
at java.util.stream.ForEachOps$ForEachTask.compute(Unknown Source)
at java.util.concurrent.CountedCompleter.exec(Unknown Source)
at java.util.concurrent.ForkJoinTask.doExec(Unknown Source)
at java.util.concurrent.ForkJoinTask.doInvoke(Unknown Source)
at java.util.concurrent.ForkJoinTask.invoke(Unknown Source)
at java.util.stream.ForEachOps$ForEachOp.evaluateParallel(Unknown Source)
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(Unknown Source)
at java.util.stream.AbstractPipeline.evaluate(Unknown Source)
at java.util.stream.ReferencePipeline.forEach(Unknown Source)
at java.util.stream.ReferencePipeline$Head.forEach(Unknown Source)
at net.twentyonesolutions.m2pg.PgMigrator.doDml(PgMigrator.java:153)
at net.twentyonesolutions.m2pg.PgMigrator.main(PgMigrator.java:83)
Caused by: java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Unknown Source)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(Unknown Source)
at java.lang.AbstractStringBuilder.append(Unknown Source)
at java.lang.StringBuilder.append(Unknown Source)
at net.twentyonesolutions.m2pg.Schema.copyTable(Schema.java:254)
at net.twentyonesolutions.m2pg.PgMigrator.lambda$doDml$0(PgMigrator.java:138)
at net.twentyonesolutions.m2pg.PgMigrator$$Lambda$17/6581689.call(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)

is there any solution?

@isapir
Copy link
Owner

isapir commented Jul 12, 2018

Strange. That's not supposed to happen even for very large tables.

Did you see any "Insert Failed" error messages before it crashed?

@mishael-w
Copy link
Author

mishael-w commented Jul 12, 2018 via email

@mishael-w
Copy link
Author

Another thing I should have mention is that it happens somewhere after 40% (as your program indicates in the output) of the table was done.

@isapir
Copy link
Owner

isapir commented Jul 12, 2018

OK, so the way that it works is that the log file for each table is built in memory because it is running in multiple threads. When the table completes copying it is written to the log file.

The process can handle the large tables, but your process ran out of memory because the log entries filled it up. The reason that you don't see the errors in the log file is that the process crashed before it got to that point.

I will think of a better way of handling it in the future, but for now you should try to figure out why you get the "Insert Failed" errors and fix that. You should be able to copy your 20M record table with no issue.

After you see the first Insert Failed message, try to hit CTRL+C so that you can read the message.

@isapir
Copy link
Owner

isapir commented Jul 12, 2018

Another thing I should have mention is that it happens somewhere after 40% (as your program indicates in the output) of the table was done.

8 million records should not be an issue, but millions (or even less) of errors buffered in the memory for future logging would easily run you out of memory.

Find the source of the "Insert Failed" errors.

@mishael-w
Copy link
Author

mishael-w commented Jul 16, 2018 via email

@isapir
Copy link
Owner

isapir commented Jul 16, 2018

OK, thanks for the feedback.

I have no idea how many people found this project useful and if it's worth putting more time and effort into it, so it's good to hear from someone that it works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants