Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Azkaban solo server & H2 : The file is locked #1038

Open
hriviere opened this issue Apr 28, 2017 · 4 comments
Open

Azkaban solo server & H2 : The file is locked #1038

hriviere opened this issue Apr 28, 2017 · 4 comments

Comments

@hriviere
Copy link

Hi,
We recently update our Azkaban solo server instance with a embded H2 database from 3.0.0 release to 3.20.0.

We set the h2.path to a different one to recreate database from scratch to avoid version conflict.

After some time (between 2 to 12 hours) Azkaban crash due to the h2 database locked

org.h2.message.DbException: General error: "java.lang.IllegalStateException: The file is locked: nio:/opt/data/azkaban.mv.db [1.4.193/7]" [50000-193]

The database never unlock (so all sql queries from azkaban fail) unless we restart azkaban instance

We have only one instance on the server and the h2 file is only used by Azkaban solo server.

With the previous release (3.0.0) we never had this error.

Is someone has insight about what happen ?

Thanks !

@kunkun-tang
Copy link
Contributor

Have you tried enforcing H2 never locking a file and see what happened?
http://stackoverflow.com/a/39075506

@hriviere
Copy link
Author

hriviere commented May 3, 2017

Not solve the issue but we have a different error !

I delete all H2 files to restart from scratch and I configure azkaban with :

database.type=h2
h2.path=/var/lib/azkaban/azkaban;DB_CLOSE_ON_EXIT=TRUE;FILE_LOCK=NO
h2.create.tables=true

After ~24 hours of run I got a azkaban crash (webserver UI is up but any sql query fails) :

2017-05-03 10:30:17 jdbc[18]: exception
org.h2.jdbc.JdbcSQLException: General error: "java.lang.IllegalStateException: Reading from nio:/var/lib/azkaban/azkaban.mv.db failed; file length -1 read length 768 at 50986998 [1.4.193/1]" [50000-193]
        at org.h2.message.DbException.getJdbcSQLException(DbException.java:345)
        at org.h2.message.DbException.get(DbException.java:168)
        at org.h2.message.DbException.convert(DbException.java:295)
        at org.h2.message.DbException.toSQLException(DbException.java:268)
        at org.h2.message.TraceObject.logAndConvert(TraceObject.java:352)
        at org.h2.jdbc.JdbcPreparedStatement.executeUpdate(JdbcPreparedStatement.java:151)
        at org.apache.commons.dbcp2.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:98)
        at org.apache.commons.dbcp2.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:98)
        at org.apache.commons.dbutils.QueryRunner.update(QueryRunner.java:487)
        at org.apache.commons.dbutils.QueryRunner.update(QueryRunner.java:403)
        at azkaban.executor.JdbcExecutorLoader.updateExecutableFlow(JdbcExecutorLoader.java:154)
        at azkaban.executor.JdbcExecutorLoader.updateExecutableFlow(JdbcExecutorLoader.java:126)
        at azkaban.execapp.FlowRunner.updateFlow(FlowRunner.java:307)
        at azkaban.execapp.FlowRunner.updateFlow(FlowRunner.java:301)
        at azkaban.execapp.FlowRunner.progressGraph(FlowRunner.java:504)
        at azkaban.execapp.FlowRunner.runFlow(FlowRunner.java:395)
        at azkaban.execapp.FlowRunner.run(FlowRunner.java:223)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalStateException: Reading from nio:/var/lib/azkaban/azkaban.mv.db failed; file length -1 read length 768 at 50986998 [1.4.193/1]
        at org.h2.mvstore.DataUtils.newIllegalStateException(DataUtils.java:765)
        at org.h2.mvstore.DataUtils.readFully(DataUtils.java:435)
        at org.h2.mvstore.FileStore.readFully(FileStore.java:98)
		at org.h2.mvstore.Page.read(Page.java:190)
        at org.h2.mvstore.MVStore.readPage(MVStore.java:1954)
        at org.h2.mvstore.MVMap.readPage(MVMap.java:736)
        at org.h2.mvstore.Page.getChildPage(Page.java:217)
        at org.h2.mvstore.MVMap.binarySearch(MVMap.java:468)
        at org.h2.mvstore.MVMap.binarySearch(MVMap.java:469)
        at org.h2.mvstore.MVMap.get(MVMap.java:450)
        at org.h2.mvstore.MVStore.getMapName(MVStore.java:2469)
        at org.h2.mvstore.MVMap.getName(MVMap.java:945)
        at org.h2.mvstore.db.TransactionStore$1.fetchNext(TransactionStore.java:555)
        at org.h2.mvstore.db.TransactionStore$1.<init>(TransactionStore.java:530)
        at org.h2.mvstore.db.TransactionStore.getChanges(TransactionStore.java:524)
        at org.h2.mvstore.db.TransactionStore$Transaction.getChanges(TransactionStore.java:813)
        at org.h2.engine.Session.rollbackTo(Session.java:751)
        at org.h2.command.Command.executeUpdate(Command.java:282)
        at org.h2.jdbc.JdbcPreparedStatement.executeUpdateInternal(JdbcPreparedStatement.java:160)
        at org.h2.jdbc.JdbcPreparedStatement.executeUpdate(JdbcPreparedStatement.java:146)
        ... 16 more
Caused by: java.nio.channels.ClosedChannelException
        at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:110)
        at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:721)
        at org.h2.store.fs.FileNio.read(FilePathNio.java:74)
        at org.h2.mvstore.DataUtils.readFully(DataUtils.java:421)
        ... 34 more
2017-05-03 10:30:17 jdbc[16]: exception
org.h2.jdbc.JdbcSQLException: The object is already closed; SQL statement:
INSERT INTO execution_jobs (exec_id, project_id, version, flow_id, job_id, start_time, end_time, status, input_params, attempt) VALUES (?,?,?,?,?,?,?,?,?,?) [90007-193]
 (...)

@hriviere
Copy link
Author

hriviere commented May 3, 2017

For test purpose I relaunch the 3.20 instance with the h2 jar of the 3.0.0 release (h2-1.3.170.jar in place of h2-1.4.193.jar).
Server is booting, waiting to see if I arrive to reproduce the issue with this h2 version...

@simrandeep11
Copy link

@hriviere
Were you able to solve the issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants