Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQLite errors when copying files to file system -- series/files get stuck in the drop folder #866

Open
jmjf opened this issue Jan 12, 2021 · 4 comments

Comments

@jmjf
Copy link

jmjf commented Jan 12, 2021

VERSION INFORMATION

Server Version: Linux Docker image daily (January 9, 2021, at 3:11 a.m. US Eastern time) Digest:sha256:a45b9630aa08a58e0ccbffd1302497bb08220bda756a83abc8478a9271ac2a87

Desktop Version: n/a, this is a server issue, but I pulled a daily desktop on January 9, 2021, at ~10:00 a.m. (US Eastern time).

LOG FILE

2021-01-12.log

DESCRIPTION

SQLite/SQLite library throws errors when Shoko is working while other activity is happening on the file system.
Examples of other activity include copying files into the drop folder, running rm -R on a deep directory tree from the Linux command line.

This seems to cause files/series to get stuck in the drop folder.

  • One or more series is counted in the drop folder on the web UI.
    • Currently, my drop folder shows 1 series, 4.06 GB in the web UI.
  • Files for the series remain in the drop folder (aren't moved, but are renamed).
  • The web UI Import Folders shows no series in the drop folder.
  • The web UI and desktop UI show no unrecognized files.
  • Stuck series does not appear in the desktop UI or appears with no files, no images, not metadata (seems to be a broken group).
  • Series that do appear in desktop may have missing files (that are still in the drop folder).
  • Rescanning the drop folder (web ui icon in the import folders widget) moves some files, but not all.
    • Moved missing files from series that Shoko previously recognized.
    • Moved a movie Shoko previously did not recognize (not even in the drop folder).
    • Did not move a series that Shoko seems to believe belongs in the drop folder (could not move errors described below--series cannot be found for file).
  • To fix stuck series, delete files from the drop folder, force Shoko to rescan and remove missing files (so it forgets they exist, I guess), then copy data back to the drop folder.
  • Usually, this issue only affects the first series Shoko attempts to process.
    • But if the first copy finishes and, while Shoko is still processing, I copy another chunk of files, other series may be affected.

Workaround

Pause all queues.
Copy files. Wait until finished.
Restart all queues.

Speculation

Root cause seems to be one or more of:

  • SQLite is failing because the file system is busy copying files.
  • C#/.NET SQLite library fails because SQLite is waiting on the file system.
  • Shoko does not handle (or doesn't get info it needs to handle) the failure cleanly (maybe retry with exponential backoff or dump all commands for the file so it can be rescanned later).

If there is no other solution, maybe wait to start processing queues until no new files/folders in watched folders for a period of time. ("A period of time" could be long if someone is moving a 10+GB movie, so maybe detect file size changes if possible.)

That doesn't resolve the rm -R case, but I'll acknowledge that's rare.

STEPS TO REPRODUCE

Server setup notes

Ubuntu 20.04.1, last updated <7 days before posting.
Data is on a mirrored BTRFS volume (2x 6 TB Ironwolf CMR drives).
sudo docker pull shokoanime:server/daily reports I have the latest image as of posting.
Data shared via Samba to a Windows machine.

Steps

  • Create a Shoko docker-compose setup for a test environment.
  • Ensure drop folder and destination folder are empty.
  • Ensure Shoko.CLI doesn't exist (clean init).
  • Start Shoko, enter user names, passwords.
  • Add import folder AnimeSeries (drop destination, do not watch for new files)
  • Add import folder ShokoDrop (drop source, watch for new files)
  • Wait for a few minutes
    • Shoko is getting anime info after startup in a clean init (maybe from AniDB MyList ???).
    • Wait for it to finish pulling data.
  • When all queues go idle, use Windows File Manager to copy folders for Ajin (2 seasons plus specials, movies), Code Geass (2 seasons, movies, specials, etc.), Dororo (episodes) and Evil or Live (episodes) into ShokoDrop. (~100 GB, takes several minutes to copy)
    • This is a copy, not a move/rename, so it takes several minutes to copy everything. Files are copying when Shoko starts processing.
  • After detecting all Evil or Live (first folder copied) and starting Ajin I see errors like SQLite errors (example below, full log attached above). Errors repeat periodically.
  • Files related to series affected by SQLite errors fail to move to the destination folder and get stuck in the drop folder (example below, full log attached above). Errors repeat for all files in affected series.
  • Log file attach includes a rescan of the drop folder.

SQLite Error Example

2021-01-12T19:04:27.18775Z
New file detected: /ShokoDev/ShokoDrop/EvilOrLive/[HorribleSubs] Evil or Live - 12 [720p].mkv: Created
2021-01-12T19:04:27.18775Z
Found file /ShokoDev/ShokoDrop/EvilOrLive/[HorribleSubs] Evil or Live - 12 [720p].mkv
2021-01-12T19:04:30.145757Z
New folder detected: /ShokoDev/ShokoDrop/Ajin: Created
2021-01-12T19:04:30.1462683Z
Found file /ShokoDev/ShokoDrop/Ajin/[AHNM-Subs] Ajin S00E001 Shoudou.mp4 under folder /ShokoDev/ShokoDrop/Ajin
2021-01-12T19:04:45.2726324Z
New file detected: /ShokoDev/ShokoDrop/Ajin/[AHNM-Subs] Ajin S00E001 Shoudou.mp4: Created
2021-01-12T19:04:45.2726324Z
Found file /ShokoDev/ShokoDrop/Ajin/[AHNM-Subs] Ajin S00E001 Shoudou.mp4
2021-01-12T19:04:45.2891873Z
Trying to load 13417 anime data from cache.
2021-01-12T19:04:46.6342444Z
Hashing File: /ShokoDev/ShokoDrop/EvilOrLive/[HorribleSubs] Evil or Live - 02 [720p].mkv
2021-01-12T19:05:46.5391115Z
There was an error retrieving the next command for the Image Queue: NHibernate.Exceptions.GenericADOException: could not execute query [ SELECT this_.CommandRequestID as commandrequestid1_0_0_, this_.CommandDetails as commanddetails2_0_0_, this_.CommandID as commandid3_0_0_, this_.CommandType as commandtype4_0_0_, this_.DateTimeUpdated as datetimeupdated5_0_0_, this_.Priority as priority6_0_0_ FROM "CommandRequest" this_ WHERE this_.CommandType in (@p0, @p1, @p2) ORDER BY this_.Priority asc, this_.DateTimeUpdated asc limit @p3 ] Name:cp0 - Value:31 Name:cp1 - Value:33 Name:cp2 - Value:96 [SQL: SELECT this_.CommandRequestID as commandrequestid1_0_0_, this_.CommandDetails as commanddetails2_0_0_, this_.CommandID as commandid3_0_0_, this_.CommandType as commandtype4_0_0_, this_.DateTimeUpdated as datetimeupdated5_0_0_, this_.Priority as priority6_0_0_ FROM "CommandRequest" this_ WHERE this_.CommandType in (@p0, @p1, @p2) ORDER BY this_.Priority asc, this_.DateTimeUpdated asc limit @p3] ---> code = Busy (5), message = System.Data.SQLite.SQLiteException (0x800007AF): database is locked database is locked at System.Data.SQLite.SQLite3.Step(SQLiteStatement stmt) at System.Data.SQLite.SQLiteDataReader.NextResult() at System.Data.SQLite.SQLiteDataReader..ctor(SQLiteCommand cmd, CommandBehavior behave) at System.Data.SQLite.SQLiteCommand.ExecuteReader(CommandBehavior behavior) at NHibernate.AdoNet.AbstractBatcher.DoExecuteReader(DbCommand cmd) at NHibernate.AdoNet.AbstractBatcher.ExecuteReader(DbCommand cmd) at NHibernate.Loader.Loader.GetResultSet(DbCommand st, QueryParameters queryParameters, ISessionImplementor session, IResultTransformer forcedResultTransformer) at NHibernate.Loader.Loader.DoQuery(ISessionImplementor session, QueryParameters queryParameters, Boolean returnProxies, IResultTransformer forcedResultTransformer, QueryCacheResultBuilder queryCacheResultBuilder) at NHibernate.Loader.Loader.DoQueryAndInitializeNonLazyCollections(ISessionImplementor session, QueryParameters queryParameters, Boolean returnProxies, IResultTransformer forcedResultTransformer, QueryCacheResultBuilder queryCacheResultBuilder) at NHibernate.Loader.Loader.DoList(ISessionImplementor session, QueryParameters queryParameters, IResultTransformer forcedResultTransformer, QueryCacheResultBuilder queryCacheResultBuilder) --- End of inner exception stack trace --- at NHibernate.Loader.Loader.DoList(ISessionImplementor session, QueryParameters queryParameters, IResultTransformer forcedResultTransformer, QueryCacheResultBuilder queryCacheResultBuilder) at NHibernate.Loader.Criteria.CriteriaLoaderExtensions.LoadAllToList[T](IList`1 loaders, ISessionImplementor session) at NHibernate.Impl.SessionImpl.List[T](CriteriaImpl criteria) at NHibernate.Impl.CriteriaImpl.List[T]() at Shoko.Server.Repositories.Direct.CommandRequestRepository.GetNextDBCommandRequestImages() in /usr/src/app/source/Shoko.Server/Repositories/Direct/CommandRequestRepository.cs:line 228
2021-01-12T19:05:56.9706734Z
Hashing File: /ShokoDev/ShokoDrop/EvilOrLive/[HorribleSubs] Evil or Live - 03 [720p].mkv
2021-01-12T19:06:15.616062Z
While calculating group filters, an AnimeSeries without a group was found: Evil or Live
2021-01-12T19:06:22.4797887Z
Hashing File: /ShokoDev/ShokoDrop/EvilOrLive/[HorribleSubs] Evil or Live - 04 [720p].mkv
2021-01-12T19:06:30.8837071Z
While calculating group filters, an AnimeSeries without a group was found: Evil or Live
2021-01-12T19:06:42.9218327Z
While calculating group filters, an AnimeSeries without a group was found: Evil or Live
2021-01-12T19:06:46.3534693Z
While calculating group filters, an AnimeSeries without a group was found: Evil or Live
2021-01-12T19:06:50.746166Z
While calculating group filters, an AnimeSeries without a group was found: Evil or Live
2021-01-12T19:06:53.4198414Z
While calculating group filters, an AnimeSeries without a group was found: Evil or Live
2021-01-12T19:06:54.3225987Z
Error processing file: /ShokoDev/ShokoDrop/EvilOrLive/[HorribleSubs] Evil or Live - 04 [720p].mkv NHibernate.TransactionException: Begin failed with SQL exception ---> code = Busy (5), message = System.Data.SQLite.SQLiteException (0x800007AF): database is locked database is locked at System.Data.SQLite.SQLite3.Step(SQLiteStatement stmt) at System.Data.SQLite.SQLiteDataReader.NextResult() at System.Data.SQLite.SQLiteDataReader..ctor(SQLiteCommand cmd, CommandBehavior behave) at System.Data.SQLite.SQLiteCommand.ExecuteReader(CommandBehavior behavior) at System.Data.SQLite.SQLiteCommand.ExecuteNonQuery(CommandBehavior behavior) at System.Data.SQLite.SQLiteTransaction.Begin(Boolean deferredLock) at System.Data.SQLite.SQLiteConnection.BeginDbTransaction(IsolationLevel isolationLevel) at NHibernate.Transaction.AdoTransaction.Begin(IsolationLevel isolationLevel) --- End of inner exception stack trace --- at NHibernate.Transaction.AdoTransaction.Begin(IsolationLevel isolationLevel) at NHibernate.Impl.AbstractSessionImpl.BeginTransaction() at Shoko.Server.Repositories.BaseCachedRepository`2.Save(T obj) in /usr/src/app/source/Shoko.Server/Repositories/BaseCachedRepository.cs:line 262 at Shoko.Server.Repositories.Cached.VideoLocalRepository.Save(SVR_VideoLocal obj, Boolean updateEpisodes) in /usr/src/app/source/Shoko.Server/Repositories/Cached/VideoLocalRepository.cs:line 227 at Shoko.Server.Commands.CommandRequest_HashFile.ProcessFile_LocalInfo() in /usr/src/app/source/Shoko.Server/Commands/Import/CommandRequest_HashFile.cs:line 476 at Shoko.Server.Commands.CommandRequest_HashFile.ProcessCommand() in /usr/src/app/source/Shoko.Server/Commands/Import/CommandRequest_HashFile.cs:line 66
2021-01-12T19:06:56.0898786Z
While calculating group filters, an AnimeSeries without a group was found: Evil or Live
2021-01-12T19:07:05.844586Z
While calculating group filters, an AnimeSeries without a group was found: Evil or Live
2021-01-12T19:07:08.2631906Z
New file detected: /ShokoDev/ShokoDrop/Ajin/[Prof] S01E01 - A Topic That Has Nothing to Do with Us.mkv: Created

Could not MOVE Error Example

2021-01-12T19:15:17.8426299Z
Could not MOVE file: /ShokoDev/ShokoDrop/EvilOrLive/[HorribleSubs] Evil or Live - 05 [720p].mkv -- System.ArgumentException: Series cannot be found for file at Shoko.Server.Renamer.GroupAwareRenamer.GetDestination(MoveEventArgs args) in /usr/src/app/source/Shoko.Server/Renamer/GroupAwareSortingRenamer.cs:line 23 at Shoko.Server.RenameFileHelper.GetDestination(SVR_VideoLocal_Place place, String scriptName) in /usr/src/app/source/Shoko.Server/Renamer/RenameFileHelper.cs:line 84 at Shoko.Server.Models.SVR_VideoLocal_Place.MoveFileIfRequired(Boolean deleteEmpty) in /usr/src/app/source/Shoko.Server/Models/SVR_VideoLocal_Place.cs:line 875

Note that I'm running the default setup, which Desktop claims is using the Legacy renamer, though the error suggests it's using the group aware renamer. If I delete files stuck in the drop folder, run import and remove missing files, then switch to the group aware renamer and copy data, it will fix the issue (and seems to get about the same result after manually renaming files moved).

@da3dsoul
Copy link
Member

I appreciate the research. The short version is that SQLite has issues dealing with concurrency, and that's a specifc repeatable case where concurrent transactions are likely.

@jmjf
Copy link
Author

jmjf commented Jan 15, 2021

So, sounds like the guidance for now is one or more of:

  • Pause queues unless actively processing and monitoring them (reduce risk of contention).
  • Move the shoko config directory to a different device (reduce risk of contention).
  • Be prepared to clean up if something loads up the device.

Docs say no new MySQL/MariaDB, so that isn't a possible solution. (The web UI allows it, but testing with a MariaDB container failed to connect, so I'm assuming the docs are right.)

Testing with the shoko config directory moved to the SSD looks like it's avoiding the issue. (At least until something hammers the SSD. :) )

If you want to add fault tolerance, you may find something like Polly (https://github.com/App-vNext/Polly) helpful. While most people use it to deal with network issues, here's an example with SQL Server https://scottdorman.blog/2020/09/13/database-resiliency-with-polly/. Might make it easier to set up a policy that certain types of errors wait and retry while others respond differently. Their GitHub page doesn't mention .NET Core 3.x, but CHANGELOG.md does in the latest update.

If this was NodeJS, I'd be digging into the code to offer a pull request. With .NET, all I can do is throw out ideas.

@Cazzar
Copy link
Member

Cazzar commented Jan 16, 2021

While Polly at a glance might work, it would be just as easy to add a retry mechanism in place.
The other complication with this, is it does seem to be very specific to setups, for example, we have seen it occur specifically when the OS, or DE on Linux scans into Sqlite files for some reason.

As for connecting to MSSQL/MySql(+variants), it should still be possible, as I use MariaDB myself, running in a docker container.

@jmjf
Copy link
Author

jmjf commented Jan 18, 2021

Another workaround option.

Create the shoko config directory on a different device that is low risk for i/o contention.
On the device you want to expose the directory ln -s <path to directory> shoko to create a symlink.

If you want to make it available over a Samba share, https://unix.stackexchange.com/questions/5120/how-do-you-make-samba-follow-symlink-outside-the-shared-path (probably the second set of smb.conf options in the top answer unless you're years out of date).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants