Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQLite: Replace LIKE with SUBSTR for file path matching #4162

Merged
merged 5 commits into from
Aug 7, 2021
Merged

SQLite: Replace LIKE with SUBSTR for file path matching #4162

merged 5 commits into from
Aug 7, 2021

Conversation

uklotzde
Copy link
Contributor

This refactoring makes SqlLikeWildcardEscaper obsolete and ensures that file paths in the database are matched case-sensitive.

Follow-up of #4146. It also fixes the incorrect file path matching in TrackDAO. No backport to 2.3.

Note: We don't properly escape SQL LIKE wildcards in other, informal search queries, e.g. for crates. But this is just a minor issue and if SqlLikeWildcardEscaper is ever needed again it could be brought back. Currently it is unused and should disappear.

@coveralls
Copy link

coveralls commented Jul 31, 2021

Pull Request Test Coverage Report for Build 1105672637

  • 3 of 12 (25.0%) changed or added relevant lines in 2 files are covered.
  • 992 unchanged lines in 12 files lost coverage.
  • Overall coverage decreased (-0.04%) to 25.989%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/library/dao/trackdao.cpp 1 10 10.0%
Files with Coverage Reduction New Missed Lines %
src/track/serato/beatgrid.cpp 2 62.5%
src/engine/enginebuffer.cpp 14 86.26%
src/track/beatutils.cpp 17 48.44%
src/util/db/dbconnection.cpp 18 69.78%
src/controllers/scripting/controllerscriptenginebase.cpp 31 39.42%
src/util/cmdlineargs.cpp 43 72.73%
src/library/basetracktablemodel.cpp 64 21.99%
src/library/dlgtrackinfo.cpp 84 0%
src/engine/controls/bpmcontrol.cpp 95 63.44%
src/coreservices.cpp 115 0%
Totals Coverage Status
Change from base Build 1083902170: -0.04%
Covered Lines: 19997
Relevant Lines: 76944

💛 - Coveralls

@Be-ing
Copy link
Contributor

Be-ing commented Jul 31, 2021

Are you sure this does not affect 2.3?

@uklotzde
Copy link
Contributor Author

Are you sure this does not affect 2.3?

As I mentioned somehow it does, but no one complained so far. I won't fix it, let's move forward.

@uklotzde
Copy link
Contributor Author

uklotzde commented Jul 31, 2021

I don't want to apply this to 2.3, too risky. We are even installing a custom sqliteLike() function for handling UTF-16. Not sure how SUBSTR behaves compared to LIKE or if it matters. Edit: The sqliteLike() might still be needed for i18n-aware filtering, not sure. Strings are in fact stored as UTF-8 by SQLite.

We cannot fix all the issues for each and every version. Hopefully we can get rid of the legacy database code and Qt UTF-16 strings someday.

@uklotzde
Copy link
Contributor Author

The database code is completely broken because the LIKE operator is tweaked unconditionally for searching, i.e. by treating 'ä' like 'a'. But we also used it for exact file path prefix matches which obviously doesn't work as intended and could cause false positives.

@uklotzde
Copy link
Contributor Author

The modified test even succeeds for 2.3.

"track_locations.location LIKE :startsWithOldDirectory ESCAPE :escape"));
query.bindValue(":startsWithOldDirectory", startsWithOldDirectory);
query.bindValue(":escape", kSqlLikeMatchAll);
"SUBSTR(track_locations.location,1,:oldDirectoryPrefixLength)=:oldDirectoryPrefix"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually don't explain the obvious in comments, but in this case it might be good to add a comment that 1 refers to the first character in the string (i.e. it's not 0-based like arrays/lists/whatever in C/C++ are).

Copy link
Member

@Holzhaus Holzhaus Aug 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another way to implement something like STARTSWITH() would be

INSTR(track_locations.location, :oldDirectoryPrefix) = 1

but I guess that would be less performant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I won't replicate the same comment at 3 different locations. This can be looked up in the SQLite docs.

Using INSTR is less chatty, we should probably use that!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The naming of the variables clearly reveals the purpose, so there is no need to explain the 1-based indexing.

Thinking about the performance implications SUBSTR might in fact be faster than INSTR, because you have to know the outer context for optimizing the INSTR operation. I will keep using it in the aoide backend.

Copy link
Member

@Holzhaus Holzhaus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

@Holzhaus Holzhaus merged commit 7074cf8 into mixxxdj:main Aug 7, 2021
@uklotzde uklotzde deleted the sqlite-substr branch August 7, 2021 07:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants