Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

genre search functionality #559

Merged
merged 9 commits into from
Sep 2, 2018
Merged

Conversation

whatdoineed2do
Copy link
Contributor

@whatdoineed2do whatdoineed2do commented Jul 29, 2018

Add endpoints to:

  • list all genres known to server
  • search by genres, returning all albums matching a genre

modeled heavily on existing album code.

Functionality would be coupled with update to UI that would allow users to queue music by a given genre, such as classical. Currently, users need to locate and queue individual songs/artists/albums.

Missing updates to web ui for which an issue #558 is logged Web ui (generated/drop-in htdocs files) updates on a fork:
https://github.com/whatdoineed2do/forked-daapd-web/files/2258629/dist-0cd890b3128c50510c776d455adb0c61a9b3b376.tar.gz
dist-search-genre-98ce8de37dfda03a766e5745e5e1219be764545b.tar.gz

EDIT: 2018-08-24

Please use in conjunction with this build to see intended use cases - web ui code NOT incl in PR)

Upon server functionality is merge, I will submit PR to https://github.com/chme/forked-daapd-web.

comments / feedback? @chme , @ejurgensen
Thanks

@whatdoineed2do
Copy link
Contributor Author

commit bb26ca3 is required due to genres being outside of strict control of the db (not a generated value). Therefore the genre name can be anything (ultimately it generates a search/match in the db) so the json regex needs to handle.

@chme
Copy link
Collaborator

chme commented Aug 12, 2018

I did a short test with this branch in combination with chme/forked-daapd-web#6
(I only had a brief look at the code changes)

Here is a truncated response to "api/library/genres" from my test system:

{
  "items": [
    {
      "name": "Alternative",
      "album_count": 1,
      "track_count": 1,
      "length_ms": 230423
    },
    ...
    {
      "name": "Rock/Pop",
      "album_count": 1,
      "track_count": 1,
      "length_ms": 257105
    },
    ...
    {
      "name": "'90s Alternative",
      "album_count": 1,
      "track_count": 1,
      "length_ms": 257488
    },
    ....
  ],
  "total": 28,
  "offset": 0,
  "limit": -1
}

The web interface does not properly escape the genre name and therefor only the first one "Alternative" works.

The second one simply return zero items. The web interface URI is /#/music/genre/Singer/Songwriter. My guess is, that the vue component to render this URI can not be resolved.

And the last one results in an sql error:

[DEBUG]       db: Running query 'SELECT COUNT(DISTINCT f.songalbumid) FROM files f WHERE f.disabled = 0 AND (f.genre= ''90s Alternative');'
[  LOG]       db: Could not prepare statement: unrecognized token: "90s"
[  LOG]       db: Could not create query, unknown type 5

Did you consider to handle genres similar to artists/albums by generating a unique id? Maybe the "groups" table could be used with a new "type" for genres to get a persistent id for a genre. Of course this would be a lot more work ...

query_params.filter = db_mprintf("(f.genre= '%s')", genre);

if (media_kind)
query_params.filter = db_mprintf("(f.media_kind = %d)", media_kind);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The media_kind filter replaces the filter to fetch a genre (and results in a memory leak). The filters should instead be combined to only fetch the albums of a genre that match the given media_kind.


query_params.type = Q_GROUP_ALBUMS;
query_params.sort = S_ALBUM;
query_params.filter = db_mprintf("(f.genre= '%s')", genre);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be '%q'. Should solve the SQL injection problem with genres that contain '.

@whatdoineed2do
Copy link
Contributor Author

whatdoineed2do commented Aug 12, 2018

Appreciate the review.

I created a dummy file with the genre '90s alt and this works for me locally (bb26ca3 fixed the spaces in names) but we have more of a problem with the genre named Rock/Pop since we can't html escape /.

Looks like I;ll have to re-factor api/library/genres/<genre_name> to api/library/genres?q="<genre_name" to handle arbitrary names to solve this. Try to get this done on the commute to work over the next week if I can't complete it today.

EDIT:
Doing this via strings is a pain on the parsing .. looks like we'll have to do this with numeric ids like the albums etc. This'll take me some time to dig through the db table / generation stuff

@ejurgensen
Copy link
Member

I think you should strive to do this with the existing data model and queries. As far I understand what you want to do is return all genres + albums for a particular genre. These are queries that are already supported and implemented for daap (possibly also for rsp), even with optimised indeces, as I recall. So try to use that framework.

There are many ways to group albums and tracks (year, composer, type etc.), and I don't like the idea of having id's for each of these.

@whatdoineed2do
Copy link
Contributor Author

whatdoineed2do commented Aug 13, 2018

Fixes to issues raised for genre names with spaces, /, ' etc. Using existing data model and query params to accepts uri encoded and escape'd names, ex:

api/library/genre?name=Pop
api/library/genre?name=Rock%2FPop            # Rock/Pop
api/library/genre?name=Drum+%26+Bass     # Drum & Bass
api/library/genre?name=%2790s+Alternative  # '90 Alternative

WebUI updated, build here:

dist-search-genre-98ce8de37dfda03a766e5745e5e1219be764545b.tar.gz

@whatdoineed2do
Copy link
Contributor Author

Previous issues fixed, commits squashed.

Please review, thanks

Copy link
Collaborator

@chme chme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is not possible to create REST-like endpoints for genres. I would prefer to drop the library/genre?q= endpoint.
In order to retrieve the albums for a genre, it should already be possible to use the search endpoint (with a smart pl query): api/search?type=album&expression=genre+is+"Alternative".

Another possibility is to extend the library/albums endpoint to support a "genre" query parameter (similar to the media_kind parameter) that allows filtering by genre.

I think, @ejurgensen had the Q_BROWSE_GENRES query type in mind. Q_BROWSE_GENRES can be used to get a list of genres. This query type only returns the genre name and no information how many artists/albums are matching a genre. Not sure how important the album and artist count information is ...

src/logger.c Outdated
@@ -321,7 +321,7 @@ logger_init(char *file, char *domains, int severity)
return -1;
}

ret = fchown(fileno(logfile), runas_uid, 0);
ret = fchown(fileno(logfile), runas_uid, -1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the effect of this change? This should at least be a separate commit and better a separate pull request.

@whatdoineed2do
Copy link
Contributor Author

I've been a bit dumb here - I opted for /api/library/genre?name=... to overcome the issues with having to escape the genre names having spaces/ampersands/fwd slashes etc.

BUT we can go with /api/library/genres/<uri encoded/escaped genre name>

With escaped genre names we can send across '90s Alternative / Rock/Pop' / Drum & Bass` with no issues so it solves the bug you found last week (my library consists of simple single word genres: Pop, Rock, Classical...)

Does this end point work for you?

On this laptop, using the api/search?type=album&expression=genre+is+"Alternative caused the process to lock up.

@ejurgensen
Copy link
Member

I agree with @chme that it would be good to use the search api for this - i.e. not add a specific endpoint. If it doesn't work then it should be fixed, of course. I also agree with him about the logger.c change. The same goes for admin.html.

@chme is also correct that I think the BROWSE query should be used. Genre is not a group in the data model, and the query with a join on the group table doesn't make sense to me. I suggest first doing the endpoint/UI with existing metadata, and then afterwards expand the BROWSE result if more metadata is required for a better UX.

@chme
Copy link
Collaborator

chme commented Aug 19, 2018

@whatdoineed2do In your comment the closing " is missing after Alternative. The uri should look like: api/search?type=album&expression=genre+is+"Alternative"

I can get forked-daapd to crash, if i am missing the leading " (crashes somewhere in the antlr3 code). With a missing trailing ", i get a "Bad request" response (correct behavior).

@whatdoineed2do
Copy link
Contributor Author

I appears my local master has local commits (admin.html / logger.c) that were never meant to see the light of day that have crept into this feature branch. I'll revert and sync with upstream

Regarding the genre functionality. One Q:

are we recommending that we don't introduce a /api/library/genres endpoint to provide a list of known genres (wil remove the Q_GROUP_GENRE), what options do we currently have?

Adding the genre search to /api/library/albums/genres looks out of key with the other part of that interface (under album the i/f looks like its all about specific albums and searches within those specific albums). In looking for genres, I think we'd only expect to do this based on albums.

Adding an equivalant just give me a list of X (ie albums/artists/tracks/genres) seems to be most in line with current json i/f but I'll go with your views (after all you will review/merge/maintain). All of these listings seem to have a corresponding /api/library/x endpoint

Implementing the get list of albums belonging to genre into the search is good .. it works already (except for the bug I raised regarding missing / single quotes) so less code to maintain.

@ejurgensen
Copy link
Member

/api/library/genres is fine, but as far as I can tell you can use Q_BROWSE_GENRE to query the db, you don't need Q_GROUP_GENRE. You can also later add a /api/library/composers and then use Q_BROWSE_COMPOSERS ... and so on.

I didn't quite understand the two paragraphs after your question, but yes for searching albums belonging to genre it seems good to use the search interface.

@whatdoineed2do
Copy link
Contributor Author

thanks. Yes, just pushed the most recent fixes to this branch (wasn't 100% ready for review) that now reflects the discussion above:

  • /api/library/genres
  • search is via existing Q_BROWSE_GROUPS (not using GROUPS)

Need to fix the webui but need to over issues raised in #570

@whatdoineed2do
Copy link
Contributor Author

I think this is clean for review now - appreciate the patience

Copy link
Member

@ejurgensen ejurgensen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice there are some style things you should fix

genre_to_json(const char* g_, const struct filecount_info* fci_)
{
json_object *item = NULL;
if (g_ == NULL) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please indent like in the rest of forked-daapd

@@ -232,6 +232,23 @@ playlist_to_json(struct db_playlist_info *dbpli)
return item;
}

static json_object *
genre_to_json(const char* g_, const struct filecount_info* fci_)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the underscores? "genre" and "fci" would be good names

static json_object *
genre_to_json(const char* g_, const struct filecount_info* fci_)
{
json_object *item = NULL;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this assignment doesn't seem to do anything

@@ -330,6 +347,7 @@ fetch_artist(const char *artist_id)
return artist;
}


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

json_object *item;
int ret = 0;
char* bi;
char* si;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why two spaces before bi and si?
naming of bi and si could be better
assignment of ret does nothing


while (((ret = db_query_fetch_string_sort(query_params, &bi, &si)) == 0) && (bi))
{
struct filecount_info fci;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no mid function declarations in forked-daapd

json_object *reply;
json_object *items;
int total;
int ret = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assignment with no effect

@whatdoineed2do
Copy link
Contributor Author

whatdoineed2do commented Aug 24, 2018

styling fixed. I still need to squash the commits - we need to agree that the front end functionality is good for you too (a tar file/drop in replacement for the htdocs was in the last comment/updated in the OP) in case we need to extend backend further.


urght.. hold off - the search bar needs a fix too

@whatdoineed2do
Copy link
Contributor Author

The last bug I mention above is when we use the webui's search page looking for value that doesn't match any genre. The query from the ui arrives at the server:
api/search?type=track,artist,album,genre&query=XXX&media_kind=music&limit=3&offset=0

If XXX exist then this is fine. However, if XXX is not a known genre then because of uing the Q_BROWSE_GENRE, db_query_start() drops us into

      default:
        if (qp->type & Q_F_BROWSE)
          query = db_build_query_browse(qp);

but this returns NULL from db_build_query_check() .. db_get_one_int() because there's no matching rows.

Q - does this mean that the BROWSE searches ALWAYS expects a result?

The web ui's listing of genre uses api/search?type=albums&expression=genre+is+\"Pop\" so it's dropping into the db_build_query_group_albums()

Copy link
Member

@ejurgensen ejurgensen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took another look, I think these are the final comments from my side. When fixed, and @chme also has checked, I think it is ready to be merged.

@@ -234,12 +234,32 @@ playlist_to_json(struct db_playlist_info *dbpli)
return item;
}

static json_object *
genre_to_json(const char* genre, const struct filecount_info* fci)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inconsistent asterix placement

json_object *item;
int ret;
char* genre;
char* sort_item;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

asterix

qp.type = Q_COUNT_ITEMS;
qp.idx_type = I_NONE;
qp.filter = db_mprintf("(f.genre = '%q')", genre);
db_query_start(&qp);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

db_filecount_get starts the query, so you don't need to do it here

db_query_start(&qp);
db_filecount_get(&fci, &qp);
free(qp.filter);
db_query_end(&qp);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also not required

memset(&qp, 0, sizeof(qp));

qp.type = Q_COUNT_ITEMS;
qp.idx_type = I_NONE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is memset of qp and setting consts required inside the loop?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all fixed - memset(&qp...) was a left over of prev review where struct was local to loop

qp.idx_type = I_NONE;
qp.filter = db_mprintf("(f.genre = '%q')", genre);
db_query_start(&qp);
db_filecount_get(&fci, &qp);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check return value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was in 2minds - if it fails (and ignore the ret code), the fci is already zero'd out so we can continue tell the client that theres 0 for album/artist/track count for that genre (even though the info is not correct due to internal library failure) but at least the client knows the system knows about the genre

I added check for retso now if theres a problem getting info about genre, we don't present any of its info back to client.

@ejurgensen
Copy link
Member

Thanks for the fixing up, the pr looks fine to me now. As mentioned @chme will also have to check it before merging.

BTW forgot to answer your question: The browse query should work with zero results as well, and if it doesn't it is a bug. I checked the code and it looks ok, db_get_one_int() should get a row because it is a count query (so there should be a row with a "0"). Can I ask you to check again?

@whatdoineed2do
Copy link
Contributor Author

whatdoineed2do commented Aug 27, 2018

The json query for a found/not found genre are identical. A deeper look we see that the `GROUP BY f.genre(no matching genre found) causes the db handling to be off.

On the sql prompt we can happily run but note the return on first query when theres no match, no rows are returned.

sqlite> SELECT COUNT(*) FROM files f WHERE f.disabled = 0 AND (f.genre LIKE '%xxx%') AND f.genre != '' GROUP BY f.genre;
sqlite> SELECT COUNT(*) FROM files f WHERE f.disabled = 0 AND (f.genre LIKE '%Drum%') AND f.genre != '' GROUP BY f.genre;
1

without the GROUP BY for zero results we get 0 as expected but we need the GROUP BY clause.

sqlite> SELECT COUNT(*) FROM files f WHERE f.disabled = 0 AND (f.genre LIKE '%xxx%') AND f.genre != '';
0
sqlite> 

Proposed Fix
The fix would be here in db_get_int_one():

diff --git a/src/db.c b/src/db.c
index 267d8b73..68e900c8 100644
--- a/src/db.c
+++ b/src/db.c
@@ -1332,12 +1332,18 @@ db_get_one_int(const char *query)
   if (ret != SQLITE_ROW)
     {
       if (ret == SQLITE_DONE)
-       DPRINTF(E_INFO, L_DB, "No matching row found for query: %s\n", query);
+      {
+       DPRINTF(E_DBG, L_DB, "No matching row found for query: %s\n", query);
+        ret = 0;
+      }
       else
+      {
        DPRINTF(E_LOG, L_DB, "Could not step: %s (%s)\n", sqlite3_errmsg(hdl), query);
+        ret = 1;
+      }
 
       sqlite3_finalize(stmt);
-      return -1;
+      return ret;
     }
 
   ret = sqlite3_column_int(stmt, 0);

Let me know if this fix is something you are comfortable and I'll add this to the PR.

@ejurgensen
Copy link
Member

ejurgensen commented Aug 27, 2018

Oh sorry, I completely missed you were using BROWSE for searching, I somehow thought it was just was for listing. Let me get back to you...

**Example**

```
curl -X GET "http://localhost:3689/api/search?type=albums&expression=genre+is+\"Pop\"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ending quotation mark is missing in these examples

@ejurgensen
Copy link
Member

In your original post you said the purpose of the pr was:

"Add endpoints to:

list all genres known to server
search by genres, returning all albums matching a genre"

But the pr now also extends the search api to return a json genre object, which contains matching genre names plus album, artist and track count. I missed that part in the above conversation, and also when reviewing, I'll admit. I also missed what the purpose of this search is, so I hope you can reiterate - is it to support some sort of future type-ahead search? You wrote above that the web ui only uses type=album, so I am unsure what type=genre is for?

@whatdoineed2do
Copy link
Contributor Author

whatdoineed2do commented Aug 28, 2018

The work here was to support the following (in order of priority):

  • (MUST HAVE) ability for user to see:
    .. list of genres for all files on the server and then to play or add the albums directly via genre name (like when you search for album) as returned
    .. to deep dive a return genre listing (from above) to list all albums for a given genre
    On the webui (as impl) this functionality is available from music->tabs which has been updated to be queue | artists | albums | genres
    This functionality for use case above is implmented in the backend with api/library/genres and api/search?type=album&expression=genre is xxx
    genre-list0

  • (USEFUL) ability to search for genres by name (similar to looking for albums or tracks by name) on the search screen and for the matching genres names to be listed; clicking on the genre loads the list of albums for that genre as above in the MUST HAVE.
    The webui is implmented to search for a value across album/playlist/artist/genre and this functionality is handled by the code:
    2940 if (strstr(param_type, "genre")) ... and via the entry api/search?type=genre...
    This only returns the genre name/total counts for the limited display on the search results screen.
    You may wonder why have a search for genre names? As genre metadata is simply a string, we can have varied and long list of genres and a user may want to search for genres like "Pop" and get a returned set of: "CantoPop", "JPop", "KPop", "MandrinPop", "Pop" on the search screen which they can quickly add to play queue as apposed to scrolling through the entire list of genres (via music->tabs->genre). The update web ui search page will send api/search?type=track,artist,album,playlist,genre&query=XXX&media_kind=music&limit=3&offset=0 (including genre as part of search).
    search0

the tar file updated on the OP (https://github.com/ejurgensen/forked-daapd/files/2318320/dist-search-genre-7d5becba4a8cee675eadbdbadec194dba3386651.tar.gz) can be dropped in place for the existing htdocs
for this tree may explain it easier how these 2 items work


To further clarify, the #559 issue only applies to the api/search?type=genre... (searching for genres via search box)

db_admin_get() .. admin_get() .. strdup()
fetch_album() .. db_mprintf() .. strdup()
fetch_artist() .. db_mprintf() .. strdup()
Copy link
Collaborator

@chme chme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing these memleaks!

If you could remove the count values from the genre object, that would be great. I would prefer to look at these in a separate pr. (Instead of an inner db query, i think it would be better to extend the browse queries to return the count values).


item = json_object_new_object();
safe_json_add_string(item, "name", genre);
json_object_object_add(item, "album_count", json_object_new_int(fci ? fci->album_count : 0));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove the count values from the genre object for this pr. Extending the genre object with more metadata can be done in follow up pr. Genre would then only contain the "name" property.

memset(&fci, 0, sizeof(fci));

qp.filter = db_mprintf("(f.genre = '%q')", genre);
ret = db_filecount_get(&fci, &qp);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without the count values in the genre object, this inner db query can be dropped.

safe_json_add_time_from_string(jreply, "started_at", (s = db_admin_get(DB_ADMIN_START_TIME)), true);
free(s);
safe_json_add_time_from_string(jreply, "updated_at", (s = db_admin_get(DB_ADMIN_DB_UPDATE)), true);
free(s);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find!

Copy link
Contributor Author

@whatdoineed2do whatdoineed2do Aug 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks to everyone's friend, valgrind

fyi - there's one other leak i saw in httpd.c's httpd_init() .. exitev = event_new(evbase_httpd, exit_efd, EV_READ, exit_cb, NULL); but I'm not familar with libevent to fix this It leaks 136 bytes IIRC

@@ -2679,6 +2894,22 @@ jsonapi_reply_search(struct httpd_request *hreq)
if (param_expression)
{
expression = safe_asprintf("\"query\" { %s }", param_expression);

#ifndef ANTLR_PARSER_FIX
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This hack only avoids the parser bug in a sepcific case and will not work in cases where '+' is a valid character. So please drop this hack.

@ejurgensen
Copy link
Member

I concur with @chme, and also suggest that you move the genre search to another pr. That way we can get the "must have" merged and continue the discussion on the other stuff seperately.

@whatdoineed2do
Copy link
Contributor Author

whatdoineed2do commented Aug 31, 2018

Updated as per review

webui PR: chme/forked-daapd-web#6

Copy link
Collaborator

@chme chme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

ret = fetch_genres(&query_params, items, NULL);
if (ret < 0)
goto error;
else
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a minor: the assignment to "total" can be done unconditionally (the if-part jumps to "error").

@ejurgensen ejurgensen merged commit e3ce003 into owntone:master Sep 2, 2018
@ejurgensen
Copy link
Member

Merged, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants