Skip to content
This repository has been archived by the owner on Feb 22, 2023. It is now read-only.

[Feature] Add audio fields to audio search response #212

Closed
1 task
krysal opened this issue Sep 17, 2021 · 6 comments · Fixed by #291
Closed
1 task

[Feature] Add audio fields to audio search response #212

krysal opened this issue Sep 17, 2021 · 6 comments · Fixed by #291
Assignees
Labels
💻 aspect: code Concerns the software code in the repository ✨ goal: improvement Improvement to an existing user-facing feature 🟧 priority: high Stalls work on the project or its dependents
Projects

Comments

@krysal
Copy link
Member

krysal commented Sep 17, 2021

Problem

Currently the audio search endpoint (http://localhost:8000/v1/audio?q=<search_term>) returns an audio list with the same generic data for images + audio links but we're missing the duration and category, required to show the track component with complete info in the audio search list.

{
      "id": "02ea0f38-5426-452d-8b9a-49fa00a05df6",
      "title": "I Love U",
      "foreign_landing_url": "https://www.jamendo.com/track/14428",
      "creator": "Dance2003",
      "creator_url": "https://www.jamendo.com/artist/1799/dance2003",
      "url": "https://mp3d.jamendo.com/download/track/14428/mp32",
      "license": "by-nc-sa",
      "license_version": "2.5",
      "license_url": "https://creativecommons.org/licenses/by-nc-sa/2.5/",
      "provider": "jamendo",
      "source": "jamendo",
      "tags": [
        {
          "name": "instrumental"
        }
      ],
      "fields_matched": [
        "title"
      ],
      "thumbnail": "http://localhost:8000/v1/audio/02ea0f38-5426-452d-8b9a-49fa00a05df6/thumb/",
      "waveform": "http://localhost:8000/v1/audio/02ea0f38-5426-452d-8b9a-49fa00a05df6/waveform/",
      "detail_url": "http://localhost:8000/v1/audio/02ea0f38-5426-452d-8b9a-49fa00a05df6/",
      "related_url": "http://localhost:8000/v1/audio/02ea0f38-5426-452d-8b9a-49fa00a05df6/related/"
}

Description

Add the duration and category to each audio object.

Implementation

  • 🙋 I would be interested in implementing this feature.
@krysal krysal added 🟧 priority: high Stalls work on the project or its dependents ✨ goal: improvement Improvement to an existing user-facing feature 💻 aspect: code Concerns the software code in the repository labels Sep 17, 2021
@dhruvkb dhruvkb added this to Backlog in Openverse Sep 17, 2021
@sarayourfriend sarayourfriend self-assigned this Sep 20, 2021
@sarayourfriend
Copy link
Contributor

@krysal just to make sure I'm understanding this ticket right, is it still true that the audio doesn't return the duration? It looks like the AudioSerializer has a duration field, am I looking in the wrong spot?

@krysal
Copy link
Member Author

krysal commented Sep 20, 2021

@sarayourfriend yes, you can check it out yourself by running the project (docker-compose up + load the sample data if you haven't already ./load_sample_data.sh) and making a request, for example:

curl http://localhost:8000/v1/audio?q=rock

You will see in the results array no audio has the duration or category fields. Something is missing, maybe some Elasticsearch mapping, I have to check again to find the root cause.

@sarayourfriend
Copy link
Contributor

sarayourfriend commented Sep 20, 2021

Okay! Thanks for the instructions. I'll start taking a look 🙂

Update: It looks like it's actually missing more than just duration and category. attribution for example is also declared as a field on the base MediaSerializer but it isn't present in the response body.

@krysal
Copy link
Member Author

krysal commented Sep 20, 2021

You're right. Actually, I just mentioned duration and category because is what we're needing right now for the audio results page, these fields are part of the AudioTrack component, but we can also add all the audio related fields at once.

Thanks for taking a look at this!

@obulat
Copy link
Contributor

obulat commented Sep 21, 2021

The ElasticSearch mapping does not contain all the fields. I think we've only added the ones that are relevant for searching. However, this means that ES returns only the mapped fields, and if we want other fields as well (eg duration), we would need to query the database on request.

There probably are two solutions:

  1. Add all audio fields to the ES mapping so that all fields are returned in ES search results.
  2. After receiving ES search results, call the DB to get the missing fields.
    I wonder which would be better for performance? My guess is the first option, but I am not sure.

@sarayourfriend
Copy link
Contributor

It seems like for search we should prioritize read speed. The second option would be better for writes but write times don't really matter too much as far as I understand, it's not a "dynamic" application that responds to user input (like searching for new users or products that users are adding to a database where the search needs to be up to date to the latest information).

What's the process for adding and testing new ES mappings?

@dhruvkb dhruvkb moved this from Backlog to In progress in Openverse Oct 11, 2021
Openverse automation moved this from In progress to Done! Oct 13, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
💻 aspect: code Concerns the software code in the repository ✨ goal: improvement Improvement to an existing user-facing feature 🟧 priority: high Stalls work on the project or its dependents
Projects
No open projects
Openverse
  
Done!
Development

Successfully merging a pull request may close this issue.

3 participants