Skip to content

Conversation

@matt-bernhardt
Copy link
Member

@matt-bernhardt matt-bernhardt commented Jul 7, 2022

This does a few things - full details are in the individual commit messages.

  1. It removes a few facets that are no longer desired. The only remaining facet is for content types.
  2. It makes that facet "real", turning the previously-faked URL into one that will actually filter results for that facet value.
  3. A few things get refactored along the way with this work - a controller method gets moved to a facet helper, and the search feedback portion of the UI gets some changes (adding the content types facet, and changing the source of truth for this feedback).
  4. There are more tests for the facet helpers.
  5. A bunch of test cassettes are updated.

Developer

  • All new ENV is documented in README
  • All new ENV has been added to Heroku Pipeline, Staging and Prod
  • ANDI or Wave has been run in accordance to
    our guide and
    all issues introduced by these changes have been resolved or opened as new
    issues (link to those issues in the Pull Request details above)
  • Stakeholder approval has been confirmed (or is not needed)

Code Reviewer

  • The commit message is clear and follows our guidelines
    (not just this pull request message)
  • There are appropriate tests covering any new functionality
  • The documentation has been updated or is unnecessary
  • The changes have been verified
  • New dependencies are appropriate or there were no changes

Requires database migrations?

NO

Includes new or updated dependencies?

NO

@mitlib mitlib temporarily deployed to timdex-ui-pi-rdi-197-co-xqcfx6 July 7, 2022 20:52 Inactive
@coveralls
Copy link

coveralls commented Jul 7, 2022

Pull Request Test Coverage Report for Build 3798605372bb63eee33ded5593030aa8e9dfddab-PR-59

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 16 of 16 (100.0%) changed or added relevant lines in 4 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.02%) to 99.394%

Totals Coverage Status
Change from base Build f24783c31538c16dead288ce09e7a9416f97b364: 0.02%
Covered Lines: 328
Relevant Lines: 330

💛 - Coveralls

@matt-bernhardt matt-bernhardt force-pushed the rdi-197-content-type-facet branch from f41e76b to 73acf52 Compare July 7, 2022 21:04
@mitlib mitlib temporarily deployed to timdex-ui-pi-rdi-197-co-xqcfx6 July 7, 2022 21:04 Inactive
@JPrevost JPrevost self-assigned this Jul 8, 2022
facets = response&.data&.search&.to_h&.dig('aggregations')

facets&.map { |k, v| { facet_labels[k] => v } }
response&.data&.search&.to_h&.dig('aggregations')&.map { |k, v| { k => v } }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're overcomplicating this data structure with this mapping.

Removing it leaves (I added Source back in locally to confirm how this will work with additional facets in the future as I was worried my perspective was being clouded with just the current contentType facet being returned:

response&.data&.search&.to_h&.dig('aggregations')
{"contentType"=>
  [{"key"=>"dataset", "docCount"=>218},
   {"key"=>"article", "docCount"=>132},
   {"key"=>"working paper", "docCount"=>70},
   {"key"=>"software", "docCount"=>58},
   {"key"=>"text", "docCount"=>36},
   {"key"=>"preprint", "docCount"=>17},
   {"key"=>"journalarticle", "docCount"=>9},
   {"key"=>"conferencepaper", "docCount"=>6},
   {"key"=>"other", "docCount"=>5},
   {"key"=>"image", "docCount"=>3}],
 "source"=>
  [{"key"=>"zenodo", "docCount"=>242},
   {"key"=>"woods hole open access server", "docCount"=>146},
   {"key"=>"abdul latif jameel poverty action lab dataverse", "docCount"=>99},
   {"key"=>"dspace@mit", "docCount"=>70}]}

and the map just surrounds the whole thing in an array that doesn't seem to add value and complicated accessing this data by always needing to grab the first element of the array (which is the only element) and then working with the structure that results:

response&.data&.search&.to_h&.dig('aggregations')&.map { |k, v| { k => v } }
[{"contentType"=>
   [{"key"=>"dataset", "docCount"=>218},
    {"key"=>"article", "docCount"=>132},
    {"key"=>"working paper", "docCount"=>70},
    {"key"=>"software", "docCount"=>58},
    {"key"=>"text", "docCount"=>36},
    {"key"=>"preprint", "docCount"=>17},
    {"key"=>"journalarticle", "docCount"=>9},
    {"key"=>"conferencepaper", "docCount"=>6},
    {"key"=>"other", "docCount"=>5},
    {"key"=>"image", "docCount"=>3}]},
 {"source"=>
   [{"key"=>"zenodo", "docCount"=>242},
    {"key"=>"woods hole open access server", "docCount"=>146},
    {"key"=>"abdul latif jameel poverty action lab dataverse", "docCount"=>99},
    {"key"=>"dspace@mit", "docCount"=>70}]}]

Before I continue with this review, I'd like to either discuss this to better understand what I'm missing, or see this move to the simpler data structure without the map as I think it'll require a bunch of changes and reviewing it all at once will be more efficient after the change (if we choose to make this change I'm suggesting).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just read the commit notes more closely and see Because the hash is moving to a helper, the extract_facets method on the controller gets simpler. The last .map call is still needed, because removing it messes with the returned data in a way that would force the view partial to change even more significantly. Apologies for not reading that more closely before my comment.

I still think I'd like to better understand the reason we want to avoid this change now though. I'll see if I can understand the reluctance by looking at the code, but may want to chat a bit about this to get your perspective on a zoom in a bit if you are around.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your help with this - the PR as it now stands avoids this issue.

@matt-bernhardt matt-bernhardt requested review from JPrevost and removed request for JPrevost and jazairi July 8, 2022 18:17
@matt-bernhardt matt-bernhardt marked this pull request as draft July 8, 2022 18:17
** Why are these changes being introduced:

* We need to make the facets in the left sidebar into something usable.
* Because of inconsistencies in the imported records, certain facets
  need to be removed - leaving only those that we feel comfortable
  providing.

** Relevant ticket(s):

* https://mitlibraries.atlassian.net/browse/rdi-197

** How does this address that need:

A number of things are happening here:

* A $contentType query variable has been added to the TimdexSearch
  model, which maps to the contentTypeFacet variable in the API. This
  requires changes to the enhancer and query_builder models to support
  this new value. This variable has been added to the search feedback
  in the results.html.erb view template.

* We are removing a number of facets that are no longer desired in the
  UI. Removing them from the aggregations of the TimdexSearch model
  requires regenerating a large number of cassettes - I've kept those
  changes to a second commit in this PR for hopefully easier review.

* The facet_labels method on the search controller is moving to a new
  facet helper, named nice_labels. This is not so much a method as a
  hash which can be used to look up human-readable names for facet
  categories. Note the invocation in the _facet partial on line 4 to see
  what I mean here.

* Because the hash is moving to a helper, the extract_facets method on
  the controller gets simpler. Thanks to Jeremy for working through the
  best way to achieve this.

* The facet helper also includes two methods for manipulating the query
  passed to results_path in the view partial. add_facet and remove_facet
  should be named pretty clearly, and take the expected parameters.

* There is a new set of unit tests for all of the facet helpers.

* Facets can be removed by clicking on a link below that facet. Active
  facets are called out in black, which hopefully will help identify
  them to the user. This should still be evaluated by UX when they join
  the project, obviously.

** Document any side effects to this change:

* This changes the search feedback in the results.html.erb to work from
  the @enhanced_query instance variable rather than the raw params
  object. This should better reflect what the search actually was,
  giving space for the user to be informed of any manipulations by the
  application, rather than simply working from whatever the user had
  requested.
** Why are these changes being introduced:

* Because the TimdexSearch model has been updated (to remove a number of
  aggregations / facets which are not desired), a large number of our
  cassettes need to be udpated.

** Relevant ticket(s):

* https://mitlibraries.atlassian.net/browse/rdi-197

** How does this address that need:

* This updates the cassettes which need to be changed. As usual, the
  timdex_error cassette is only updated for the request - the response
  is retained as a 500 error.

** Document any side effects to this change:

* None
@matt-bernhardt matt-bernhardt force-pushed the rdi-197-content-type-facet branch from 73acf52 to d28cf9b Compare July 8, 2022 19:44
@matt-bernhardt matt-bernhardt marked this pull request as ready for review July 8, 2022 19:44
@mitlib mitlib temporarily deployed to timdex-ui-pi-rdi-197-co-xqcfx6 July 8, 2022 19:44 Inactive
Copy link
Member

@JPrevost JPrevost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made two comments about adding and removing facets that allow multiple values will (I think) not work with this logic. As we are only displaying facets where single values make sense, I think we should go with this as written and address the more complex case later when we are more confident we actually care about displaying the more complex facets.

:shipit:

facets = response&.data&.search&.to_h&.dig('aggregations')

facets&.map { |k, v| { facet_labels[k] => v } }
response&.data&.search&.to_h&.dig('aggregations')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fits my brain so much better. Thanks so much for talking that through with me so we both understood a bit better what was going on and why here :)

def add_facet(query, facet, term)
new_query = query.clone
new_query[:page] = 1
new_query[facet.to_sym] = term
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is okay with the facets we currently are displaying because they are all "only one makes sense" kinda facets. We may need to revisit this in the future to ensure this doesn't remove facets that are applied during a drill down application (i.e. search > apply "foo" from group A > apply "bar" from group A... is the result just "bar" applied or are both being preserved in the query with this code?

Even if this only works for our currently displayed types of facets, I'm okay with coming back to this later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this work is predicated on only one value being applicable from a given vocabulary - so I'm not pushing values onto an array, or purging only one value, etc.

This work does support applying facets from multiple vocabularies, which I think you've noted somewhere in these conversations, but more work will be needed if we get into dealing with arrays of values.

def remove_facet(query, facet)
new_query = query.clone
new_query[:page] = 1
new_query.delete facet.to_sym
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to my note above. I think this works with contentType really well as it's a one value facet, but with facets in which multiple tiers of filtering can happen (such as subjects) this probably won't hold up. I'm okay with dealing with this later, but wanted to share the concern so we aren't surprised later if this needs to get more complex.

@JPrevost JPrevost merged commit fff30c6 into main Jul 11, 2022
@JPrevost JPrevost deleted the rdi-197-content-type-facet branch July 11, 2022 13:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants