Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request/Idea: Change the language of facets in the search API #8286

Closed
bappun opened this issue Dec 7, 2021 · 6 comments · Fixed by #8435
Closed

Feature Request/Idea: Change the language of facets in the search API #8286

bappun opened this issue Dec 7, 2021 · 6 comments · Fixed by #8435

Comments

@bappun
Copy link

bappun commented Dec 7, 2021

Overview of the Feature Request
I would like to be able to get the facets in different languages directly in the API. Maybe by adding a language parameter to the request?

What kind of user is the feature intended for?
(Example users roles: API User, Curator, Depositor, Guest, Superuser, Sysadmin)
API User

What inspired the request?
Integration of Dataverse search API in another website using javascript.

What existing behavior do you want changed?
The translation of facets in the search API.

Any related open or closed issues to this feature request?
#8287

@qqmyers
Copy link
Member

qqmyers commented Feb 17, 2022

@bappun - see the PR for some limits on what I was able to do in one PR/on current funds. It should allow API calls using translated terms to work, though you do need to search against the field (e.g. subject) rather than the special facet field (subject_ss).

@pdurbin pdurbin added this to the 5.11 milestone Mar 29, 2022
@bappun
Copy link
Author

bappun commented Apr 11, 2022

@qqmyers

We upgraded our instance to 5.10.1 but we are not able to search for the translated value.

I get no results when I search for topicClassValue:Elections (EN) or topicClassValue:Élections (FR) instead of topicClassValue:Politics.Elections/topicClassValue_ss:Politics.Elections (original value from the TSV).

Is there something we need to do to index the translated properties? We restarted payara and also did a reexportAll at the end of the upgrade.

@qqmyers
Copy link
Member

qqmyers commented Apr 11, 2022

Yes, a reindex would be required. (The 5.10 release includes Additional Release Steps: Solr Upgrade which includes doing a reindex all. Reindexing wasn't listed as a separate step due to that.)

@bappun
Copy link
Author

bappun commented Apr 11, 2022

Thank you @qqmyers, the reindex fixes the issue.

However, the reindex stops with an error when it is processing a dataset with a field that does not allow multiple values (but we have not changed the allow multiple values in our TSVs). Here is an error we got:

2022-04-11 13:30:27.850 ERROR (qtp261748192-18) [   x:collection1] o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: ERROR: [doc=dataset_1094_draft] multiple values encountered for non multiValued field samplingProcedure: [Non-probability: Respondent-assisted, Non probabiliste : participation volontaire]
        at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:160)
        at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:100)
        at org.apache.solr.update.AddUpdateCommand.lambda$null$0(AddUpdateCommand.java:261)
        at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
        at java.base/java.util.ArrayList$ArrayListSpliterator.tryAdvance(ArrayList.java:1632)
        at java.base/java.util.stream.StreamSpliterators$WrappingSpliterator.lambda$initPartialTraversalState$0(StreamSpliterators.java:294)
        at java.base/java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.fillBuffer(StreamSpliterators.java:206)

Non probabiliste : participation volontaire is the french translation for Non-probability: Respondent-assisted. There is only one value selected for the field but it seems each translation is counted as a value.

I am just guessing but could it be possible? The only hot fix I can see is to allow multiple values for our controlled vocabulary fields, update the schema, and reindexing. What do you think?

@qqmyers
Copy link
Member

qqmyers commented Apr 11, 2022

Ergh! We must have only tested with multivalue fields? In any case, I think there's a simpler fix - just change the schema.xml line for any single-valued CVV field to have multiValued="true" - i.e. this line for samplingProcedure . You'll probably have to restart solr and reindex after that change. This allows multiple values in solr but doesn't allow the user to add multiple entries in the UI.

Pleas submit an issue as this is something where we need to update the default schema for any fields in the standard blocks and the update schema script.

@bappun
Copy link
Author

bappun commented Apr 11, 2022

Thank you! We changed to true for our CVV fields and the reindex worked. I will create the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

3 participants