Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Date search results in error #936

Closed
csidirop opened this issue Mar 31, 2023 · 12 comments
Closed

[BUG] Date search results in error #936

csidirop opened this issue Mar 31, 2023 · 12 comments

Comments

@csidirop
Copy link
Contributor

Description

Trying to use the date search results in an error.

This input:
grafik

leads to this error:
grafik

Nothing is written to the log.

Reproduction

Steps to reproduce the behaviour:

  1. Go to your search view page
  2. Set the date range
  3. Click on search
  4. See error

Expected Behavior

Filter results.

Screenshots and Examples

Environment

  • OS version: Debian
  • RDBMS version: Maria DB (MySQL 5.5.5-10.11.2-MariaDB-1:10.11.2+maria~ubu2204 )
  • Apache Solr version: 8.11.2 (docker)
  • TYPO3 version: 10.4.37-dev
  • PHP version: 7.4.33
@michaelkubina
Copy link
Collaborator

michaelkubina commented Apr 11, 2023

Hello Chris,
i am trying to replicate your error, but i cannot encounter this exception. I see, that you are using the extended search, but in my tests it worked as well, so it does not seem to be the issue.

Could you provide me some additional imformation?

  • what browser are you using?
  • are you using the general schema.xml in your solr instance?
  • how does a indexed document look like in your solr? have dates been indexed?
  • how does a METS look like for a particular object?

The reason i am asking, is because solr is strict about the format of the date-input:

  • if your browser sends a "16000101" instead of "1600-01-01" we would run into an solr-error. If so, then we would need to sanitize the dateformat during query time as well, as it is already implemented for the indexer.
  • if the indexer has not been able to get any dates (e.g. wrong XPATH or invalid dateformats), then without the presence of a date-field in any of your indexed documents, we would run into an "undefined fields" error.
  • or maybe its something completly different, that i have not thought before

In any case i need some further insight.

With best regards,
Michael

@csidirop
Copy link
Contributor Author

csidirop commented Apr 17, 2023

Hi Michael,
thank you for your reply!

Could you provide me some additional imformation?

It is still possible that my setup or the config is not correct. You can try out my docker environment: https://github.com/UB-Mannheim/kitodo-presentation-docker/tree/dfg-viewer-6.x-ocr
Just start it with docker compose --profile with-solr up (first time will need some time)

@michaelkubina
Copy link
Collaborator

I have no Docker-Setup yet and would need to set it up on my private desktop (which i will do, if we cant resolve it maybe even earlier), as its a longer process to get it here on my employers machine.

But i took a quick look at your repository and am i assuming right, that your docker actually uses this configuration? https://github.com/UB-Mannheim/kitodo-presentation-docker/tree/dfg-viewer-6.x-ocr/volumes/solr/solrconfig/dlf/conf

Or are you really using the one you mentioned, and i just cant find it in your repository? Are you sideloading it somewhere else...cant find it in the docker-compose.yml https://github.com/kitodo/kitodo-presentation/tree/4.x/Configuration/ApacheSolr/configsets/dlf/conf

Because the one in your repository is different, than the one in the current master branch. In https://github.com/UB-Mannheim/kitodo-presentation-docker/tree/dfg-viewer-6.x-ocr/volumes/solr/solrconfig/dlf/conf your schema.xml is missing the new "daterange" fieldtype and the declaration of the "date"-field, both required for the datesearch.

And also the dynamic field "*_sorting" is still of the type "standard", which causes irratical sorting as it uses an analyzerchain. It should be changed to a "string" fieldtype" instead. (see this issue, which got resolved in a PR #863)

Years like 17XX or even determined dates like [1812] or [ca. 1850] are not properly ISO or w3cdtf formatted and indexing the datefield will omit those. This is something, that needs some care in some of our METS-Files as well, as it has been imported from our OPAC this way. A full overview of accepted formats for the datefield (in appliance to ISO8601 and extended ISO8601) can be found here: #869 (comment)

Do you mind checking your schema.xml again?

@csidirop
Copy link
Contributor Author

Because the one in your repository is different, than the one in the current master branch. In https://github.com/UB-Mannheim/kitodo-presentation-docker/tree/dfg-viewer-6.x-ocr/volumes/solr/solrconfig/dlf/conf your schema.xml is missing the new "daterange" fieldtype and the declaration of the "date"-field, both required for the datesearch.

You are right, I was using the outdated v1.6 instead of v1.7.

The error is now gone, but no results are shown. Just the same search page. Anything still missing?

@michaelkubina
Copy link
Collaborator

This is promising...great to hear!

Sorry that i have to ask, but have you re-indexed the documents or indexed any new documents afterwards? If not, then the documents in the solr are still indexed according to the old schema.xml and thus are missing the date field in the index itself. Its required to re-index, so the value gets written out to the index.

Sidenote: If you have applied the changes to "*_sorting" as well, then you need to start with a fresh index anyway, because it would result in datatype-mismatch errors.

@csidirop
Copy link
Contributor Author

I'm starting a fresh container and thus have to re-index the documents every time for those tests.

Sidenote: If you have applied the changes to "*_sorting" as well

I replaced the whole config folder with the updated one in presentation v4. Should I have done something else?

@michaelkubina
Copy link
Collaborator

I replaced the whole config folder with the updated one in presentation v4. Should I have done something else?

No, this is fine!

I believe, i understand the issue now...

You will not find the date-field in the Documents of your Document-Repository (in your Typo3-Backend), as there's currently no intended output for it in the tx_dlf_documents.php and no Label in the Labels.xml - but maybe worth a discussion.
Its like with the language-field, that has no representation there as well, though it has got its own viewhelper for translation of the ISO-639 encoding, that gets applied on language-facets, metadata and in the listview.
So looking at a document in the Typo3-Backend will always show you a "missing" datefield and also a "missing" language code...even if both get indexed properly.

For the language code to work, it needs the Metadata set up in the Metadata-Repository with the indexname, that gets called in the sourcecode - in this case language.

I believe, the issue you have right now, is that the Metadata needs to be set up in your Typo3 backend as well - so you need a field with the index-name date...with a custom XPATH or with the build-in Fallback to MetadataDefaults.php => ./mods:originInfo/*[@encoding="iso8601" or @encoding="w3cdtf"][@keyDate="yes"]. Without it, kitodo.presentation wont index it by itself and it will be missing in the solr-index, thus returning no results.

metadata

@csidirop
Copy link
Contributor Author

So the Creation Date is not the same as date ?

But, after adding date like you suggested the date search works as expected:
grafik
grafik

Thanks!

Is there any reason why its not a default metadata index? I am not that familiar with the mods specs, but the date should be more or less the same key.

@sebastian-meyer
Copy link
Member

Is there any reason why its not a default metadata index? I am not that familiar with the mods specs, but the date should be more or less the same key.

It actually is part of the default metadata configuration: https://github.com/kitodo/kitodo-presentation/blob/master/Resources/Private/Data/MetadataDefaults.php#L83-L101

@csidirop
Copy link
Contributor Author

Oh how could I miss that. I'm still working on the 4.0 branch, which doesn't have my changes (4d732a4) here yet.

Some other question not 100% related: In presentation v3.x all metadata an structures were created automatically (or where they with the DFG-Viewer setup?), but in presentation v4 they are not:

TYPO3 v9 & presentation v3.3.4 & DFG-Viewer 5.2:
grafik

TYPO3 v10 & presentation v4.0 & DFG-Viewer 6.0:
grafik

Is there a reason for this?

@sebastian-meyer
Copy link
Member

Kitodo.Presentation always required manually importing the default settings for every tenant using the "New Tenant" backend module. That's because we can't guess which sysfolder or pagetree the user wants to have Kitodo.Presentation installed to (or if he/she wants to use the default settings at all). Also, if you have a multi-tenant installation you'll want to import the settings into multiple sysfolders.

The DFG-Viewer on the other hand is meant to be a standalone web service always using the official METS/MODS mappings of the "DFG-Praxisregeln", that's why its installation routine automatically creates all necessary pages and sysfolders, imports the default configuration and sets everything up ready to go.
(It's just a coincidence that many developers use the DFG-Viewer to conveniently set up a Kitodo.Presentation instance for development and testing. That's fine, just be aware that not everything you see is purely Kitodo.Presentation.)

@csidirop
Copy link
Contributor Author

csidirop commented Apr 24, 2023

Yes, but what I meant is that structures and metadata used to be automatically generated and now they are not.
But - as I now understand - that is probably a problem with the DFG-Viewer and should be addressed there. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants