New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement: sort items by name of file type #963

Closed
hofmannc opened this Issue Nov 29, 2017 · 4 comments

Comments

Projects
None yet
4 participants
@hofmannc

hofmannc commented Nov 29, 2017

Current situation: Items are grouped by file type, not by file type name.
grafik

Proposed situation: Sort by file type
We discussed this earlier, Melissa brought it up again.

@haarli haarli added this to the imeji 4.3 milestone Jul 27, 2018

@MPDLbrede

This comment has been minimized.

Show comment
Hide comment
@MPDLbrede

MPDLbrede Jul 30, 2018

Items are currently sorted alphabetically by their mime type, i.e

File name Mime Type shown abbreviation

test.pdf "application/pdf" pdf
test.docx "application/vnd.openxml-formats .." docx
test.zip "application/zip" zip
test.mp3 "audio/mpeg" mpga
test.jpg "image/jpg" jpg
test.tiff "image/tiff" tiff
test.txt "text/plain" txt
test.r "text/plain" txt
test.wmv "video/x-ms-wmv" wmv

(1) The mime type is used as the sort criterion yet it isn't shown to the user.
The user has no indication why files are sorted the way they are.
(2) Sorting alphabetically by mime type yields good classification results with image, audio and video files.
Image, audio and video files have consistent mime types like "image/file extension",
"audio/file extension", "video/file extension".
(3) This is not the case for files that belong into the overall category "text documents":
- .dat, .txt, .r etc. are classified as "text/plain"
- .doc is classified as "application/msword"
- .docx is classified as "application/vnd.openxmlformats-officedocument. wordprocessingml.document "
- .ppt is classified as "application/mspowerpoint"
- pdf is classified as "application/pdf"
etc.
Thus "text" documents are torn apart by this type of ordering.

MPDLbrede commented Jul 30, 2018

Items are currently sorted alphabetically by their mime type, i.e

File name Mime Type shown abbreviation

test.pdf "application/pdf" pdf
test.docx "application/vnd.openxml-formats .." docx
test.zip "application/zip" zip
test.mp3 "audio/mpeg" mpga
test.jpg "image/jpg" jpg
test.tiff "image/tiff" tiff
test.txt "text/plain" txt
test.r "text/plain" txt
test.wmv "video/x-ms-wmv" wmv

(1) The mime type is used as the sort criterion yet it isn't shown to the user.
The user has no indication why files are sorted the way they are.
(2) Sorting alphabetically by mime type yields good classification results with image, audio and video files.
Image, audio and video files have consistent mime types like "image/file extension",
"audio/file extension", "video/file extension".
(3) This is not the case for files that belong into the overall category "text documents":
- .dat, .txt, .r etc. are classified as "text/plain"
- .doc is classified as "application/msword"
- .docx is classified as "application/vnd.openxmlformats-officedocument. wordprocessingml.document "
- .ppt is classified as "application/mspowerpoint"
- pdf is classified as "application/pdf"
etc.
Thus "text" documents are torn apart by this type of ordering.

@ioverka

This comment has been minimized.

Show comment
Hide comment
@ioverka

ioverka Jul 30, 2018

Member

@MPDLbrede Thanks a lot for the detailed information!

From the functional perspective, I believe that most users have no grasp of the structure/meaning of file mime type, therefore I would suggest to switch the sorting and use the "shown abbreviation" resp. file extension instead.

Member

ioverka commented Jul 30, 2018

@MPDLbrede Thanks a lot for the detailed information!

From the functional perspective, I believe that most users have no grasp of the structure/meaning of file mime type, therefore I would suggest to switch the sorting and use the "shown abbreviation" resp. file extension instead.

MPDLbrede added a commit that referenced this issue Aug 17, 2018

MPDLbrede added a commit that referenced this issue Aug 29, 2018

#963 Sort items by name of file type
In Items view the items were previously sorted by their mime type.
New: The items are sorted by their file extension.

(1) File extension:
The file extension of a file is solely derived from the provided file
name. In case there is no file extension provided, the program uses "".
A new field "fileextension" is added to "items" type in ElasticSearch
and used for sorting of results.

(2) Multilevel sorting:
Items that are sorted by last modification, file name, file size or file
extension are additionally sorted alphabetically by file name whenever
several results exist for one category.

(3) GUI: Labeling "sort by file type" stays. In Windows systems users
sort by "file type" when they sort by file extension. The current
labeling is according to users' expectations.

You need to re-index the data in order to use the new functionality.
Follow these steps:
- Start up Server
- Login as admin
- Do Admin>Tools>Re-Index
@MPDLbrede

This comment has been minimized.

Show comment
Hide comment
@MPDLbrede

MPDLbrede Aug 30, 2018

On dev-imeji.mpdl.mpg.de/imeji the "alphabetical sorting" is lexical. This is due to the ElasticSearch analyzer ("keyword" instead of "ducet_sort") that is configured for the dev environment.

Note: Re-configure dev environment for using "ducet_sort" analyzer with ElasticSearch.

In the QA environment "ducet_sort" is configured.

MPDLbrede commented Aug 30, 2018

On dev-imeji.mpdl.mpg.de/imeji the "alphabetical sorting" is lexical. This is due to the ElasticSearch analyzer ("keyword" instead of "ducet_sort") that is configured for the dev environment.

Note: Re-configure dev environment for using "ducet_sort" analyzer with ElasticSearch.

In the QA environment "ducet_sort" is configured.

MPDLbrede added a commit that referenced this issue Sep 21, 2018

#963 Added check for unusual/unexpected file extensions
In case that a file extension
- has no characters or more than 20 characters
- contains characters other than a-z, a-Z, 0-9, _
a blank is saved as file extension instead
@hofmannc

This comment has been minimized.

Show comment
Hide comment
@hofmannc

hofmannc Oct 2, 2018

Testserver: qa imeji
Browser: ff
Version: version 4.3 - build date 2018-09-24 11:15:25       
User: standard user
result: ok

hofmannc commented Oct 2, 2018

Testserver: qa imeji
Browser: ff
Version: version 4.3 - build date 2018-09-24 11:15:25       
User: standard user
result: ok

@hofmannc hofmannc closed this Oct 2, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment