Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

creators browse view sort #338

Open
photomedia opened this issue Aug 17, 2015 · 2 comments
Open

creators browse view sort #338

photomedia opened this issue Aug 17, 2015 · 2 comments

Comments

@photomedia
Copy link

@photomedia photomedia commented Aug 17, 2015

After upgrading from 3.2.4 to 3.3.12, the sort wasn't working well on the creators browse view.

For example, “Liscio, Sinella” is listed inbetween “Li, Sanujun” and “Li, Shixiong” The people with the same last name should be listed next to each other.

Screenshot:
1

Our original browse view definition for authors (cfg.d/views.pl) was the following:

{
id => "creators",
allow_null => 0,
hideempty => 1,
menus => [
{
fields => [ "creators_name" ],
new_column_at => [1, 1],
mode => "sections",
open_first_section => 1,
group_range_function => "EPrints::Update::Views::cluster_ranges_30",
grouping_function => "EPrints::Update::Views::group_by_a_to_z",
},
],
order => "-date/title",
variations => [
"type",
"DEFAULT",
],
},

With Adam Field's help, we were able to patch this issue in the following way:

Ordervalues are generated in the metafield classes. The one used by a name is here:

https://github.com/eprints/eprints/blob/3.3/perl_lib/EPrints/MetaField/Multipart.pm#L185

            ...which is  called from MetaField::sort_values:

https://github.com/eprints/eprints/blob/3.3/perl_lib/EPrints/MetaField.pm#L946

            ...which uses the superclass of Name fields:

https://github.com/eprints/eprints/blob/3.3/perl_lib/EPrints/MetaField/Multipart.pm#L185

            ...which tab joins each sub-part.  If tabs are sorted before the letter 'A', this could be what's causing the problem.

            Note that this can be overridden on a field definition, which will allow you to test this:

https://github.com/eprints/eprints/blob/3.3/perl_lib/EPrints/MetaField.pm#L1929

             So, in your field definition (in eprint_field.pl), add something like:

   name => 'creators',
   type => 'compound',
   multiple => 1,
   fields => [
          {
                 sub_name => 'name',
                 type => 'name',
                 hide_honourific => 1,
                 hide_lineage => 1,
                 family_first => 1,
                                 make_single_value_orderkey => 'the_name_orderval_function_af',

          },

and in cfg/cfg.d/z_local.pl:

             $c->{the_name_orderval_function_af} = sub
            {
                            my ($field, $value, $dataset) = @_;

                            return $value->{family} . 'ZZZZZ' . $value->{given};
            }

I tested this using the "ZZZZZ" string and it worked in terms of keeping the people with the same last name next to each other, but it puts the shorter names at the end, rather than the beginning – and I think that the convention (probably based on how the phone book does it) is to put them first. This is why I used "11111" as the join string, and that worked well.

I also tested it with the default "\t" and with “\t\t\t\t\t\t\t\t” and it didn’t solve the problem – the result didn't keep the people with the same last name together. For example, we would have something like this:

Li, Peng
Li, Pengjie
Lipscombe, Carla L
Lipsky, Naomi
Lipton, Rebecca
Li, Qiao
Li, Qing

It looks to me like EPrints is actually ignoring the default \t separator for the purposes of sorting.

@phluid61
Copy link
Contributor

@phluid61 phluid61 commented Feb 17, 2016

It's because of the Unicode Collation Algorithm used by Unicode::Collate (see: EPrints::Update::Views::default_sort).

Certain characters (whitespace, punctuation, etc.) are handled differently than in simple ASCII lexicographic sort. In other words, it seems to intentionally skip the tab character.

A dollars sign $ seems to work the way we expect, should have as little chance of appearing in an author name as the numeral 1, and stands out (for debugging.)

@Pfiffikus
Copy link

@Pfiffikus Pfiffikus commented Aug 12, 2020

presumably some added functions of 3.4.2 could help!?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants