Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix to_param when attribute has multibyte character #13386

Merged
merged 1 commit into from Dec 18, 2013
Merged

Fix to_param when attribute has multibyte character #13386

merged 1 commit into from Dec 18, 2013

Conversation

rono23
Copy link
Contributor

@rono23 rono23 commented Dec 18, 2013

When attribute has only multibyte character, to_param returns id + "-". I think it is not pretty.

@rafaelfranca
Copy link
Member

This seems good to me. @javan what do you think? Will this break any behavior we are relying?

@pixeltrix
Copy link
Contributor

Typically to_param values are to be used as part of a URI path so should be valid pchar according to RFC 3986, so I think this change makes sense.

@javan
Copy link
Contributor

javan commented Dec 18, 2013

👍 on the change. I'm not crazy about building on the nest of (...).present? checks though. Perhaps define each var first and just use those in the conditional. Up to you guys.

@norman
Copy link
Contributor

norman commented Dec 18, 2013

When I wrote the transliterate method used internally by parameterize I didn't change the API at all in order to get it into Rails 3 fairly close to its release. But one thing I've never liked is the fact that transliterate has no option to preserve UTF-8 characters it can't transliterate. Essentially, to_param is more or less useless for Russian, Arabic, Hebrew, Japanese, Korean, Chinese, Vietnamese, Hindi, and other languages used by more than half of the world's population.

Unicorn mode ON:

It would be nice, I think, to allow an option to not delete UTF-8 characters, so that you could generate a parameterized string like 4-戦場ヶ原-ひたぎ if you want to.

Unicorn mode OFF.

@rafaelfranca
Copy link
Member

@norman I agree with you but even if we generate parameterized string with UTF-8 characters our router is not prepared to handle it very well (I was bite for a lot of bugs related with this in the last month). So I think we should stay with this version here and when we change the router to handle UTF-8 characters we can change transliterate

@pixeltrix
Copy link
Contributor

Yep, fine for 4.1 and then look at it again when we support IRIs

"#{default}-#{result.squish.truncate(20, separator: /\s/, omission: nil).parameterize}"
if (default = super()) && (result = send(method_name).to_s).present? &&
(param = result.squish.truncate(20, separator: /\s/, omission: nil).parameterize).present?
"#{default}-#{param}"
else
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@javan Thank you for your feedback. I will fix like below and then rebase, fine?

result = send(method_name).to_s
param = result.squish.truncate(20, separator: /\s/, omission: nil).parameterize

if (default = super()) && param.present?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the short-circuiting nature of the current structure makes the most sense. We don't need result unless there's a default and don't need param unless there's a result. I guess leave it as-is or fix up the formatting a little. I'm not sure what the preferred style in Rails is, perhaps:

define_method :to_param do
  if (default = super()) && 
       (result = send(method_name).to_s).present? &&
         (param = result.squish.truncate(20, separator: /\s/, omission: nil).parameterize).present?
    "#{default}-#{param}"
  else
    default
  end
end

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know preferred style, but we can find both, ex) here and here etc.
I will wait for responses from others. Thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use the one @javan used here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rafaelfranca Thank you! I will fix it and force push.

@rono23
Copy link
Contributor Author

rono23 commented Dec 18, 2013

@javan @rafaelfranca Thank you for your advice. I updated the style.

rafaelfranca added a commit that referenced this pull request Dec 18, 2013
Fix to_param when attribute has multibyte character
@rafaelfranca rafaelfranca merged commit 4779aaa into rails:master Dec 18, 2013
@rafaelfranca
Copy link
Member

Thank you so much for trying the beta and fixing this bug.

@kuldeepaggarwal
Copy link
Contributor

@rafaelfranca So, does Rails support Unicode character routes ?

@pixeltrix
Copy link
Contributor

@kuldeepaggarwal RFC 3986 requires path chars other than alphanumerics to be %-encoded. HTML5 includes partial support for IRIs and it's something we'll be investigating for a future version of Rails.

@jeremy
Copy link
Member

jeremy commented Dec 19, 2013

Please Do Investigate support for routing IRIs in Journey!

Major performance win if we can avoid transliteration when we to_param to generate slugs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants