Browse files

Add note about utf8mb4

  • Loading branch information...
1 parent 2b2e1d3 commit 49435c38aa4acfd16ea2da7fa60b964e63fe6260 @oscardelben committed May 4, 2013
Showing with 4 additions and 2 deletions.
  1. +4 −2 2013/05/
@@ -12,14 +12,16 @@ message.body # => "a simple \ud83d\udcaa test"
message.reload.body # => "a simple "
-As you can see the message gets truncated.
+As you can see the message gets truncated.
The fix depends on wherever you want to patch ActiveRecord or do a conversion only on the fields you care about. Here's how you would overwrite a specific field:
def body=(value)
- write_attribute :body, value.gsub(/[\u{10000}-\u{10FFFF}]/, "\uFFFF")
+ write_attribute :body, value.gsub(/[\u{10000}-\u{10FFFF}]/, "something")
This solution is pretty fast (can convert a 1 mil string in ~20ms) and non invasive. If you want to monkey patch every field you'll have to hook into ActiveRecord write_attribute directly with similar outcomes.
+Note that this is only a temporary hack, you should upgrade to mysql 5.5+ and use the utf8mb4 character set which fully supports 4 byte characters.

0 comments on commit 49435c3

Please sign in to comment.